Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fail-fast for dicts #1543

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

Add fail-fast for dicts #1543

wants to merge 4 commits into from

Conversation

uriyyo
Copy link
Contributor

@uriyyo uriyyo commented Nov 17, 2024

Change Summary

Add fail-fast feature to:

  • dict
  • typed-dict
  • dataclass-arguments
  • model-fields

Related issue number

#1345

Checklist

  • Unit tests for the changes exist
  • Documentation reflects the changes where applicable
  • Pydantic tests pass with this pydantic-core (except for expected changes)
  • My PR is ready to review, please add a comment including the phrase "please review" to assign reviewers

@uriyyo
Copy link
Contributor Author

uriyyo commented Nov 17, 2024

please review

Copy link

codecov bot commented Nov 17, 2024

Codecov Report

Attention: Patch coverage is 90.00000% with 2 lines in your changes missing coverage. Please review.

Project coverage is 89.60%. Comparing base (ab503cb) to head (805c64a).
Report is 264 commits behind head on main.

Files with missing lines Patch % Lines
src/validators/dict.rs 87.50% 1 Missing ⚠️
src/validators/typed_dict.rs 88.88% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1543      +/-   ##
==========================================
- Coverage   90.21%   89.60%   -0.61%     
==========================================
  Files         106      112       +6     
  Lines       16339    17867    +1528     
  Branches       36       40       +4     
==========================================
+ Hits        14740    16010    +1270     
- Misses       1592     1837     +245     
- Partials        7       20      +13     
Files with missing lines Coverage Δ
python/pydantic_core/core_schema.py 94.88% <100.00%> (+0.11%) ⬆️
src/validators/dict.rs 96.84% <87.50%> (-0.04%) ⬇️
src/validators/typed_dict.rs 89.66% <88.88%> (-2.39%) ⬇️

... and 60 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 562fad3...805c64a. Read the comment docs.

Copy link

codspeed-hq bot commented Nov 17, 2024

CodSpeed Performance Report

Merging #1543 will not alter performance

Comparing uriyyo:more-fail-fast (805c64a) with main (562fad3)

Summary

✅ 157 untouched benchmarks

@uriyyo
Copy link
Contributor Author

uriyyo commented Nov 17, 2024

After implementing this feature I thought maybe it's make sense to add kinda validation context structure to not duplicate fail-fast check logic, it can look like this:

struct ValidationContext {
    fail_fast: bool,
    errors: Vec<ValLineError>,
}

impl ValidationContext {
    fn new(fail_fast: bool) -> Self {
        Self {
            fail_fast,
            errors: Vec::new(),
        }
    }

    fn default() -> Self {
        return Self::new(false);
    }

    fn should_stop(self) -> bool {
        self.fail_fast && self.has_errors()
    }

    fn has_errors(self) -> bool {
        !self.errors.is_empty()
    }

    fn add_error(mut self, error: ValLineError) {
        self.errors.push(error);
    }
}

Copy link
Contributor

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, overall the implementation seems fine.

RE your refactoring idea, I was beginning to think of something similar recently, see #1517, probably this solves related problems.

I realise that looking at this, I'm a bit unsure what the user-facing result is. What order do they expect validation of fields to be in? Does it matter if we validate the first field but not the last N, because the second was fail fast? What if the next time we randomly did these in a different order, and the performance was different? Maybe there were even side effects 👀

Under what cases is fail-fast useful for a model?

I would love to understand some more about this. I think if it's about making errors shorter, we could make it easier to see just the first error. If it's about performance, well, maybe we just make validation faster so that fail-fast doesn't matter.

Unlike lists and sequences, with potentially large inputs, a model should have a relatively small finite number of fields, so I don't know how much performance is gained by failing early...

@uriyyo uriyyo requested a review from davidhewitt December 26, 2024 08:36
@uriyyo
Copy link
Contributor Author

uriyyo commented Dec 26, 2024

@davidhewitt Your points totally makes sense to me. I removed fail-fast for model and dataclass and left it for dict and typed-dict.

@uriyyo uriyyo changed the title Add fail-fast for dicts, model and dataclass Add fail-fast for dicts Dec 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants