-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce redundant validations on metadata instantiation #249
base: patches
Are you sure you want to change the base?
Reduce redundant validations on metadata instantiation #249
Conversation
Thank you for running QA. From a quick survey the issue seems related to "_"-prefixed variables, which may be handled in a slightly different way. I will look into this. |
@gabelepoudre This is great. I'll admit when I wrote this I wasn't focused on speed, and there can definitely be a number of speed ups. I could see having a "skip_validation" attribute would be beneficial to remove any additional reads. I think going through how things are initiated will probably fix some of the tests. I can have a look this weekend. Other thoughts would be to use Pydantic for validation instead of the current base code. Or initiate in a different way, like read in a single json. Not sure what the optimum way of initializing/validating is. |
@kujaku11 To clarify, the change in this PR is likely the most significant hit to performance in mt_metadata right now, as far as I can tell. I can't comment on optimum, but as for better there is a couple things that I noted/tried while looking at this:
|
This is my first PR in this repo, so please do point out any deficiencies in the request. I am always open to discussion!
Background
to_runts
andfrom_runts
. In these, one of the notable (but perhaps not the largest) hits in performance is the time spent generating metadata objects. In a light investigation, some of this slowness comes from erroneously/repeatedly validating the attr_dict objects which have already been validated on first import, and on each instantiation of the metadata.Changes
skip_validation
(default False) tracing frommetadata.Base._set_attr_dict()
through to the actual attribute setting on the base_object inhelpers.recursive_split_setattr()
. As "validation" also includes some amount of formatting, parsing, etc, this argument is only safe if you know that the incoming property or properties have already been validated. Fortunately this seems to always be the case when any subclass of metadata.Base calls super().init(), as the global-scope JSON parsing does this same validation when building each attr_dict. Therefore, we can skip re-validating the entire base schema for each object on every instantiation of a subclass.Impact
Station()
decreased on my machine from 0.0012488s to 0.0003102s, which is an improvement of about ~4x.Testing
timing code: