Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent parsing of kwargs in some Children of MongoStore #1013

Open
jmmshn opened this issue Nov 26, 2024 · 4 comments
Open

Inconsistent parsing of kwargs in some Children of MongoStore #1013

jmmshn opened this issue Nov 26, 2024 · 4 comments

Comments

@jmmshn
Copy link
Contributor

jmmshn commented Nov 26, 2024

It looks like we have a few places where the __init__ skips the MongoStore.__init__ which can leave some flags like safe_update unset.

super(MongoStore, self).__init__(**kwargs) # lgtm

It's not clear to me why we use super(MongoStore, self) here instead of super(). Perhaps it allow more fine-grained re-work of some of the initialization behavior, but it does create some weird edge cases that are hard to catch.

@tschaume
Copy link
Member

It could be a remnant from the python2 days which is unlikely, though. There's also this commit from 2020 that explicitly changed it from no argument to including the class. There's no explanation but it seems to been have done on purpose.

@jmmshn
Copy link
Contributor Author

jmmshn commented Nov 26, 2024

Thanks for the fast reply!
I did more digging and I think the different MongoStore.__init__ look the way they currently do because there MSONable does not parse parent or grandparent __init__. So for anything important that needs to be serialized we have to explicitly place this in the final child's __init__.

The are obvious benefits of this: the point of failure is now much closer to the the actual function being called.

But I don't know if it makes sense to just have as_dict move up the MRO and get all the __init__ args and kwargs that appear before MSONable. I think I'll just patch the bug I found with the missing MongURIStore.safe_update flag. And leave any decisions about reworking MSONable for later.

Illustration of the reason why we have those weird looking supers

from monty.json import MSONable

# Common pattern - but has issues
from monty.json import MSONable

# Common pattern - but has issues
class Parent(MSONable):
    def __init__(self, required_arg, **kwargs):
        self.required_arg = required_arg
        self.kwargs = kwargs

class Child(Parent):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)

c = Child(required_arg=1, a = 1)
c.as_dict()

The Child above will only serialize to:

{
'@module': '__main__', 
'@class': 'Child', 
'@version': None, 
'a': 1
}

Missing the required_arg.

Line responsible for this behavior:
https://github.com/materialsvirtuallab/monty/blob/a3d35a63d911127a2bb0a8331f3b6265d249151c/src/monty/json.py#L167C1-L168C1

Attempt at solving solving the issue:

def get_all_init_specs(cls):
    specs = []
    for c in cls.__mro__:
        try:
            if '__init__' in c.__dict__:  # Only get classes that define their own __init__
                spec = getfullargspec(c.__init__)
                specs.append((c.__name__, spec))
        except TypeError:  # Skip cases where we can't get argspec (like built-in classes)
            continue
    return specs

@rkingsbury
Copy link
Collaborator

Thanks for investigating @jmmshn . I'm happy to consider modifying this if you think it's best.

Speaking of MongoURIStore - I would like to consolidate it with MongoStore (see #483 ) maybe using a classmethod (e.g., MongoStore.from_uri() or something along those lines. We can discuss in the other issue if you have thoughts, but I wanted to link it here as it might be relevant.

@jmmshn
Copy link
Contributor Author

jmmshn commented Dec 2, 2024

@rkingsbury, the from_uri makes sense although the pymongo kinda overloads the usage of host for both the URI and standard hostnames in the MongoClient constructor so it's hard to map it one-to-one.
I play around with some edge cases to see if this is feasible.

However, I would like to see MSONable serialize up the MRO call stack which will solve this entire class of problems. Currently MSONable has (what I would argue is) the correct behavior for datalasses:

from monty.json import MSONable
from dataclasses import dataclass

@dataclass
class Parent(MSONable):
    required_arg: int

@dataclass
class Child(Parent):
    a: int
c = Child(required_arg=1, a = 1)
c.as_dict()

gives:

{'@module': '__main__',
 '@class': 'Child',
 '@version': None,
 'required_arg': 1,
 'a': 1}

I would be really nice to see it work like this generally.

I'll have to think a bit hard about how such a change will interact with cases where there is existing workarounds like we see for MonogoStore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants