Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when pydantic validate outputs a dict #54

Open
percevalw opened this issue Dec 15, 2023 · 2 comments · May be fixed by #55
Open

Error when pydantic validate outputs a dict #54

percevalw opened this issue Dec 15, 2023 · 2 comments · May be fixed by #55
Labels
bug Something isn't working

Comments

@percevalw
Copy link

percevalw commented Dec 15, 2023

Hi,

I noticed a bug when using pydantic casting functionalities.

Using a custom component in spacy :

from pydantic import BaseModel

class CastStrAsDict:
    @classmethod
    def validate(cls, value):
        if isinstance(value, str):
            return {value: True}
        return value
    
    @classmethod
    def __get_validators__(cls):
        yield cls.validate

@spacy.Language.factory("custom")
class CustomComponent:
    def __init__(self, nlp, name, value: CastStrAsDict):
        self.nlp = nlp
        self.name = name
        self.value = value
        
    def __call__(self, doc):
        return doc

import spacy

nlp = spacy.blank("en")
nlp.add_pipe("custom", config={"value": "mykey"})
Traceback
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[1], line 27
     24 import spacy
     26 nlp = spacy.blank("en")
---> 27 nlp.add_pipe("custom", config={"value": "mykey"})

File venv/lib/python3.11/site-packages/spacy/language.py:821, in Language.add_pipe(self, factory_name, name, before, after, first, last, source, config, raw_config, validate)
    817     pipe_component, factory_name = self.create_pipe_from_source(
    818         factory_name, source, name=name
    819     )
    820 else:
--> 821     pipe_component = self.create_pipe(
    822         factory_name,
    823         name=name,
    824         config=config,
    825         raw_config=raw_config,
    826         validate=validate,
    827     )
    828 pipe_index = self._get_pipe_index(before, after, first, last)
    829 self._pipe_meta[name] = self.get_factory_meta(factory_name)

File venv/lib/python3.11/site-packages/spacy/language.py:709, in Language.create_pipe(self, factory_name, name, config, raw_config, validate)
    706 cfg = {factory_name: config}
    707 # We're calling the internal _fill here to avoid constructing the
    708 # registered functions twice
--> 709 resolved = registry.resolve(cfg, validate=validate)
    710 filled = registry.fill({"cfg": cfg[factory_name]}, validate=validate)["cfg"]
    711 filled = Config(filled)

File venv/lib/python3.11/site-packages/confection/__init__.py:759, in registry.resolve(cls, config, schema, overrides, validate)
    750 @classmethod
    751 def resolve(
    752     cls,
   (...)
    757     validate: bool = True,
    758 ) -> Dict[str, Any]:
--> 759     resolved, _ = cls._make(
    760         config, schema=schema, overrides=overrides, validate=validate, resolve=True
    761     )
    762     return resolved

File venv/lib/python3.11/site-packages/confection/__init__.py:808, in registry._make(cls, config, schema, overrides, resolve, validate)
    806 if not is_interpolated:
    807     config = Config(orig_config).interpolate()
--> 808 filled, _, resolved = cls._fill(
    809     config, schema, validate=validate, overrides=overrides, resolve=resolve
    810 )
    811 filled = Config(filled, section_order=section_order)
    812 # Check that overrides didn't include invalid properties not in config

File venv/lib/python3.11/site-packages/confection/__init__.py:863, in registry._fill(cls, config, schema, validate, resolve, parent, overrides)
    861     schema.__fields__[key] = copy_model_field(field, Any)
    862 promise_schema = cls.make_promise_schema(value, resolve=resolve)
--> 863 filled[key], validation[v_key], final[key] = cls._fill(
    864     value,
    865     promise_schema,
    866     validate=validate,
    867     resolve=resolve,
    868     parent=key_parent,
    869     overrides=overrides,
    870 )
    871 reg_name, func_name = cls.get_constructor(final[key])
    872 args, kwargs = cls.parse_args(final[key])

File venv/lib/python3.11/site-packages/confection/__init__.py:942, in registry._fill(cls, config, schema, validate, resolve, parent, overrides)
    940 exclude_validation = set([ARGS_FIELD_ALIAS, *RESERVED_FIELDS.keys()])
    941 validation.update(result.dict(exclude=exclude_validation))
--> 942 filled, final = cls._update_from_parsed(validation, filled, final)
    943 if exclude:
    944     filled = {k: v for k, v in filled.items() if k not in exclude}

File venv/lib/python3.11/site-packages/confection/__init__.py:964, in registry._update_from_parsed(cls, validation, filled, final)
    962     final[key] = value
    963 if isinstance(value, dict):
--> 964     filled[key], final[key] = cls._update_from_parsed(
    965         value, filled[key], final[key]
    966     )
    967 # Update final config with parsed value if they're not equal (in
    968 # value and in type) but not if it's a generator because we had to
    969 # replace that to validate it correctly
    970 elif key == ARGS_FIELD:

File venv/lib/python3.11/site-packages/confection/__init__.py:976, in registry._update_from_parsed(cls, validation, filled, final)
    973     elif str(type(value)) == "<class 'numpy.ndarray'>":
    974         final[key] = value
    975     elif (
--> 976         value != final[key] or not isinstance(type(value), type(final[key]))
    977     ) and not isinstance(final[key], GeneratorType):
    978         final[key] = value
    979 return filled, final

TypeError: string indices must be integers, not 'str'

We can reproduce this in a confection only setting with:

import confection
from confection import BaseModel   # to work with pydantic v1 or v2

class CastStrAsDict:
    @classmethod
    def validate(cls, value):
        if isinstance(value, str):
            return {value: True}
        return value
    
    @classmethod
    def __get_validators__(cls):
        yield cls.validate
    
class SectionSchema(BaseModel):
    value: CastStrAsDict
    
class MainSchema(BaseModel):
    section: SectionSchema
    
cfg = confection.Config().from_str("""
[section]
value = "ok"
""")

confection.registry.fill(cfg, schema=MainSchema)
Traceback
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[6], line 26
     19     section: SectionSchema
     21 cfg = confection.Config().from_str("""
     22 [section]
     23 value = "ok"
     24 """)
---> 26 confection.registry.fill(cfg, schema=MainSchema)

File venv/lib/python3.11/site-packages/confection/__init__.py:773, in registry.fill(cls, config, schema, overrides, validate)
    764 @classmethod
    765 def fill(
    766     cls,
   (...)
    771     validate: bool = True,
    772 ):
--> 773     _, filled = cls._make(
    774         config, schema=schema, overrides=overrides, validate=validate, resolve=False
    775     )
    776     return filled

File venv/lib/python3.11/site-packages/confection/__init__.py:808, in registry._make(cls, config, schema, overrides, resolve, validate)
    806 if not is_interpolated:
    807     config = Config(orig_config).interpolate()
--> 808 filled, _, resolved = cls._fill(
    809     config, schema, validate=validate, overrides=overrides, resolve=resolve
    810 )
    811 filled = Config(filled, section_order=section_order)
    812 # Check that overrides didn't include invalid properties not in config

File venv/lib/python3.11/site-packages/confection/__init__.py:902, in registry._fill(cls, config, schema, validate, resolve, parent, overrides)
    899     if not isinstance(field.type_, ModelMetaclass):
    900         # If we don't have a pydantic schema and just a type
    901         field_type = EmptySchema
--> 902 filled[key], validation[v_key], final[key] = cls._fill(
    903     value,
    904     field_type,
    905     validate=validate,
    906     resolve=resolve,
    907     parent=key_parent,
    908     overrides=overrides,
    909 )
    910 if key == ARGS_FIELD and isinstance(validation[v_key], dict):
    911     # If the value of variable positional args is a dict (e.g.
    912     # created via config blocks), only use its values
    913     validation[v_key] = list(validation[v_key].values())

File venv/lib/python3.11/site-packages/confection/__init__.py:942, in registry._fill(cls, config, schema, validate, resolve, parent, overrides)
    940 exclude_validation = set([ARGS_FIELD_ALIAS, *RESERVED_FIELDS.keys()])
    941 validation.update(result.dict(exclude=exclude_validation))
--> 942 filled, final = cls._update_from_parsed(validation, filled, final)
    943 if exclude:
    944     filled = {k: v for k, v in filled.items() if k not in exclude}

File venv/lib/python3.11/site-packages/confection/__init__.py:964, in registry._update_from_parsed(cls, validation, filled, final)
    962     final[key] = value
    963 if isinstance(value, dict):
--> 964     filled[key], final[key] = cls._update_from_parsed(
    965         value, filled[key], final[key]
    966     )
    967 # Update final config with parsed value if they're not equal (in
    968 # value and in type) but not if it's a generator because we had to
    969 # replace that to validate it correctly
    970 elif key == ARGS_FIELD:

File venv/lib/python3.11/site-packages/confection/__init__.py:976, in registry._update_from_parsed(cls, validation, filled, final)
    973     elif str(type(value)) == "<class 'numpy.ndarray'>":
    974         final[key] = value
    975     elif (
--> 976         value != final[key] or not isinstance(type(value), type(final[key]))
    977     ) and not isinstance(final[key], GeneratorType):
    978         final[key] = value
    979 return filled, final

TypeError: string indices must be integers, not 'str'

Versions:

platform     macOS-13.1-arm64-arm-64bit                         
python       3.11.3
confection   0.1.4
spacy        3.7.2
percevalw added a commit to percevalw/confection that referenced this issue Dec 15, 2023
percevalw added a commit to percevalw/confection that referenced this issue Dec 15, 2023
@percevalw percevalw changed the title Erorr when pydantic validate outputs a dict Error when pydantic validate outputs a dict Dec 15, 2023
@adrianeboyd
Copy link
Contributor

I'm having trouble replicating this exact error.

With pydantic v1 the example runs without errors.

With pydantic v2, I get a different error:

Traceback (most recent call last):
  File "/home/adriane/spacy/issues/confection-54.py", line 26, in <module>
    confection.registry._fill(cfg, schema=MainSchema)
  File "/tmp/venv311-1/lib/python3.11/site-packages/confection/__init__.py", line 898, in _fill
    field_type = field.type_
                 ^^^^^^^^^^^
AttributeError: 'FieldInfo' object has no attribute 'type_'

Which versions of pydantic and all its dependencies are you using?

percevalw added a commit to percevalw/confection that referenced this issue Dec 19, 2023
@percevalw
Copy link
Author

percevalw commented Dec 19, 2023

Oops, my apologies for the confusion, I must have mixed things up. Replacing key = 1 with key='ok' in the snippet, the error appears (the caster tests for a string). I've updated my comment above to correct this. And the second error disappears by importing BaseModel from confection to use the deprecated pydantic.v1.BaseModel in pydantic v2.
The error shows up using pydantic v1 (1.10.7) or v2 (2.5.2).

percevalw added a commit to percevalw/confection that referenced this issue Dec 19, 2023
@svlandeg svlandeg added the bug Something isn't working label Jan 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants