Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should Port.validate fail if required=False and receives None which does not match valid_type? #226

Open
sphuber opened this issue Nov 9, 2021 · 4 comments

Comments

@sphuber
Copy link
Collaborator

sphuber commented Nov 9, 2021

Original issue reported in aiida-common-workflows:

Consider the following example of a simple port that is not required and expects a string:

In [1]: from plumpy import InputPort

In [3]: port = InputPort('test', valid_type=str, required=False)

In [4]: port.validate('string')

In [5]: port.validate(None)
Out[5]: plumpy.ports.PortValidationError("Error occurred validating port 'test': value 'test' is not of the right type. Got '<class 'NoneType'>', expected '<class 'str'>'")

When we validate None as a value, we hit the validation error, just as in the EOS workchain. Except that None should be a perfectly legal value if the port is optional. It seems weird to raise for this. The only reason I can think off to keep the current behavior is that it might be necessary to have an optional port but where an explicit None value is incorrect and so if None is passed explicitly (instead of the input being omitted entirely) you want the port to raise. But I am wondering if this is not an edge-case and in that case the develop should simply explicitly check for None in a validator or the process body and ignore it.

@sphuber
Copy link
Collaborator Author

sphuber commented Nov 9, 2021

@muhrin @chrisjsewell thoughts?

@muhrin
Copy link
Collaborator

muhrin commented Nov 9, 2021

Thanks Seb, I was actually just writing a response on the issue.

Essentially I agree with your analysis. All of this comes down to an original choice I made which is best demonstrated with dictionaries (which input port hierarchies emulate).

Essentially, should this:

input = {
    'value': None
}

be considered the same as:

input = {}

i.e. should the lack of a key in a dictionary be equivalent to the key being there with a value of None. This is equivalent to saying 'Should None be treated as a value or as the absence of a value?'

My instinct is that None should be treated as a value specifically to allow one to distinguish between the two dictionaries above. Other languages deal with this using a third type, like undef which accounts for this case. I think in plumpy we, internally at least, use an UNSET singleton for this case.

I don't remember if it's possible in plumpy to do (following from your example):

port.validate()

which should be True in this case.

To summarise, as it stands, I still prefer the original behaviour but I'm not opposed to the change. I would just caution that this assumption has certain consequences that aren't necessarily obvious and may appear in other parts of the code and so care should be taken to make sure the behaviour is consistent throughout.

@sphuber
Copy link
Collaborator Author

sphuber commented Nov 11, 2021

My instinct is that None should be treated as a value specifically to allow one to distinguish between the two dictionaries above. Other languages deal with this using a third type, like undef which accounts for this case. I think in plumpy we, internally at least, use an UNSET singleton for this case.

Indeed we do. We define UNSPECIFIED = () which we use as a default for unspecified keys in the pre-processing and validation of port namespaces.

I don't remember if it's possible in plumpy to do (following from your example):

port.validate()

which should be True in this case.

This actually doesn't work because Port.validate requires a value as the first argument:

In [1]: from plumpy import InputPort

In [2]: port = InputPort('test', valid_type=str, required=False)

In [3]: port.validate()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-3-5f493d0fc7eb> in <module>
----> 1 port.validate()

TypeError: validate() missing 1 required positional argument: 'value'

Note that for the PortNamespace.validate it is possible to pass None as that is the default. So there is a bit of a discrepancy here. Maybe we should change the signature for Port.validate to

def validate(self, value=UNSPECIFIED, ...):

which should then make the above example work.

To summarise, as it stands, I still prefer the original behaviour but I'm not opposed to the change. I would just caution that this assumption has certain consequences that aren't necessarily obvious and may appear in other parts of the code and so care should be taken to make sure the behaviour is consistent throughout.

This is the main point of doubt for me. Changing this would indeed most likely have potentially very subtle consequences. Not sure if it is worth it. Maybe it is best to keep the current convention and close this issue with an explicit description in the documentation.

@chrisjsewell
Copy link
Member

Other languages deal with this using a third type, like undef

yeh never found a good/defacto solution to this in python

Maybe we should change the signature for Port.validate to value=UNSPECIFIED

that makes sense to me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants