You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Our CKAN implementation is being used to index datasets across a variety of domains. We have a schema that defines a core set of fields that all datasets need to specify. To support flexibility for dataset authors to add additional key-value metadata that is outside of the required core set of fields, our schema also has a field called custom_fields that uses repeating_subfields to allow users to specify additional key-value pairs to associate with their dataset. The key subfield has a validator that makes sure that the key's name meets certain requirements:
We want to block users from specifying the same key value in multiple entries in custom_fields. The obvious way to do this is to define a validator function, keys_are_unique(), that can be applied to custom_fields to verify that the keys are unique.
After trying this & spending some time in my debugger, I discovered that the extension initially saves the top-level field's validator, but then replaces it with the subfields' validator(s). When this happens, the data type of the validators list also changes from list[str] to dict. I haven't had time yet to dig into why/how this works in the grand scheme of CKAN dataset creation, but that's probably my next step.
One example of this behavior is in the _field_validators() function in ckanext/scheming/plugins.py:
As a user, this behavior is unexpected -- I would expect that validators can be applied to both the top-level & sub-level fields. I think that it's worth updating the documentation for repeating_subfields and/or validators to explain that if a field has repeating_subfields, then validators for that field are ignored in favor of validators defined for the subfields.
I also have two specific questions, any insight would be appreciated:
We are trying to allow users to add an arbitrary number of "extra" key-value pairs to a dataset that is defined by a ckanext-scheming schema. Is our current approach (custom_fields field with repeating_subfields) the "correct" way to do what we are trying to do, or is there another method that is better supported, either by ckanext-scheming or by CKAN itself?
Assuming that what we are doing is correct, roughly how difficult would it be to modify ckanext-scheming to support validation of both the subfields and the top-level field? Any tips on what we would need to do to support that? It almost seems like this would require changes to the way the data is modelled, which I assume would be nontrivial.
The text was updated successfully, but these errors were encountered:
Environment
CKAN version: 2.10.1
ckanext-scheming version: release-3.0.0
Description
Our CKAN implementation is being used to index datasets across a variety of domains. We have a schema that defines a core set of fields that all datasets need to specify. To support flexibility for dataset authors to add additional key-value metadata that is outside of the required core set of fields, our schema also has a field called
custom_fields
that usesrepeating_subfields
to allow users to specify additional key-value pairs to associate with their dataset. Thekey
subfield has a validator that makes sure that the key's name meets certain requirements:We want to block users from specifying the same
key
value in multiple entries incustom_fields
. The obvious way to do this is to define a validator function,keys_are_unique()
, that can be applied tocustom_fields
to verify that the keys are unique.After trying this & spending some time in my debugger, I discovered that the extension initially saves the top-level field's validator, but then replaces it with the subfields' validator(s). When this happens, the data type of the validators list also changes from
list[str]
todict
. I haven't had time yet to dig into why/how this works in the grand scheme of CKAN dataset creation, but that's probably my next step.One example of this behavior is in the
_field_validators()
function inckanext/scheming/plugins.py
:ckanext-scheming/ckanext/scheming/plugins.py
Lines 582 to 608 in 8646a9d
As a user, this behavior is unexpected -- I would expect that validators can be applied to both the top-level & sub-level fields. I think that it's worth updating the documentation for
repeating_subfields
and/orvalidators
to explain that if a field hasrepeating_subfields
, thenvalidators
for that field are ignored in favor of validators defined for the subfields.I also have two specific questions, any insight would be appreciated:
We are trying to allow users to add an arbitrary number of "extra" key-value pairs to a dataset that is defined by a
ckanext-scheming
schema. Is our current approach (custom_fields
field withrepeating_subfields
) the "correct" way to do what we are trying to do, or is there another method that is better supported, either byckanext-scheming
or by CKAN itself?Assuming that what we are doing is correct, roughly how difficult would it be to modify
ckanext-scheming
to support validation of both the subfields and the top-level field? Any tips on what we would need to do to support that? It almost seems like this would require changes to the way the data is modelled, which I assume would be nontrivial.The text was updated successfully, but these errors were encountered: