-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create JSON schemas for import #356
Comments
A problem I've run into: JSON schemas are made to validate data in JSON format. Ergo, before we can validate data we need to load it and convert to JSON. The use of CSV presents something of a problem:
|
We have a first version of JSON schemas for data import. We also have an initial version of a function that checks a csv file against the JSON schemas (
|
Once the JSON schemas are being used, we can delete the view |
|
Current problem: The import script uses the Python csv library to read the CSV files into memory as Python dictionaries. It then runs the python jsonschema library against those dictionaries to check validity. The problem comes with the automatic data type conversion:
So, either all fields will be loaded as strings, or we need to ensure that string fields are quoted and non-string fields are not quoted. |
Alternatives:
|
I've decided to go for option 2. The idea is to load the csv as string, then use the JSON schema to determine what attributed should have the "number" datatype, then cast those attributes to floats, and then finally check the data against the JSON schema. |
Draft MR for me to continue working on next sprint: https://kwvmxgit.ad.nerc.ac.uk/bmgf-maps/data/db-test-data/-/merge_requests/80 |
We want to create a simple way for the scientists to check whether their csv files can be imported into the database/system. JSON schemas look like a good way of doing this.
The text was updated successfully, but these errors were encountered: