-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[meta issue] HXL and data directly from and to SQL databases #10
Comments
Maybe most databases accept HXL Hashtags without changesGood thing: PostgreSQL actually accept # as first char of column names. (Also tested with MariaDB, so maybe do exist others) On last week's I was aware that only SQLite accepted almost everything as column name, so I was concerned on what use to replace # and + on almost every other database engine. But since maybe current databases actually allow both # and +, this simplify a lot! Tests with csvsqlThe csvsql used SQLAlchemy. The file actually is not that complex so in worst case scenario could be just Implement same thing. (But as reference: as expected the csvsql exporting from genetic csv files may be OK, but not tested yet if types would be more generic) BUT since on issue HXLMeta we're already mapping more exact StorageTypes (and this is likely to take much more time to get right) I think that for exporting from HXLated datasets to most common SQL database, we may no need at all something like SQLAlchemy. But for importing to HXL tools, some abstraction like SQLAlchemy (at least for python HXL tools) definely worth looking at. SQLite as potential alternative to have an local collections of taxonomies
In addition to country/territory codes (and without resorting to load the entire P-Codes for local usage) do exist some taxonomies (at least the one for language codes) that I think eventually would be useful to have near the computer running complex inferences. On @HXL-CPLP we're already drafting taxonomies like words used to represent true/false on different languages, so maybe would be possible to make some taxonomies so important that the user could build own cache. One good initial candidate could be booleans (using 2 letter ISO codes as namespace, something like +v_un_bool for 6 UN languages and +v_eu_bool for an draft of 20+ European ones) and some way to a person "merge" more than one external source of reference. I'm not fully sure if this would be really necessary (and, in fact, for a few tables, even a folder with plain HXLated CSVs would work). But for cases like the booleans, just a canonical single table would not be ideal (if not because of the user, then because of make harder to implement on the fly) But anyway, in both cases (local SQLite or CSVs) something that could "build" an local database but one that could persist across executions (and also one that could work offline so if really used a lot on worst case scenario neither Google Drive could get hate limited or blocked) seems a totally win-win. |
This issue is an draft. Some extra information may be edited/added later.
The text was updated successfully, but these errors were encountered: