You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
But the CSV export support is not great. And folks asking for the data seem to be much more familar with postgres/postgis.
I also had hoped to do more with S3 & Athena in this space, but as far as I can tell it has no support for linear distance queries, only cartesian distance (radiation within 100 units would be different meters depending on how far north/south you are).
And finally there was hope that postgres replicas could help us here, but (0) they don't support temp tables (1) they can't be made public and (2) hard queries cause replication lag and ultimately fail out.
Opening this to brainstorm ideas about how we could more easily provide a clean data set in a flexible format people are generally familiar with.
Some ideas:
A nightly job that copies/packages data into a public RDS snapshot (or a public RDS instance)
Some data cleanup (or at least labeling) for quality issues stemming from known bugs
https://github.com/openaq/oh-snap might help if we want to do a public RDS snapshot (though sounds like no-one is really using the one openaq provides)
We get occasional requests for data formatted differently than our bulk exports. For example:
This has also uncovered some lingering data quality issues:
My original thought was to have more people use elasticsearch directly (https://github.com/Safecast/safecastapi/wiki/Data-Sets#kibana--elasticsearch-access).
But the CSV export support is not great. And folks asking for the data seem to be much more familar with postgres/postgis.
I also had hoped to do more with S3 & Athena in this space, but as far as I can tell it has no support for linear distance queries, only cartesian distance (radiation within 100 units would be different meters depending on how far north/south you are).
And finally there was hope that postgres replicas could help us here, but (0) they don't support temp tables (1) they can't be made public and (2) hard queries cause replication lag and ultimately fail out.
Opening this to brainstorm ideas about how we could more easily provide a clean data set in a flexible format people are generally familiar with.
Some ideas:
The text was updated successfully, but these errors were encountered: