-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Task: Reconfigure the Docker setup to incorporate the timeseriesdb extension #997
Comments
I'd love to take a shot at this one and get involved with the project in general. After talking last night and looking through the issue I'm starting to get a feel for it. One caveat: I'll be out of town this weekend and won't be able to really sink my teeth into it until the latter part of next week. Will that be an issue? |
Not an issue at all @rmartinsen ! I've assigned the ticket to you. |
I've been rethinking the development process and realized I was overcomplicating things when we talked the other night. Rather than trying to swap out the database + container, in one go, it's likely easier for you to create a new container for this timescale database and integrate that into our existing For now, you can create a new container for the TimescaleDB database (with the PostGIS and TimescaleDB extensions) and integrate it into our existing While you're working on the new database container, I'll write a new version of After that, we can:
Summary of Tasks
Once these two parts are ready, we’ll work together to validate the new ETL script by writing data to the new database. Afterward, I’ll handle deployment to our VM, set up backups, and finalize the migration. Does this approach make sense to you? |
@nlebovits That sounds like a plan. I'll get started on it over the next few days. |
@rmartinsen see #1014--the new ETL pipeline is outlined there and you should be able to integrate it directly into the new Docker container + pg database |
@nlebovits I put up a PR with the changes we discussed. It looks pretty similar to what you'd put together, just with a separate postgres container. Let me know if this is what you're looking for. |
Just saw the other comment about #1014. I'll look at integrating that next |
Add TimescaleDB Extension and Configure Hypertables for Time Series Analysis
Describe the Task
Our current PostgreSQL instance is not optimized for time series data. Currently, the script dumps the existing postgres schema into a backup schema named with the date it was created, and then creates a new data. Instead, we should switch to using a single schema with the
timescaledb
extension. Then we can convert our main tables to hypertables, partitioned monthly, instead of creating a bunch of backup schemas. Additionally, we want to implement data compression policies for data older than one year to optimize storage.As an optional improvement, consider adding spatial indexing to tables containing geospatial data. If this is implemented, please document the process and any decisions made.
Another optional but beneficial addition is setting up
pg_stat
to monitor query performance and track table growth over time.Acceptance Criteria
timescaledb
extension to the PostgreSQL instance.pg_stat
to monitor query performance and table growth over time.Additional Context
timescaledb
and PostgreSQL configurations for time series analysis.Existing Work
I've already put some work into this. Here is:
docker-compose.yml
Dockerfile
Dockerfile-pg
init_pq.sql
script.py
Please create a draft PR that includes strong documentation, including:
timescaledb
and the setup of hypertables.pg_stat
setup).The text was updated successfully, but these errors were encountered: