Data Flow #27

Patowhiz · 2024-03-15T07:43:36Z

Overview:

This proposal aims to standardise the data flow within Climsoft from initial entry to the generation of final products, emphasizing the need for consistent data source identification, unified storage, robust QC checks, and transparent logging for auditability.

Detailed Description:

Data Ingestion:
Data ingestion is done through 3 source types that define the data ingestion methods.

Forms: Allow users to manually input data via forms, capturing real-time observations.
Machine: Enable automated data capture from instruments and sensors.
Import: Provide functionality for batch imports of data from external sources.

Each entry method must clearly document the source of the data to ensure traceability. Each data source is associated with the source type.

Observations Table:

Centralise data storage by saving entries from all sources into one Observations table, maintaining data in its original form.
Ensure that the Observations table structure is conducive to identifying and querying the data source.

Quality Control (QC) Protocol:

Establish a comprehensive QC protocol that scrutinises data for accuracy and consistency.
Make corrections within the Observations table, allowing for real-time data integrity enhancement.

Logging and Audit Trails:

Create a robust logging system that captures every action taken on the data, including QC checks and edits.
Ensure that data change logs and QC test logs are transparent and easily retrievable for audit purposes.

Final Product Generation:

Define a clear pathway for data to be classified as 'final' post-QC for use in Climsoft's product generation.
Emphasize that final products are based on the highest quality, QC-verified observations.

Proposal for Enhancements:

Streamlined Data Entry:
- Formalize data entry procedures that require source identification for every data input.
Quality Control Reinforcement:
- Implement a unified QC system that is both rigorous and standardized across all data types.
Auditability and Transparency:
- Develop an enhanced logging system for full transparency and accountability of data modifications and QC results.
Finality in Product Creation:
- Introduce criteria within Climsoft to determine and label data as 'final' for the production of climatological outputs.

Rationale:

The integrity of Climsoft's data and the trust in its climatological products hinge on a clear, accountable, and verifiable data management process. This proposal seeks to reinforce these aspects, ensuring Climsoft remains a reliable and authoritative tool for meteorological and hydrological data processing.

Request for Team Feedback:

I request feedback from the development community to refine this proposal. Contributions from the development team are essential to the successful enhancement of Climsoft's data workflow.

Patowhiz · 2024-06-03T07:35:45Z

After reflecting on this, I think we can assume that there is only 2 primary sources: Form and Import.
Import will be data that comes from a file (through http, ftp etc) or an API.

This means there is no need to have the Machine/Digital data source. Note also, Automatic stations record data and save it to their data logger, it's from these data loggers that we can import the data. So there is no need of a machine to machine concept.

Patowhiz changed the title ~~Data Flow in Climsoft~~ Climsoft Data Flow Mar 18, 2024

Patowhiz changed the title ~~Climsoft Data Flow~~ Data Flow Mar 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data Flow #27

Data Flow #27

Patowhiz commented Mar 15, 2024 •

edited

Loading

Patowhiz commented Jun 3, 2024 •

edited

Loading

Data Flow #27

Data Flow #27

Comments

Patowhiz commented Mar 15, 2024 • edited Loading

Overview:

Detailed Description:

Proposal for Enhancements:

Rationale:

Request for Team Feedback:

Patowhiz commented Jun 3, 2024 • edited Loading

Patowhiz commented Mar 15, 2024 •

edited

Loading

Patowhiz commented Jun 3, 2024 •

edited

Loading