Skip to content

Commit

Permalink
Single doc reference for all folders and files
Browse files Browse the repository at this point in the history
  • Loading branch information
bpbond committed Oct 16, 2024
1 parent f7418f0 commit 7e69fb9
Show file tree
Hide file tree
Showing 2 changed files with 129 additions and 0 deletions.
26 changes: 26 additions & 0 deletions synoptic/docs/system-files.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# List the 'important' files in the process system
# This list is then used to build the docs/structure.md file

files <- list.files(recursive = TRUE, full.names = TRUE, include.dirs = TRUE)

files <- files[grep("^./data/.+", files, invert = TRUE)]

files <- files[grep("^./data_TEST/Raw/.+", files, invert = TRUE)]
files <- files[grep("^./data_TEST/L0/.+", files, invert = TRUE)]
files <- files[grep("^./data_TEST/L1_normalize/.+", files, invert = TRUE)]
files <- files[grep("^./data_TEST/L1/.+", files, invert = TRUE)]
files <- files[grep("^./data_TEST/Logs/.+", files, invert = TRUE)]

files <- files[grep("^./L0_files/.+", files, invert = TRUE)]
files <- files[grep("^./L1_normalize_files/.+", files, invert = TRUE)]
files <- files[grep("^./L1_files/.+", files, invert = TRUE)]

files <- files[grep("html$", files, invert = TRUE)]
files <- files[grep("README.md$", files, invert = TRUE)]

x <- data.frame(dir = dirname(files), file = basename(files))
x <- x[order(x$dir),]


writeLines(paste(paste0("`", x$file, "`"), "| description"), "files.txt")

103 changes: 103 additions & 0 deletions synoptic/docs/system-files.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
# system-files.md

This file describes the files in the sensor data processing system.
It excludes data files, temporary Quarto output files, and folder README files.

The sections folder are organized by **folder name** and then, within
each, is a list of the files in that folder and accompanying descriptions.

## ./synoptic

The root of the system, holding main Quarto files, helper files, and the driver script.

File | Description
---- | -------------
`data` | Test data folder that holds inputs (raw data), outputs, and logs; see below
`data_TEST` | Test data folder that holds inputs (raw data), outputs, and logs; see below
`docs` | Documentation folder; see below
`driver.R` | A 'driver' script that runs all the Quarto files sequentially and by default in parallel mode, which is fast but generates no logs
`flag-database.R` | Flag database helper script; not used currently
`flmd-generator.R` | File level metadata generation script, to help with ESS-DIVE submission
`helpers.R` | Various ancillary ('helper') functions and test code. This is sourced by all the Quarto files. For interactive use, this contains the `reset()` and `list_directories()` functions
`L0.qmd` | Quarto file for generating L0 data from raw data
`L1_normalize.qmd` | Quarto file for running the 'normalize' L1 step: unit conversion, design table matching, etc. This is the most complicated step in the pipeline
`L1.qmd` | Quarto file for generating L1 data. This is a memory-intensive step
`L2.qmd` | Quarto file for generating L2 data; not used currently
`metadata` | Metadata folder; see below
`out-of-service.R` | description

## ./synoptic/data

This directory tree holds the data used in a release run. It typically holds
the entire sensor data, making it big and slow to process.

File | Description
---- | -------------
`flag-db.sqlite` | Flag database for L2 step; not used currently
`L0` | Folder of L0 data written by the `L0.qmd` processing step
`L1` | Folder of L0 data written by the `L1.qmd` processing step
`L1_normalize` | Folder of intermediate data written by the `L1_normalize.qmd` processing step
`L2` | Folder of L2 data written by the `L2.qmd` processing step; not used currently
`Logs` | Folder of all log files
`Raw` | Folder of Raw data, copied (occasionally with edits) from the SERC Dropbox
`Raw_done` | Folder of completely processed raw data; not used currently

## ./synoptic/data_TEST

This directory tree holds the test data used to test and verify system performance.
These data are the default for the Quarto files and driver script, and run
by GitHub Actions when pull requests are opened. It is structured
identically to the `./data` directory (see above) but its `Raw/` folder
holds only short sample files from across all sites.

## ./synoptic/docs

Documentation folder.

File | Description
---- | -------------
`making-a-new-release.md` | Step-by-step instructions for creating a new data release
`design_table.csv` | This file

## ./synoptic/metadata

Metadata used by the processing system.

File | Description
---- | -------------
`design_table.csv` | The design table that links datalogger data with experimental subjects, measurement names, and instruments. The crucial central 'brain' of the entire L0-to-L1 process
`L1_metadata` | Folder of L1-specific metadata; see below
`L2_output_templates` | Folder of L2 template files; not used currently
`newvars_table.csv` | Table describing how to compute new variables in the L2 step; not used currently
`out-of-service` | Folder of data files used in the out-of-service step, part of L1_normalize; see below

## ./synoptic/metadata/L1_metadata

Files used by the `L1_normalize.qmd` and `L1.qmd` steps.

File | Description
---- | -------------
`CRC.txt` | Site description file for the CRC site, giving location, ecological context, contacts, and key publications; used in the `L1.qmd` metadata-generation step
`GCW.txt` | Site description file for the GCW site
`GWI.txt` | Site description file for the GWI site
`L1_metadata_columns.csv` | This specifies the names and ordering of the L1 data columns. Upon being generated, L1 files are checked against this list and the process will error if there's a discrepancy
`L1_metadata_template.txt` | Template for the various metadata files in each site-year folder. Information placeholders in square brackets are replaced in the L1 metadata-generation step
`L1_metadata_variables.csv` | Along with the design table, this is a key 'information center' for the system. Also known as the 'bounds and units table', it specifies unit conversions, expected bands, variable metadata, etc. Every output variable must have an entry here, or the L1_normalize step will error
`MSM.txt` | Site description file for the MSM site
`OWC.txt` | Site description file for the OWC site
`PTR.txt` | Site description file for the PTR site
`README_v???.txt` | Template for the overall README files
`README_v0-9.txt` | Overall README file for the v0-9 release; the release number is set in either `driver.R` or `L1.qmd` and L1 will error without a corresponding README file present here
`README_v1-0.txt` | Overall README file for the v1-0 release
`README_v1-1.txt` | Overall README file for the v1-1 release
`SWH.txt` | Site description file for the SWH site
`TMP.txt` | Site description file for the TMP site

## ./synoptic/metadata/out-of-service

Files used by `out-of-service.R`.

File | Description
---- | -------------
`exo_log.csv` | This file tracks the "Aquatroll Calibration/Removal Log" spreadsheet on the COMPASS-FME Google Drive and must be updated by hand
`troll_maintenance.csv` | This file tracks the "EXO calibration/deployment log" sheet on the COMPASS-FME Google Drive and must be updated by hand

0 comments on commit 7e69fb9

Please sign in to comment.