Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ppe to add training labels to repo #91

Merged
merged 2 commits into from
Nov 27, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Data/.gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
*.zip
!CleanedForecastsNWAC_CAIC_UAC_CAC.V1.2013-2021.zip
*.csv
Binary file not shown.
9 changes: 1 addition & 8 deletions Data/Readme.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
Training data can be downloaded here:
https://oapstorageprod.blob.core.windows.net/oap-training-data/Data/CleanedForecastsNWAC_CAIC_UAC_CAC.V1.2013-2021.zip

https://github.com/scottcha/OpenAvalancheProject/blob/master/Data/CleanedForecastsNWAC_CAIC_UAC_CAC.V1.2013-2021.zip

1. This file is an attempt to collect all the data from ever avy center I’ve worked on so far.
2. Not ever avy center published the same data and not ever forecast has the same data. Scanning though there is a mix between no-data and null. I’m not sure if its entirely consistent but null should mean that datacenter doesn’t publish that data while no-data means it wasn’t part of that forecast.
Expand All @@ -13,9 +12,3 @@ e. “Day1DangerNearTreeline” is the forecast at the mid elevation provided fo
f. “Day1DangerBelowTreeline” is the forecast at the lower elevation provided for the region
g. “ForecastUrl” the archived url where I pulled the forecast in case the data needs to be checked against the source.
4. There are many other columns meant to encode avy-rose values, avy-rose avalanche problems and other elements of the forecast. Most of the column names should be self explanatory.


Older datasets for prior updates are archived below.
https://oapstorageprod.blob.core.windows.net/oap-training-data/Data/V1.1FeaturesWithLabels2013-2020.zip

A small training set which has been processed for input in to an ML model (samples, features, timestep) can be downloaded here: https://oapstorageprod.blob.core.windows.net/oap-training-data/Data/MLDataWashington.zip The feature names are indexed according to the file https://oapstorageprod.blob.core.windows.net/oap-training-data/Data/FeatureLabels.csv
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ Directories are organized as follows:

This aspect of the tutorial will cover how you can obtain new weather input data for a new date range or region. This part assumes you have avalanche forecast labels for the dates and region (OAP currently has historical forecast labels for three avalanche centers in the US from the 15-16 season through the 20-21 season and is working on expanding that).

Due to the large size of the input GFS data and the fact that its already hosted by NCAR, OAP isn't currently providing copies of this data. If you want to start a data processing pipeline from the original data you can start with this process here. If you aren't interested in the data processing steps and only in the ML steps you can download the labels here: https://oapstorageprod.blob.core.windows.net/oap-training-data/Data/CleanedForecastsNWAC_CAIC_UAC_CAC.V1.2013-2021.zip and a subset of training data here: [TODO: replace with current link] and skip to the fourth notebook 4.TimeseriesAi
Due to the large size of the input GFS data and the fact that its already hosted by NCAR, OAP isn't currently providing copies of this data. If you want to start a data processing pipeline from the original data you can start with this process here. If you aren't interested in the data processing steps and only in the ML steps you can download the labels here: https://github.com/scottcha/OpenAvalancheProject/blob/master/Data/CleanedForecastsNWAC_CAIC_UAC_CAC.V1.2013-2021.zip and a subset of training data here: [TODO: replace with current link] and skip to the fourth notebook 4.TimeseriesAi

The input data is derived from the .25 degree GFS model hosted by NCAR hosted at this site: https://rda.ucar.edu/datasets/ds084.1/

Expand Down Expand Up @@ -164,7 +164,7 @@ And then this is what it looks like when filtered to only the Olympics avalanche
# Files on disk structure

Training labels can be downloaded here:
https://oapstorageprod.blob.core.windows.net/oap-training-data/Data/CleanedForecastsNWAC_CAIC_UAC_CAC.V1.2013-2021.zip
https://github.com/scottcha/OpenAvalancheProject/blob/master/Data/CleanedForecastsNWAC_CAIC_UAC_CAC.V1.2013-2021.zip

1.RawWeatherData/
gfs/
Expand Down
Loading