Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data not available in benchmark-tutorial.ipynb #210

Open
robmarkcole opened this issue Aug 15, 2022 · 4 comments
Open

Data not available in benchmark-tutorial.ipynb #210

robmarkcole opened this issue Aug 15, 2022 · 4 comments

Comments

@robmarkcole
Copy link

robmarkcole commented Aug 15, 2022

Working through competitions/cloud-cover/benchmark-tutorial.ipynb on a hub instance, the notebook states the data should be available in a volume but this is not the case:

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
Input In [4], in <cell line: 5>()
      2 TRAIN_FEATURES = DATA_DIR / "train_features"
      3 TRAIN_LABELS = DATA_DIR / "train_labels"
----> 5 assert TRAIN_FEATURES.exists()

AssertionError: 
@TomAugspurger
Copy link

Thanks for the report. I was cleaning some things up in preparation for a Hub migration and completely forgot that this notebook existed :/

I'll need to think a bit about how to adjust for that. The notebook will fail to run in the meantime, unless you signed up for the competition: https://www.drivendata.org/competitions/83/cloud-cover/data/

@robmarkcole
Copy link
Author

No problem! I did not sign up for the competition but appears as it has finished I cannot.
I tried downloading but as this is 53GB I guess I wont have space. I only want to train on the NIR band, is there a way to request just that band?

@TomAugspurger
Copy link

I tried downloading but as this is 53GB I guess I wont have space.

You might have space outside of your home directory (e.g. /tmp) but that's reset each time the notebook server restarts (https://planetarycomputer.microsoft.com/docs/overview/environment/#understanding-the-file-system).

is there a way to request just that band?

I'm not sure. With the assets hosted by the Planetary Computer you can access single bands. I don't recall how these assets were distributed, but it might have been a large gz or ZIP file.

@robmarkcole
Copy link
Author

yes they are large gz. NP I will download elsewhere and create a NIR version for use here, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants