Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add eurocrops data module. #1869

Merged
merged 15 commits into from
Apr 12, 2024
Merged

Conversation

favyen2
Copy link
Contributor

@favyen2 favyen2 commented Feb 8, 2024

It is based on NAIPChesapeakeDataModule which splits bounding box of dataset into 1/2 train, 1/4 val, and 1/4 test.

This may not be the best way to train an actual model. I think it is more natural to either split by country, or to randomly assign each large grid cell (e.g. 4096x4096 pixel) to train/val/test and then sample within those grid cells. But I wasn't sure how to split by country since VectorDataset automatically detects all the files, or to assign large grid cell since there's no sampler that can take multiple large bounding boxes and sample patches within them.

It is based on NAIPChesapeakeDataModule which splits bounding box of dataset
into 1/2 train, 1/4 val, and 1/4 test. This may not be the best way to train
an actual model.
@github-actions github-actions bot added datasets Geospatial or benchmark datasets testing Continuous integration testing datamodules PyTorch Lightning datamodules labels Feb 8, 2024
@adamjstewart adamjstewart added this to the 0.6.0 milestone Feb 11, 2024
@yichiac
Copy link
Contributor

yichiac commented Mar 4, 2024

Hi @favyen2, could you use the new splitting function random_grid_cell_assignment similar to CDL #1889 and NCCM #1949 for EuroCrops? It would be more realistic to split datasets into more grids during training.

Copy link
Collaborator

@adamjstewart adamjstewart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's rename everything from eurocrops_sentinel2 to sentinel2_eurocrops

tests/conf/eurocrops_sentinel2.yaml Outdated Show resolved Hide resolved
tests/conf/eurocrops_sentinel2.yaml Outdated Show resolved Hide resolved
tests/data/eurocrops/data.py Outdated Show resolved Hide resolved
tests/trainers/test_segmentation.py Outdated Show resolved Hide resolved
torchgeo/datamodules/eurocrops.py Outdated Show resolved Hide resolved
torchgeo/datamodules/eurocrops.py Outdated Show resolved Hide resolved
torchgeo/datamodules/eurocrops.py Outdated Show resolved Hide resolved
torchgeo/datamodules/eurocrops.py Outdated Show resolved Hide resolved
torchgeo/datasets/geo.py Outdated Show resolved Hide resolved
@favyen2
Copy link
Contributor Author

favyen2 commented Mar 21, 2024

I am not able to get this to work without some changes from #1889 like setting Sentinel-2 test data size to 128 instead of 36, so I will wait for that to be merged first since that will make it easier.

@adamjstewart
Copy link
Collaborator

#1889 has now been merged, feel free to rebase and copy-n-paste whatever you want from that data module.

tests/conf/sentinel2_eurocrops.yaml Outdated Show resolved Hide resolved
torchgeo/datamodules/sentinel2_eurocrops.py Outdated Show resolved Hide resolved
torchgeo/datamodules/sentinel2_eurocrops.py Show resolved Hide resolved
adamjstewart
adamjstewart previously approved these changes Apr 12, 2024
@adamjstewart adamjstewart enabled auto-merge (squash) April 12, 2024 12:00
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Apr 12, 2024
auto-merge was automatically disabled April 12, 2024 12:53

Pull request was closed

@adamjstewart adamjstewart reopened this Apr 12, 2024
@adamjstewart adamjstewart merged commit 83353b0 into microsoft:main Apr 12, 2024
38 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datamodules PyTorch Lightning datamodules datasets Geospatial or benchmark datasets documentation Improvements or additions to documentation testing Continuous integration testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants