Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Sentinel2_CDL Datamodule #1889

Merged
merged 47 commits into from
Mar 22, 2024

Conversation

yichiac
Copy link
Contributor

@yichiac yichiac commented Feb 17, 2024

This PR adds the datamodule for CDL + Sentinel-2 images.

@github-actions github-actions bot added testing Continuous integration testing datamodules PyTorch Lightning datamodules labels Feb 17, 2024
@yichiac yichiac changed the title Adds CDL Sentinel2 Datamodule [WIP] Adds CDL Sentinel2 Datamodule Feb 17, 2024
@yichiac yichiac changed the title [WIP] Adds CDL Sentinel2 Datamodule Adds CDL Sentinel2 Datamodule Feb 21, 2024
@yichiac
Copy link
Contributor Author

yichiac commented Feb 22, 2024

Are there any changes required to merge this datamodule?
Tagging @adamjstewart for review.

torchgeo/datamodules/cdlsentinel2.py Outdated Show resolved Hide resolved
torchgeo/datamodules/cdlsentinel2.py Outdated Show resolved Hide resolved
torchgeo/datamodules/cdlsentinel2.py Outdated Show resolved Hide resolved
tests/conf/cdlsentinel2.yaml Outdated Show resolved Hide resolved
@calebrob6
Copy link
Member

Also need to add this to docs

@adamjstewart adamjstewart added this to the 0.6.0 milestone Feb 29, 2024
@adamjstewart
Copy link
Collaborator

Needs same changes as NCCM.

@github-actions github-actions bot added the documentation Improvements or additions to documentation label Mar 4, 2024
adamjstewart
adamjstewart previously approved these changes Mar 15, 2024
Copy link
Collaborator

@adamjstewart adamjstewart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly looks good now. Look through the commits I added and make sure they all make sense. Couple final comments worth reviewing before we meet. I'll try to get the other PRs in similar shape so we can merge after our meeting today

torchgeo/datamodules/sentinel2cdl.py Outdated Show resolved Hide resolved
torchgeo/datamodules/sentinel2cdl.py Outdated Show resolved Hide resolved
adamjstewart
adamjstewart previously approved these changes Mar 15, 2024
adamjstewart
adamjstewart previously approved these changes Mar 15, 2024
@yichiac
Copy link
Contributor Author

yichiac commented Mar 15, 2024

@adamjstewart Thanks for the review!

@yichiac yichiac changed the title Add Sentinel2CDLDatamodule Add Sentinel2_CDL Datamodule Mar 18, 2024
adamjstewart
adamjstewart previously approved these changes Mar 18, 2024
Copy link
Member

@calebrob6 calebrob6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks fine, but I want to make sure that you have actually tried to train a model with this!

**kwargs: Any,
) -> None:
"""Initialize a new Sentinel2CDLDataModule instance.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you give an example of how to instantiate this class? It is non-obvious from the current docstring.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The datamodule can be initialized like this:

datamodule = Sentinel2CDLDataModule(
    crs="epsg:3857",
    batch_size=64,
    patch_size=224,
    cdl_paths="data/cdl/",
    sentinel2_paths="data/sentinel2/",
)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool, can you add that to the docstring?

Returns:
A matplotlib Figure with the image, ground truth, and predictions.
"""
return self.cdl.plot(*args, **kwargs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this plot work when using this with a SemanticSegmentation trainer?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's also tested in CI. It simply overrides the default that calls self.dataset.plot

Copy link
Collaborator

@adamjstewart adamjstewart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are still a few details we need to work out for our paper experiments w.r.t. augmentations, but we can do that in a mass PR once all data modules are merged.

I agree with @calebrob6 that the documentation could always be improved, but this is also true for our existing datamodules, none of which give examples for instantiation. The foo_ prefix stuff is already used in some of our existing datamodules like NAIPChesapeake with no explanation. Let's make a tutorial for these someday that gives example usage. For now, I'll force merge this so we aren't holding up all other datamodule PRs.

)

self.train_aug = AugmentationSequential(
K.Normalize(mean=self.mean, std=self.std),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reminder that we still need to figure this out for all data modules in a future PR

@adamjstewart adamjstewart dismissed calebrob6’s stale review March 22, 2024 16:02

Requests are partially resolved, better tutorials can be a future goal

@adamjstewart adamjstewart merged commit 5a7b9e5 into microsoft:main Mar 22, 2024
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datamodules PyTorch Lightning datamodules documentation Improvements or additions to documentation testing Continuous integration testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants