NCCM: Add new data module #1902

shreyakannan1205 · 2024-02-23T05:41:39Z

The purpose of this PR is to add NCCM's data module

Initial commit for adding Northeastern China Crop Map dataset

…nto main

Co-authored-by: Yi-Chia Chang <[email protected]>

Co-authored-by: Adam J. Stewart <[email protected]>

torchgeo/datamodules/nccm.py

torchgeo/samplers/batch.py

torchgeo/trainers/segmentation.py

adamjstewart · 2024-02-25T10:26:22Z

Now that #1878 is merged, the data module correctly detects that NCCM and Sentinel-2 don't actually overlap (specifically in time). You have two options to get the tests to pass:

Change the timestamp of the NCCM or Sentinel-2 fake data
Add more NCCM or Sentinel-2 fake data at different times

All of these options are equally valid I think.

adamjstewart

Code is fine, just need to get the tests to pass now.

adamjstewart · 2024-02-25T20:02:21Z

torchgeo/datamodules/nccm.py

+
+        self.dataset = self.sentinel2 & self.nccm
+        (self.train_dataset, self.val_dataset, self.test_dataset) = (
+            random_grid_cell_assignment(self.dataset, [0.8, 0.1, 0.1], 2, generator)


2 seems great for testing, but I'm not sure if it seems great for actual usage... Are you sure this is what you want? For example, with CDL, this would mean we're splitting the US into NE, SE, SW, NW quadrants and randomly assigning them to train/val/test.

grid_size > 2 seems more reasonable for splitting in the actual case. I think we have three options :

Set a parameter for grid_size so that users can define it themselves.

Create larger test data patches so that our pre-defined grid_size, say grid_size=8, can be executed.

Do both 1 and 2.

into dm_branch

yichiac · 2024-02-29T15:48:07Z

Now that #1878 is merged, the data module correctly detects that NCCM and Sentinel-2 don't actually overlap (specifically in time). You have two options to get the tests to pass:

Change the timestamp of the NCCM or Sentinel-2 fake data

Add more NCCM or Sentinel-2 fake data at different times

All of these options are equally valid I think.

@shreyakannan1205 The Sentinel-2 test data is 2022, and the NCCM data only has 2017-2019. This is the reason why the test error shows no spatiotemporal intersection. I checked that those NCCM and Sentinel-2 test data are spatially intersected, so we only need to fix the temporal intersection. As @adamjstewart suggested, you can decide which option to get the tests passed.

shreyakannan1205 · 2024-03-08T03:21:48Z

Hi @adamjstewart @yichiac , I added the 2022 fake data to NCCM. Locally, my test cases pass, but when I push it to git, it is throwing an error. I tried making a few changes, but I am not too sure on what the issue is. It would be really helpful if you could look into it and let me know on how to fix it. Thanks!

adamjstewart · 2024-03-08T13:29:53Z

The error message you're seeing means that the datasets have no area of intersection.

Why does it pass locally but not in CI? Because #1878 has been merged into main but not into your branch. Before running tests in CI, git will merge your PR into the main branch to make sure that your changes don't break main.

Easiest way to test locally would be to rebase your branch or merge in upstream main. You'll note that this PR contains 111 commits. This is because it's also including all of your commits from your NCCM PR which were done on the main branch. I would highly recommend overwriting your main branch with the upstream main branch to avoid this.

shreyakannan1205 and others added 30 commits October 14, 2023 11:55

Add files via upload

0b3a2b1

Initial commit for adding Northeastern China Crop Map dataset

Added northeastern_china_cropmap (NCCM) definition to _init_.py

70f351d

Update northeastern_china_cropmap.py

bd0cf84

Added tests/data

0a3f1f9

added test_nccm.py

46f4424

Updated datasets.rst and geo_datasets.csv

c0023f9

Latest changes to nccm.py

9457eec

changes to data.py, nccm.py, test_nccm.py

53c3fc9

Update test_nccm.py

1ffe9f4

Debug 1

319f548

Merge branch 'main' of https://github.com/shreyakannan1205/torchgeo i…

01a86e7

…nto main

new changes

17fa8f7

Latest update

4de40ce

Update torchgeo/datasets/nccm.py

ecf38a9

Co-authored-by: Yi-Chia Chang <[email protected]>

Fixed style errors

e2dfabc

Fixed style errors

7771f0e

Fixed style errors

75a304c

Update docs/api/datasets.rst

a23f9e9

Co-authored-by: Adam J. Stewart <[email protected]>

Update torchgeo/datasets/nccm.py

fe0ee69

Co-authored-by: Adam J. Stewart <[email protected]>

Update torchgeo/datasets/nccm.py

254215d

Co-authored-by: Adam J. Stewart <[email protected]>

Delete tests/data/nccm/.DS_Store

81e36cb

Update data.py

8ea5b64

Update nccm.py

0694926

Update torchgeo/datasets/nccm.py

7f73d2d

Co-authored-by: Adam J. Stewart <[email protected]>

Update torchgeo/datasets/nccm.py

5b0eb59

Co-authored-by: Adam J. Stewart <[email protected]>

Update torchgeo/datasets/nccm.py

574ad2c

Co-authored-by: Adam J. Stewart <[email protected]>

Update torchgeo/datasets/nccm.py

7c17dd6

Co-authored-by: Adam J. Stewart <[email protected]>

Update nccm.py

bfe69f6

Update nccm.py

125765f

Resolved few comments

234345f

adamjstewart changed the title ~~New PR for NCCM datamodule~~ NCCM: Add new data module Feb 25, 2024

adamjstewart added this to the 0.6.0 milestone Feb 25, 2024

adamjstewart reviewed Feb 25, 2024

View reviewed changes

torchgeo/datamodules/nccm.py Show resolved Hide resolved

torchgeo/datamodules/nccm.py Outdated Show resolved Hide resolved

torchgeo/samplers/batch.py Outdated Show resolved Hide resolved

torchgeo/trainers/segmentation.py Outdated Show resolved Hide resolved

Clean up unnecessary code

33c27c2

github-actions bot removed samplers Samplers for indexing datasets trainers PyTorch Lightning trainers labels Feb 25, 2024

adamjstewart previously approved these changes Feb 25, 2024

View reviewed changes

adamjstewart reviewed Feb 25, 2024

View reviewed changes

shreyakannan1205 and others added 5 commits February 28, 2024 10:08

Add files via upload

4a488a3

new tif files

f83b00d

new changes

2899ca0

New changes

8826c07

Merge branch 'dm_branch' of https://github.com/shreyakannan1205/torchgeo

0f50145

into dm_branch

shreyakannan1205 dismissed adamjstewart’s stale review via 0f50145 February 28, 2024 16:15

shreya28 added 6 commits February 29, 2024 16:37

Added more fake 2019 sentinel2 data

f6d4397

Added more fake 2019 sentinel2 data

4ff721f

Added more fake 2019 sentinel2 data

f4cb19e

changed size

e733dee

Changed grid size

b1572b3

new changes

3a220d4

yichiac mentioned this pull request Mar 4, 2024

Add eurocrops data module. #1869

Merged

shreya28 added 2 commits March 7, 2024 20:48

Added 2022 to nccm

0811d33

Deleted DS_store

13b7ed3

shreyakannan1205 closed this Mar 15, 2024

shreyakannan1205 deleted the dm_branch branch March 15, 2024 02:28

adamjstewart removed this from the 0.6.0 milestone Mar 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NCCM: Add new data module #1902

NCCM: Add new data module #1902

shreyakannan1205 commented Feb 23, 2024 •

edited

Loading

adamjstewart commented Feb 25, 2024

adamjstewart left a comment

adamjstewart Feb 25, 2024

yichiac Feb 29, 2024

yichiac commented Feb 29, 2024

shreyakannan1205 commented Mar 8, 2024

adamjstewart commented Mar 8, 2024

NCCM: Add new data module #1902

NCCM: Add new data module #1902

Conversation

shreyakannan1205 commented Feb 23, 2024 • edited Loading

adamjstewart commented Feb 25, 2024

adamjstewart left a comment

Choose a reason for hiding this comment

adamjstewart Feb 25, 2024

Choose a reason for hiding this comment

yichiac Feb 29, 2024

Choose a reason for hiding this comment

yichiac commented Feb 29, 2024

shreyakannan1205 commented Mar 8, 2024

adamjstewart commented Mar 8, 2024

shreyakannan1205 commented Feb 23, 2024 •

edited

Loading