Skip to content

Converters for Corsica and Kenya #52

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

aninda-ghosh
Copy link

No description provided.

@@ -0,0 +1,139 @@
# TEMPLATE FOR A FIBOA CONVERTER
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd propose to rename the file to fs_fr_cor for consistency.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as below.

@@ -0,0 +1,138 @@
# TEMPLATE FOR A FIBOA CONVERTER
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd propose to rename the file to fs_ke for consistency.

Copy link
Author

@aninda-ghosh aninda-ghosh Jun 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about this. Since in the meeting We (Hannah, Jacob, Me and Snehal) decided to keep a naming convention similar to boundaries__.parquet to keep it a bit separate than the rest of the global dataset. We can discuss more on this I feel.

Copy link
Contributor

@m-mohr m-mohr Jun 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just the converter name, not the output file name?! Anyway, I'm fine if it's consistent. I this case the other PRs should also align.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes okay, I will cross check to make sure all the fieldscape PRs align.

# Add columns with constant values.
# The key is the column name, the value is a constant value that's used for all rows.
ADD_COLUMNS = {
"determination_datetime": "2021-01-01T00:00:00Z"
Copy link
Contributor

@m-mohr m-mohr Jun 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2022 (as in the file name) and 2021 (as determination datetime) doesn't quite fit together.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right this is slight mistake.

# Title of the collection
TITLE = "Field boundaries for Kenya (Fieldscapes)"
# Description of the collection. Can be multiline and include CommonMark.
DESCRIPTION = """ The dataset contains field boundaries for the Kenya."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
DESCRIPTION = """ The dataset contains field boundaries for the Kenya."""
DESCRIPTION = "The dataset contains field boundaries for the Kenya."


# Schemas for the fields that are not defined in fiboa
# Keys must be the values from the COLUMNS dict, not the keys
MISSING_SCHEMAS = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Schema for crop_name is missing.

# Title of the collection
TITLE = "Field boundaries for Corsica (Fieldscapes)"
# Description of the collection. Can be multiline and include CommonMark.
DESCRIPTION = """ The dataset contains field boundaries for the Corsica."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
DESCRIPTION = """ The dataset contains field boundaries for the Corsica."""
DESCRIPTION = "The dataset contains field boundaries for the Corsica."

# Provider name, can be None if not applicable, must be provided if PROVIDER_URL is provided
PROVIDER_NAME = "Republique Francaise - Geoservices"
# URL to the homepage of the data or the provider, can be None if not applicable
PROVIDER_URL = "https://data.geopf.fr/telechargement/download/RPG/RPG_2-0__SHP_LAMB93_R94_2022-01-01/RPG_2-0__SHP_LAMB93_R94_2022-01-01.7z.001"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you run this? I don't think we have 7zip support yet. Related to #50

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it has to be downloaded and dun on the gpkg inside.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have 7z support now, no manuals steps required.

Copy link
Author

@aninda-ghosh aninda-ghosh Jun 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No there were manual steps involved like creating a subset of the original dataset, Not all the fieldscapes datasets cover the entire original data sources. Hence, this step.

# Keys must be the values from the COLUMNS dict, not the keys
MISSING_SCHEMAS = {
"required": [], # i.e. non-nullable properties
"properties": {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

schemas mssing for crop_name and crop_id

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have to do this for all the countries?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there's no extension, yes. That's why I recommended an extension.

Copy link
Contributor

@m-mohr m-mohr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Various changes needed, see the other comments.

@m-mohr
Copy link
Contributor

m-mohr commented Jun 17, 2024

Please rebase the PR to the latest CLI version, including renaming the URI constant to SOURCES and renaming the parameter cache_file to cache. I tried to push it upstream, but you didn't permit changes to this PR so you need to do these changes yourself. Thanks.

@aninda-ghosh
Copy link
Author

Yes rebased the fork with the latest cli, Will update these comments in the latest PR pushes.

@m-mohr m-mohr marked this pull request as draft July 10, 2024 15:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

2 participants