Recent advancements in machine learning (ML) algorithms and reduced cloud computing costs have significantly improved the utility and applicability of Earth observations (EO). Concurrently, the importance of understanding the health and status of the world's farms, especially in how they will be impacted by climate change or assist in global poverty alleviation, has never been greater.
ML models learn from training data, which in applications such as agricultural monitoring are generated from ground reference data. These data are a critical component of the ML pipeline, and without proper standardization will result in models with biased or inaccurate predictions. Therefore, many members of the global ML EO community have expressed a desire to establish a core set of community-defined standards around ground reference data.
As a first step in that direction, Radiant Earth Foundation has collaborated with practitioners from around the world to create an initial version of these standards. This version includes:
- Best practices for the in-field data collection component and specifications for metadata fields to include.
- An example of groundreference data in GeoJSON format.
Our objective is to provide a starting point framework that all of us as members of the global EO, ML and agricultural community can discuss, critique, and improve upon. These guidelines will continue to be developed through future meetings and events. Comments and improvements are welcome.
In April 2020, Radiant Earth Foundation and CGIAR Platform for Big Data in Agriculture hosted a webinar to present this guideline, and discuss its applications for ground data collection. You can watch the recordings of the webinar here.
You are welcomed to provide feedback on this guide. You can submit an issue, or contact us by email. Moreover, please reach out to us if you have data you’d like to share.