Street2Sat is a new framework for obtaining large data sets of geo-referenced crop type labels obtained from vehicle mounted cameras that can be extended to other applications.
Paper accepted to ICML 2021 Tackling Climate Change Using AI Workshop. 🎉 Link coming soon!
- Clone the repository
- Set up and activate a Python virtual environment
python3 -m venv venv source venv/bin/activate
- Install the dependencies
pip install -r requirements.txt
Multiple_image_pipeline.ipynb - demonstrates how predictions are generated.
db_to_shapefile.ipynb demonstrates access to the FireStore database and generation of dataset (geodataframe, shapefile).
The code can also be accessible through a demo flask app:
- Ensure MongoDB is installed.
- Start the flask app:
export FLASK_APP=run flask run
The app should be live on http://127.0.0.1:5000/
Ensure the Google Cloud CLI is installed.
Initial Setup (only done once)
gsutil mb gs://street2sat-uploaded
gsutil mb gs://street2sat-model-predictions
Deploying resources (done on every code update)
sh deploy.sh
Ground-truth labels on crop type and other variables are critically needed to develop machine learning methods that use satellite observations to combat climate change and food insecurity. These labels difficult and costly to obtain over large areas, particularly in Sub-Saharan Africa where they are most scarce. Street2Sat is a new framework for obtaining large data sets of geo-referenced crop type labels obtained from vehicle mounted cameras that can be extended to other applications.
The Street2Sat pipeline has 6 steps:
An iterated approach on Otsu's Method for thresholding was used to straighten images before they were used to train the Yolo object detection. Otsu's method was run on the whole image, halves of the image, thirds of the image, and fourths of the image, and all the predicted rotation angles were averaged. In addition, the image was blurred using a gaussian filter before running on Otsu's method in order to reduce incorrect rotations resulting from small objects.
The Yolo Object Detection framework is used to train and predict where crops were located in the image. The images were labeled for crop type by Nasa Harvest teams with agricultural expert guidance. The trained Yolo model is hosted on Heroku using Flask and MongoDB. The crop type prediction can be run on the web app.
The heights of the predicted bounding boxes are used to predict the distance from the camera to the crop. We calculated the distance to each bounding box and then calculated the average depth for all of the boxes of the same crop type class to get a single depth for the field. We used the following equation to predict distance:
The focal length, lfocal, was obtained from the EXIF image metadata; the crop height, hcrop, is from known heights of crops; hbbox is the height of the bounding box in pixels; himage is the height of the image in pixels; and the GoPro sensor height hsensor is 4.55 mm.
After obtaining the average distance to the crop in the image, the bearing of the camera had to be calculated. To find the bearing, a velocity vector is calculated between the current image and the closest other image in time that is uploaded. We assume that the camera is pointed 90 degrees orthogonal to the drive direction and relocate the point the average distance in that direction.
Errors could occur in this pipeline due to a variety of factors such as poor lighting, mixed crop fields, occlusions, and object detection errors. In future work, we plan to apply techniques for quality assessment and control (QA/QC) to identify points with possible errors and correct them, e.g., out-of-distribution detection to find outliers located on roads or other objects.