This Python script processes a list of GitHub contributors and attempts to geolocate them based on their GitHub profiles or other available information. The script generates a GeoJSON file containing the geographical locations of the contributors.
- Reads a list of contributors from a text file
- Searches for GitHub users by name or email
- Retrieves user location information from GitHub profiles
- Geocodes location strings using multiple services (Nominatim, OpenCage, Google Maps)
- Generates a GeoJSON FeatureCollection with user locations
- Provides fallback mechanisms for users without location information
- Handles API rate limits and uses multiple geocoding services
Before running the script, make sure you have the following:
- Python 3.x installed
- Required Python packages:
requests
- API keys for the following services (optional, but recommended for better results):
- GitHub API
- Google Maps API
- OpenCage API
- Clone this repository or download the script.
- Install the required packages:
pip install requests
- Set up environment variables for API keys:
export GITHUB_TOKEN=your_github_token export GOOGLE_MAPS_API_KEY=your_google_maps_api_key export OPENCAGE_API_KEY=your_opencage_api_key export NATIONALIZE_API_KEY=your_nationalize_api_key
-
Prepare a text file (
contributors_list.txt
) with the list of contributors in the following format:Full Name <[email protected]> Full Name https://github.com/username Full Name
-
Run the script:
python github_user_geolocation.py
-
The script will generate a GeoJSON file (
contributors.json
) in adata
directory containing the locations of the contributors.
- The script reads the list of contributors from the input file.
- For each contributor, it attempts to find their GitHub profile using their name, email, or GitHub username.
- If a GitHub profile is found, the script retrieves the user's location information.
- The location string is geocoded using multiple services (Nominatim, OpenCage, Google Maps) to obtain coordinates.
- If geocoding fails or no location is available, the script uses a fallback mechanism to predict a country based on the user's name.
- The accuracy of the geolocation depends on the information provided in GitHub profiles and the geocoding services used.
- API rate limits may affect the script's performance, especially for large lists of contributors.
- The script uses fallback mechanisms and name-based country prediction, which may not always be accurate.
This project is open-source and available under the MIT License.