We are interested in looking at the specific locations of fast food chains in the US and various socio-economic measures like median level of income, unemployment rate etc. in those very locations. By transforming two different datasets that have a common point: Zip Code Tabulation Area (ZCTA), we hope that our database will allow analysts to draw insights on the potential link between low-socio economic status communities, the location of fast food chains as well as obesity levels across the US.
For further details in any of the following steps and our project potentials and limitations, please refer to our report and the notebooks for each part.
1/ US Census Bureau Demographic Data
Use census API wrapper to retrieve data from the American Community Survey 5-Year Data (2009-2018) based on zip code tabulation area (zcta). Please refer to our notebook.
2/ Fast Food Restaurants Across America
This dataset was extracted from Kaggle and it came in the form of a downloadable CSV. Please refer to our notebook.
3/ Zip Code to ZCTA Cross Walk
This dataset was extracted from UDS Mapper and it came in the form of a downloadable CSV. Please refer to our notebook.
All of the input csv files can be found here.
Please refer to the following notebooks:
After our data analysis and transformation, we come up with this ERD and schema before loading data to the PostgreSQL database.
Please refer to our notebook.
We make no claims as the ownership of the data. Hence, please do what you'd love with the data but credit the appropriate people.