The notebook scrape Wikipedia in order to find information about the biggest cities in the US
I start by scraping the wikipedia page: https://en.wikipedia.org/wiki/List_of_United_States_cities_by_population
which contain informations about the 314 most populated cities in the US
To run this module you need the following libraries:
- Pandas
- Numpy
- requests
- bs4
- re
- time
2018_rank: (Integer) Rank the cities by population size
City: (String) Name of the city
State: (String) City's state
2018_estimate: (Integer) Population estimated in 2018
2010_census: (Integer) Population in 2010
Change (%): (float) Percentage of change between 2010 and 2018
2016_land_area (km2): (float) land area in 2016 in squared km
2016_density: (Integer) Population density in 2016
Location: (String) Latitude and Longitude
Description: (String) City's description from Wikipedia
Nickname: (String) City's nickname(s)
County: (String) Name the counties included in the city or the county in which the city is included
Govern_type: (String) Give the type of government of the city
Govern_body: (String) Constitution of the city government
Govern_mayor: (String) City's mayor
Area_land: (String) Land area in squared miles and squared km
Area_water: (String) Water area in squared miles and squared km
Area_metro: (String) Metro area in squared miles and squared km
Elevation: (String) Elevation in ft and m
CSA: (float) CSA, Combined Statistical Area
Time_zone: (String) Time zone
ZIP_code: (String) ZIP codes included in the city
Area_code: (String) Area code in the city
FIPS_code: (String) FIPS code
Airport: (String) List of airports in the city
Website: (String) City's official website
Area_land (km2): (float) land area in squared km
Area_water (km2): (float) water area in squared km
Area_metro (km2): (float) Metro area in squared km
Elevation (m): (float) Elevation in m