Skip to content

Latest commit

 

History

History
1997 lines (1760 loc) · 35.5 KB

parte-practica.md

File metadata and controls

1997 lines (1760 loc) · 35.5 KB

Examen Periodismo de Datos - Parte Práctica

Diego Domínguez González - Periodismo y Humanidades Grupo 61

pip install pandas
Defaulting to user installation because normal site-packages is not writeable
Requirement already satisfied: pandas in /usr/local/lib/python3.8/dist-packages (1.3.1)
Requirement already satisfied: python-dateutil>=2.7.3 in /usr/local/lib/python3.8/dist-packages (from pandas) (2.8.1)
Requirement already satisfied: numpy>=1.17.3 in /usr/local/lib/python3.8/dist-packages (from pandas) (1.21.1)
Requirement already satisfied: pytz>=2017.3 in /usr/local/lib/python3.8/dist-packages (from pandas) (2020.4)
Requirement already satisfied: six>=1.5 in /usr/lib/python3/dist-packages (from python-dateutil>=2.7.3->pandas) (1.14.0)
�[33mWARNING: You are using pip version 21.0.1; however, version 22.1.2 is available.
You should consider upgrading via the '/usr/bin/python3 -m pip install --upgrade pip' command.�[0m
Note: you may need to restart the kernel to use updated packages.
import pandas as pd
url = "https://api.covid19api.com/countries"
df = pd.read_json(url)
df
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
Country Slug ISO2
0 Falkland Islands (Malvinas) falkland-islands-malvinas FK
1 Panama panama PA
2 Russian Federation russia RU
3 Turkmenistan turkmenistan TM
4 Guinea guinea GN
... ... ... ...
243 Guyana guyana GY
244 Hungary hungary HU
245 Kazakhstan kazakhstan KZ
246 Liberia liberia LR
247 Somalia somalia SO

248 rows × 3 columns

df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 248 entries, 0 to 247
Data columns (total 3 columns):
 #   Column   Non-Null Count  Dtype 
---  ------   --------------  ----- 
 0   Country  248 non-null    object
 1   Slug     248 non-null    object
 2   ISO2     248 non-null    object
dtypes: object(3)
memory usage: 5.9+ KB

España

url_esp = "https://api.covid19api.com/country/spain/status/confirmed/live"
df_esp = pd.read_json(url_esp)
df_esp
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
Country CountryCode Province City CityCode Lat Lon Cases Status Date
0 Spain ES 40.46 -3.75 0 confirmed 2020-01-22 00:00:00+00:00
1 Spain ES 40.46 -3.75 0 confirmed 2020-01-23 00:00:00+00:00
2 Spain ES 40.46 -3.75 0 confirmed 2020-01-24 00:00:00+00:00
3 Spain ES 40.46 -3.75 0 confirmed 2020-01-25 00:00:00+00:00
4 Spain ES 40.46 -3.75 0 confirmed 2020-01-26 00:00:00+00:00
... ... ... ... ... ... ... ... ... ... ...
883 Spain ES 40.46 -3.75 12613634 confirmed 2022-06-23 00:00:00+00:00
884 Spain ES 40.46 -3.75 12681820 confirmed 2022-06-24 00:00:00+00:00
885 Spain ES 40.46 -3.75 12681820 confirmed 2022-06-25 00:00:00+00:00
886 Spain ES 40.46 -3.75 12681820 confirmed 2022-06-26 00:00:00+00:00
887 Spain ES 40.46 -3.75 12681820 confirmed 2022-06-27 00:00:00+00:00

888 rows × 10 columns

df_esp.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 888 entries, 0 to 887
Data columns (total 10 columns):
 #   Column       Non-Null Count  Dtype              
---  ------       --------------  -----              
 0   Country      888 non-null    object             
 1   CountryCode  888 non-null    object             
 2   Province     888 non-null    object             
 3   City         888 non-null    object             
 4   CityCode     888 non-null    object             
 5   Lat          888 non-null    float64            
 6   Lon          888 non-null    float64            
 7   Cases        888 non-null    int64              
 8   Status       888 non-null    object             
 9   Date         888 non-null    datetime64[ns, UTC]
dtypes: datetime64[ns, UTC](1), float64(2), int64(1), object(6)
memory usage: 69.5+ KB
df_esp ["Date"][1]
Timestamp('2020-01-23 00:00:00+0000', tz='UTC')
df_esp.describe()
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
Lat Lon Cases
count 8.880000e+02 888.00 8.880000e+02
mean 4.046000e+01 -3.75 4.092078e+06
std 7.109432e-15 0.00 3.947626e+06
min 4.046000e+01 -3.75 0.000000e+00
25% 4.046000e+01 -3.75 4.569650e+05
50% 4.046000e+01 -3.75 3.347512e+06
75% 4.046000e+01 -3.75 5.069291e+06
max 4.046000e+01 -3.75 1.268182e+07
df_esp.set_index("Date")
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
Country CountryCode Province City CityCode Lat Lon Cases Status
Date
2020-01-22 00:00:00+00:00 Spain ES 40.46 -3.75 0 confirmed
2020-01-23 00:00:00+00:00 Spain ES 40.46 -3.75 0 confirmed
2020-01-24 00:00:00+00:00 Spain ES 40.46 -3.75 0 confirmed
2020-01-25 00:00:00+00:00 Spain ES 40.46 -3.75 0 confirmed
2020-01-26 00:00:00+00:00 Spain ES 40.46 -3.75 0 confirmed
... ... ... ... ... ... ... ... ... ...
2022-06-23 00:00:00+00:00 Spain ES 40.46 -3.75 12613634 confirmed
2022-06-24 00:00:00+00:00 Spain ES 40.46 -3.75 12681820 confirmed
2022-06-25 00:00:00+00:00 Spain ES 40.46 -3.75 12681820 confirmed
2022-06-26 00:00:00+00:00 Spain ES 40.46 -3.75 12681820 confirmed
2022-06-27 00:00:00+00:00 Spain ES 40.46 -3.75 12681820 confirmed

888 rows × 9 columns

df_esp.set_index("Date")["Cases"]
Date
2020-01-22 00:00:00+00:00           0
2020-01-23 00:00:00+00:00           0
2020-01-24 00:00:00+00:00           0
2020-01-25 00:00:00+00:00           0
2020-01-26 00:00:00+00:00           0
                               ...   
2022-06-23 00:00:00+00:00    12613634
2022-06-24 00:00:00+00:00    12681820
2022-06-25 00:00:00+00:00    12681820
2022-06-26 00:00:00+00:00    12681820
2022-06-27 00:00:00+00:00    12681820
Name: Cases, Length: 888, dtype: int64
df_esp.set_index("Date")["Cases"].plot(title="Casos España Covid")
<AxesSubplot:title={'center':'Casos España Covid'}, xlabel='Date'>

png

Francia

url_fra = "https://api.covid19api.com/country/france/status/confirmed/live"
df_fra = pd.read_json(url_fra)
df_fra
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
Country CountryCode Province City CityCode Lat Lon Cases Status Date
0 France FR Martinique 14.64 -61.02 0 confirmed 2020-01-22 00:00:00+00:00
1 France FR New Caledonia -20.90 165.62 0 confirmed 2020-01-22 00:00:00+00:00
2 France FR Reunion -21.12 55.54 0 confirmed 2020-01-22 00:00:00+00:00
3 France FR Wallis and Futuna -14.29 -178.12 0 confirmed 2020-01-22 00:00:00+00:00
4 France FR French Guiana 4.00 -53.00 0 confirmed 2020-01-22 00:00:00+00:00
... ... ... ... ... ... ... ... ... ... ...
10652 France FR Mayotte -12.83 45.17 37877 confirmed 2022-06-27 00:00:00+00:00
10653 France FR Saint Barthelemy 17.90 -62.83 4671 confirmed 2022-06-27 00:00:00+00:00
10654 France FR French Guiana 3.93 -53.13 85596 confirmed 2022-06-27 00:00:00+00:00
10655 France FR Reunion -21.12 55.54 421269 confirmed 2022-06-27 00:00:00+00:00
10656 France FR 46.23 2.21 29823387 confirmed 2022-06-28 00:00:00+00:00

10657 rows × 10 columns

df_fra.describe()
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
Lat Lon Cases
count 10657.000000 10657.000000 1.065700e+04
mean 6.433109 -34.730791 6.325769e+05
std 23.358877 88.296378 3.279857e+06
min -21.120000 -178.120000 0.000000e+00
25% -14.290000 -62.830000 1.090000e+02
50% 14.640000 -56.320000 3.819000e+03
75% 18.070000 2.210000 3.595300e+04
max 46.890000 165.620000 2.982339e+07
df_fra.set_index("Date")
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
Country CountryCode Province City CityCode Lat Lon Cases Status
Date
2020-01-22 00:00:00+00:00 France FR Martinique 14.64 -61.02 0 confirmed
2020-01-22 00:00:00+00:00 France FR New Caledonia -20.90 165.62 0 confirmed
2020-01-22 00:00:00+00:00 France FR Reunion -21.12 55.54 0 confirmed
2020-01-22 00:00:00+00:00 France FR Wallis and Futuna -14.29 -178.12 0 confirmed
2020-01-22 00:00:00+00:00 France FR French Guiana 4.00 -53.00 0 confirmed
... ... ... ... ... ... ... ... ... ...
2022-06-27 00:00:00+00:00 France FR Mayotte -12.83 45.17 37877 confirmed
2022-06-27 00:00:00+00:00 France FR Saint Barthelemy 17.90 -62.83 4671 confirmed
2022-06-27 00:00:00+00:00 France FR French Guiana 3.93 -53.13 85596 confirmed
2022-06-27 00:00:00+00:00 France FR Reunion -21.12 55.54 421269 confirmed
2022-06-28 00:00:00+00:00 France FR 46.23 2.21 29823387 confirmed

10657 rows × 9 columns

df_fra.set_index("Date")["Cases"]
Date
2020-01-22 00:00:00+00:00           0
2020-01-22 00:00:00+00:00           0
2020-01-22 00:00:00+00:00           0
2020-01-22 00:00:00+00:00           0
2020-01-22 00:00:00+00:00           0
                               ...   
2022-06-27 00:00:00+00:00       37877
2022-06-27 00:00:00+00:00        4671
2022-06-27 00:00:00+00:00       85596
2022-06-27 00:00:00+00:00      421269
2022-06-28 00:00:00+00:00    29823387
Name: Cases, Length: 10657, dtype: int64
df_fra.set_index("Date")["Cases"].plot(title="Casos Francia Covid")
<AxesSubplot:title={'center':'Casos Francia Covid'}, xlabel='Date'>

png

Italia

url_ita = "https://api.covid19api.com/country/italy/status/confirmed/live"
df_ita = pd.read_json(url_ita)
df_ita
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
Country CountryCode Province City CityCode Lat Lon Cases Status Date
0 Italy IT 41.87 12.57 0 confirmed 2020-01-22 00:00:00+00:00
1 Italy IT 41.87 12.57 0 confirmed 2020-01-23 00:00:00+00:00
2 Italy IT 41.87 12.57 0 confirmed 2020-01-24 00:00:00+00:00
3 Italy IT 41.87 12.57 0 confirmed 2020-01-25 00:00:00+00:00
4 Italy IT 41.87 12.57 0 confirmed 2020-01-26 00:00:00+00:00
... ... ... ... ... ... ... ... ... ... ...
883 Italy IT 41.87 12.57 18071634 confirmed 2022-06-23 00:00:00+00:00
884 Italy IT 41.87 12.57 18128044 confirmed 2022-06-24 00:00:00+00:00
885 Italy IT 41.87 12.57 18184917 confirmed 2022-06-25 00:00:00+00:00
886 Italy IT 41.87 12.57 18234242 confirmed 2022-06-26 00:00:00+00:00
887 Italy IT 41.87 12.57 18259261 confirmed 2022-06-27 00:00:00+00:00

888 rows × 10 columns

df_ita.describe()
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
Lat Lon Cases
count 888.00 8.880000e+02 8.880000e+02
mean 41.87 1.257000e+01 4.670686e+06
std 0.00 1.777358e-15 5.231011e+06
min 41.87 1.257000e+01 0.000000e+00
25% 41.87 1.257000e+01 2.689650e+05
50% 41.87 1.257000e+01 3.745302e+06
75% 41.87 1.257000e+01 4.885903e+06
max 41.87 1.257000e+01 1.825926e+07
df_ita.set_index("Date")
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
Country CountryCode Province City CityCode Lat Lon Cases Status
Date
2020-01-22 00:00:00+00:00 Italy IT 41.87 12.57 0 confirmed
2020-01-23 00:00:00+00:00 Italy IT 41.87 12.57 0 confirmed
2020-01-24 00:00:00+00:00 Italy IT 41.87 12.57 0 confirmed
2020-01-25 00:00:00+00:00 Italy IT 41.87 12.57 0 confirmed
2020-01-26 00:00:00+00:00 Italy IT 41.87 12.57 0 confirmed
... ... ... ... ... ... ... ... ... ...
2022-06-23 00:00:00+00:00 Italy IT 41.87 12.57 18071634 confirmed
2022-06-24 00:00:00+00:00 Italy IT 41.87 12.57 18128044 confirmed
2022-06-25 00:00:00+00:00 Italy IT 41.87 12.57 18184917 confirmed
2022-06-26 00:00:00+00:00 Italy IT 41.87 12.57 18234242 confirmed
2022-06-27 00:00:00+00:00 Italy IT 41.87 12.57 18259261 confirmed

888 rows × 9 columns

df_ita.set_index("Date")["Cases"]
Date
2020-01-22 00:00:00+00:00           0
2020-01-23 00:00:00+00:00           0
2020-01-24 00:00:00+00:00           0
2020-01-25 00:00:00+00:00           0
2020-01-26 00:00:00+00:00           0
                               ...   
2022-06-23 00:00:00+00:00    18071634
2022-06-24 00:00:00+00:00    18128044
2022-06-25 00:00:00+00:00    18184917
2022-06-26 00:00:00+00:00    18234242
2022-06-27 00:00:00+00:00    18259261
Name: Cases, Length: 888, dtype: int64
df_ita.set_index("Date")["Cases"].plot(title="Casos Italia Covid")
<AxesSubplot:title={'center':'Casos Italia Covid'}, xlabel='Date'>

png

Gráfico de los 3 países

casos_fra = df_fra.set_index("Date")["Cases"]
casos_esp = df_esp.set_index("Date")["Cases"]
casos_ita = df_ita.set_index("Date")["Cases"]
vs = pd.concat([casos_fra,casos_esp,casos_ita],axis=1)
vs
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
Cases Cases Cases
Date
2020-01-22 00:00:00+00:00 0.0 0 0.0
2020-01-22 00:00:00+00:00 0.0 0 0.0
2020-01-22 00:00:00+00:00 0.0 0 0.0
2020-01-22 00:00:00+00:00 0.0 0 0.0
2020-01-22 00:00:00+00:00 0.0 0 0.0
... ... ... ...
2022-06-27 00:00:00+00:00 12681820.0 37877 18259261.0
2022-06-27 00:00:00+00:00 12681820.0 4671 18259261.0
2022-06-27 00:00:00+00:00 12681820.0 85596 18259261.0
2022-06-27 00:00:00+00:00 12681820.0 421269 18259261.0
2022-06-28 00:00:00+00:00 NaN 29823387 NaN

10657 rows × 3 columns

vs.columns = ["Francia","España","Italia"]
vs
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
Francia España Italia
Date
2020-01-22 00:00:00+00:00 0 0.0 0.0
2020-01-22 00:00:00+00:00 0 0.0 0.0
2020-01-22 00:00:00+00:00 0 0.0 0.0
2020-01-22 00:00:00+00:00 0 0.0 0.0
2020-01-22 00:00:00+00:00 0 0.0 0.0
... ... ... ...
2022-06-27 00:00:00+00:00 37877 12681820.0 18259261.0
2022-06-27 00:00:00+00:00 4671 12681820.0 18259261.0
2022-06-27 00:00:00+00:00 85596 12681820.0 18259261.0
2022-06-27 00:00:00+00:00 421269 12681820.0 18259261.0
2022-06-28 00:00:00+00:00 29823387 NaN NaN

10657 rows × 3 columns

vs.plot(title="Gráfico de comparación entre países")
<AxesSubplot:title={'center':'Gráfico de comparación entre países'}, xlabel='Date'>

png