-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validate accuweather data #83
Comments
It works like this: import h5py
import numpy as np
>>> h5 = h5py.File('hourly_database.hdf5', 'r'); data = h5['weather_data'][:]
>>> np.unique(data[:,2])
array([ 0.00000000e+00, 1.00000000e+00, 4.00000000e+00,
2.01606212e+11]) Beware that the data is padded with rows of zero from the bottom. |
I have checked the data using nanmax(), values are reasonable. I didn't get what information unique() should give me? I get only NaNs. I have also looked through the excepted errors and fixed some of them. So if the data is used and you have time tonight, you can rerun it. But it will only add data from ~60 / 3000 html files, which are not included in the current database since they through errors before. If you run the scraper again, can you change the try:
sc_ac(date_string, city, DATAPATH)
except Exception as ex:
if type(ex)== AssertionError: assertion_count += 1
elif type(ex) == UnicodeDecodeError: unicode_count += 1 |
Did you push your changes? |
I pulled some stuff but I haven't noticed your changes on the file. Maybe I have overlooked |
I thought I pushed changes. But it only changed my accuweather/functions.py file. You didnt get those?
|
I pulled and started running it an hour ago. So results should be ready soon. |
Please download the data with _aw suffix here: https://drive.google.com/folderview?id=0BwQc_CC3arWWMTNYaEpCOHlKZmc&usp=sharing
And look at nanmax(), unique() etc of columns and the number of entries to see if it makes sense.
The text was updated successfully, but these errors were encountered: