You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I know from (personal) experience that working with time is not always straightforward in python. There are different modules (time, datetime, dateutil, pytz, pandas.Timestamp, calendar, etc...) and there are are few caveats when using timezones and converting between datetimes and POSIX timestamps etc.
Do we need a small notebook with some basic commands and conventions on how to use time in our codes?
As a start, we could use this email that I sent some weeks ago to my colleagues after a debugging session.
Dear python colleagues,
We have to be very careful with the datetime library. I hope I was the last one to realize that there are some very dangerous quirks in that library?
Let’s do a quiz: predict the outcome of the following two print statements:
import pytz
import datetime as dt
BXL = pytz.timezone('Europe/Brussels')
print BXL.localize(dt.datetime(2014,8,3,7,41))
print dt.datetime(2014,8,3,7,41, tzinfo=BXL)
Here are the answers:
print BXL.localize(dt.datetime(2014,8,3,7,41))
2014-08-03 07:41:00+02:00
print dt.datetime(2014,8,3,7,41, tzinfo=BXL)
2014-08-03 07:41:00+00:18
I was expecting both constructions to yield the same time, but pytz and datetime don’t think that makes sense. The errors can cause serious bugs, evidently.
It’s funny actually that this is documented on the site of pytz, but really, who reads this? (http://pytz.sourceforge.net/).
Considering also the mail of Patrick yesterday, we should take a moment to standardize the way we use ‘time’ in 3E. Maybe that will involve avoiding the datetime library as much as possible and using pandas and numpy ?
Gregoire told me yesterday he uses pd.Timestamp to create dates. I have tried it now, and indeed it is very powerful. Unfortunately, it is very badly documented. The method pd.to_datetime() is better documented, but is not 100% identical. The best way to see the options is by looking into the code of pandas.tslib.Timestamp (eg here: https://github.com/pydata/pandas/blob/master/pandas/tslib.pyx#L202)
I try to give a summary of functions that are most useful for me:
import pandas as pd
import pytz
import datetime as dt
BXL = pytz.timezone('Europe/Brussels')
In [32] pd.Timestamp.now() # naïve timestamp: dangerous
Out[32]: Timestamp('2014-11-11 12:08:45.997512')
In [33]: pd.Timestamp.now(tz='utc') # better to specify local time or UTC
In [35]: pd.Timestamp('20141111T12:10:15', tz=BXL).value/1e9 # conversion to epoch: the timestamp is stored as time since EPOCH in ns, so to convert to POSIX epoch (in s) divide by 1e9
Out[35]: 1415704215.0
In [37]: now=time.time() # posix epoch
In [38]: pd.Timestamp(now, unit='s', tz='utc') # conversion from epoch, don't forget the unit='s'!!
Hi all,
I know from (personal) experience that working with time is not always straightforward in python. There are different modules (time, datetime, dateutil, pytz, pandas.Timestamp, calendar, etc...) and there are are few caveats when using timezones and converting between datetimes and POSIX timestamps etc.
Do we need a small notebook with some basic commands and conventions on how to use time in our codes?
As a start, we could use this email that I sent some weeks ago to my colleagues after a debugging session.
Dear python colleagues,
We have to be very careful with the datetime library. I hope I was the last one to realize that there are some very dangerous quirks in that library?
Let’s do a quiz: predict the outcome of the following two print statements:
import pytz
import datetime as dt
BXL = pytz.timezone('Europe/Brussels')
print BXL.localize(dt.datetime(2014,8,3,7,41))
print dt.datetime(2014,8,3,7,41, tzinfo=BXL)
Here are the answers:
print BXL.localize(dt.datetime(2014,8,3,7,41))
print dt.datetime(2014,8,3,7,41, tzinfo=BXL)
I was expecting both constructions to yield the same time, but pytz and datetime don’t think that makes sense. The errors can cause serious bugs, evidently.
It’s funny actually that this is documented on the site of pytz, but really, who reads this? (http://pytz.sourceforge.net/).
Considering also the mail of Patrick yesterday, we should take a moment to standardize the way we use ‘time’ in 3E. Maybe that will involve avoiding the datetime library as much as possible and using pandas and numpy ?
Gregoire told me yesterday he uses pd.Timestamp to create dates. I have tried it now, and indeed it is very powerful. Unfortunately, it is very badly documented. The method pd.to_datetime() is better documented, but is not 100% identical. The best way to see the options is by looking into the code of pandas.tslib.Timestamp (eg here: https://github.com/pydata/pandas/blob/master/pandas/tslib.pyx#L202)
I try to give a summary of functions that are most useful for me:
import pandas as pd
import pytz
import datetime as dt
BXL = pytz.timezone('Europe/Brussels')
In [32] pd.Timestamp.now() # naïve timestamp: dangerous
Out[32]: Timestamp('2014-11-11 12:08:45.997512')
In [33]: pd.Timestamp.now(tz='utc') # better to specify local time or UTC
Out[33]: Timestamp('2014-11-11 11:09:18.633738+0000', tz='UTC')
In [34]: pd.Timestamp('20141111T12:10:15', tz=BXL) # parsing of strings in many formats
Out[34]: Timestamp('2014-11-11 12:10:15+0100', tz='Europe/Brussels')
In [35]: pd.Timestamp('20141111T12:10:15', tz=BXL).value/1e9 # conversion to epoch: the timestamp is stored as time since EPOCH in ns, so to convert to POSIX epoch (in s) divide by 1e9
Out[35]: 1415704215.0
In [37]: now=time.time() # posix epoch
In [38]: pd.Timestamp(now, unit='s', tz='utc') # conversion from epoch, don't forget the unit='s'!!
Out[38]: Timestamp('2014-11-11 11:12:18.057659+0000', tz='UTC')
In [39]: pd.Timestamp(now, unit='s', tz=BXL) # conversion from epoch to localtime, don't forget the unit='s'!!
Out[39]: Timestamp('2014-11-11 12:12:18.057659+0100', tz='Europe/Brussels')
In [40]: assert _.value/1e9 == now
In [44]: dt_now = pd.Timestamp(now, unit='s', tz=BXL).to_datetime() # conversion to datetime
In [45]: pd.Timestamp(dt_now) # reading datetime
Out[45]: Timestamp('2014-11-11 12:12:18.057659+0100', tz='Europe/Brussels')
There is much more, but that may be better in a wiki page. Which wiki shall we use for this?
In summary: the take-home message:
NEVER EVER use datetime.datetime(y,m,d,..., tzinfo=my_timezone) again!! (unless you want to have ‘solar time’?)
and the second message: ALWAYS pass a tz=my_timezone or tz=’utc’ to a pd.Timestamp() call!!
See you,
Roel
The text was updated successfully, but these errors were encountered: