Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need for a small demo notebook about using time and doing conversions? #28

Open
saroele opened this issue Nov 23, 2014 · 2 comments
Open
Labels

Comments

@saroele
Copy link
Member

saroele commented Nov 23, 2014

Hi all,

I know from (personal) experience that working with time is not always straightforward in python. There are different modules (time, datetime, dateutil, pytz, pandas.Timestamp, calendar, etc...) and there are are few caveats when using timezones and converting between datetimes and POSIX timestamps etc.

Do we need a small notebook with some basic commands and conventions on how to use time in our codes?

As a start, we could use this email that I sent some weeks ago to my colleagues after a debugging session.

Dear python colleagues,

We have to be very careful with the datetime library. I hope I was the last one to realize that there are some very dangerous quirks in that library?

Let’s do a quiz: predict the outcome of the following two print statements:

import pytz

import datetime as dt

BXL = pytz.timezone('Europe/Brussels')

print BXL.localize(dt.datetime(2014,8,3,7,41))

print dt.datetime(2014,8,3,7,41, tzinfo=BXL)

Here are the answers:

print BXL.localize(dt.datetime(2014,8,3,7,41))

2014-08-03 07:41:00+02:00

print dt.datetime(2014,8,3,7,41, tzinfo=BXL)

2014-08-03 07:41:00+00:18

I was expecting both constructions to yield the same time, but pytz and datetime don’t think that makes sense. The errors can cause serious bugs, evidently.

It’s funny actually that this is documented on the site of pytz, but really, who reads this? (http://pytz.sourceforge.net/).

Considering also the mail of Patrick yesterday, we should take a moment to standardize the way we use ‘time’ in 3E. Maybe that will involve avoiding the datetime library as much as possible and using pandas and numpy ?

Gregoire told me yesterday he uses pd.Timestamp to create dates. I have tried it now, and indeed it is very powerful. Unfortunately, it is very badly documented. The method pd.to_datetime() is better documented, but is not 100% identical. The best way to see the options is by looking into the code of pandas.tslib.Timestamp (eg here: https://github.com/pydata/pandas/blob/master/pandas/tslib.pyx#L202)

I try to give a summary of functions that are most useful for me:

import pandas as pd

import pytz

import datetime as dt

BXL = pytz.timezone('Europe/Brussels')

In [32] pd.Timestamp.now() # naïve timestamp: dangerous

Out[32]: Timestamp('2014-11-11 12:08:45.997512')

In [33]: pd.Timestamp.now(tz='utc') # better to specify local time or UTC

Out[33]: Timestamp('2014-11-11 11:09:18.633738+0000', tz='UTC')

In [34]: pd.Timestamp('20141111T12:10:15', tz=BXL) # parsing of strings in many formats

Out[34]: Timestamp('2014-11-11 12:10:15+0100', tz='Europe/Brussels')

In [35]: pd.Timestamp('20141111T12:10:15', tz=BXL).value/1e9 # conversion to epoch: the timestamp is stored as time since EPOCH in ns, so to convert to POSIX epoch (in s) divide by 1e9

Out[35]: 1415704215.0

In [37]: now=time.time() # posix epoch

In [38]: pd.Timestamp(now, unit='s', tz='utc') # conversion from epoch, don't forget the unit='s'!!

Out[38]: Timestamp('2014-11-11 11:12:18.057659+0000', tz='UTC')

In [39]: pd.Timestamp(now, unit='s', tz=BXL) # conversion from epoch to localtime, don't forget the unit='s'!!

Out[39]: Timestamp('2014-11-11 12:12:18.057659+0100', tz='Europe/Brussels')

In [40]: assert _.value/1e9 == now

In [44]: dt_now = pd.Timestamp(now, unit='s', tz=BXL).to_datetime() # conversion to datetime

In [45]: pd.Timestamp(dt_now) # reading datetime

Out[45]: Timestamp('2014-11-11 12:12:18.057659+0100', tz='Europe/Brussels')

There is much more, but that may be better in a wiki page. Which wiki shall we use for this?

In summary: the take-home message:

NEVER EVER use datetime.datetime(y,m,d,..., tzinfo=my_timezone) again!! (unless you want to have ‘solar time’?)

and the second message: ALWAYS pass a tz=my_timezone or tz=’utc’ to a pd.Timestamp() call!!

See you,

Roel

@icarus75
Copy link

Epochs never lie, everything else does. Nevertheless, I took your take-home message to heart, see flukso/tmpo-py@5325b19

@dirkdevriendt
Copy link
Contributor

FYI: arrow (http://crsmithdev.com/arrow/) aims to be the requests of date/time related functionality in Python

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants