Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DATESTAMPS #220

Open
gellweiler opened this issue Feb 19, 2018 · 1 comment
Open

DATESTAMPS #220

gellweiler opened this issue Feb 19, 2018 · 1 comment

Comments

@gellweiler
Copy link

Hi there,

I think the TZ (Timezone) regexp is flawed. It currently reads as TZ (?:[APMCE][SD]T|UTC) and is used in the DATESTAMP_RFC822 and DATESTAMP_OTHERexpressions. Now RFC822 (ARPA Internet Text Messages) also lists values such as GMT and UT as valid identifiers, CET, CEST, STD, ... are all commonly used timezone identifiers that don't show up in the spec, but that I think should be matched non the less. TZ could be replaced with something like this([ABCDEFGHIJKLMNPSTUVWY][CDEHKMRWZOUAGLVFJNBISPY][DSOARNPWMUCLGVHKT][TA]|[ABCDEFGHIJKLMNPQRSTUVWXY][CDFMNPRSWOTAEGKLUVXJYHBI][TDKCB]|[ABCEHMUW][CZEHKAOLY][WOAHSVDLR][DSMU][T]|BORTST|[ABCDEFGHIKLMNOPQRSTUVWXYZ]) which would match everything that is in the PHP timezone abbreviations list. Or maybe a less complicated but more generic REGEXP such as [A-Z]{1,6}?

Also I'm confused because the RFC822 spec wants a comma after %{DAY} just like the RFC2822spec. But this isn't reflected in the regexp.

There is a merge request for CET and CEST from 2015. But I believe the date stamp patterns in general need some work. I know it's easy to work around with custom patterns, but I think the current rules are confusing to end users.

Thanks for all the good work

@jordansissel
Copy link
Contributor

Maybe the right solution is to stop using the TZ pattern entirely? I don't know have confidence in any particular pattern that indicates "TZ" that is resistant to problems.

I'm OK if the TZ pattern is changed to match the php timezone abbreviations list, but I'm also fine if the regexp is just removed and we use %{WORD} instead, for example.

For whatever change, assuming TZ pattern stays in use, we'll need tests to cover the expected TZ abbreviations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants