-
Notifications
You must be signed in to change notification settings - Fork 824
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend string parsing support for Date32 #5282
Conversation
fdfddd1
to
81ea97a
Compare
"2020-9-8 01:02:03", | ||
"2020-09-08 1:2:3", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may make sense to support these two as well, though that isn't doable with the current TimestampParser
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like this could be used to be more flexible with the date(time) support format, while also adhering to some form of input validation:
fn parse_date(string: &str) -> Option<NaiveDate> {
if string.len() > 10 {
let mut parts = string.splitn(2, ' ');
return match (parts.next(), parts.next()) {
(Some(date), Some(time)) if string_to_time(time).is_some() => parse_date(date),
_ => None,
};
};
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would likely represent a major performance regression as formulated, but so long as we don't regress performance I have no major objections
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, let's go with the current approach, since it handles the majority of use cases anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a minor nit
arrow-cast/src/cast.rs
Outdated
@@ -7500,17 +7500,19 @@ mod tests { | |||
assert!(c.is_valid(0)); // "2000-01-01" | |||
assert_eq!(date_value, c.value(0)); | |||
|
|||
assert!(c.is_valid(1)); // "2000-01-01" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is incorrect
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, since we trim away the time part, I think it's correct (I see 10957 as the value for both)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The value being parsed is still 2000-01-01T12:00:00
which is what this should read I think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see; fixed now, thanks!
81ea97a
to
35c761b
Compare
This now includes the timestamp format besides the plain date format.
Which issue does this PR close?
Closes #5280.
Rationale for this change
PG supports casting a valid timestamp string into a date (by throwing away the time part).
What changes are included in this PR?
Fallback to parsing a datetime if the string length is too long for just a date.
Are there any user-facing changes?
None, except additional format support when casting to Date32.