-
-
Notifications
You must be signed in to change notification settings - Fork 134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fast coercion date and time #417
base: main
Are you sure you want to change the base?
Conversation
# | ||
# Comparison: | ||
# DRY 2021-02-03T00:10:54.597+03:00: 157549.5 i/s | ||
# AM 2021-02-03T00:10:54.597+03:00: 40502.8 i/s - 3.89x (± 0.00) slower |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its confused me, and I found strange code in rails and created issue
rails/rails#41316
36341e9
to
c778e7d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking the time to work on this. Performance improvements are always welcome! In my experience, regexps are typically very slow in Ruby but let's see where this will go 😄 I didn't know Date.parse is so slow btw.
ISO_DATE = /\A(\d{4})-(\d\d)-(\d\d)\z/.freeze | ||
def fast_string_to_date(string) | ||
if string =~ ISO_DATE | ||
::Date.new $1.to_i, $2.to_i, $3.to_i |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not thread-safe so it cannot use these nasty regexp globals
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is actually is. These are pseudo-global
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://ruby-doc.org/core-2.7.2/Regexp.html#class-Regexp-label-Special+global+variables
These global variables are thread-local and method-local variables.
@ermolaev why isn't it a patch to Ruby? |
I mean it sounds strange we invent "performance-optimized" code and copy it between line libraries... What if there's another, even faster set of regex? :) |
Just what I said lol rails/rails@9ea15f1 |
Another thought (sorry for spamming): we should provide a way to fix the format to ISO8601. Mb it should be a different key, e.g. |
ruby has other methods for parsing time in strict formats Time.iso8601("2011-10-05T22:26:12-04:00")
Time.strptime("2000-10-31", "%Y-%m-%d") this methods is fast, but this is a very strict format logic, this methods throw exeptions, and for fallback we need handle it, but exeptions is very slow
I think new keys will be confuse developers, and nobody will use them, also better to encapsulate the parsing details in the library |
But we don't define
These are strong assumptions and I don't share them. It's true that being faster by default is better but we also need to consider the price (support, bugs, etc). In my opinion, parsing dates without a predefined format is a potential source of errors. Delegating format detection to This is what I use personally TypeContainer.register(
'json.iso_time',
::Types::Time.constrained(
gteq: Time.new(2014, 1, 1),
lteq: Time.new(2038, 1, 1)
).constructor { |value, &failed|
Try[::ArgumentError] { ::Time.iso8601(value.sub(' ', 'T')) }.value_or {
failed.(value)
}
}
) I'm not against the change (though I'm not sure about the implementation) but I think we should try adding it to Ruby first. Side note: is there a safe |
I looked it up, seems like using |
Ruby has regexp for
Date benchmarkISO_DATE = /\A(\d{4})-(\d\d)-(\d\d)\z/
string = '2011-12-03'
Benchmark.ips do |x|
x.report('parse') do |n|
while n > 0
Date.parse(string)
n-=1
end
end
x.report('_parse') do |n|
while n > 0
::Date.new(*::Date._parse(string, false).values_at(:year, :mon, :mday))
n-=1
end
end
x.report('_iso8601') do |n|
while n > 0
::Date.new(*::Date._iso8601(string).values_at(:year, :mon, :mday))
n-=1
end
end
x.report('_strptime') do |n|
while n > 0
::Date.new(*Date._strptime(string, '%Y-%m-%d').values_at(:year, :mon, :mday))
n-=1
end
end
x.report('_strptime.values') do |n|
while n > 0
::Date.new(*Date._strptime(string, '%Y-%m-%d').values)
n-=1
end
end
x.report('regexp') do |n|
while n > 0
if string =~ ISO_DATE
::Date.new($1.to_i, $2.to_i, $3.to_i)
end
n-=1
end
end
x.compare!
end
# Comparison:
# _strptime.values: 1282303.3 i/s
# _strptime: 1194136.2 i/s - same-ish: difference falls within error
# regexp: 997567.6 i/s - 1.29x (± 0.00) slower
# _iso8601: 529221.1 i/s - 2.42x (± 0.00) slower
# parse: 282310.3 i/s - 4.54x (± 0.00) slower
# _parse: 270369.9 i/s - 4.74x (± 0.00) slower Time benchmarkISO_DATETIME = /
\A
(\d{4})-(\d\d)-(\d\d)(?:T|\s) # 2020-06-20T
(\d\d):(\d\d):(\d\d)(?:\.(\d{1,6})\d*)? # 10:20:30.123456
(?:(Z(?=\z)|[+-]\d\d)(?::?(\d\d))?)? # +09:00
\z
/x.freeze
def fast_string_to_time(string)
return unless ISO_DATETIME =~ string
usec = $7.to_i
usec_len = $7&.length
if usec_len&.< 6
usec *= 10**(6 - usec_len)
end
if $8
offset = $8 == "Z" ? 0 : $8.to_i * 3600 + $9.to_i * 60
end
::Time.local($1.to_i, $2.to_i, $3.to_i, $4.to_i, $5.to_i, $6.to_i, usec, offset)
end
string = '2021-02-04T01:48:47.551+03:00'
Benchmark.ips do |x|
x.report('parse') do |n|
while n > 0
::Time.parse(string)
n-=1
end
end
x.report('iso8601') do |n|
while n > 0
::Time.iso8601(string)
n-=1
end
end
x.report('_parse') do |n|
while n > 0
::Time.new(*::Date._parse(string, false).values_at(:year, :mon, :mday, :hour, :min, :sec, :sec_fraction))
n-=1
end
end
x.report('_iso8601') do |n|
while n > 0
::Time.new(*::Date._iso8601(string).values_at(:year, :mon, :mday, :hour, :min, :sec, :sec_fraction))
n-=1
end
end
x.report('_strptime') do |n|
while n > 0
::Time.new(*Date._strptime(string, '%Y-%m-%dT%H:%M:%S.%N%z').values_at(:year, :mon, :mday, :hour, :min, :sec, :sec_fraction))
n-=1
end
end
x.report('regexp') do |n|
while n > 0
fast_string_to_time(string)
n-=1
end
end
x.compare!
end
# Comparison:
# regexp: 201472.4 i/s
# _strptime: 159658.5 i/s - 1.26x (± 0.00) slower
# _iso8601: 140898.1 i/s - 1.43x (± 0.00) slower
# iso8601: 138998.2 i/s - 1.45x (± 0.00) slower
# _parse: 66379.7 i/s - 3.04x (± 0.00) slower
# parse: 57152.7 i/s - 3.53x (± 0.00) slower |
Thanks for this! A minor heads up. I benchmarked various options and looked into the ruby sources. It turns out that ruby itself relies on regular expressions internally. My benchmarking showed approx 20% difference between |
I benchmarked
dry-types
andactivemodel
and noticed thatdry-types
has low perfomance, and I dip inside code railshttps://github.com/rails/rails/blob/main/activemodel/lib/active_model/type/helpers/time_value.rb#L72
https://github.com/rails/rails/blob/main/activemodel/lib/active_model/type/date.rb#L30
rails use regexp for frequent case