-
Notifications
You must be signed in to change notification settings - Fork 736
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Perf: Use hackyOffset if it parses date strings in the expected format #1580
base: master
Are you sure you want to change the base?
Perf: Use hackyOffset if it parses date strings in the expected format #1580
Conversation
The time zone offset of a date can be computed on platforms that support it (anything that's not super ancient) by using Intl.DateTimeFormat.formatToParts with en-US to output an array of the date components. For legacy reasons, you can also generate a date string using Intl.DateTimeFormat.format which can be parsed into an array using regexes. The string/regex approach (hackyOffset) is way faster (2-4x), but much more susceptible to weird client configurations. This detects whether hackyOffset is able to parse a known date correctly, and uses it if it does.
*/ | ||
static hackyOffsetParsesCorrectly() { | ||
if (hackyOffsetParsesCorrectly === undefined) { | ||
const dtf = makeDTF("UTC"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a useless DTF to cache since we use fixed offset zones for UTC. Unclear if there's a real zone that would make more sense to use here (I'm biased towards America/New_York
)
/easycla |
This is a clever optimization and I'm not against it. But one question I have is how many calls it takes to amortize the cost of the upfront check. Do you have a sense of that? For all I know, it just takes one. |
that's a very good point import * as b from "benny"
const ts = 1710020705493
const jsDate = new Date(ts)
const utcTestDate = new Date(Date.UTC(1969, 11, 31, 15, 45, 55))
function makeDtf(zone: string) {
return new Intl.DateTimeFormat("en-US", {
hour12: false,
timeZone: zone,
year: "numeric",
month: "2-digit",
day: "2-digit",
hour: "2-digit",
minute: "2-digit",
second: "2-digit",
era: "short",
})
}
// from luxon
function hackyOffset(dtf: any, date: any) {
const formatted = dtf.format(date).replace(/\u200E/g, ""),
parsed = /(\d+)\/(\d+)\/(\d+) (AD|BC),? (\d+):(\d+):(\d+)/.exec(formatted),
[, fMonth, fDay, fYear, fadOrBc, fHour, fMinute, fSecond] = parsed ?? []
return [fYear, fMonth, fDay, fadOrBc, fHour, fMinute, fSecond]
}
const typeToPos = {
year: 0,
month: 1,
day: 2,
era: 3,
hour: 4,
minute: 5,
second: 6,
}
function partsOffset(dtf: any, date: any) {
const formatted = dtf.formatToParts(date)
const filled = []
for (let i = 0; i < formatted.length; i++) {
const { type, value } = formatted[i]
// @ts-expect-error[implicit-any-index-incorrect] hidden by suppressImplicitAnyIndexErrors
const pos = typeToPos[type]
if (type === "era") {
filled[pos] = value
} else if (pos !== undefined) {
filled[pos] = parseInt(value, 10)
}
}
return filled
}
const dtf = makeDtf("America/New_York")
void b.suite(
"Intl.DateTimeFormat",
b.add("hackyOffset", () => {
hackyOffset(dtf, jsDate)
}),
b.add("partsOffset", () => {
partsOffset(dtf, jsDate)
}),
b.add("makeDtf(UTC) no-cache", () => {
makeDtf("UTC")
}),
b.add("uncached partsOffset", () => {
const dtf = makeDtf("America/New_York")
partsOffset(dtf, jsDate)
}),
b.add("uncached hackyOffset", () => {
const utcDtf = makeDtf("UTC")
hackyOffset(utcDtf, utcTestDate)
const dtf = makeDtf("America/New_York")
hackyOffset(dtf, jsDate)
}),
b.cycle(),
b.complete()
)
DTF construction is very slow, so doing it twice is quite bad. I don't know why the error bars are so high for it, however. I think this math works, but let me know if it makes sense to you. Analytically, we can see that constructing a DTF takes 1MM/27,116 = 36.9us . hackyOffset takes 1MM/589,933 = 1.7us, while partsOffset takes 1MM/234,358= 4.3us we then expect a single uncached partsOffset to take 36.9us + 4.3us = 41.2us (observed is 44us) 77.2 - 41.2 = 36us to make up. We would save 4.3 - 1.7 = 2.6us per call, so 36/2.6 = 13.8 calls before our optimization pays off. Note that we may have to call |
How about just adding Settings.setUseHackyOffset()? Seems like that would allow specific applications that do a ton of formats to speed up while sacrificing a few locales, and others to not worry about it. |
This is part of a series of PRs based on performance work we have done to
improve a use-case involving parsing/formatting hundreds of thousands of dates
where luxon was the bottleneck.
This includes the commit from #1574 to establish the benchmark
This is the sketchy optimization of the bunch and I would understand not wanting to support this, but it is much, much faster. In our observations, about 1.5% of web traffic outputs in a format that doesn't parse correctly, so we fall back to
formatToParts
.The time zone offset of a date can be computed on platforms that support
it (anything that's not super ancient) by using
Intl.DateTimeFormat.formatToParts with en-US to output an array of the
date components. For legacy reasons, you can also generate a date string
using Intl.DateTimeFormat.format which can be parsed into an array using
regexes. The string/regex approach (hackyOffset) is way faster (2-4x),
but much more susceptible to weird client configurations.
This detects whether hackyOffset is able to parse a known date
correctly, and uses it if it does.
Benchmark Comparison (
name | before | after | after/before
):