-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make FixedTimeZone isbits #354
Conversation
Depends on: JuliaData/WeakRefStrings.jl#78 |
Thanks, this seems promising. Out of curiosity do you also want |
No problem! For now I'm focusing on |
I think we should probably switch to I also wonder if there's anyway to use even shorter InlineStrings... 🤔 |
Great! I think using
Funny you should ask... As an experiment I implemented a I still think we should proceed with using |
6a1e2c2
to
2672ff1
Compare
Worth noting: This will cause issues with projects using CSV.jl until it's next release, since v0.8.5 doesn't support WeakRefStrings 1.2 |
It's doubtful anyone will explicitly require this latest release so falling back to a slightly older TimeZones.jl should be fine |
WeakRefStrings.jl has a Julia minimum of 1.3. All of its dependencies are set to use Julia 1 and the PR that bumped up the minimum was: JuliaData/WeakRefStrings.jl#73. From looking over that PR it seems like WeakRefStrings.jl could be set to use |
eb7c954
to
56a254f
Compare
I'm looking into the performance of this change but I haven't benchmarked with Julia before, so here's what I did and the results: First I created a test script, using TimeZones, BenchmarkTools
BenchmarkTools.DEFAULT_PARAMETERS.gcsample = true
function bm_tzs()
map( t -> TimeZone(t, TimeZones.Class(:ALL)), timezone_names())
TimeZones._reset_tz_cache()
GC.gc()
end
@benchmark bm_tzs()
For both master and my branch I switched to each like so: > git checkout <branch name>
> git clean -dfX && julia --project=@. -e 'using Pkg; Pkg.build("TimeZones")' Then I tested each with the following Julia commands: julia --project=@. -t 1 include("test.jl") Here are the results for master from one run:
And from my branch:
I'm going to follow up and make sure they use the same number of samples, but so far this looks good 🤷 |
Reran with using TimeZones, BenchmarkTools
BenchmarkTools.DEFAULT_PARAMETERS.gcsample = true
BenchmarkTools.DEFAULT_PARAMETERS.seconds = 300
BenchmarkTools.DEFAULT_PARAMETERS.samples = 100
function bm_tzs()
map( t -> TimeZone(t, TimeZones.Class(:ALL)), timezone_names())
TimeZones._reset_tz_cache()
GC.gc()
end
@benchmark bm_tzs()
Master results:
this branch's results:
|
src/types/fixedtimezone.jl
Outdated
h = hash(tz.name, h) | ||
h = hash(tz.offset, h) | ||
return h | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to overload == and hash?
I expected making it isbits would decrease that need.
src/types/timezone.jl
Outdated
# Note: If the class `mask` does not match the time zone we'll still load the | ||
# information into the cache to ensure the result is consistent. | ||
tz, class = get!(_tz_cache(), str) do | ||
tz_path = joinpath(TZData.COMPILED_DIR, split(str, "/")...) | ||
tz, class = get!(_tz_cache(), String(str)) do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this required?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed pretty much all the changes that cast to strings, seems like it's handled better now
src/types/timezone.jl
Outdated
tz, class = get!(_tz_cache(), str) do | ||
tz_path = joinpath(TZData.COMPILED_DIR, split(str, "/")...) | ||
tz, class = get!(_tz_cache(), String(str)) do | ||
tz_path = joinpath(TZData.COMPILED_DIR, split(String(str), "/")...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An issue should be openned on InlineStrings.jl if this is required
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
src/types/timezone.jl
Outdated
elseif occursin(FIXED_TIME_ZONE_REGEX, str) | ||
FixedTimeZone(str), Class(:FIXED) | ||
elseif occursin(FIXED_TIME_ZONE_REGEX, String(str)) | ||
FixedTimeZone(String(str)), Class(:FIXED) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this required?
It seems like it is just going to be converted to a InlineString anyway so why convert to a String first?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
I think we can do better. VariableTimeZones are named after locations. I found before they all fit a ShortString15. |
Switched to InlineString15, here are the results if I test without specifying
|
339f04f
to
5b24406
Compare
I can't think of any more code changes or cleanup needed, so if everything looks good this can be merged |
src/types/fixedtimezone.jl
Outdated
function Base.isequal(a::FixedTimeZone, b::FixedTimeZone) | ||
return isequal(a.name, b.name) && isequal(a.offset, b.offset) | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to overload this?
(sorry I didn't mention it before)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
other than 1 last comment about isequal
this looks good to me.
(I don't have merge rights though)
Looks like this PR effectively lower-bounds Julia to 1.3 through the InlineStrings dependency, so it seems it has the same problems as the old WeakRefStrings. This means that CI won't run as-is. |
This should go without saying but CI should be passing before merge |
I've switched to ShortStrings instead, which does support Julia 1.0. The only quirk so far is with regex matching, but I've also only tested locally with Julia 1.6 |
2d95761
to
513aff6
Compare
Codecov Report
@@ Coverage Diff @@
## master #354 +/- ##
==========================================
- Coverage 93.75% 92.48% -1.27%
==========================================
Files 31 31
Lines 1553 1477 -76
==========================================
- Hits 1456 1366 -90
- Misses 97 111 +14
Continue to review full report at Codecov.
|
I messed up updating the project.toml, which I think is what caused the benchmarking failure I also re-ran performance tests after switching to ShortStrings, and here are the results:
|
Not sure but I think there may also be a code bug in the benchmarks based on this error: https://github.com/JuliaTime/TimeZones.jl/pull/354/checks?check_run_id=3828334686#step:5:263 |
I would rather not, we are wanting to deprecate ShortStrings.jl for InlineStrings. I wonder why InlineStrings doesn't support 1.0 |
513aff6
to
3f2271b
Compare
My best guess is that the older version of TimeZones is being built (in the compiled folder), and that is being used when benchmarking, which causes deserializing from those files to fail. To work around this I've pushed a change that moves checkout to before package building, similar to the julia tests in the package |
2c3651c
to
9c1e835
Compare
This should resolve the benchmarking issue: #360 I've tested running the CI code locally, replicated the issue, and fixed it by using this code as the base. I'll add more details on how I tested it in this MR |
9c1e835
to
f8dd12e
Compare
@omus can we revert the commit changing to ShortStrings and drop 1.0 support instead? |
I'm okay with that. We should drop 1.0 support in a separate PR and be sure to remove all of the |
I will make that PR presently |
f8dd12e
to
d53218d
Compare
Reran the custom benchmark script This branch:
master:
|
I'll be making the 1.6.0 release today |
Co-authored-by: Curtis Vogt <[email protected]>
This switches FixedTimeZone.name to an InlineString instead of a String, and doing that makes it an isbits type
As a side effect this also limits the number of bytes in a FixedTimeZone name
Usually timezones are named after locations, and the longest location name currently is Taumatawhakatangihangakoauauotamateaturipukakapikimaungahoronukupokaiwhenuakitanatahu. This is in New Zealand, so if a fixed time zone were created for it then it would be
Oceania/Taumatawhakatangihangakoauauotamateaturipukakapikimaungahoronukupokaiwhenuakitanatahu
That is still only 93 characters, so the 255 character limit should be fine for a while, and we could switch to InlineString127 if needed as well
Related issue: #271