You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Aug 21, 2024. It is now read-only.
I have a site with ~1300 documents and wanted to improve its render performance.
Two main observations:
PATH_CACHE is only read for formatted_last_modified_date, and not for last_modified_at_time
it looks like calls to git log ... scale linearly with the number of documents using last_modified_at within a liquid tag
re: PATH_CACHE usage
By aligning PATH_CACHE usage in both formatted_last_modified_date and last_modified_at_time the initial site render will be unaffected, but subsequent renders (e.g. after site reset when jekyll detects a change while running jekyll serve) will see improvement.
Pros:
very minimal patch footprint
regeneration time for ~1300 documents went from ~28s to ~4s
Caveats:
initial generation is unaffected
some users may have come to expect / depend on a "live" call to git log or mtime. Changing this would serve them cached time data
it was unclear to me if there was reason for the separation; I may have missed something!
re: git log ... calls scaling w/ number of documents
Both initial site render and subsequent renders will see improvement if we replace the 1:1 calls with a single git log call and cache its data. The call ends up fast enough that we can flush the cache during reset so users will always have a freshly determined last_modified_at (presumed to be preferable).
Pros:
feature-gated / off by default
initial generation and regeneration time for ~1300 documents went from ~28s to ~4s
Caveats:
larger patch footprint than PATH_CACHE usage
it is plausible that site with very large git log histories (e.g. lots of commits, lots of file churn, or both) would run into issues here. The repo I tested with has ~1800 commits without a lot of file churn. It would be possible to store the paths passed to Determinator.new and stop reading the git log when every file has been encountered, but it would be a messy implementation.
some users may have come to expect cached data for the formatted last_modified_at
some users may have a large number of uncommitted files (e.g. be using jekyll without a git repo) and the time saved by caching mtime data between rerenders is significant. I don't think this is probable but at least wanted to mention it.
I have a site with ~1300 documents and wanted to improve its render performance.
Two main observations:
PATH_CACHE
is only read forformatted_last_modified_date
, and not forlast_modified_at_time
git log ...
scale linearly with the number of documents usinglast_modified_at
within a liquid tagre:
PATH_CACHE
usageBy aligning
PATH_CACHE
usage in bothformatted_last_modified_date
andlast_modified_at_time
the initial site render will be unaffected, but subsequent renders (e.g. after site reset whenjekyll
detects a change while runningjekyll serve
) will see improvement.Pros:
Caveats:
git log
ormtime
. Changing this would serve them cached time dataAn example of this implementation is at https://github.com/klandergren/jekyll-last-modified-at/tree/use-path-cache
re:
git log ...
calls scaling w/ number of documentsBoth initial site render and subsequent renders will see improvement if we replace the 1:1 calls with a single
git log
call and cache its data. The call ends up fast enough that we can flush the cache during reset so users will always have a freshly determinedlast_modified_at
(presumed to be preferable).Pros:
Caveats:
PATH_CACHE
usagegit log
histories (e.g. lots of commits, lots of file churn, or both) would run into issues here. The repo I tested with has ~1800 commits without a lot of file churn. It would be possible to store the paths passed toDeterminator.new
and stop reading the git log when every file has been encountered, but it would be a messy implementation.last_modified_at
jekyll
without a git repo) and the time saved by cachingmtime
data between rerenders is significant. I don't think this is probable but at least wanted to mention it.An example of this implementation is at https://github.com/klandergren/jekyll-last-modified-at/tree/cache-git-information
I will open each of these improvement approaches as separate pull requests so you can evaluate.
Thanks for creating this plugin!
The text was updated successfully, but these errors were encountered: