-
-
Notifications
You must be signed in to change notification settings - Fork 9.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
livecheck: add GithubLatest strategy #9381
Conversation
Review period will end on 2020-12-03 at 12:59:20 UTC. |
You must have picked up my brainwaves, as this was the first PR I was going to create after finishing the docs PR, ahah. I've had this strategy implemented locally for months but I didn't have the opportunity to sneak it in during GSoC (and post-GSoC I promised myself to the docs before working on notable changes to livecheck myself). This strategy's definitely needed (and will save us from replicating similar |
Haha, looking forward to your review. I wanted to add this strategy earlier but got a little too occupied with other stuff 😅 |
🙈 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than a few (minor) comments below (above?), 👍 from me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll take another pass through this later (specifically the documentation comments) but this should help to move things along.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Outside of the areas where I could add suggestions above, we should skip #preprocess_url
for the GithubLatest
strategy as well (in livecheck/livecheck.rb#latest_version
). Currently, the URL is processed into the .git
URL and then the GithubLatest
strategy uses that to create the "latest" URL, so the intermediate preprocessing step is simply a waste of effort.
I can push a commit after the suggested changes have been incorporated but here's a patch that addresses this:
diff --git a/Library/Homebrew/livecheck/livecheck.rb b/Library/Homebrew/livecheck/livecheck.rb
index 953126526055eacbacced7e17923718b60509b35..07cba1f076aac36891b0caf166b807d40db532ed 100644
--- a/Library/Homebrew/livecheck/livecheck.rb
+++ b/Library/Homebrew/livecheck/livecheck.rb
@@ -25,6 +25,11 @@ module Homebrew
lolg.it
].freeze
+ STRATEGY_SYMBOLS_TO_SKIP_PREPROCESS_URL = [
+ :github_latest,
+ :page_match,
+ ].freeze
+
UNSTABLE_VERSION_KEYWORDS = %w[
alpha
beta
@@ -381,8 +386,8 @@ module Homebrew
next
end
- # Do not preprocess the URL when livecheck.strategy is set to :page_match
- url = if livecheck_strategy == :page_match
+ # Only preprocess the URL when it's appropriate
+ url = if STRATEGY_SYMBOLS_TO_SKIP_PREPROCESS_URL.include?(livecheck_strategy)
original_url
else
preprocess_url(original_url)
I'm using an array for the symbols because brew style
will complain if we do url = if livecheck_strategy == :page_match || livecheck_strategy == :github_latest
(i.e., "Avoid comparing a variable with multiple items in a conditional, use Array#include? instead.").
For what it's worth, I'm working on updating the formulae in homebrew/core that currently use a GitHub "latest" URL in the Edit: The homebrew/core PR can be found at Homebrew/homebrew-core#66134. I've labeled it "do not merge" until we finish and merge this PR. I already tested these changes (using this PR and all the changes mentioned here), identified any issues, and either addressed them in that PR or through suggested changes here. |
Review period ended. |
@samford, I've incorporated the suggestions from your review, except the patches. Please do push any other commit(s) you have, thanks! |
I pushed some commits to address anything else that was left. Namely:
I did full before/after livecheck runs across homebrew/core with the current version of this PR and the related homebrew/core PR that updates Everything seems to be working properly from a functional standpoint now, so this just needs some final review to identify any areas that should be addressed before merging. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! 👍
I think this can be merged as is or improved a tiny bit following some or all of my comments below.
# Only preprocess the URL when it's appropriate | ||
url = if STRATEGY_SYMBOLS_TO_SKIP_PREPROCESS_URL.include?(livecheck_strategy) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
preprocess_url
function (still) has the following bit:
brew/Library/Homebrew/livecheck/livecheck.rb
Lines 327 to 332 in 24f7898
if host.end_with?("github.com") | |
return url if path.match? %r{/releases/latest/?$} | |
owner, repo = path.delete_prefix("downloads/").split("/") | |
url = "#{scheme}://#{host}/#{owner}/#{repo}.git" | |
elsif host.end_with?(*GITEA_INSTANCES) |
Shall we remove or make changes to it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thinking is that /releases/latest
URLs shouldn't ever be processed and keeping this guard in place ensures this continues to be true regardless of whether #preprocess_url
is skipped for some strategies.
Removing this guard would mess up checks for any formulae that use a /releases/latest
URL in a livecheck
block (i.e., not updated to use strategy :github_latest
). PageMatch
checks only skip preprocessing when strategy :page_match
is present in the livecheck
block (i.e., checks that implicitly use PageMatch
don't skip preprocessing). Homebrew/homebrew-core#66134 will address this for homebrew/core but other taps could be affected if we removed this.
There isn't any notable benefit to removing this simple guard but there are clear detriments under certain circumstances. With that in mind, I think we should simply keep this guard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it - thank you! Perhaps it would be helpful to add a comment to that block above explaining which cases it catches (formulae in non-homebrew/core taps + <some conditions> ) -- otherwise it's not obvious why we still check for github.com
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since I'm not the target audience, do you think something short like # Handles any `livecheck` blocks not using the `GithubLatest` strategy
would be sufficient? This guard would also apply if a check that uses a /releases/latest
URL mistakenly sneaks into homebrew/core, so I don't think we need to go into the details of when we're most likely to encounter this situation (e.g., non-core taps).
# A priority of zero causes livecheck to skip the strategy. We do this | ||
# for {GithubLatest} so we can selectively apply the strategy using | ||
# `strategy :github_latest` in a `livecheck` block. | ||
PRIORITY = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously, 0 priority was equivalent (specific) to the PageMatch strategy.
Now, we introduce another strategy with the same priority. This makes the way we write conditionals that process these priorities important.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original idea was that PRIORITY = 0
would cause a strategy to be omitted unless it's specified in a livecheck
block (e.g., strategy :github_latest
). I think the updated logic in Livecheck::Strategy#from_url
more accurately represents this, as it now ensures the strategy can be applied when specified in a livecheck
block.
Since PageMatch
was the only strategy with a priority of zero up to this point and it was special-cased, I forgot to account for the from_symbol(livecheck_strategy) != strategy
condition originally. Outside of this omission, we haven't treated PRIORITY = 0
as being equivalent to strategy == PageMatch
in livecheck's code and we specifically reference PageMatch
(or :page_match
) where appropriate.
PageMatch
is an exception to the above idea, rather than the rule. The PageMatch
strategy will automatically apply if a livecheck
block contains a url
and regex
, so it's not omitted like other zero-priority strategies. The reasons why it has a priority of zero is to be able to enforce the regex requirement before applying it and to also deprioritize it with respect to more specific strategies (e.g., those with the DEFAULT_PRIORITY
of 5 or a higher PRIORITY
).
I've thought about updating the #match?
method in strategies to allow for contextual information to be passed in. I imagine this would either be the Livecheck
object (i.e., the information in the formula's livecheck
block) or maybe an optional options
hash (to allow for more flexibility/extensibility in the future). With this setup, we could give PageMatch
a priority of 1 (deprioritized but not omitted) and enforce the regex
requirement in PageMatch#match?
.
I believe this is a better setup overall, as we wouldn't violate the PRIORITY = 0
omission idea above and also it would allow us to keep this information in strategies instead of Strategy#from_url
. I'll create a PR for this after I've finished the livecheck docs PR (and I allow myself to work on livecheck's internals again) and we can discuss it further there.
That said, I believe the related changes in this PR are sufficient for now but let me know if I've overlooked anything important.
!strategy::PRIORITY.positive? && | ||
from_symbol(livecheck_strategy) != strategy | ||
# Ignore strategies with a priority of 0 or lower, unless the | ||
# strategy is specified in the `livecheck` block |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We explicitly state on line 17 of this file that strategies with priorities of 0 or lower are ignored.
Shall we expand and say that "negative priorities raise an error (they don't at the moment, but they probably should), 0 priority is reserved for..., all normal strategies should use positive values below 10" ?
Why do we allow such strategies when they're specified in the livecheck
block? I guess this comment should explain why we're doing this instead of or in addition to what it currently says.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the comment on line 15 mentions, the 0-10 range is informal. It's technically possible for us to go above or below this range if it becomes necessary in the future and that's why we don't strictly enforce the range in the code.
The only ideas that we strictly enforce in the code are:
- The default priority is 5
- Strategies with a non-positive priority (0 or below) are omitted unless specified in the
livecheck
block - Strategies that apply to a URL are arranged from highest priority to lowest
I do think this explanation can be improved a little (e.g., I didn't mention here that the way to apply a strategy with a non-positive priority is using strategy
in a livecheck
block), so I'll take care of that in a moment.
# @param regex [Regexp] a regex used for matching versions in content | ||
# @return [Hash] | ||
def self.find_versions(url, regex = nil) | ||
%r{github\.com/(?:downloads/)?(?<username>[^/]+)/(?<repository>[^/]+)}i =~ url.sub(/\.git$/i, "") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked Homebrew/core, and the following packages use /downloads/
subfolder:
- apktool.rb
- btpd.rb
- cssembed.rb
- growly.rb
- henplus.rb
- jadx.rb
- midgard2.rb
- rpg.rb
- syck.rb
- tidyp.rb
- wrk-trello.rb
My understanding is that /downloads
folders have been deprecated for a while now -- see https://github.blog/2012-12-12-goodbye-uploads/. I wonder if we should remove downloads
from here and update the above formulae to use proper download paths.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've already been through the /downloads/
formulae recently (as part of another PR) and I took care of the straightforward ones, so this is what remains. As you would guess, most of these are highly outdated and tend to have low install counts in our analytics. The repository uploads URLs continue to resolve, so updating these is low priority in comparison to work that can have a greater impact.
Can potentially be updated
apktool.rb
: The/downloads/
URL is a resource from another GitHub repository that's used in thetest
block but I didn't see an alternative location for this particular file. The formula could be updated to use a differentapk
if a suitable replacement can be found.btpd.rb
: The formula would need to be updated to build using the0.16
tag archive.jadx.rb
: This uses the sameapk
file as theapktool
formula, so the situation is the same.midgard2.rb
: The formula would need to be updated to build using the12.09.1
tag archive.rpg.rb
: The formula would need to be updated to build using the0.3.0
tag archive. This release is from 2010, the repository is archived (I deprecated the formula in the past), and there have only been 12 installs over the past year, so it may not be worth the effort.tidyp.rb
: The formula would need to be updated to build using the1.04
tag archive.
Unlikely that we can update
cssembed.rb
:stable
is ajar
file and I don't see alternatives. The GitHub repository is archived (I deprecated the formula in the past), so it's not worth the effort.growly.rb
: The repository doesn't have any tags or releases (and we're not going to use thegrowly
file frommaster
). Besides that, the code hasn't been modified in nine years and the repository doesn't have a license file (missing license file for the repo ryankee/growly#7).henplus.rb
: The repository doesn't have any tags or releases and the formula is deprecated since it doesn't build. There's been some activity here and there in recent years but we're stuck unless they create a release.syck.rb
: This is one of the two formulae using agithub.s3.amazonaws.com/downloads
URL. There aren't any alternatives (to my knowledge) and it hasn't been updated in years (it's a project by why the lucky stiff and I don't imagine him coming back anytime soon).wrk-trello.rb
: This is one of the two formulae using agithub.s3.amazonaws.com/downloads
URL. There aren't any alternatives (to my knowledge) and it hasn't been updated in eight years.
Thinking about this some more, do we need to add some RuboCops for this strategy (in the
The second one here could potentially be implemented more like "if |
The rubocops would be useful, in my opinion. Shall I go ahead and add them (unless you already have them, that is)? |
You're more familiar with the |
Thanks again @nandahkrishna! |
Thanks everyone. I'm working on a new PR for the RuboCops, I hope to have it out soon with more details. |
brew style
with your changes locally?brew tests
with your changes locally?brew man
locally and committed any changes?This PR adds a
GithubLatest
strategy for Livecheck, to check the latest release marked on a project's GitHub repository.If my
grep
ing was right, there are about 214 Formulae withlivecheck
blocks to check the latest release, so this strategy would reduce replication of standard URL formats and regexes across multiple Formulae.There aren't too many exceptions to the default regex (eg: when the tag name contains the Formula name, non-version characters or version numbers separated by
-
or_
) – I was unsure whether to factor in those cases as well.