Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: remove unpopular licenses #33467

Open
wxiaoguang opened this issue Feb 1, 2025 · 6 comments
Open

Proposal: remove unpopular licenses #33467

wxiaoguang opened this issue Feb 1, 2025 · 6 comments
Labels
proposal/accepted We have reviewed the proposal and agree that it should be implemented like that/at all. type/proposal The new feature has not been accepted yet but needs to be discussed first.

Comments

@wxiaoguang
Copy link
Contributor

There are more than 720 licenses in Gitea https://github.com/go-gitea/gitea/tree/main/options/license

Most of them are out-dated/inactive/unpopular

We could just keep about 20-40 popular licenses:

Benefits:

  • reduce binary size
  • speed up execution time
  • save memory
    • especially the License Detector, it consumes more than 100MB memory at the moment
    • by removing most unnecessary licenses, License Detector could only consume 3-5MB memory then
@wxiaoguang wxiaoguang added the type/proposal The new feature has not been accepted yet but needs to be discussed first. label Feb 1, 2025
@delvh
Copy link
Member

delvh commented Feb 2, 2025

Another benefit:

  • users are not overloaded with possible options anymore (I've often asked myself "which of these licenses could be the one I want")

@delvh delvh added the proposal/accepted We have reviewed the proposal and agree that it should be implemented like that/at all. label Feb 2, 2025
@wxiaoguang
Copy link
Contributor Author

So, as the first step, need to stop this:

@lunny @techknowlogick

Image

@lunny
Copy link
Member

lunny commented Feb 3, 2025

So, as the first step, need to stop this:

@lunny @techknowlogick

Image

A pull request could be sent to change cron-licenses.yml to only manually. #33486

@silverwind
Copy link
Member

silverwind commented Feb 3, 2025

especially the License Detector, it consumes more than 100MB memory at the moment

100MB persistent? Excluding actual license data, it ideally should have close to zero persistent memory, only run when needed (e.g. when a repo was pushed to, does not matter if license update takes a few seconds in such cases, ideally it start off a lazy goroutine, debounced on like ~10s of repo push inactivity).

@silverwind
Copy link
Member

20-40 popular licenses

How about keeping 50 most popular? Gitea's usersbase is diverse, 20 sounds definitely too low.

lafriks pushed a commit that referenced this issue Feb 3, 2025
Help #33467
The file can be changed or removed after that issue is resolved.
@eeyrjmr
Copy link
Contributor

eeyrjmr commented Feb 7, 2025

20-40 popular licenses

How about keeping 50 most popular? Gitea's usersbase is diverse, 20 sounds definitely too low.

maybe...
There only really appears to be ~30 that are really used on github,

import pandas as pd
url= "https://raw.githubusercontent.com/github/innovationgraph/refs/heads/main/data/licenses.csv"
tables= pd.read_csv(url)
tables[(tables.year == 2024)].pivot_table(values="num_pushers", index="spdx_license",aggfunc="sum").sort_values('num_pushers',ascending=False).plot(kind='barh',figsize=(9,9))

2024 data
Image

2023 data

Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal/accepted We have reviewed the proposal and agree that it should be implemented like that/at all. type/proposal The new feature has not been accepted yet but needs to be discussed first.
Projects
None yet
Development

No branches or pull requests

5 participants