Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unsanitize user and org names in DB #4762

Open
wants to merge 26 commits into
base: main
Choose a base branch
from
Open

Conversation

pat-s
Copy link
Contributor

@pat-s pat-s commented Jan 23, 2025

fix #3614

As discussed in chat, preferred to use non-sanitized values everywhere.

  • Store user and org names using the casing provided by the forge
  • Allow to search for users / orgs using case insensitive names
  • Removed all sanitation in the code
  • AFAICS only the org names were stored sanitized in the DB. User names and repo names are not affected

@pat-s pat-s added bug Something isn't working server labels Jan 23, 2025
Copy link

codecov bot commented Jan 23, 2025

Codecov Report

Attention: Patch coverage is 40.00000% with 18 lines in your changes missing coverage. Please review.

Project coverage is 28.28%. Comparing base (fac744d) to head (e638c62).
Report is 11 commits behind head on main.

Files with missing lines Patch % Lines
...ore/migration/024_unsanitize_org_and_user_names.go 39.28% 14 Missing and 3 partials ⚠️
server/store/datastore/migration/common.go 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4762      +/-   ##
==========================================
- Coverage   28.29%   28.28%   -0.01%     
==========================================
  Files         398      399       +1     
  Lines       28295    28318      +23     
==========================================
+ Hits         8005     8011       +6     
- Misses      19580    19594      +14     
- Partials      710      713       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@pat-s pat-s changed the title Sanitize user names in DB to match org names Unsanitize user and org names in DB Jan 23, 2025
@pat-s pat-s mentioned this pull request Jan 23, 2025
2 tasks
@xoxys
Copy link
Member

xoxys commented Jan 23, 2025

Tests are failing.

server/store/datastore/migration/common.go Outdated Show resolved Hide resolved
server/store/datastore/org.go Show resolved Hide resolved
server/store/datastore/repo.go Outdated Show resolved Hide resolved
@xoxys
Copy link
Member

xoxys commented Jan 24, 2025

After thinking of it again, I'm not sure if thats the best approach. Cant really say why but I see a lot of potential for issues and corner cases. Looks like this issue only occurs with forgejo/gitea and the api returns mixed capitalization while the forge only supports case-insensitive values. What do you think?

Edit: Tested it. You can create an org "Foo" in gitea and in that case its even displayed as "Foo" in the url.... But you cant create "foo" with the error "The organization name is already taken." This is a somewhat inconsistent behavior upstream, however I think we should switch back to the initial approach from @pat-s

@pat-s
Copy link
Contributor Author

pat-s commented Jan 27, 2025

If all users are sanitized, the user list in the UI would also reflect a non-optimal state.

The simplest fix which also wouldn't require a migration would probably be to just sanitize the comparison call of user.Login and org.Name?

@qwerty287
Copy link
Contributor

Why does the license header of the migration start with // Excerpt from:?

@xoxys
Copy link
Member

xoxys commented Feb 1, 2025

Ok I dont have time to look into this on my own. Feel free to go on if you think the current state is fine.

@xoxys
Copy link
Member

xoxys commented Feb 1, 2025

In best case we add a test case. Try to write "org1/foo" and "org1/Foo" to the db. If it fails, we are good to go if not we should still add db level restrictions.

@pat-s
Copy link
Contributor Author

pat-s commented Feb 1, 2025

Added two tests which explicitly check that attempting to add a case-sensitive duplicate for orgs and users results in an error.

@xoxys
Copy link
Member

xoxys commented Feb 1, 2025

Thanks!

xoxys
xoxys previously approved these changes Feb 1, 2025
@xoxys xoxys dismissed their stale review February 1, 2025 12:44

Comments

@woodpecker-bot
Copy link
Contributor

woodpecker-bot commented Feb 2, 2025

Deployment of preview was successful: https://woodpecker-ci-woodpecker-pr-4762.surge.sh

@anbraten anbraten requested a review from xoxys February 2, 2025 14:13
@anbraten anbraten mentioned this pull request Feb 2, 2025
@xoxys
Copy link
Member

xoxys commented Feb 2, 2025

Code LGTM, have you tested the PR locally?

@xoxys
Copy link
Member

xoxys commented Feb 4, 2025

@anbraten Something is wrong. I have applied this PR to an existing WP instance. While I was able to log-in with the user before this PR, this now fails:

woodpecker-server | {"level":"error","time":"2025-02-04T07:54:23Z","caller":"/build/server/api/login.go:228","message":"user org is not from the same forge woodpecker"}

@xoxys
Copy link
Member

xoxys commented Feb 4, 2025

I also noticed this sql errors in the logs:

woodpecker-db       | 2025-02-04 07:51:30.475 UTC [33] ERROR:  relation "migrations" does not exist at character 22
woodpecker-db       | 2025-02-04 07:51:30.475 UTC [33] STATEMENT:  SELECT count(*) FROM "migrations"

but the table is called migration. This looks like migrations at all are no longer working?

Edit: This seems to be unrelated https://github.com/woodpecker-ci/woodpecker/blob/main/server/store/datastore/migration/migration.go#L84

server/api/login.go Outdated Show resolved Hide resolved
@pat-s
Copy link
Contributor Author

pat-s commented Feb 6, 2025

but the table is called migration. This looks like migrations at all are no longer working?

I've seen this error before a few times.
Not sure how "blocking" it is as all past migrations were successfully applied in all my instances.
But it looks like this is a typo, yes. The table is called "migration".

While I was able to log-in with the user before this PR, this now fails:

This is worrisome. Can you check which forge ID got assigned to the user org? Actually, tests should catch this.

Co-authored-by: Robert Kaussow <[email protected]>
Comment on lines 87 to 85
name = strings.ToLower(name)
org := new(model.Org)
return org, wrapGet(sess.Where("name = ?", name).Get(org))
return org, wrapGet(sess.Where("LOWER(name) = ?", strings.ToLower(name)).Get(org))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be a mistake? I thought we don't want to sanitize calls/entries in the DB but only to the forge API.

Also removing sanitation in l87 and then re-adding it seems strange.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is to have searching case insensitive and storing case sensitive (the same way how forges do it, you can use both ways in urls etc and it shows the casing in the UI etc the user used).

Also removing sanitation in l87 and then re-adding it seems strange.

It was previously missing the lower part on both sides. The change makes it unique with other calls.

@xoxys
Copy link
Member

xoxys commented Feb 6, 2025

This is worrisome. Can you check which forge ID got assigned to the user org? Actually, tests should catch this.

woodpecker_server=# select * from orgs;
 id | forge_id |    name    | is_user | private 
----+----------+------------+---------+---------
  1 |        0 | woodpecker | t       | f

forge_id is 0 while only one forge exists with id 1:

woodpecker_server=# select * from forges;
 id | type  |             url              |                client                |                      client_secret                       | skip_verify | oauth_host | additional_options 
----+-------+------------------------------+--------------------------------------+----------------------------------------------------------+-------------+------------+--------------------
  1 | gitea | http://woodpecker-gitea:3000 | **** | **** | f           |            | {}
(1 row)

@anbraten
Copy link
Member

anbraten commented Feb 6, 2025

Strange, I started to add a method to the storage interface to directly get an org by name and forge-id, that might prevent such cases. Forge-id 0 sounds like not set at all (gos default for not setting a number).

@xoxys
Copy link
Member

xoxys commented Feb 6, 2025

Yeah maybe we mixed in too much different fixes into this PR? Can we revert it to focus on the sanitizing issue and move other improvements to another PR?

I have a backup of the DB before the migration from this PR so I can also re-test this PR.

@anbraten
Copy link
Member

anbraten commented Feb 7, 2025

Yeah maybe we mixed in too much different fixes into this PR? Can we revert it to focus on the sanitizing issue and move other improvements to another PR?

Extracted most other changes to #4817

@xoxys xoxys mentioned this pull request Feb 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working server
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CamelCase usernames get added as lower-case into DB and result in access issues
5 participants