Figure out how to block or filter out sources of code of conduct violations #77

wlandau · 2024-07-19T15:39:04Z

Using the GitHub API, it would be simple to check if the author of the PR is the same GitHub user name as the owner of the package source code repo. @shikokuchuo, what do you think?

shikokuchuo · 2024-07-19T15:43:01Z

Does that work if the repo is part of an organization such as ROpenSci?

wlandau · 2024-07-19T15:44:00Z

It would flag all those for manual review, which could get heavy-handed.

shikokuchuo · 2024-07-19T15:56:15Z

A variant would be to maintain a whitelist of approved contributors i.e. those which have contributed at least one package verified in this manner (or manually checked by a moderator).

wlandau · 2024-08-22T02:12:21Z

I think it may be easier to track code of conduct violations (package and user names) in a confidential repo, then have the bot use the data for pull request reviews in https://github.com/r-multiverse/contributions.

shikokuchuo · 2024-08-27T20:04:55Z

A blacklist is also fine, but this can still be public. If it's transparent then mistakes can be easily corrected.

wlandau · 2024-08-27T20:17:50Z

A blacklist is also fine, but this can still be public. If it's transparent then mistakes can be easily corrected.

I worry about defacement attacks: an adversary's username could contain hate speech, and I'm not sure I would feel comfortable maintaining a list of such usernames in a public-facing repo.

shikokuchuo · 2024-08-27T20:23:04Z

In that case, I'd prefer your original idea - and if that requires more manual review for orgs then so be it. Even if it's private, we'd still be storing such material, which doesn't strike me as a good thing.

wlandau · 2024-08-29T12:20:57Z

Hmmm.... it seems like that would disrupt the contribution workflow and accumulate a large database on the off chance someone behaves maliciously. A banned list would be much shorter, and we would rarely have to update it.

Is your main objection the part about the private repo, i.e. archiving a secret store of user data? In that case, I think I would be willing to maintain a public banned list if the list entries were masked just enough to avoid offending or traumatizing casual observers. Maybe base64 encode the whole thing, formatting included, and add obvious documentation warnings to not decode it? Maybe even encode it twice so people get a second warning and an encoded string the first time they try.

wlandau · 2024-08-29T12:26:18Z

Also, I really hope GitHub can help us, i.e. allowing us to report users as needed. Not sure what their policies and procedures are exactly.

shikokuchuo · 2024-08-29T12:31:50Z

Is your main objection the part about the private repo, i.e. archiving a secret store of user data?

My concern is not about storing the data per se, but the doing it in secret part. It seems to go against the transparency that all this open infrastructure allows. It also prevents verification that due process is being observed if such non-public lists are maintained.

I think I'm onboard with the approach of obfuscation using base64, or perhaps we can think of something better later on.

wlandau · 2024-09-03T12:34:24Z

Using the GitHub API, it would be simple to check if the author of the PR is the same GitHub user name as the owner of the package source code repo.

Actually, maybe this could work if the GitHub API can tell the difference between orgs and users. For users, we could check if the user is the owner. For orgs, we could check if the user is a member of the org. If this approach works, it seems preferable because

It is stateless.
It decreases anonymity just a bit.

shikokuchuo · 2024-09-03T13:13:50Z

Using the GitHub API, it would be simple to check if the author of the PR is the same GitHub user name as the owner of the package source code repo.

Actually, maybe this could work if the GitHub API can tell the difference between orgs and users. For users, we could check if the user is the owner. For orgs, we could check if the user is a member of the org.

Seems possible - with the /orgs/{org}/repos and /orgs/{org}/members endpoints.

This is a nice approach, but do you think it would generate too many manual reviews for community contributed packages not from the maintainer e.g. Arrow.

wlandau · 2024-09-03T15:23:23Z

This is a nice approach, but do you think it would generate too many manual reviews for community contributed packages not from the maintainer e.g. Arrow.

Maybe. But on the other hand, if a maintainer doesn't want their package in R-multiverse, then I think it's a good idea to make sure we respect their wishes.

shikokuchuo · 2024-09-04T09:14:18Z

Sure, that's fine. The example I had in mind was rather: I want to add x package to Multiverse for it to build the binaries I need (or so it's within the Production release) - and I am not the maintainer of said package. This would be flagged for review. I'm thinking this could generate too much noise.

wlandau · 2024-09-04T16:03:43Z

That's a good point. Nowadays maintainers just talk to each other when this comes up for CRAN, and we may want to encourage that, but we might still get more manual reviews. Not sure what the right call is here.

Looming in the back of my mind are also general questions about security: how do we identify contributors and prepare to hold them accountable if packages have malicious code or illegal files? Discharging that risk will help me think about this.

eitsupi · 2024-09-04T23:30:04Z

I would simply like to add another example. I am effectively the maintainer of https://github.com/pola-rs/r-polars, but I am not a member of the organization https://github.com/pola-rs.

wlandau · 2024-09-10T16:35:07Z

Yeah, I think at this point I'd be more in favor of a publicly-maintained base64-encoded "banned" list.

wlandau · 2024-11-17T14:53:38Z

I just found out about GitHub's feature of organization-level moderators. We can add org moderators at https://github.com/organizations/r-multiverse/settings/moderators, and any moderator can go to https://github.com/organizations/r-multiverse/settings/blocked_users and block users.

Now that I know about this, I have a new proposal for handling malicious activity at https://github.com/r-multiverse/contributions/pulls:

Rely on https://github.com/organizations/r-multiverse/settings/blocked_users to block users if needed.
Flag for manual review any package listed in https://github.com/RConsortium/r-advisory-database/tree/main/vulns. (It should be an easy check if we don't bother to search the YAML files for version numbers.)

shikokuchuo · 2024-11-17T15:53:59Z

👍🏻 That would be much simpler than what we were thinking about before.

wlandau · 2024-11-17T16:10:24Z

Awesome! Added https://github.com/orgs/r-multiverse/teams/administrators and https://github.com/orgs/r-multiverse/teams/moderators to https://github.com/organizations/r-multiverse/settings/moderators. That takes care of most of this issue. It only remains to scan https://github.com/RConsortium/r-advisory-database/tree/main/vulns.

wlandau · 2024-11-19T16:56:09Z

I am wondering if https://github.com/RConsortium/r-advisory-database might be flexible enough to accept reports of copyright violations, etc.

c.f. RConsortium/r-advisory-database#14

wlandau · 2024-11-20T10:14:53Z

It occurs to me that we still need an R-multiverse-native way to disable automatic acceptance of packages that previously were removed. Otherwise, if we remove a package due to a policy violation, the bot could just add it back again without anyone knowing.

I think we can accomplish this in a simple way using the existing package listing system at https://github.com/r-multiverse/contributions/tree/main/packages. Instead of a URL or JSON, the text file of a problematic package could contain an informative note such as "removed for x policy violation on date y". Then of course a contributor could make a PR to add it again, but since that PR edits an existing file, the bot would flag it for manual review.

I added support for this mechanism in wlandau/multiverse.internals@b1e7950. If the forthcoming PR is merged, we should mention it documentation for reviewers.

wlandau · 2024-11-20T21:26:47Z

I think we can accomplish this in a simple way using the existing package listing system

Implemented in r-multiverse/multiverse.internals#39, pretty simple and unobtrusive

wlandau changed the title ~~Require manual review if the pull request author is not the owner of the package?~~ Figure out how to block or filter out sources of code of conduct violations Aug 29, 2024

wlandau mentioned this issue Aug 29, 2024

Security expert #81

Closed

wlandau added the priority label Aug 29, 2024

wlandau mentioned this issue Sep 9, 2024

Disclaimer of Liability for Third-Party Downloads #82

Closed

wlandau self-assigned this Nov 17, 2024

wlandau mentioned this issue Nov 20, 2024

Align with upcoming review policy r-multiverse/multiverse.internals#39

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Figure out how to block or filter out sources of code of conduct violations #77

Figure out how to block or filter out sources of code of conduct violations #77

wlandau commented Jul 19, 2024

shikokuchuo commented Jul 19, 2024

wlandau commented Jul 19, 2024

shikokuchuo commented Jul 19, 2024 •

edited

Loading

wlandau commented Aug 22, 2024

shikokuchuo commented Aug 27, 2024

wlandau commented Aug 27, 2024

shikokuchuo commented Aug 27, 2024

wlandau commented Aug 29, 2024

wlandau commented Aug 29, 2024

shikokuchuo commented Aug 29, 2024

wlandau commented Sep 3, 2024

shikokuchuo commented Sep 3, 2024

wlandau commented Sep 3, 2024

shikokuchuo commented Sep 4, 2024

wlandau commented Sep 4, 2024

eitsupi commented Sep 4, 2024

wlandau commented Sep 10, 2024

wlandau commented Nov 17, 2024

shikokuchuo commented Nov 17, 2024

wlandau commented Nov 17, 2024

wlandau commented Nov 19, 2024

wlandau commented Nov 20, 2024

wlandau commented Nov 20, 2024

Figure out how to block or filter out sources of code of conduct violations #77

Figure out how to block or filter out sources of code of conduct violations #77

Comments

wlandau commented Jul 19, 2024

shikokuchuo commented Jul 19, 2024

wlandau commented Jul 19, 2024

shikokuchuo commented Jul 19, 2024 • edited Loading

wlandau commented Aug 22, 2024

shikokuchuo commented Aug 27, 2024

wlandau commented Aug 27, 2024

shikokuchuo commented Aug 27, 2024

wlandau commented Aug 29, 2024

wlandau commented Aug 29, 2024

shikokuchuo commented Aug 29, 2024

wlandau commented Sep 3, 2024

shikokuchuo commented Sep 3, 2024

wlandau commented Sep 3, 2024

shikokuchuo commented Sep 4, 2024

wlandau commented Sep 4, 2024

eitsupi commented Sep 4, 2024

wlandau commented Sep 10, 2024

wlandau commented Nov 17, 2024

shikokuchuo commented Nov 17, 2024

wlandau commented Nov 17, 2024

wlandau commented Nov 19, 2024

wlandau commented Nov 20, 2024

wlandau commented Nov 20, 2024

shikokuchuo commented Jul 19, 2024 •

edited

Loading