-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use teams for ownership #142
Conversation
This dummy team allows us to record the fact that a small number of repos are jointly owned by the Tech teams.
a0e2a8c
to
62831c8
Compare
metrics/github/repos.py
Outdated
return [r for r in query.repos(client, org) if _is_tech_owned(r)] | ||
tech_repo_names = list( | ||
flatten(query.team_repos(client, org, team) for team in _TECH_TEAMS) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@iaindillingham flattening nested lists is a case where I think the list comprehension alternative is pretty intolerable. Although here my case is again damaged by the Python stdlib with its iterator-maximalism and terrible naming of standard functional operations (see my alias at the end of the module, which you might reasonably object to).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to check my understanding, you prefer the above to...
tech_repo_names = [
name
for names in (query.team_repos(client, org, team) for team in _TECH_TEAMS)
for name in names
]
I'm uneasy about the alias, although it is a better name. No objections, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I think the comprehension version is horrible. Once we start splitting list comprehensions over multiple lines I actually prefer for loops. I'd be happy to consider the for loop alternative if you prefer it to flatten()
. Or I'd be happy to write flatten()
as a helper function, rather than a straight alias, which would allow me to absorb the list()
into it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def tech_repos(client, org):
tech_repo_names = []
for team in _TECH_TEAMS:
tech_repo_names.extend(query.team_repos(client, org, team))
return [r for r in _active_repos(client, org) if r.name in tech_repo_names]
or
def tech_repos(client, org):
tech_repo_names = flatten(query.team_repos(client, org, team) for team in _TECH_TEAMS)
return [r for r in _active_repos(client, org) if r.name in tech_repo_names]
def flatten(xss):
return [x for xs in xss for x in xs]
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, of course, black
won't stand for that and unnecessarily splits the flatten()
over multiple lines:
def tech_repos(client, org):
tech_repo_names = flatten(
query.team_repos(client, org, team) for team in _TECH_TEAMS
)
return [r for r in _active_repos(client, org) if r.name in tech_repo_names]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And my for-loop version had a bug, now fixed above. I think I now prefer the for-loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thoughts are very much "You have bad taste, Iain". To my eye, the nested list comprehension has a superficial appeal: it looks tidy. But looking tidy and being tidy are different, as the mountain of junk behind my sofa testifies. The nested list comprehension is hard to parse (start in the middle and work from right to left, go to the bottom, go to the top?).
I'm drawn to the for
loop alternative. I don't think the flatten
helper adds much; I'd have to read the implementation to remind myself that it only flattens lists-in-lists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Every time I see one of those nested for list comps I have to remind my self that it's equivalent to nested for loops in the same order as the list comp. At that point I think "wouldn't this be fewer lines if it was just nested for loops?" and realize that the answer is "yes".
I can forgive a very simple one like my rewritten flatten()
. But once they get split over multiple lines I usually abandon them.
I do like them in principle. Python just isn't a terse enough language to carry it off, I think.
metrics/github/repos.py
Outdated
return [r for r in query.repos(client, org) if _is_tech_owned(r)] | ||
tech_repo_names = list( | ||
flatten(query.team_repos(client, org, team) for team in _TECH_TEAMS) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to check my understanding, you prefer the above to...
tech_repo_names = [
name
for names in (query.team_repos(client, org, team) for team in _TECH_TEAMS)
for name in names
]
I'm uneasy about the alias, although it is a better name. No objections, though.
…exclusion list This has a side-effect of slightly changing the behaviour with respect to archived repos. We don't insist that archived repos have an owning team recorded. Now that we only display data for repos owned by _current_ teams, any archived repos that were owned by old teams or which were active before we had a team structure will not be included in the data. This doesn't affect our main use of this data, which is to understand how the teams are operating today. It potentially affects our long-term historic view of the data. In practice only 4% of PRs are excluded by this change and eye-balling the resulting charts over long timescales shows no significant differences, so we've decided that this is okay.
Previously an unpredictable subset of archived repos were excluded based on team ownership records (see previous commit for details). For clarity and consistently we are now excluding them all explicitly.
…repos See last commit for context.
62831c8
to
4c6c4d5
Compare
Built on #141.