-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RHELPLAN-48018 - CI - use github library in test-harness #89
Conversation
test/run-tests
Outdated
random.shuffle(pulls) | ||
pyghrepo = pygh.get_repo(f"{owner}/{repo}") | ||
pulls = pyghrepo.get_pulls(state='open') | ||
# random.shuffle(pulls) --- cannot shuffle PaginatedList |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we convert this into a regular python list, then do the shuffle? e.g. pulls = list(pyghrepo.get_pulls(state='open'))
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your proposed code works!
I'm wondering if there are more than one page size (30 by default), list function still works or not... What do you think? Or it's hard to imagine there are more than 30 open pull requests in a repo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your proposed code works!
I'm wondering if there are more than one page size (30 by default), list function still works or not...
I don't know - you'll have to look at the source code to see if it is implemented as a generator or something like that.
What do you think? Or it's hard to imagine there are more than 30 open pull requests in a repo?
Easier to imagine that github decides to change the page size to 10 or something like that :P
That is - we should make sure it works, regardless of the page size.
In the run-tests script, there are other objects we retrieve from github that may require paging - like statuses and comments - do all of these use the same page size? It should be pretty easy to create more than 30 comments in a PR and see how paging works with comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, my question was not accurate. It was just about list(pyghrepo.get_pulls(state='open'))
. Other PaginatedList's are handled in the loop, so there should not be an issue regardless of the page size or page counts.
But probably, it might be a good idea to add a converter function from PaginatedList to List...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worst case, you have to do an explicit loop e.g.
pulls = [pr for pr in pyghrepo.get_pulls(state='open')]
that will presumably page through all of the pages and return you a list
which contains all of the prs from all of the pages.
|
6a11fd4
to
dfe88f9
Compare
Issue 1 in the commit afe2957 was not an issue. |
Using PyGitHub to take advantage of pagination. Replacing the original Session handle with the pyGitHub handle as gh.
@richm, I revisited the token issue. It seems I made some mistake in the beginning... It turned out there's no problem to use pyGitHub with no auth.
Sorry for the noise. I think there's no concern in using pyGitHub. Could you please review this pr? |
test/run-tests
Outdated
|
||
|
||
def get_statuses(gh, owner, repo, sha, max_num_statuses): | ||
def get_statuses(commits): | ||
""" | ||
Fetches all statuses of the given commit in a repository and returns a dict | ||
mapping context to its most recent status. | ||
Will return at most max_num_statuses. By default, github will return 30, but | ||
some of our PRs have more than that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment should be changed.
test/run-tests
Outdated
if comment_update > after: | ||
new_comments.append(comment) | ||
comments = new_comments | ||
statues = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
statuses
test/run-tests
Outdated
new_comments.append(comment) | ||
comments = new_comments | ||
statues = [] | ||
for commit in commits: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for multiple commits? Not just head
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, you are right... Somehow, I was thinking I should look into all, but I should not have. Let me fix it.
@@ -601,21 +573,21 @@ def check_commit_needs_testing(status, commands): | |||
return True | |||
|
|||
# or the status is pending without a hostname in the description | |||
if status["state"] == "pending" and not status.get("description"): | |||
if status.state == "pending" and not status.description: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the gh api will return None if you use status.description
and the status has no description
field?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It returns an empty string, not None. Is it ok?
>>> for status in statuses:
... if status.description == '':
... print(status.state, 'empty')
... else:
... print(status.state, status.description)
<<snip>>
pending empty
failure linux-system-roles-staging-123-9blzw@2020-07-16 04:59:59.853483
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes - so
if status.state == "pending" and not status.description:
is ok
test/run-tests
Outdated
logging.debug( | ||
"PR will be tested because it is in state 'pending' and has no description" | ||
) | ||
return True | ||
|
||
# or a generic re-check was requested: | ||
if status["updated_at"] < commands[COMMENT_CMD_TEST_ALL]: | ||
if status.updated_at.strftime("%Y%m%d-%H%M%S") < commands[COMMENT_CMD_TEST_ALL]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The gh api returns the status.updated_at
as a DateTime object?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest converting this to a string once e.g.
updated_at = status.updated_at.strftime("%Y%m%d-%H%M%S")
if updated_at < commands[COMMENT_CMD_TEST_ALL]:
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The gh api returns the
status.updated_at
as a DateTime object?
Yes.
| updated_at
| :type: datetime.datetime
{"context": status_context, "state": "pending", "description": description}, | ||
commit = gh.get_repo(f"{task.owner}/{task.repo}").get_commit(task.head) | ||
commit.create_status( | ||
state="pending", context=status_context, description=description |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this now takes 2 requests instead of 1 - I'm assuming it is doing a request to get the commit, then another request to create the status
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe 3 requests - get_repo
then get_commit
then create_status
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we stash objects such as commit in Task?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
Not sure if there is a way to get it to dump the number of actual requests made - some of them might be cached - you could use your private github account token instead of the systemroller account, and use https://developer.github.com/v3/rate_limit/ to count how many requests are made for each Task. |
test/run-tests
Outdated
continue | ||
|
||
statuses = get_statuses(gh, owner, repo, head, args.max_num_statuses) | ||
commands = get_comment_commands(gh, owner, repo, number) | ||
statuses = get_statuses(pull.get_commits()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to get the list of statuses, you need to get the commit using e.g. commit = repo.get_commit(pull.head.sha)
, then use commit.get_combined_status().statuses
to get the list of statuses.
I set the logging level to DEBUG and counted the GET and POST counts (assuming both Session and PyGitHub dumps every GET and POST).
More precisely, this is the GET list from the master:
vs. pr89:
I think I need to investigate some more to eliminate the redundant GET calls... |
It seems that the main engineering effort around CI is an optimization problem, to minimize the number of calls to github - the main limiting factor is the ratelimit. @i386x @pcahyna you have worked with this for a while - perhaps you already came to this conclusion with your previous efforts with CI. It's possible that using pygh will make the rate limit problem worse . . . it seems the api is designed to get objects like this: With the current rest based api design, you only get the data you need at the time you need it - the calls to the rest api are minimized - not as programmer friendly, but more rate limit friendly. It may be that pygh is not designed to limit the number of requests, and it might be quite difficult to use it in such a way as to minimize the number of requests. My current approach for testing multiple versions of ansible is to have separate deployments (OpenShift deploymentConfig). However, each pod increases the number of concurrent github requests by a factor of 2, increasing the risk of hitting the ratelimit. If we are going to test for multiple versions of ansible in such a way that we minimize the number of requests to github in a given time period, we might need to take a radically different approach - have a single deployment that can test multiple different versions of ansible. https://github.com/linux-system-roles/test-harness/blob/master/test/run-tests#L414 - here, instead of calling ansible-playbook directly, have a loop that calls multiple versions of ansible: DEFAULT_ANSIBLE_PLAYBOOK_SCRIPTS = ["ansible-playbook-2.7", "ansible-playbook-2.8", "ansible-playbook-2.9"]
... allow the user to override these in the config ...
for playbook_script in config["ansible_playbook_scripts"]:
ansible_log = f"{artifactsdir}/{playbook_script}.log"
for playbook in sorted(playbooks):
print(f"Testing {playbook}...", end="")
with redirect_output(ansible_log, mode="a"):
result = run(
playbook_script,
"-vv",
f"--inventory={inventory}", |
I did some more research if there is any chance to reduce the GET calls in this PR. I could eliminate 4. But it still requires 8 of them. There could be more to go, but I'd think we cannot achieve as little as our original implementation.
And I should mention that the test duration time is almost doubled compared to the master. Sad to say, but I'd guess pyGitHub does not fit our needs. It is intuitive and maybe good for some interactive use cases, but not for the tools/services which require performance? |
@richm, @pcahyna, @nhosoi What about using just one pod as a proxy between GitHub and the rest of pods? Proxy pod will check GitHub and delegate the work to other pods. It also can do some level of caching. This can decrease the number of request dramatically, but on the other hand it can also enforce us to totally rework our CI. |
do you have the ansible logs from those tests?
I think the main problem with pygh is that it makes the ratelimit problem worse. afaict, it shouldn't make the performance much worse - so I'm curious why there is such a difference in the results above |
That is also a possibility - could you explain more about how that would work? |
Sorry, there were some mistakes in my testing. I commented out the time.sleep calls in the scripts and reran the tests. With pyGitHub, it tends to be slower, but not doubled.
|
ok - that's what I would expect - so the main issue is that it makes many more requests which risks ratelimiting |
Closing this PR since there is a better way to improve the CI test script. Unless there are any stronger demands on using the github library, this issue could be closed, as well. |
Using PyGitHub to take advantage of pagination.
Replacing the original Session handle with the pyGitHub handle as gh.
Note: it's still in the wip stage.Using PyGitHub to take advantage of pagination.2 issues arose in the effort.1) To avoid the race among the multiple instances of the script, handle_tasksets status to pending with the timestamp. Then the status is read justafter that. The task handling goes forward only if the timestamp matchesthe one the instance sets.~~ I could not make the logic work since the PyGitHub does not reread the~~
status since it's already read once. That makes the two timestamps notmatch and the following task never be executed.Also, to update the status, tried Commit.create_statuses. But the statuswas not retrieved by the following Commit.get_status, either...2) In the current usage, dry-run mode does not require token, but PyGitHubalways require an authentication.