Skip to content

Commit

Permalink
Use previous token to flush runners (#218)
Browse files Browse the repository at this point in the history
* Flush runners using prev token

* Catch correct error

* Force flush runners

* Improve BlockedStatus message

* Use force flush correctly

* Add rate-limit info
  • Loading branch information
cbartz committed Feb 7, 2024
1 parent 321e9f8 commit c9f6bfa
Show file tree
Hide file tree
Showing 16 changed files with 192 additions and 61 deletions.
3 changes: 3 additions & 0 deletions docs/explanation/charm-architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,9 @@ The charm requires a GitHub personal access token for the [`token` configuration

The token is also passed to [repo-policy-compliance](https://github.com/canonical/repo-policy-compliance) to access GitHub API for the service.

Note that the GitHub API uses a [rate-limiting mechanism](https://docs.github.com/en/rest/using-the-rest-api/rate-limits-for-the-rest-api?apiVersion=2022-11-28). When this is reached, the charm may not be able to perform the necessary operations and may go into
BlockedStatus. The charm will automatically recover from this state once the rate limit is reset, but using a different token with a higher rate limit may be a better solution depending on your deployment requirements.

## GitHub repository setting check

The [repo-policy-compliance](https://github.com/canonical/repo-policy-compliance) is a [Flask application](https://flask.palletsprojects.com/) hosted on [Gunicorn](https://gunicorn.org/) that provides a RESTful HTTP API to check the settings of GitHub repositories. This ensures the GitHub repository settings do not allow the execution of code not reviewed by maintainers on the self-hosted runners.
Expand Down
6 changes: 3 additions & 3 deletions src-docs/charm.py.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ Charm for creating and managing GitHub self-hosted runner instances.

---

<a href="../src/charm.py#L68"><img align="right" style="float:right;" src="https://img.shields.io/badge/-source-cccccc?style=flat-square"></a>
<a href="../src/charm.py#L69"><img align="right" style="float:right;" src="https://img.shields.io/badge/-source-cccccc?style=flat-square"></a>

## <kbd>function</kbd> `catch_charm_errors`

Expand All @@ -38,7 +38,7 @@ Catch common errors in charm.

---

<a href="../src/charm.py#L99"><img align="right" style="float:right;" src="https://img.shields.io/badge/-source-cccccc?style=flat-square"></a>
<a href="../src/charm.py#L103"><img align="right" style="float:right;" src="https://img.shields.io/badge/-source-cccccc?style=flat-square"></a>

## <kbd>function</kbd> `catch_action_errors`

Expand Down Expand Up @@ -67,7 +67,7 @@ Catch common errors in actions.
## <kbd>class</kbd> `GithubRunnerCharm`
Charm for managing GitHub self-hosted runners.

<a href="../src/charm.py#L142"><img align="right" style="float:right;" src="https://img.shields.io/badge/-source-cccccc?style=flat-square"></a>
<a href="../src/charm.py#L146"><img align="right" style="float:right;" src="https://img.shields.io/badge/-source-cccccc?style=flat-square"></a>

### <kbd>function</kbd> `__init__`

Expand Down
18 changes: 18 additions & 0 deletions src-docs/errors.py.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,15 @@ Represents an error when the shared filesystem could not be retrieved.



---

## <kbd>class</kbd> `GithubApiError`
Represents an error when the GitHub API returns an error.





---

## <kbd>class</kbd> `GithubClientError`
Expand Down Expand Up @@ -305,3 +314,12 @@ Construct the subprocess error.



---

## <kbd>class</kbd> `TokenError`
Represents an error when the token is invalid or has not enough permissions.





26 changes: 19 additions & 7 deletions src-docs/github_client.py.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,25 @@ GitHub API client.
Migrate to PyGithub in the future. PyGithub is still lacking some API such as remove token for runner.


---

<a href="../src/github_client.py#L31"><img align="right" style="float:right;" src="https://img.shields.io/badge/-source-cccccc?style=flat-square"></a>

## <kbd>function</kbd> `catch_http_errors`

```python
catch_http_errors(func)
```

Catch HTTP errors and raise custom exceptions.


---

## <kbd>class</kbd> `GithubClient`
GitHub API client.

<a href="../src/github_client.py#L34"><img align="right" style="float:right;" src="https://img.shields.io/badge/-source-cccccc?style=flat-square"></a>
<a href="../src/github_client.py#L53"><img align="right" style="float:right;" src="https://img.shields.io/badge/-source-cccccc?style=flat-square"></a>

### <kbd>function</kbd> `__init__`

Expand All @@ -36,7 +48,7 @@ Instantiate the GiHub API client.

---

<a href="../src/github_client.py#L150"><img align="right" style="float:right;" src="https://img.shields.io/badge/-source-cccccc?style=flat-square"></a>
<a href="../src/github_client.py#L173"><img align="right" style="float:right;" src="https://img.shields.io/badge/-source-cccccc?style=flat-square"></a>

### <kbd>function</kbd> `delete_runner`

Expand All @@ -55,7 +67,7 @@ Delete the self-hosted runner from GitHub.

---

<a href="../src/github_client.py#L170"><img align="right" style="float:right;" src="https://img.shields.io/badge/-source-cccccc?style=flat-square"></a>
<a href="../src/github_client.py#L194"><img align="right" style="float:right;" src="https://img.shields.io/badge/-source-cccccc?style=flat-square"></a>

### <kbd>function</kbd> `get_job_info`

Expand Down Expand Up @@ -84,7 +96,7 @@ Get information about a job for a specific workflow run.

---

<a href="../src/github_client.py#L44"><img align="right" style="float:right;" src="https://img.shields.io/badge/-source-cccccc?style=flat-square"></a>
<a href="../src/github_client.py#L63"><img align="right" style="float:right;" src="https://img.shields.io/badge/-source-cccccc?style=flat-square"></a>

### <kbd>function</kbd> `get_runner_applications`

Expand All @@ -107,7 +119,7 @@ Get list of runner applications available for download.

---

<a href="../src/github_client.py#L63"><img align="right" style="float:right;" src="https://img.shields.io/badge/-source-cccccc?style=flat-square"></a>
<a href="../src/github_client.py#L83"><img align="right" style="float:right;" src="https://img.shields.io/badge/-source-cccccc?style=flat-square"></a>

### <kbd>function</kbd> `get_runner_github_info`

Expand All @@ -132,7 +144,7 @@ Get runner information on GitHub under a repo or org.

---

<a href="../src/github_client.py#L128"><img align="right" style="float:right;" src="https://img.shields.io/badge/-source-cccccc?style=flat-square"></a>
<a href="../src/github_client.py#L150"><img align="right" style="float:right;" src="https://img.shields.io/badge/-source-cccccc?style=flat-square"></a>

### <kbd>function</kbd> `get_runner_registration_token`

Expand All @@ -155,7 +167,7 @@ Get token from GitHub used for registering runners.

---

<a href="../src/github_client.py#L110"><img align="right" style="float:right;" src="https://img.shields.io/badge/-source-cccccc?style=flat-square"></a>
<a href="../src/github_client.py#L131"><img align="right" style="float:right;" src="https://img.shields.io/badge/-source-cccccc?style=flat-square"></a>

### <kbd>function</kbd> `get_runner_remove_token`

Expand Down
2 changes: 1 addition & 1 deletion src-docs/runner.py.md
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ Create the runner instance on LXD and register it on GitHub.
### <kbd>function</kbd> `remove`

```python
remove(remove_token: str) → None
remove(remove_token: Optional[str]) → None
```

Remove this runner instance from LXD and GitHub.
Expand Down
4 changes: 2 additions & 2 deletions src-docs/runner_manager.py.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ Construct RunnerManager object for creating and managing runners.

---

<a href="../src/runner_manager.py#L714"><img align="right" style="float:right;" src="https://img.shields.io/badge/-source-cccccc?style=flat-square"></a>
<a href="../src/runner_manager.py#L726"><img align="right" style="float:right;" src="https://img.shields.io/badge/-source-cccccc?style=flat-square"></a>

### <kbd>function</kbd> `build_runner_image`

Expand Down Expand Up @@ -172,7 +172,7 @@ Bring runners in line with target.

---

<a href="../src/runner_manager.py#L724"><img align="right" style="float:right;" src="https://img.shields.io/badge/-source-cccccc?style=flat-square"></a>
<a href="../src/runner_manager.py#L736"><img align="right" style="float:right;" src="https://img.shields.io/badge/-source-cccccc?style=flat-square"></a>

### <kbd>function</kbd> `schedule_build_runner_image`

Expand Down
5 changes: 3 additions & 2 deletions src-docs/runner_manager_type.py.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,9 @@ Strategy for flushing runners.

- <b>`FLUSH_IDLE`</b>: Flush only idle runners.
- <b>`FLUSH_IDLE_WAIT_REPO_CHECK`</b>: Flush only idle runners, then wait until repo-policy-check is completed for the busy runners.
- <b>`FORCE_FLUSH_BUSY`</b>: Force flush busy runners.
- <b>`FORCE_FLUSH_BUSY_WAIT_REPO_CHECK`</b>: Wait until the repo-policy-check is completed before force flush of busy runners.
- <b>`FLUSH_BUSY`</b>: Flush busy runners.
- <b>`FLUSH_BUSY_WAIT_REPO_CHECK`</b>: Wait until the repo-policy-check is completed before flush of busy runners.
- <b>`FORCE_FLUSH_WAIT_REPO_CHECK`</b>: Force flush the runners (remove lxd instances even on gh api issues, like invalid token). Wait until repo-policy-check is completed before force flush of busy runners.



Expand Down
46 changes: 29 additions & 17 deletions src/charm.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@
RunnerBinaryError,
RunnerError,
SubprocessError,
TokenError,
)
from event_timer import EventTimer, TimerStatusError
from firewall import Firewall, FirewallEntry
Expand Down Expand Up @@ -82,6 +83,9 @@ def func_with_catch_errors(self, event: EventT) -> None:
except ConfigurationError as err:
logger.exception("Issue with charm configuration")
self.unit.status = BlockedStatus(str(err))
except TokenError as err:
logger.exception("Issue with GitHub token")
self.unit.status = BlockedStatus(str(err))
except MissingConfigurationError as err:
logger.exception("Missing required charm configuration")
self.unit.status = BlockedStatus(
Expand Down Expand Up @@ -492,7 +496,7 @@ def _on_upgrade_charm(self, _event: UpgradeCharmEvent) -> None:
if not runner_manager:
return

runner_manager.flush(FlushMode.FORCE_FLUSH_BUSY_WAIT_REPO_CHECK)
runner_manager.flush(FlushMode.FLUSH_BUSY_WAIT_REPO_CHECK)
self._reconcile_runners(runner_manager)

@catch_charm_errors
Expand All @@ -502,31 +506,39 @@ def _on_config_changed(self, _event: ConfigChangedEvent) -> None:
Args:
event: Event of configuration change.
"""
self._set_reconcile_timer()

prev_config_for_flush: dict[str, str] = {}

if self.config["token"] != self._stored.token:
prev_config_for_flush["token"] = str(self._stored.token)
self._start_services()
self._stored.token = None

self._refresh_firewall()
self._set_reconcile_timer()

if self.config["path"] != self._stored.path:
prev_runner_manager = self._get_runner_manager(
path=str(self._stored.path)
) # Casting for mypy checks.
if prev_runner_manager:
self.unit.status = MaintenanceStatus("Removing runners from old org/repo")
prev_runner_manager.flush(FlushMode.FORCE_FLUSH_BUSY_WAIT_REPO_CHECK)
prev_config_for_flush["path"] = str(self._stored.path)
self._stored.path = self.config["path"]

runner_manager = self._get_runner_manager()
if runner_manager:
if prev_config_for_flush:
prev_runner_manager = self._get_runner_manager(**prev_config_for_flush)
if prev_runner_manager:
self.unit.status = MaintenanceStatus("Removing runners due to config change")
# it may be the case that the prev token has expired, so we need to use force flush
prev_runner_manager.flush(FlushMode.FORCE_FLUSH_WAIT_REPO_CHECK)

self._refresh_firewall()

try:
runner_manager = self._get_runner_manager()
except MissingConfigurationError as err:
self.unit.status = BlockedStatus(
f"Missing required charm configuration: {err.configs}"
)
else:
self._reconcile_runners(runner_manager)
self.unit.status = ActiveStatus()
else:
self.unit.status = BlockedStatus("Missing token or org/repo path config")

if self.config["token"] != self._stored.token:
runner_manager.flush(FlushMode.FORCE_FLUSH_BUSY_WAIT_REPO_CHECK)
self._stored.token = self.config["token"]

def _check_and_update_dependencies(self) -> bool:
Expand Down Expand Up @@ -669,7 +681,7 @@ def _on_flush_runners_action(self, event: ActionEvent) -> None:
"""
runner_manager = self._get_runner_manager()

runner_manager.flush(FlushMode.FORCE_FLUSH_BUSY_WAIT_REPO_CHECK)
runner_manager.flush(FlushMode.FLUSH_BUSY_WAIT_REPO_CHECK)
delta = self._reconcile_runners(runner_manager)

self._on_check_runners_action(event)
Expand Down Expand Up @@ -704,7 +716,7 @@ def _on_stop(self, _: StopEvent) -> None:
self._event_timer.disable_event_timer("reconcile-runners")

runner_manager = self._get_runner_manager()
runner_manager.flush(FlushMode.FORCE_FLUSH_BUSY)
runner_manager.flush(FlushMode.FLUSH_BUSY)

def _reconcile_runners(self, runner_manager: RunnerManager) -> Dict[str, Any]:
"""Reconcile the current runners state and intended runner state.
Expand Down
8 changes: 8 additions & 0 deletions src/errors.py
Original file line number Diff line number Diff line change
Expand Up @@ -150,6 +150,14 @@ class GithubClientError(Exception):
"""Base class for all github client errors."""


class GithubApiError(GithubClientError):
"""Represents an error when the GitHub API returns an error."""


class TokenError(GithubClientError):
"""Represents an error when the token is invalid or has not enough permissions."""


class JobNotFoundError(GithubClientError):
"""Represents an error when the job could not be found on GitHub."""

Expand Down
28 changes: 27 additions & 1 deletion src/github_client.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
Migrate to PyGithub in the future. PyGithub is still lacking some API such as
remove token for runner.
"""

import functools
import logging
from datetime import datetime
from urllib.error import HTTPError
Expand All @@ -28,6 +28,25 @@
logger = logging.getLogger(__name__)


def catch_http_errors(func):
"""Catch HTTP errors and raise custom exceptions."""

@functools.wraps(func)
def wrapper(*args, **kwargs):
try:
return func(*args, **kwargs)
except HTTPError as exc:
if exc.code in (401, 403):
if exc.code == 401:
msg = "Invalid token."
else:
msg = "Provided token has not enough permissions or has reached rate-limit."
raise errors.TokenError(msg) from exc
raise errors.GithubApiError from exc

return wrapper


class GithubClient:
"""GitHub API client."""

Expand All @@ -41,6 +60,7 @@ def __init__(self, token: str):
self._token = token
self._client = GhApi(token=self._token)

@catch_http_errors
def get_runner_applications(self, path: GithubPath) -> RunnerApplicationList:
"""Get list of runner applications available for download.
Expand All @@ -60,6 +80,7 @@ def get_runner_applications(self, path: GithubPath) -> RunnerApplicationList:

return runner_bins

@catch_http_errors
def get_runner_github_info(self, path: GithubPath) -> list[SelfHostedRunner]:
"""Get runner information on GitHub under a repo or org.
Expand Down Expand Up @@ -107,6 +128,7 @@ def get_runner_github_info(self, path: GithubPath) -> list[SelfHostedRunner]:
]
return remote_runners_list

@catch_http_errors
def get_runner_remove_token(self, path: GithubPath) -> str:
"""Get token from GitHub used for removing runners.
Expand All @@ -125,6 +147,7 @@ def get_runner_remove_token(self, path: GithubPath) -> str:

return token["token"]

@catch_http_errors
def get_runner_registration_token(self, path: GithubPath) -> str:
"""Get token from GitHub used for registering runners.
Expand All @@ -147,6 +170,7 @@ def get_runner_registration_token(self, path: GithubPath) -> str:

return token["token"]

@catch_http_errors
def delete_runner(self, path: GithubPath, runner_id: int) -> None:
"""Delete the self-hosted runner from GitHub.
Expand Down Expand Up @@ -208,6 +232,8 @@ def get_job_info(self, path: GithubRepo, workflow_run_id: str, runner_name: str)
)

except HTTPError as exc:
if exc.code in (401, 403):
raise errors.TokenError from exc
raise errors.JobNotFoundError(
f"Could not find job for runner {runner_name}. "
f"Could not list jobs for workflow run {workflow_run_id}"
Expand Down
Loading

0 comments on commit c9f6bfa

Please sign in to comment.