Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement log flushing #808

Merged
merged 7 commits into from
May 1, 2024

Conversation

nkaretnikov
Copy link
Contributor

@nkaretnikov nkaretnikov commented Apr 11, 2024

Fixes #809.

Description

This pull request:

  • makes actions write to log storage incrementally: the output can be seen by users as soon as it's available
  • changes the logging format by prefixing each line with the name of the action: makes it easier to see what code is running
  • adds locking when appending to logs to avoid a race condition
  • moves the body of logged_command to ActionContext.run_command, next to run.

Pull request checklist

  • Did you test this change locally?
  • Did you update the documentation (if required)?
  • Did you add/update relevant tests for this change (if required)?

Additional information

How to test

  • start an environment build
  • go to the logs page for this build
  • check that new output appears while an action is running, by refreshing the page quickly
  • check that all output is prefixed with the action name

Copy link

netlify bot commented Apr 11, 2024

Deploy Preview for conda-store canceled.

Name Link
🔨 Latest commit b09a067
🔍 Latest deploy log https://app.netlify.com/sites/conda-store/deploys/663193c2a4ffeb0008d7dfbf

@nkaretnikov nkaretnikov changed the title WIP: Move conda-lock into a subprocess Implement log flushing Apr 14, 2024
@nkaretnikov nkaretnikov added this to the challenges: Round 2 🚀 milestone Apr 15, 2024
except Exception:
current_logs = b""
class LoggedStream:
"""Allows writing to storage via logging.StreamHandler"""
Copy link
Contributor Author

@nkaretnikov nkaretnikov Apr 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class is designed to be used with StreamHandler, via self.log in ActionContext, as well as a dedicated stream, via self.stdout and self.stderr. This allows capturing output no matter how it's written to, by writing to one of the streams directly or using the logging library, and without changing existing logging code.

From https://docs.python.org/3/library/logging.handlers.html#streamhandler:

The StreamHandler class, located in the core logging package, sends logging output to streams such as sys.stdout, sys.stderr or any file-like object (or, more precisely, any object which supports write() and flush() methods).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you leave this useful information in the docstring?

stderr=subprocess.STDOUT if redirect_stderr else subprocess.PIPE,
bufsize=1,
universal_newlines=True,
**kwargs,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://docs.python.org/3/library/subprocess.html

bufsize [...] 1 means line buffered (only usable if text=True or universal_newlines=True)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be useful to put this in the code (in the docstring, maybe?)

db.commit()
if not exists:
db.add(ba(build_id=build_id, key=key, artifact_type=artifact_type))
db.commit()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fixes a pre-existing issue. It's important to fix this now because this PR writes to logs line by line, so a new SQL entry would be created for each line.

Multiple SQL entries add nothing here, it's just an oversight. The format is:

1|1|LOGS|logs/6e6d70d9-1713098291-1-test.log
2|1|LOGS|logs/6e6d70d9-1713098291-1-test.log
3|1|LOGS|logs/6e6d70d9-1713098291-1-test.log
4|1|LOGS|logs/6e6d70d9-1713098291-1-test.log
5|1|LOGS|logs/6e6d70d9-1713098291-1-test.log
6|1|LOGS|logs/6e6d70d9-1713098291-1-test.log
7|1|LOGS|logs/6e6d70d9-1713098291-1-test.log

@nkaretnikov nkaretnikov marked this pull request as ready for review April 15, 2024 04:44
@nkaretnikov
Copy link
Contributor Author

@dcmcand @peytondmurray A gentle reminder that this still needs review. Will rebase after it's approved.

Copy link
Contributor

@peytondmurray peytondmurray left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this! I think it's basically there but adding (and moving some of the existing) explanatory documentation to be in docstrings would help with future maintenance.

Does it also make sense to write tests for this? e.g. run an action, expect some logging output with the appropriate logging prefix, etc.

I still need to run this locally to see what difference there is here, but will report back.

self.log = logging.getLogger(f"conda_store_server.action.{self.id}")
self.log.propagate = False
self.log.addHandler(logging.StreamHandler(stream=self.stdout))
self.log.setLevel(logging.INFO)
self.result = None
self.artifacts = {}

def run_command(self, command, redirect_stderr=True, **kwargs):
"""Runs command and immediately writes to logs"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bulk of this function has to do with redirecting output to context buffers from the subprocess, so it might make sense to mention something about that in the docstring.

I know this isn't done elsewhere in the code, so maybe it would be useful to discuss in tomorrow's meeting about docstrings and type annotations. But IMO it would make future future maintenance easier if there was something said about each parameter, e.g. why is there an option to redirect stderr but not stdout? Part of the reason I mention it is that LSPs will pull out docstrings and type annotations to give richer hints if we use those, and doing so will unlock static type checking in the future.

stderr=subprocess.STDOUT if redirect_stderr else subprocess.PIPE,
bufsize=1,
universal_newlines=True,
**kwargs,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be useful to put this in the code (in the docstring, maybe?)

@@ -40,7 +40,7 @@ def write_file(filename, s):
"constructor",
"--help",
]
logged_command(context, command, timeout=10)
logged_command(context, command)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
logged_command(context, command)
context.run_command(command)

Based on your comment under the logged_command function.

except Exception:
current_logs = b""
class LoggedStream:
"""Allows writing to storage via logging.StreamHandler"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you leave this useful information in the docstring?

# For instance, with local storage, this involves reading from and writing
# to a file. Locking here prevents a race condition when multiple tasks
# attempt to write to a shared resource, which is the log
with FileLock(f"{build.build_path(conda_store)}.log.lock"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to have a timeout here? This lock is only active during writes to the file, not actual commands running, right? If so, something short (5s?) should be okay.

pass


def append_to_logs(db: Session, conda_store, build, logs: typing.Union[str, bytes]):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be okay to add types for the other parameters?


if isinstance(logs, str):
logs = logs.encode("utf-8")
def __init__(self, db, conda_store, build, prefix=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want, the default can be prefix = "", which would eliminate the need for the branch on L34.

line = self.prefix + line
append_to_logs(self.db, self.conda_store, self.build, line + "\n")

def flush(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed?

Comment on lines +44 to +54
ba = orm.BuildArtifact
exists = (
db.query(ba)
.filter(ba.build_id == build_id)
.filter(ba.key == key)
.filter(ba.artifact_type == artifact_type)
.first()
)
db.commit()
if not exists:
db.add(ba(build_id=build_id, key=key, artifact_type=artifact_type))
db.commit()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks the same as Storage.fset. Can we just call that?

command, stderr=subprocess.STDOUT, encoding="utf-8", **kwargs
)
)
# This is here only for backward compatibility, new code should use the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to issue a DeprecationWarning here? cc @dcmcand for insight about our policy around this.

@peytondmurray peytondmurray removed this from the challenges: Round 2 🚀 milestone Apr 23, 2024
@nkaretnikov nkaretnikov merged commit 42ce8c0 into conda-incubator:main May 1, 2024
27 checks passed
peytondmurray pushed a commit to peytondmurray/conda-store that referenced this pull request May 4, 2024
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Done 💪🏾
Development

Successfully merging this pull request may close these issues.

[ENH] - Implement log flushing
2 participants