Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Log to stdout and add more informative progress logs #9

Closed
alexgleith opened this issue Sep 12, 2023 · 6 comments
Closed

Log to stdout and add more informative progress logs #9

alexgleith opened this issue Sep 12, 2023 · 6 comments
Assignees

Comments

@alexgleith
Copy link
Contributor

See: digitalearthpacific/dep-mangroves#3

But basically, running the data processes now provides no feedback at where the tasks are at.

An example of this being helpful was the tiles that were loading a ring around the world. If I could see in the logs that it was stuck on finding STAC Items, then it would have been a clue.

Logs need to just push out to standard out, so that they are picked up in the normal Docker tools that we have running.

@jessjaco
Copy link
Collaborator

A few things here (sorry I missed the prior issue):
All the logging is done in https://github.com/digitalearthpacific/dep-tools/blob/main/dep_tools/runner.py. You can see what is logged there is somewhat terse, but one of the goals was to have machine readable logging, so there is just one row for each "task". (I also recognize the iterative running is not being utilized in the argo workflow, but that is a separate issue).

I think what I'd like to do is enable more explicit logging by calling self.logger.info in the runner with whatever info is desired. Then the logger could choose whether to log it or not.

Then we could either 1) replace the existing logger with e.g. a stdout logger that would log at the info level or 2) combine the existing logger with a stdout logger, and log to both. Ideally the existing logger would only log at the debug level and above, but info would be sent to stdout too.

I prefer the second method, as removing the existing logger would make some of the existing filtering code (to prevent redos) unusable. I actually had started working on this solution in Suva (see https://github.com/jessjaco/azure-logger/blob/b23c9c71b15d9d714a7bde75f66d18fd246b2650/azure_logger/__init__.py#L75), but didn't get it working. For it to work though, we just need to make sure that log level can set separately for each handler (or a workaround).

@alexgleith
Copy link
Contributor Author

I'm a bit cautious about the external state. I don't want you to change it, but it is a complication.

The way I've managed whether or not work is needed is to have a success flag file, and I like using the STAC document as that. Then each task starts up, checks whether the STAC exists, and skips the tasks, unless the flag for overwrite is set, in which case it runs anyway... Since the STAC is small and it's the last thing written, it's the perfect success state file.

For managing what tasks need to be run, we used a queue (AWS' SQS or RabbitMQ) and put tasks on there... so containers will have an iterator in them and pull tasks off the queue. Anyhow, as I said, I don't want you to change the central logging thing. But we must have logs in Argo, and in a local dev environment, and I really don't mind if they're fairly verbose.

@jessjaco
Copy link
Collaborator

I'm not wedded to the existing logger and I can see how testing the stac file is a little cleaner. I do like how the existing setup gives a central place to view all the outputs for a year. It also logs errors (well most errors) so I can just look at the file to see which outputs are missing and why. I am also concerned about the stdout logs not being archived anywhere, and the need to e.g. copy and paste the error into an issue.

For now I'll look into adding the stdout logger to the existing one like I laid out above and also using the stac files as flags, at least optionally, to explore the other pathway to monitor tasks.

@alexgleith
Copy link
Contributor Author

I am also concerned about the stdout logs not being archived anywhere

I'm still working on our logging solution. We do have Loki up and running, but if I look at Argo, there are no logs from the workflows (which I think is because the workflows aren't logging at all!)

Permanently storing those logs is something I can work on too, but requires a bit more infrastructure on the Azure side. We'll get there.

The Loki logs stick around for weeks currently. The UI for Argo Workflows gets logs from the container, so when the container goes the logs do... but that's separate to Loki.

@alexgleith alexgleith self-assigned this Dec 4, 2023
@alexgleith
Copy link
Contributor Author

I'm going to work on this eventually, so self-assigned.

@alexgleith
Copy link
Contributor Author

Logging is ok. Could be more detailed during processing, but that's probably a per-process issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants