Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

filter_by_log in print_tasks doesn't seem to be filtering #14

Open
alexgleith opened this issue Sep 29, 2023 · 2 comments
Open

filter_by_log in print_tasks doesn't seem to be filtering #14

alexgleith opened this issue Sep 29, 2023 · 2 comments
Assignees

Comments

@alexgleith
Copy link
Contributor

I ran 0.0.5 and then re-ran it, and all the tiles are added to the list of jobs to run, and each is then skipping as it's already completed.

I thought the print_tasks code would not even put the completed tasks on the queue.

@jessjaco
Copy link
Collaborator

I'm guessing you ran it with --datetime 2016/2022? If not, please tell me what params you used. I need to think about how to handle this case. For the coastlines, this would indicate there is a single 2016/2022 product (like a seven year mosaic). But for mangroves, we are iterating over years. So the logger is checking the 2016/2022 log and seeing there are no products there. (i.e. this path: https://deppcpublicstorage.blob.core.windows.net/output/dep_s2_mangroves/0-0-5/logs/dep_s2_mangroves_2016-2022_log.csv). But it should be checking individual years, which are complete (see https://deppcpublicstorage.blob.core.windows.net/output/dep_s2_mangroves/0-0-5/logs/dep_s2_mangroves_2016_log.csv). We can change the setup to accommodate this, but I also should think about your desire to migrate from the handy, convenient, compact and readable csv based logs to pinging the stac items directly.

@alexgleith
Copy link
Contributor Author

Yep, that’s what we did. The config we used is in the 0-0-5 file here: https://github.com/digitalearthpacific/dep-kubernetes-apps/pull/7/files

It worked fine, and 1,100 tiles were completed (skipped) really quickly, so I don't think we really need to change it actually. It just didn't do what I expected!

I'm pretty enthusiastic about not having an external CSV as a state file, writing normal logs to stdout and using the STAC docs as a "jobs-already-done" flag! We can talk about this later, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants