-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Github Action Workflows for Scheduled Algorithm Deployment #25
Comments
@valentina-s , I see your CronPy inside your main.yml script which will call your script.py each 5 minutes after it is started. Where are you setting the conditions that will start CronPy running? Is it always running? Github actions are completely new to me. |
CronPy is always running (based on the schedule). You can look at the progress here: https://github.com/valentina-s/cron_action/actions There is another workflow called Manual workflow which you can trigger manually, or make it work based on some event. |
Hi, @valentina-s!
Most of these might be just my misunderstanding of the idea/project, I'll send a few more followup questions later :D |
More followup questions/thoughts:
|
In the second set of comments, you are on the right path: there can be different inputs: the streams, or the way the streams are accessed: per file?, per unit time?, and the outputs will differ based on the functions. The point of that question is for students to describe the details in the proposal (like a few scenarios). There are already some Docker images specifically for the Orcasound data. You can look at the We have some extra cloud resources, so if the github actions resources are not sufficient, so having a way to utilize the cloud ones would be great! |
@Molkree you are welcome to submit a PR with the spectrogram artifact. The images will most probably quickly start accumulating and taking up a lot of space. You can change the retention policy to delete them after a few days: |
@valentina-s
From docs. Alright, I'll send PR shortly, might also change it to process more than one file. And process the previous day by default, there were no new files today... |
@valentina-s, quick question, have you ever seen error 404? if r == 'Response [404]': from script Edit: ahhh, alright, 404 is what I'm getting now. |
Develop a workflow using Github Actions to apply processing functions to a stream of data, for example:
Github Actions can run sporadically at about 30 min rate but are free, so when data is also publicly available (which is the case with the OOI and Orcasound streams), this approach allows anybody to test their models on the long streams and then compare the results.
Required skills: Python, Github, basic signal processing
Bonus skills: Docker, Cloud computing, deep learning
Possible mentor(s): Valentina @valentina-s, Scott @scottveirs, Val @veirs
References:
An example of a github action applied to the Ocean Observatories Initiative Hydrophone Stream:
https://github.com/orcasound/orca-action-workflow
A package to read data from the Ocean Observatories Archive:
https://ooipy.readthedocs.io/en/latest/
Getting Started:
script.py
is only reading the file and creating a spectrogram, but it is not doing anything with it.Points to consider in the proposal:
What are the inputs of the system, what are the outputs? Where will the inputs and outputs live?
What are the simple operations that can be done within the Github actions?
How can they be streamlined using Docker containers?
How can they be extended using Cloud computing resources?
How can the results be organized for easy access?
Can this be extended for multiple processes?
The text was updated successfully, but these errors were encountered: