-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Project Set Up #2
Comments
@brucearctor I need one help here. I am not able to run the script (to generate the audio files from video) in Case HPC. Do I need to create a Docker env for the same ? |
Ultimately, yes, everything needs to be able to run in the infra -- likely just a bit of specific containerizing/packaging to get things working [ python/tensorflow/etc will run in that environment ]. Check with the community ( ex: slack ) or hop on one of the calls Wednesday or Friday for some preliminary tips, if needed. |
For clips, see https://sites.google.com/case.edu/techne-data-requests/home
… On Jun 21, 2022, at 11:10 AM, brucearctor ***@***.***> wrote:
Ultimately, yes, everything needs to be able to run in the infra -- likely just a bit of specific containerizing/packaging to get things working [ python/tensorflow/etc will run in that environment ]. Check with the community ( ex: slack ) or hop on one of the calls Wednesday or Friday for some preliminary tips, if needed.
—
Reply to this email directly, view it on GitHub <#2 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACTVVWMZZDGUJPF4X5GXUD3VQHLQBANCNFSM5YGRVSIA>.
You are receiving this because you are subscribed to this thread.
|
@turnermarkb Sorry I could not understand your comments. I was thinking to run audio processing from "/mnt/rds/redhen/gallina/tv/2022" folder first and for other years (2021,...) and generate the audio files in my Gallina home. After that I plan to do the tagging and store the results in safe. Is this correct approach or we need to run this on some specified set of file ? |
This seems like a good approach to me. How many clips, how will you make them, how much storage? Gallina is vast.
m
… On Jun 21, 2022, at 9:47 PM, Sabyasachi Ghosal ***@***.*** ***@***.***>> wrote:
For clips, see https://sites.google.com/case.edu/techne-data-requests/home <https://sites.google.com/case.edu/techne-data-requests/home>
… <x-msg://10/#>
@turnermarkb <https://github.com/turnermarkb> Sorry I could not understand your comments. I was thinking to run audio processing from "/mnt/rds/redhen/gallina/tv/2022" folder first and for other years (2021,...) and generate the audio files in my Gallina home. After that I plan to do the tagging and store the results in safe. Is this correct approach or we need to run this on some specified set of file ?
—
Reply to this email directly, view it on GitHub <#2 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACTVVWNDFV5YYTLEDVTM2OLVQJWBPANCNFSM5YGRVSIA>.
You are receiving this because you were mentioned.
|
@brucearctor I was able to create a tensorflow based local docker image from github workflows. Then I creates a local singularity container and copied it to HPC. Now I plan to run the container in HPC to execute my scripts. Can you please check if I am going in the correct direction (Blog: https://technosaby.github.io/gsoc/phase1/week5) . The latest code is in main branch. |
Just mentioning that we don’t require a Singularity container until near the end of your project. It’s fine to work on the code outside of Singularity until you’ve got it in good shape and then to make the container.
m
… On Jun 26, 2022, at 6:41 AM, Sabyasachi Ghosal ***@***.*** ***@***.***>> wrote:
@brucearctor <https://github.com/brucearctor> I was able to create a tensorflow based local docker image from github workflows and pushed it to docker hub. I am planning to clone the image from the dockerhub and make a container in the HPC. Can you please check if I am going in the correct direction (Blog: https://technosaby.github.io/gsoc/phase1/week5 <https://technosaby.github.io/gsoc/phase1/week5>) . The latest code is in main branch.
—
Reply to this email directly, view it on GitHub <#2 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACTVVWJAHPWRS3THIFNONULVRAXWVANCNFSM5YGRVSIA>.
You are receiving this because you were mentioned.
|
@turnermarkb Thanks for your suggestion. Can you please explain what do you mean by "outside of Singularity" ? Do you mean only use the docker and not singularity ? If so, I could not find a way to run the docker containers directly in HPC. Please let me know if there is a resource which I have missed. So I am using this approach, Build Scripts (Local) -> Put in Docker container (Local) -> Build Singularity container sif image(Local) -----Copy to HPC ----> Execute containers (HPC). As there is no existing audio pipleine, I am planning to build all audio data from the videos from /mnt/rds/redhen/gallina/tv/2021 and extend it for other videos (years) later. As the size of this is big, I need to run it in HPC. Please let me know if this approach is correct or there is some faster way to do things as the tensorflow based sif image is also around 3GB, so takes a good amount of time to copy. |
I think that @turnermarkb is also saying -- no need ( and probably not even desired ... until the end of your project ) to get things running over years of data. It is great if you are prepared to do so, but you'll want to run it over years with what you determine to be the optimal model, which I imagine that you will iterate on throughout the summer. |
No, I mean you don’t need to use docker or singularity if it’s convenient not to, until the end;
we need your project to end up in a container, but some coders prefer to work without a container.
There are Red Hen audio pipelines: https://www.redhenlab.org/home/the-cognitive-core-research-topics-in-red-hen/audio-processing-pipeline <https://www.redhenlab.org/home/the-cognitive-core-research-topics-in-red-hen/audio-processing-pipeline>
m
… On Jun 26, 2022, at 2:33 PM, Sabyasachi Ghosal ***@***.***> wrote:
@turnermarkb <https://github.com/turnermarkb> Thanks for your suggestion. Can you please explain what do you mean by "outside of Singularity" ? Do you mean only use the docker and not singularity ? If so, I could not find a way to run the docker containers directly in HPC. Please let me know if there is a resource which I have missed.
So I am using this approach,
Build Scripts (Local) -> Put in Docker container (Local) -> Build Singularity container (Local) -----Copy to HPC ----> Execute containers (HPC).
As there is no existing audio pipleine, I am planning to build all audio data from the videos from /mnt/rds/redhen/gallina/tv/2021 and extend it for other videos (years) later. As the size of this is big, I need to run it in HPC.
Please let me know if my understanding is correct.
—
Reply to this email directly, view it on GitHub <#2 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACTVVWIYHUAWTMNFVNKFA7TVRCO6PANCNFSM5YGRVSIA>.
You are receiving this because you were mentioned.
|
@brucearctor Thanks for your comments. I will keep this task for later work and work on baselining. For now I am processing the audio using my script. |
Yes.
m
… On Jun 26, 2022, at 9:47 PM, Sabyasachi Ghosal ***@***.*** ***@***.***>> wrote:
I think that @turnermarkb <https://github.com/turnermarkb> is also saying -- no need ( and probably not even desired ... until the end of your project ) to get things running over years of data. It is great if you are prepared to do so, but you'll want to run it over years with what you determine to be the optimal model, which I imagine that you will iterate on throughout the summer.
|
Final model updates and merging to singularity container for delivery will be taken care in the last milestone |
As discussed in last meeting with @turnermarkb , as the tagging is being done properly, it is the correct time to do the packaging and them focus on improving that from there. So I will work on making a singularity image from my codebase. |
Yes, start with the baseline of things working -- tagging works, now operationalize with good foundations -- then optimize/retrain/improve. |
After copying the video files from the /mnt/rds/rehen/gallina to my scratch folder and then running the scripts using the singularity container from the docker file, all tags get generated properly @brucearctor @turnermarkb |
The idea is to prepare the project set up in the Singularity inside Redhen's infrastructure
The text was updated successfully, but these errors were encountered: