Project Set Up #2

technosaby · 2022-06-08T13:30:39Z

The idea is to prepare the project set up in the Singularity inside Redhen's infrastructure

technosaby · 2022-06-21T09:35:24Z

@brucearctor I need one help here. I am not able to run the script (to generate the audio files from video) in Case HPC. Do I need to create a Docker env for the same ?

brucearctor · 2022-06-21T15:10:45Z

Ultimately, yes, everything needs to be able to run in the infra -- likely just a bit of specific containerizing/packaging to get things working [ python/tensorflow/etc will run in that environment ]. Check with the community ( ex: slack ) or hop on one of the calls Wednesday or Friday for some preliminary tips, if needed.

turnermarkb · 2022-06-21T16:19:52Z

For clips, see https://sites.google.com/case.edu/techne-data-requests/home

…

On Jun 21, 2022, at 11:10 AM, brucearctor ***@***.***> wrote: Ultimately, yes, everything needs to be able to run in the infra -- likely just a bit of specific containerizing/packaging to get things working [ python/tensorflow/etc will run in that environment ]. Check with the community ( ex: slack ) or hop on one of the calls Wednesday or Friday for some preliminary tips, if needed. — Reply to this email directly, view it on GitHub <#2 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACTVVWMZZDGUJPF4X5GXUD3VQHLQBANCNFSM5YGRVSIA>. You are receiving this because you are subscribed to this thread.

technosaby · 2022-06-22T01:46:52Z

For clips, see https://sites.google.com/case.edu/techne-data-requests/home
…

@turnermarkb Sorry I could not understand your comments. I was thinking to run audio processing from "/mnt/rds/redhen/gallina/tv/2022" folder first and for other years (2021,...) and generate the audio files in my Gallina home. After that I plan to do the tagging and store the results in safe. Is this correct approach or we need to run this on some specified set of file ?

turnermarkb · 2022-06-22T02:56:18Z

This seems like a good approach to me. How many clips, how will you make them, how much storage? Gallina is vast. m

…

On Jun 21, 2022, at 9:47 PM, Sabyasachi Ghosal ***@***.*** ***@***.***>> wrote: For clips, see https://sites.google.com/case.edu/techne-data-requests/home <https://sites.google.com/case.edu/techne-data-requests/home> … <x-msg://10/#> @turnermarkb <https://github.com/turnermarkb> Sorry I could not understand your comments. I was thinking to run audio processing from "/mnt/rds/redhen/gallina/tv/2022" folder first and for other years (2021,...) and generate the audio files in my Gallina home. After that I plan to do the tagging and store the results in safe. Is this correct approach or we need to run this on some specified set of file ? — Reply to this email directly, view it on GitHub <#2 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACTVVWNDFV5YYTLEDVTM2OLVQJWBPANCNFSM5YGRVSIA>. You are receiving this because you were mentioned.

technosaby · 2022-06-26T10:41:34Z

@brucearctor I was able to create a tensorflow based local docker image from github workflows. Then I creates a local singularity container and copied it to HPC. Now I plan to run the container in HPC to execute my scripts.

Can you please check if I am going in the correct direction (Blog: https://technosaby.github.io/gsoc/phase1/week5) . The latest code is in main branch.

turnermarkb · 2022-06-26T15:40:36Z

Just mentioning that we don’t require a Singularity container until near the end of your project. It’s fine to work on the code outside of Singularity until you’ve got it in good shape and then to make the container. m

…

On Jun 26, 2022, at 6:41 AM, Sabyasachi Ghosal ***@***.*** ***@***.***>> wrote: @brucearctor <https://github.com/brucearctor> I was able to create a tensorflow based local docker image from github workflows and pushed it to docker hub. I am planning to clone the image from the dockerhub and make a container in the HPC. Can you please check if I am going in the correct direction (Blog: https://technosaby.github.io/gsoc/phase1/week5 <https://technosaby.github.io/gsoc/phase1/week5>) . The latest code is in main branch. — Reply to this email directly, view it on GitHub <#2 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACTVVWJAHPWRS3THIFNONULVRAXWVANCNFSM5YGRVSIA>. You are receiving this because you were mentioned.

technosaby · 2022-06-26T18:33:00Z

@turnermarkb Thanks for your suggestion. Can you please explain what do you mean by "outside of Singularity" ? Do you mean only use the docker and not singularity ? If so, I could not find a way to run the docker containers directly in HPC. Please let me know if there is a resource which I have missed.

So I am using this approach,

Build Scripts (Local) -> Put in Docker container (Local) -> Build Singularity container sif image(Local) -----Copy to HPC ----> Execute containers (HPC).

As there is no existing audio pipleine, I am planning to build all audio data from the videos from /mnt/rds/redhen/gallina/tv/2021 and extend it for other videos (years) later. As the size of this is big, I need to run it in HPC.

Please let me know if this approach is correct or there is some faster way to do things as the tensorflow based sif image is also around 3GB, so takes a good amount of time to copy.

brucearctor · 2022-06-26T20:33:50Z

@technosaby -- I like that you're getting containers going. It does seem there is a possibility that CWRU HPC might support docker, in addition to singularity. The path you're on re: docker/singularity, seems fine for your current stage. Singularity isn't going to hurt anything -- ultimately, the choice of runtime singularity/docker should just be one minor implementation detail [ even though getting to work and run in HPC infrastructure is required ].
Prove that the tagging and a 'pipeline' can work for a single video file, then multiple, then more ... don't worry about addressing for year/years at this time. You'll want to explore [ at some times manually ] over many files to ensure you're happy with the performance of your tagger, and that the output produced on a given file is in the desired format.

I think that @turnermarkb is also saying -- no need ( and probably not even desired ... until the end of your project ) to get things running over years of data. It is great if you are prepared to do so, but you'll want to run it over years with what you determine to be the optimal model, which I imagine that you will iterate on throughout the summer.

turnermarkb · 2022-06-26T22:55:23Z

No, I mean you don’t need to use docker or singularity if it’s convenient not to, until the end; we need your project to end up in a container, but some coders prefer to work without a container. There are Red Hen audio pipelines: https://www.redhenlab.org/home/the-cognitive-core-research-topics-in-red-hen/audio-processing-pipeline <https://www.redhenlab.org/home/the-cognitive-core-research-topics-in-red-hen/audio-processing-pipeline> m

…

On Jun 26, 2022, at 2:33 PM, Sabyasachi Ghosal ***@***.***> wrote: @turnermarkb <https://github.com/turnermarkb> Thanks for your suggestion. Can you please explain what do you mean by "outside of Singularity" ? Do you mean only use the docker and not singularity ? If so, I could not find a way to run the docker containers directly in HPC. Please let me know if there is a resource which I have missed. So I am using this approach, Build Scripts (Local) -> Put in Docker container (Local) -> Build Singularity container (Local) -----Copy to HPC ----> Execute containers (HPC). As there is no existing audio pipleine, I am planning to build all audio data from the videos from /mnt/rds/redhen/gallina/tv/2021 and extend it for other videos (years) later. As the size of this is big, I need to run it in HPC. Please let me know if my understanding is correct. — Reply to this email directly, view it on GitHub <#2 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACTVVWIYHUAWTMNFVNKFA7TVRCO6PANCNFSM5YGRVSIA>. You are receiving this because you were mentioned.

technosaby · 2022-06-27T01:46:57Z

@technosaby -- I like that you're getting containers going. It does seem there is a possibility that CWRU HPC might support docker, in addition to singularity. The path you're on re: docker/singularity, seems fine for your current stage. Singularity isn't going to hurt anything -- ultimately, the choice of runtime singularity/docker should just be one minor implementation detail [ even though getting to work and run in HPC infrastructure is required ].

Prove that the tagging and a 'pipeline' can work for a single video file, then multiple, then more ... don't worry about addressing for year/years at this time. You'll want to explore [ at some times manually ] over many files to ensure you're happy with the performance of your tagger, and that the output produced on a given file is in the desired format.

I think that @turnermarkb is also saying -- no need ( and probably not even desired ... until the end of your project ) to get things running over years of data. It is great if you are prepared to do so, but you'll want to run it over years with what you determine to be the optimal model, which I imagine that you will iterate on throughout the summer.

@brucearctor Thanks for your comments. I will keep this task for later work and work on baselining.

For now I am processing the audio using my script.

turnermarkb · 2022-06-27T04:13:58Z

Yes. m

…

On Jun 26, 2022, at 9:47 PM, Sabyasachi Ghosal ***@***.*** ***@***.***>> wrote: I think that @turnermarkb <https://github.com/turnermarkb> is also saying -- no need ( and probably not even desired ... until the end of your project ) to get things running over years of data. It is great if you are prepared to do so, but you'll want to run it over years with what you determine to be the optimal model, which I imagine that you will iterate on throughout the summer.

technosaby · 2022-07-02T10:52:59Z

Final model updates and merging to singularity container for delivery will be taken care in the last milestone

technosaby · 2022-07-28T06:33:48Z

As discussed in last meeting with @turnermarkb , as the tagging is being done properly, it is the correct time to do the packaging and them focus on improving that from there. So I will work on making a singularity image from my codebase.

brucearctor · 2022-07-28T15:45:43Z

Yes, start with the baseline of things working -- tagging works, now operationalize with good foundations -- then optimize/retrain/improve.

technosaby · 2022-07-31T12:00:35Z

After copying the video files from the /mnt/rds/rehen/gallina to my scratch folder and then running the scripts using the singularity container from the docker file, all tags get generated properly @brucearctor @turnermarkb

technosaby self-assigned this Jun 8, 2022

technosaby added the enhancement New feature or request label Jun 8, 2022

technosaby added this to @technosaby's Tagging Sound Effects Jun 8, 2022

technosaby added this to the M1 - Complete Initial Set Up milestone Jun 8, 2022

technosaby moved this to Todo in @technosaby's Tagging Sound Effects Jun 16, 2022

technosaby moved this from Todo to In Progress in @technosaby's Tagging Sound Effects Jun 18, 2022

technosaby modified the milestones: M1 - Complete Initial Set Up, M3 - Transfer Learning, Final Model Preparation Jul 2, 2022

technosaby moved this from In Progress to Review in progress in @technosaby's Tagging Sound Effects Aug 1, 2022

technosaby moved this from Review in progress to Done in @technosaby's Tagging Sound Effects Aug 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Project Set Up #2

Project Set Up #2

technosaby commented Jun 8, 2022

technosaby commented Jun 21, 2022

brucearctor commented Jun 21, 2022

turnermarkb commented Jun 21, 2022 via email

technosaby commented Jun 22, 2022

turnermarkb commented Jun 22, 2022 via email

technosaby commented Jun 26, 2022 •

edited

Loading

turnermarkb commented Jun 26, 2022 via email

technosaby commented Jun 26, 2022 •

edited

Loading

brucearctor commented Jun 26, 2022

turnermarkb commented Jun 26, 2022 via email

technosaby commented Jun 27, 2022

turnermarkb commented Jun 27, 2022 via email

technosaby commented Jul 2, 2022

technosaby commented Jul 28, 2022

brucearctor commented Jul 28, 2022

technosaby commented Jul 31, 2022

Project Set Up #2

Project Set Up #2

Comments

technosaby commented Jun 8, 2022

technosaby commented Jun 21, 2022

brucearctor commented Jun 21, 2022

turnermarkb commented Jun 21, 2022 via email

technosaby commented Jun 22, 2022

turnermarkb commented Jun 22, 2022 via email

technosaby commented Jun 26, 2022 • edited Loading

turnermarkb commented Jun 26, 2022 via email

technosaby commented Jun 26, 2022 • edited Loading

brucearctor commented Jun 26, 2022

turnermarkb commented Jun 26, 2022 via email

technosaby commented Jun 27, 2022

turnermarkb commented Jun 27, 2022 via email

technosaby commented Jul 2, 2022

technosaby commented Jul 28, 2022

brucearctor commented Jul 28, 2022

technosaby commented Jul 31, 2022

technosaby commented Jun 26, 2022 •

edited

Loading

technosaby commented Jun 26, 2022 •

edited

Loading