Recipe for embedding reduction [Videos] #418

plon-Susk7 · 2024-10-24T10:46:18Z

This PR is related to issue #410

Added notebook example for embedding reduction using dimension_reduction operator.
Used Hugging face dataset "sayakpaul/ucf101-subset" which has 10 classes, as example.
Used vid_vec_rep_clip operator to extract embeddings.

aatmanvaidya · 2024-10-24T11:30:47Z

@plon-Susk7 great work!
just a few things

can we add some more 1-2 lines descriptions in the markdown -- you have the headings, just one two more lines explaining the process - like you are extracting embedding's using CLIP etc etc
I am guessing that you have to download the huggingface hub, matplotlib libraries? where are you downloading them? in the .venv? what if we download them in the notebook? what I mean is, what if the first cell is something like this

!pip install huggingface-hub
!pip install matplotlib
!pip install datasets

This way we make sure the user has to only run the notebook and they should not worry about fixing package install issue.
3. The final plot looks great, is there a chance that the plot is a bit more spatial, like right now the thumbnails overlap a lot
4. Also, how many videos are there in the dataset?

aatmanvaidya · 2024-10-24T11:31:28Z

src/notebooks/03_plot_tsne_videos.ipynb

+    "\n",
+    "dataset_name = \"UCF101_subset/train\"\n",
+    "hf_dataset_identifier = \"sayakpaul/ucf101-subset\"\n",
+    "filename = \"UCF101_subset.tar.gz\"\n",


does the tar.gz file gets deleted in the end?

I added the code to remove tar.gz file from cache.

plon-Susk7 · 2024-10-24T12:50:59Z

This way we make sure the user has to only run the notebook and they should not worry about fixing package install issue. 3. The final plot looks great, is there a chance that the plot is a bit more spatial, like right now the thumbnails overlap a lot 4. Also, how many videos are there in the dataset?

There are 10 classes in the dataset, so I took 5 from each. In total the notebook processes 50 videos.

aatmanvaidya · 2024-10-24T13:07:25Z

@plon-Susk7 this looks great, merging the PR now

can you also do one small thing - on the issue - #410, can you write detailed instructions on how to download the jupyter notebook in the .venv and then exec into the docker container and run it.
I will add those instructions to the wiki

added example for reduction

cc6c0a9

aatmanvaidya self-requested a review October 24, 2024 11:17

aatmanvaidya reviewed Oct 24, 2024

View reviewed changes

plon-Susk7 added 2 commits October 24, 2024 18:14

made changes

798520a

added code to remove from cache

84ffd7b

aatmanvaidya merged commit 98b9336 into tattle-made:development Oct 24, 2024
4 of 5 checks passed

plon-Susk7 deleted the development branch October 25, 2024 06:19

aatmanvaidya mentioned this pull request Dec 23, 2024

Create Recipes for common use case #410

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recipe for embedding reduction [Videos] #418

Recipe for embedding reduction [Videos] #418

plon-Susk7 commented Oct 24, 2024

aatmanvaidya commented Oct 24, 2024

aatmanvaidya Oct 24, 2024

plon-Susk7 Oct 24, 2024

plon-Susk7 commented Oct 24, 2024 •

edited

Loading

aatmanvaidya commented Oct 24, 2024

Recipe for embedding reduction [Videos] #418

Recipe for embedding reduction [Videos] #418

Conversation

plon-Susk7 commented Oct 24, 2024

aatmanvaidya commented Oct 24, 2024

aatmanvaidya Oct 24, 2024

Choose a reason for hiding this comment

plon-Susk7 Oct 24, 2024

Choose a reason for hiding this comment

plon-Susk7 commented Oct 24, 2024 • edited Loading

aatmanvaidya commented Oct 24, 2024

plon-Susk7 commented Oct 24, 2024 •

edited

Loading