-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create a vector search from youtube audio transcripts #289
Comments
Hi!
Please update the ticket |
Guys, Anyone of you can contribute. Let's not wait for the approval. We can start working and raise a PR whenever we want 🙌🏻 |
Hi all. Glad to see the enthusiasm here :) You don't have to ask permission to begin working on tickets. Please raise PRs and comment links to PRs here. I'll not be assigning anyone the ticket as such now |
Hey team. Please raise a draft PR that we can review to see if everyone is going in the right direction. Thanks. |
@ChakshuGautam I'm facing this issue while working in colab Environment |
@kartikf4 Is this happening on non colab env as well? Any alternatives to this package that you tried out? |
@ChakshuGautam well i didnt tried in local env but i did tried alternative yt-dlpcheck here |
Probably has something to do with colab. Let's do locally. |
|
Hi I want to contribute to this can you assign me |
@ChakshuGautam https://pypi.org/project/youtube-transcript-api/ gives the transcripts for all videos in English/Hindi (from the auto generated cc). |
@ChakshuGautam ,@GautamR-Samagra on the further improvement on the issue
|
@xorsuyash can you share a draft PR anyway so that we can review in chunks? |
@ChakshuGautam raised draft-pr |
Hey @xorsuyash, Let's drop vector and colbart part until the issue is resolved. Single API: Also I have some questions:
|
|
@xorsuyash Thanks for completing this. cc: @Shruti3004 , @ChakshuGautam |
Description
Be able to parse all the videos from a Youtube channel or Youtube playlist , extract transcripts from their audios and embed them in a vector DB to enable search/retrieve over it .
Implementation Details
It'll include the following :
Can use https://github.com/ytdl-org/youtube-dl for scraping
Can use https://www.youtube.com/@3blue1brown as initial test set for the above
Ticket for using ColBERT is covered here, you only need to make it work locally here using the notebook.
Product Name
AI Tools
Organization Name
SamagraX
Domain
NA
Tech Skills Needed
Pytorch/ Python, ML
Category
Feature
Mentor(s)
@GautamR-Samagra
Complexity
Medium
The text was updated successfully, but these errors were encountered: