Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adds fine tuning notebook and sample datasets #60

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

justin-cechmanek
Copy link
Contributor

No description provided.

Copy link
Collaborator

@tylerhutcherson tylerhutcherson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few feedback items, nice work justin!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's put this dataset in the public S3 bucket redis-ai-resources and then use wget or curl to download it. You will see folders in their for some of the others. We are trying to move more in this direction

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same - let's move to S3 similar to above

Copy link
Collaborator

@tylerhutcherson tylerhutcherson Jan 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! This is looking good. A few recommendations:

  • Let's downplay the semantic caching use case specifics a bit more?
  • Add hyper links to some of the relevant fine tuning papers when possible (contrastive loss, etc) or alternative techniques that could apply to other use cases (leaning more general)
  • Potentially reduce the number of plots to just a few
  • Possible to use the RedisVL huggingface vectorizer class to serve the embeddings here (just to insert some value prop?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes to to first 3 points.
Not sure we can use our huggingface vectorizer here as it takes a string model name and pulls existing models from huggingface hub. I don't see a way to insert a local finetuned model. We could wrap it in our Custom vectorizer if we want to use it with our cache

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants