adds fine tuning notebook and sample datasets #60

justin-cechmanek · 2025-01-31T00:48:45Z

No description provided.

tylerhutcherson

Left a few feedback items, nice work justin!

tylerhutcherson · 2025-01-31T14:35:44Z

python-recipes/finetuning/datasets/sample_dataset.csv

Let's put this dataset in the public S3 bucket redis-ai-resources and then use wget or curl to download it. You will see folders in their for some of the others. We are trying to move more in this direction

tylerhutcherson · 2025-01-31T14:36:00Z

python-recipes/finetuning/datasets/sample_testset.csv

Same - let's move to S3 similar to above

tylerhutcherson · 2025-01-31T14:47:41Z

python-recipes/finetuning/00_text_finetuning.ipynb

Nice! This is looking good. A few recommendations:

Let's downplay the semantic caching use case specifics a bit more?

Add hyper links to some of the relevant fine tuning papers when possible (contrastive loss, etc) or alternative techniques that could apply to other use cases (leaning more general)

Potentially reduce the number of plots to just a few

Possible to use the RedisVL huggingface vectorizer class to serve the embeddings here (just to insert some value prop?)

yes to to first 3 points.
Not sure we can use our huggingface vectorizer here as it takes a string model name and pulls existing models from huggingface hub. I don't see a way to insert a local finetuned model. We could wrap it in our Custom vectorizer if we want to use it with our cache

justin-cechmanek added 2 commits January 29, 2025 17:43

adds initial finetuning notebook

0c080d3

adds finetuning notebook and sample datasets

1523be6

justin-cechmanek requested review from tylerhutcherson and rbs333 January 31, 2025 00:49

tylerhutcherson requested changes Jan 31, 2025

View reviewed changes

justin-cechmanek added 2 commits January 31, 2025 11:37

moves sample data to S3 bucket

9c79ee2

adds link to contrastive loss paper. removes some plots

c86478f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

adds fine tuning notebook and sample datasets #60

adds fine tuning notebook and sample datasets #60

justin-cechmanek commented Jan 31, 2025

tylerhutcherson left a comment

tylerhutcherson Jan 31, 2025

tylerhutcherson Jan 31, 2025

tylerhutcherson Jan 31, 2025 •

edited

Loading

justin-cechmanek Jan 31, 2025

adds fine tuning notebook and sample datasets #60

Are you sure you want to change the base?

adds fine tuning notebook and sample datasets #60

Conversation

justin-cechmanek commented Jan 31, 2025

tylerhutcherson left a comment

Choose a reason for hiding this comment

tylerhutcherson Jan 31, 2025

Choose a reason for hiding this comment

tylerhutcherson Jan 31, 2025

Choose a reason for hiding this comment

tylerhutcherson Jan 31, 2025 • edited Loading

Choose a reason for hiding this comment

justin-cechmanek Jan 31, 2025

Choose a reason for hiding this comment

tylerhutcherson Jan 31, 2025 •

edited

Loading