diff --git a/README.md b/README.md index 4d1ff3e..b48297d 100644 --- a/README.md +++ b/README.md @@ -1,7 +1,16 @@ # Starbucks Starbucks: Improved Training for 2D Matryoshka Embeddings +
+ +
-### General guidelines +We propose Starbucks: a new 2D MRL fine-tuning and pre-training method. + +Starbucks is composed of two key processes: the Starbucks Masked Autoencoding (SMAE) pretraining and the Starbucks Representation Learning (SRL) fine-tuning processes. + +In Starbucks, the model loss is computed based on a limited target list of layer-dimension pairs, ranging from smaller to larger sizes, much like how the coffeehouses chain [Starbucks](https://en.wikipedia.org/wiki/Starbucks) offers coffee in different cup sizes, from Demi to Trenta. + +## General guidelines Our codebase is built on top of torch and transformers. We recommend using a conda environment to install the required dependencies. @@ -22,7 +31,7 @@ For SRL fine-tuning on retrieval task, see [retrieval](retrieval/README.md). For SRL fine-tuning on STS task, see [sts](sts/README.md). -### Model Checkpoints +## Model Checkpoints We released our model checkpoints on Hugging Face Model Hub: @@ -30,4 +39,4 @@ Pre-trained SMAE: [bert-base-uncased-fineweb100bt-smae](https://huggingface.co/i Fine-tuned Starbucks_STS: [Starbucks_STS](https://huggingface.co/ielabgroup/Starbucks_STS) -Fine-tuned Starbucks_Retrieval: [Starbucks_Retrieval](https://huggingface.co/ielabgroup/Starbucks_Retrieval) \ No newline at end of file +Fine-tuned Starbucks_Retrieval: [Starbucks-msmarco](https://huggingface.co/ielabgroup/Starbucks-msmarco) \ No newline at end of file diff --git a/Starbucks.png b/Starbucks.png new file mode 100644 index 0000000..7eca473 Binary files /dev/null and b/Starbucks.png differ