Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Embedding parameters #7458

Merged
merged 8 commits into from
Jun 24, 2024
Merged

Conversation

YannFollet
Copy link
Contributor

extra parameters

--embd-normalize $integer$

$integer$ description formula
$-1$ none
$0$ max absolute int16 $\Large{{32760 * x_i} \over\max \lvert x_i\rvert}$
$1$ taxicab $\Large{x_i \over\sum \lvert x_i\rvert}$
$2$ euclidean (default) $\Large{x_i \over\sqrt{\sum x_i^2}}$
$>2$ p-norm $\Large{x_i \over\sqrt[p]{\sum \lvert x_i\rvert^p}}$

--embd-output-format $'string'$

$'string'$ description
'' same as before (default)
'array' single embeddings $[[x_1,...,x_n]]$
multiple embeddings $[_0[x_1,...,x_n],1[x_1,...,x_n],...,{n-1}[x_1,...,x_n]]$
'json' openai style
'json+' add cosine similarity matrix

--embd-separator $"string"$

$"string"$
"\n" (default)
"<#embSep#>" for exemple
"<#sep#>" other exemple

examples

Unix-based systems (Linux, macOS, etc.):

./embedding -p 'Castle<#sep#>Stronghold<#sep#>Dog<#sep#>Cat' --embd-separator '<#sep#>' --embd-normalize 2  --embd-output-format '' -m './path/to/model.gguf' --n-gpu-layers 99 --log-disable 2>/dev/null

Windows:

embedding.exe -p 'Castle<#sep#>Stronghold<#sep#>Dog<#sep#>Cat' --embd-separator '<#sep#>' --embd-normalize 2  --embd-output-format '' -m './path/to/model.gguf' --n-gpu-layers 99 --log-disable 2>/dev/null

YannFollet and others added 2 commits May 22, 2024 09:29
--embd-normalize
--embd-output-format
--embd-separator
description in the README.md
@mofosyne mofosyne added Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level embeddings embedding related topics labels May 22, 2024
@mofosyne mofosyne added the merge ready indicates that this may be ready to merge soon and is just holding out in case of objections label May 28, 2024
@mofosyne
Copy link
Collaborator

@YannFollet is this PR all good? Have a quick lookover. Had to do some merging to keep this in sync.

Marking as merging soon and will revisit later to see if anyone want to still comment about it.

@mofosyne
Copy link
Collaborator

mofosyne commented Jun 9, 2024

@YannFollet this PR has gone out of sync. Can you synchronize this with the main branch? It appears there is no other objection and GG is happy with it. But it's out of sync so cannot easily merge.

@YannFollet
Copy link
Contributor Author

YannFollet commented Jun 18, 2024

@mofosyne Sorry haven't seen your message, I have done the merge with master


// print cosine similarity matrix
if (n_prompts > 1) {
if (params.embd_out=="") {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (params.embd_out=="") {
if (params.embd_out.empty()) {

common/common.h Outdated
Comment on lines 156 to 158
int32_t embd_normalize = 2; // normalisation for embendings (-1=none, 0=max absolute, 1=taxicab, 2=euclidean, >2=p-norm)
std::string embd_out = ""; // empty = default, "array" = [] or [[],[]...], "json" = openai style, "json+" = same "json" + cosine similarity matrix
std::string embd_sep = "\n"; // separator of embendings
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These group of parameters should be moved below in their own section, similar to what we do for other examples:

    // embedding
    int32_t embd_normalize = 2;     // normalisation for embedings (-1=none, 0=max absolute, 1=taxicab, 2=euclidean, >2=p-norm)

    std::string embd_out   = "";    // empty = default, "array" = [] or [[],[]...], "json" = openai style, "json+" = same "json" + cosine similarity matrix
    std::string embd_sep   = "\n";  // separator of embedings

CHECK_ARG
params.embd_sep = argv[i];
return true;
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expand gpt_params_print_usage for these parameters

common/common.h Outdated
@@ -377,7 +380,7 @@ void llama_kv_cache_dump_view_seqs(const llama_kv_cache_view & view, int row_siz
// Embedding utils
//

void llama_embd_normalize(const float * inp, float * out, int n);
void llama_embd_normalize(const float * inp, float * out, int n, int embd_norm = 2);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
void llama_embd_normalize(const float * inp, float * out, int n, int embd_norm = 2);
void llama_embd_normalize(const float * inp, float * out, int n, int embd_norm = 2);

Comment on lines 193 to 196
if (params.embd_normalize==0)
fprintf(stdout, "%6.0f ", emb[j * n_embd + i]);
else
fprintf(stdout, "%9.6f ", emb[j * n_embd + i]);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (params.embd_normalize==0)
fprintf(stdout, "%6.0f ", emb[j * n_embd + i]);
else
fprintf(stdout, "%9.6f ", emb[j * n_embd + i]);
if (params.embd_normalize == 0) {
fprintf(stdout, "%6.0f ", emb[j * n_embd + i]);
} else {
fprintf(stdout, "%9.6f ", emb[j * n_embd + i]);
}

float sim = llama_embd_similarity_cos(emb + i * n_embd, emb + j * n_embd, n_embd);
fprintf(stdout, "%6.2f ", sim);
}
fprintf(stdout, "%1.10s",prompts[i].c_str());
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fprintf(stdout, "%1.10s",prompts[i].c_str());
fprintf(stdout, "%1.10s", prompts[i].c_str());

Comment on lines 220 to 255
if (params.embd_out=="json" || params.embd_out=="json+" || params.embd_out=="array") {
const bool notArray = params.embd_out!="array";

fprintf(stdout, notArray?"{\n \"object\": \"list\",\n \"data\": [\n":"[");
for (int j = 0;;) { // at least one iteration (one prompt)
if (notArray) fprintf(stdout, " {\n \"object\": \"embedding\",\n \"index\": %d,\n \"embedding\": ",j);
fprintf(stdout, "[");
for (int i = 0;;) { // at least one iteration (n_embd > 0)
fprintf(stdout, params.embd_normalize==0?"%1.0f":"%1.7f", emb[j * n_embd + i]);
i++;
if (i < n_embd) fprintf(stdout, ","); else break;
}
fprintf(stdout, notArray?"]\n }":"]");
j++;
if (j < n_prompts) fprintf(stdout, notArray?",\n":","); else break;
}
fprintf(stdout, notArray?"\n ]":"]\n");

if (params.embd_out=="json+" && n_prompts > 1) {
fprintf(stdout, ",\n \"cosineSimilarity\": [\n");
for (int i = 0;;) { // at least two iteration (n_prompts > 1)
fprintf(stdout, " [");
for (int j = 0;;) { // at least two iteration (n_prompts > 1)
float sim = llama_embd_similarity_cos(emb + i * n_embd, emb + j * n_embd, n_embd);
fprintf(stdout, "%6.2f", sim);
j++;
if (j < n_prompts) fprintf(stdout, ", "); else break;
}
fprintf(stdout, " ]");
i++;
if (i < n_prompts) fprintf(stdout, ",\n"); else break;
}
fprintf(stdout, "\n ]");
}

if (notArray) fprintf(stdout, "\n}\n");
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (params.embd_out=="json" || params.embd_out=="json+" || params.embd_out=="array") {
const bool notArray = params.embd_out!="array";
fprintf(stdout, notArray?"{\n \"object\": \"list\",\n \"data\": [\n":"[");
for (int j = 0;;) { // at least one iteration (one prompt)
if (notArray) fprintf(stdout, " {\n \"object\": \"embedding\",\n \"index\": %d,\n \"embedding\": ",j);
fprintf(stdout, "[");
for (int i = 0;;) { // at least one iteration (n_embd > 0)
fprintf(stdout, params.embd_normalize==0?"%1.0f":"%1.7f", emb[j * n_embd + i]);
i++;
if (i < n_embd) fprintf(stdout, ","); else break;
}
fprintf(stdout, notArray?"]\n }":"]");
j++;
if (j < n_prompts) fprintf(stdout, notArray?",\n":","); else break;
}
fprintf(stdout, notArray?"\n ]":"]\n");
if (params.embd_out=="json+" && n_prompts > 1) {
fprintf(stdout, ",\n \"cosineSimilarity\": [\n");
for (int i = 0;;) { // at least two iteration (n_prompts > 1)
fprintf(stdout, " [");
for (int j = 0;;) { // at least two iteration (n_prompts > 1)
float sim = llama_embd_similarity_cos(emb + i * n_embd, emb + j * n_embd, n_embd);
fprintf(stdout, "%6.2f", sim);
j++;
if (j < n_prompts) fprintf(stdout, ", "); else break;
}
fprintf(stdout, " ]");
i++;
if (i < n_prompts) fprintf(stdout, ",\n"); else break;
}
fprintf(stdout, "\n ]");
}
if (notArray) fprintf(stdout, "\n}\n");
if (params.embd_out == "json" || params.embd_out == "json+" || params.embd_out == "array") {
const bool notArray = params.embd_out != "array";
fprintf(stdout, notArray ? "{\n \"object\": \"list\",\n \"data\": [\n":"[");
for (int j = 0;;) { // at least one iteration (one prompt)
if (notArray) fprintf(stdout, " {\n \"object\": \"embedding\",\n \"index\": %d,\n \"embedding\": ",j);
fprintf(stdout, "[");
for (int i = 0;;) { // at least one iteration (n_embd > 0)
fprintf(stdout, params.embd_normalize == 0 ? "%1.0f" : "%1.7f", emb[j * n_embd + i]);
i++;
if (i < n_embd) fprintf(stdout, ","); else break;
}
fprintf(stdout, notArray ? "]\n }" : "]");
j++;
if (j < n_prompts) fprintf(stdout, notArray ? ",\n" : ","); else break;
}
fprintf(stdout, notArray ? "\n ]" : "]\n");
if (params.embd_out == "json+" && n_prompts > 1) {
fprintf(stdout, ",\n \"cosineSimilarity\": [\n");
for (int i = 0;;) { // at least two iteration (n_prompts > 1)
fprintf(stdout, " [");
for (int j = 0;;) { // at least two iteration (n_prompts > 1)
float sim = llama_embd_similarity_cos(emb + i * n_embd, emb + j * n_embd, n_embd);
fprintf(stdout, "%6.2f", sim);
j++;
if (j < n_prompts) fprintf(stdout, ", "); else break;
}
fprintf(stdout, " ]");
i++;
if (i < n_prompts) fprintf(stdout, ",\n"); else break;
}
fprintf(stdout, "\n ]");
}
if (notArray) fprintf(stdout, "\n}\n");

group of parameters // embedding
print usage for embedding parameters
@ggerganov ggerganov merged commit 646ef4a into ggerganov:master Jun 24, 2024
64 checks passed
Nexesenex added a commit to Nexesenex/croco.cpp that referenced this pull request Jun 25, 2024
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Jun 30, 2024
* add parameters for embeddings
--embd-normalize
--embd-output-format
--embd-separator
description in the README.md

* Update README.md

fix tipo

* Trailing whitespace

* fix json generation, use " not '

* fix merge master

* fix code formating
group of parameters // embedding
print usage for embedding parameters

---------

Co-authored-by: Brian <[email protected]>
MagnusS0 pushed a commit to MagnusS0/llama.cpp-normistral-tokenizer that referenced this pull request Jul 1, 2024
* add parameters for embeddings
--embd-normalize
--embd-output-format
--embd-separator
description in the README.md

* Update README.md

fix tipo

* Trailing whitespace

* fix json generation, use " not '

* fix merge master

* fix code formating
group of parameters // embedding
print usage for embedding parameters

---------

Co-authored-by: Brian <[email protected]>
@YannFollet YannFollet deleted the embedding-parameters branch December 10, 2024 01:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
embeddings embedding related topics examples merge ready indicates that this may be ready to merge soon and is just holding out in case of objections Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants