Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Api embeding as service #127

Merged
merged 24 commits into from
May 12, 2024
Merged

Api embeding as service #127

merged 24 commits into from
May 12, 2024

Conversation

Kleczyk
Copy link
Contributor

@Kleczyk Kleczyk commented Apr 4, 2024

embeding-api is working more info in README.md Closes #122

@github-actions github-actions bot requested a review from Sygnator April 4, 2024 01:37
@pgronkievitz pgronkievitz changed the title Api embeding as servis Api embeding as service Apr 4, 2024
Copy link
Member

@pgronkievitz pgronkievitz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

real quick, didn't took deeper look at it

.gitignore Outdated
@@ -5,5 +5,6 @@ models/**
db/
static/
.idea/
embeding_models/e5-large-v2/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: this shouldn't be like that, it's a submodule

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remove this and add new .gitignore

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: tons of typos, please use some spellchecker and preferably Language Tool

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was late xd done

Comment on lines 21 to 42
#### IMPORTANT

keep file tree like this !!!

```sh
embeding_models
├── e5-large-v2
│ ├── 1_Pooling
│ │ └── config.json
│ ├── config.json
│ ├── handler.py
│ ├── model.safetensors
│ ├── modules.json
│ ├── pytorch_model.bin
│ ├── README.md
│ ├── sentence_bert_config.json
│ ├── special_tokens_map.json
│ ├── tokenizer_config.json
│ ├── tokenizer.json
│ └── vocab.txt
└── README.md
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: just don't mess with submodule, as it's another repo you don't have permissions for :v

Copy link
Contributor Author

@Kleczyk Kleczyk Apr 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will fix to download only models in nextgod for now as in llm I have to download the repo for the container

Comment on lines 49 to 55
48 deploy:
49 resources:
50 reservations:
51 devices:
52 - driver: nvidia
53 count: 1
54 capabilities: [ gpu ]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: create separate profiles for gpu and cpu or separate compose for the gpu (you can use 2 compose files using -f flag and 2nd option will just override the first one)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the future

Comment on lines 8 to 10
python = "^3.10"
transformers = "^4.39.3"
torch = "^2.2.2"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: use ~major.minor instead of ^major.minor.patch, it'll be "safer" as some tools can introduce breaking changes with minors (but they shouldn't)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remove this

.gitignore Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: ???

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.

README.md Outdated Show resolved Hide resolved
embedding/.gitignore Outdated Show resolved Hide resolved
@@ -0,0 +1,61 @@
# embedding-api

## How run servis
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: SPELL. CHECK.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 49 to 55
48 deploy:
49 resources:
50 reservations:
51 devices:
52 - driver: nvidia
53 count: 1
54 capabilities: [ gpu ]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: i think i already told you about line numbers in here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 12 to 15
```sh
cd embedding_models
git clone [email protected]:intfloat/e5-large-v2
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: DO NOT do it like this. either use submodules or subtrees. cloning on your own into folder inside repo is just dumb.

Copy link
Contributor Author

@Kleczyk Kleczyk Apr 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I'll do it as submodules

embedding/README.md Outdated Show resolved Hide resolved
.gitignore Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.

README.md Outdated
Comment on lines 77 to 79
```sh
cd models
git clone [email protected]:intfloat/e5-large-v2
cd embedding/models
git submodule add [email protected]:intfloat/e5-large-v2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: nope, that's wrong, read more about submodules and how to use 'em

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: nope

@Kleczyk
Copy link
Contributor Author

Kleczyk commented Apr 28, 2024

To explain why I removed most of the files. In lama.cpp all llm models have a "built-in embedding model" I knew this but I thought that if we use a newer one then llm will work better and this was also said at the meetings but nothing could be further from the truth since a particular model benefited from a specific embedding model during training will work better with the one it was trained with. Using the lama.cpp server I have access directly to the embedding model itself as a separate EP. For this reason, I discarded the idea of creating a separate service for embedding.

@Kleczyk Kleczyk requested a review from pgronkievitz April 28, 2024 11:49
@@ -28,11 +28,27 @@ services:
depends_on:
- db

llm:
profiles: [ "dev", "prod" ]
llm-embedding:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add -cpu to this line.

@Kleczyk Kleczyk dismissed pgronkievitz’s stale review May 12, 2024 16:48

because I deleted the files to which there were objections and went another way

@Kleczyk Kleczyk merged commit 9f7c2bc into main May 12, 2024
3 checks passed
@Kleczyk Kleczyk deleted the api-embeding branch May 12, 2024 16:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Embeding
4 participants