-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Api embeding as service #127
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
real quick, didn't took deeper look at it
.gitignore
Outdated
@@ -5,5 +5,6 @@ models/** | |||
db/ | |||
static/ | |||
.idea/ | |||
embeding_models/e5-large-v2/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: this shouldn't be like that, it's a submodule
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remove this and add new .gitignore
embedingModels/README.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: tons of typos, please use some spellchecker and preferably Language Tool
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it was late xd done
embedingModels/README.md
Outdated
#### IMPORTANT | ||
|
||
keep file tree like this !!! | ||
|
||
```sh | ||
embeding_models | ||
├── e5-large-v2 | ||
│ ├── 1_Pooling | ||
│ │ └── config.json | ||
│ ├── config.json | ||
│ ├── handler.py | ||
│ ├── model.safetensors | ||
│ ├── modules.json | ||
│ ├── pytorch_model.bin | ||
│ ├── README.md | ||
│ ├── sentence_bert_config.json | ||
│ ├── special_tokens_map.json | ||
│ ├── tokenizer_config.json | ||
│ ├── tokenizer.json | ||
│ └── vocab.txt | ||
└── README.md | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: just don't mess with submodule, as it's another repo you don't have permissions for :v
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will fix to download only models in nextgod for now as in llm I have to download the repo for the container
embedingModels/README.md
Outdated
48 deploy: | ||
49 resources: | ||
50 reservations: | ||
51 devices: | ||
52 - driver: nvidia | ||
53 count: 1 | ||
54 capabilities: [ gpu ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: create separate profiles for gpu and cpu or separate compose for the gpu (you can use 2 compose files using -f
flag and 2nd option will just override the first one)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the future
embedingModels/pyproject.toml
Outdated
python = "^3.10" | ||
transformers = "^4.39.3" | ||
torch = "^2.2.2" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: use ~major.minor
instead of ^major.minor.patch
, it'll be "safer" as some tools can introduce breaking changes with minors (but they shouldn't)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remove this
.gitignore
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: ???
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.
embedding/README.md
Outdated
@@ -0,0 +1,61 @@ | |||
# embedding-api | |||
|
|||
## How run servis |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: SPELL. CHECK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
embedding/README.md
Outdated
48 deploy: | ||
49 resources: | ||
50 reservations: | ||
51 devices: | ||
52 - driver: nvidia | ||
53 count: 1 | ||
54 capabilities: [ gpu ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: i think i already told you about line numbers in here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
embedding/README.md
Outdated
```sh | ||
cd embedding_models | ||
git clone [email protected]:intfloat/e5-large-v2 | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: DO NOT do it like this. either use submodules or subtrees. cloning on your own into folder inside repo is just dumb.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I'll do it as submodules
Co-authored-by: Mateusz <[email protected]>
Co-authored-by: Patryk Gronkiewicz <[email protected]>
Co-authored-by: Patryk Gronkiewicz <[email protected]>
.gitignore
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.
README.md
Outdated
```sh | ||
cd models | ||
git clone [email protected]:intfloat/e5-large-v2 | ||
cd embedding/models | ||
git submodule add [email protected]:intfloat/e5-large-v2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: nope, that's wrong, read more about submodules and how to use 'em
embedding/.gitignore
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: nope
To explain why I removed most of the files. In lama.cpp all llm models have a "built-in embedding model" I knew this but I thought that if we use a newer one then llm will work better and this was also said at the meetings but nothing could be further from the truth since a particular model benefited from a specific embedding model during training will work better with the one it was trained with. Using the lama.cpp server I have access directly to the embedding model itself as a separate EP. For this reason, I discarded the idea of creating a separate service for embedding. |
docker-compose.yml
Outdated
@@ -28,11 +28,27 @@ services: | |||
depends_on: | |||
- db | |||
|
|||
llm: | |||
profiles: [ "dev", "prod" ] | |||
llm-embedding: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add -cpu to this line.
because I deleted the files to which there were objections and went another way
embeding-api is working more info in README.md Closes #122