-
Notifications
You must be signed in to change notification settings - Fork 109
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Bump the versions, new models support (#463)
* up the versions * fixing starcoder2 flash sa * integrate groq / cerebras to the self-hosting (#466) * qwen2.5 models * upd README.md * a warning * get rid of the autogptq models * version 1.8.0 * version 1.8.0 * deprecated versions in the readme * add completion support for the passthrough models * add multiline_code_completion_default_model * _select_default_lora_if_exists for multiline_code_completion_default_model _add_results_for_passthrough_provider fix * rm deepseek-coder-v2/16b/instruct MAX_JOBS=8 * gpt-4 is unavailable
- Loading branch information
1 parent
31ed965
commit 1b094ba
Showing
19 changed files
with
414 additions
and
88 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,4 @@ | ||
FROM nvidia/cuda:11.8.0-cudnn8-devel-ubuntu22.04 | ||
FROM nvidia/cuda:12.4.1-cudnn-devel-ubuntu22.04 | ||
|
||
ENV INSTALL_OPTIONAL=TRUE | ||
ENV MAX_JOBS=8 | ||
|
@@ -13,24 +13,28 @@ RUN DEBIAN_FRONTEND="noninteractive" TZ=Etc/UTC apt-get install -y \ | |
ruby-full \ | ||
ruby-bundler \ | ||
build-essential \ | ||
cmake \ | ||
pkg-config \ | ||
libicu-dev \ | ||
zlib1g-dev \ | ||
libcurl4-openssl-dev \ | ||
libssl-dev \ | ||
&& rm -rf /var/lib/{apt,dpkg,cache,log} | ||
RUN DEBIAN_FRONTEND="noninteractive" TZ=Etc/UTC apt remove cmake -y | ||
RUN pip install cmake --upgrade | ||
|
||
RUN git clone https://github.com/smallcloudai/linguist.git /tmp/linguist \ | ||
&& cd /tmp/linguist \ | ||
&& bundle install \ | ||
&& rake build_gem | ||
ENV PATH="${PATH}:/tmp/linguist/bin" | ||
|
||
RUN pip install --no-cache-dir torch==2.3.0 --index-url https://download.pytorch.org/whl/cu118 | ||
RUN pip install --no-cache-dir xformers==0.0.26.post1 --index-url https://download.pytorch.org/whl/cu118 | ||
RUN pip install --no-cache-dir torch==2.5.0 | ||
RUN pip install --no-cache-dir xformers==v0.0.28.post2 | ||
RUN pip install ninja | ||
RUN VLLM_INSTALL_PUNICA_KERNELS=1 pip install -v --no-build-isolation git+https://github.com/smallcloudai/vllm@refact_v0.4.2_06052024 | ||
RUN pip install setuptools_scm | ||
ENV CMAKE_ARGS="-DLLAMA_CUBLAS=on -DCMAKE_CUDA_ARCHITECTURES=60;61;70;75;80;86;89;90+PTX" | ||
RUN pip install -v --no-build-isolation git+https://github.com/smallcloudai/vllm@refact_v0.6.3_2adb440 | ||
|
||
# there is no prebuild auto-gptq with torch 2.3.0 support | ||
# there is no prebuild auto-gptq with torch 2.5.0 support | ||
ENV TORCH_CUDA_ARCH_LIST="6.0;6.1;7.0;7.5;8.0;8.6;8.9;9.0+PTX" | ||
RUN BUILD_CUDA_EXT=1 pip install -v --no-build-isolation git+https://github.com/PanQiWei/[email protected] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.