GitHub - semiring/IRL-llama.cpp: in situ recurrent layering (and some ablation studies) on llama.cpp. Ugly experimental hacks. Nothing stable here.

This is an appalling hack of llama.cpp to see if we can create in situ Frankenmerges at the computation graph building level.

Layers are specifed in the environment variable $LLAMA_CHUNKS as a string of floats of the form: "first_chunk_begin, first_chunk_end, 2nd_chunk_begin, 2nd_chunk_end, ...";

e.g.,

export LLAMA_CHUNKS="0.0,0.6,0.2,0.8,0.6,1.0"

creates a Frankenmodel that has the first 60% of the model layers, followed by a block starting at the layer 20% through, and ending at the layer 80% of the way through, and then a final block from 60% to 100%.

Note layers are always addressed as a fraction of the number of layers in the model (0.0 - 1.0).

Name		Name	Last commit message	Last commit date
Latest commit History 1,734 Commits
.devops		.devops
.github		.github
awq-py		awq-py
ci		ci
cmake		cmake
common		common
docs		docs
examples		examples
gguf-py		gguf-py
grammars		grammars
media		media
models		models
pocs		pocs
prompts		prompts
requirements		requirements
scripts		scripts
spm-headers		spm-headers
tests		tests
.clang-tidy		.clang-tidy
.dockerignore		.dockerignore
.ecrc		.ecrc
.editorconfig		.editorconfig
.flake8		.flake8
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
Makefile		Makefile
Package.swift		Package.swift
README.md		README.md
SHA256SUMS		SHA256SUMS
build.zig		build.zig
codecov.yml		codecov.yml
convert-hf-to-gguf.py		convert-hf-to-gguf.py
convert-llama-ggml-to-gguf.py		convert-llama-ggml-to-gguf.py
convert-lora-to-ggml.py		convert-lora-to-ggml.py
convert-persimmon-to-gguf.py		convert-persimmon-to-gguf.py
convert.py		convert.py
flake.lock		flake.lock
flake.nix		flake.nix
ggml-alloc.c		ggml-alloc.c
ggml-alloc.h		ggml-alloc.h
ggml-backend-impl.h		ggml-backend-impl.h
ggml-backend.c		ggml-backend.c
ggml-backend.h		ggml-backend.h
ggml-cuda.cu		ggml-cuda.cu
ggml-cuda.h		ggml-cuda.h
ggml-impl.h		ggml-impl.h
ggml-metal.h		ggml-metal.h
ggml-metal.m		ggml-metal.m
ggml-metal.metal		ggml-metal.metal
ggml-mpi.c		ggml-mpi.c
ggml-mpi.h		ggml-mpi.h
ggml-opencl.cpp		ggml-opencl.cpp
ggml-opencl.h		ggml-opencl.h
ggml-quants.c		ggml-quants.c
ggml-quants.h		ggml-quants.h
ggml.c		ggml.c
ggml.h		ggml.h
llama.cpp		llama.cpp
llama.h		llama.h
mypy.ini		mypy.ini
requirements.txt		requirements.txt
run_with_preset.py		run_with_preset.py
unicode.h		unicode.h

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

License

semiring/IRL-llama.cpp

Folders and files

Latest commit

History

Repository files navigation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages