From a3cece5fbd15087f533d1f922843d5b4ff321af7 Mon Sep 17 00:00:00 2001 From: Wing Lian Date: Thu, 28 Mar 2024 19:30:10 -0400 Subject: [PATCH] add readme for jamba --- examples/jamba/README.md | 5 +++++ 1 file changed, 5 insertions(+) create mode 100644 examples/jamba/README.md diff --git a/examples/jamba/README.md b/examples/jamba/README.md new file mode 100644 index 0000000000..aa98c02450 --- /dev/null +++ b/examples/jamba/README.md @@ -0,0 +1,5 @@ +# Jamba + +qlora w/ deepspeed needs at least 2x GPUs and 35GiB VRAM per GPU + +qlora single-gpu - training will start, but loss is off by an order of magnitude