Falcon40B and 7B (Instruct) with streaming, top-k, and beam search 4bit
Adapted from Serin32's starter code on huggingface: https://huggingface.co/tiiuae/falcon-40b/discussions/38
Falcon40B and 7B (Instruct) with streaming, top-k, and beam search 4bit
Adapted from Serin32's starter code on huggingface: https://huggingface.co/tiiuae/falcon-40b/discussions/38