Skip to content

Navigation Menu

Explore
By company size
By use case
By industry
View all solutions
Topics
- AI
- DevOps
- Security
- Software Development
- View all
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

turboderp / exllama Public

Sponsor turboderp/exllama

External links

ko-fi.com/turboderp

Learn more about funding links in repositories.

Report abuse
Notifications
Fork 221
Star 2.8k

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Issues: turboderp/exllama

Labels 9 Milestones 0

Labels 9 Milestones 0

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clear current search query, filters, and sorts

59 Open 161 Closed

59 Open 161 Closed

Author

Filter by author

Loading

Label

Filter by label

Loading

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Loading

Milestones

Filter by milestone

Loading

Assignee

Filter by who’s assigned

Assigned to nobody

Loading

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

Run on CPU without AVX2

#315 opened Apr 14, 2024 by ZanMax updated Apr 18, 2024

3

piece id is out of range

#314 opened Apr 9, 2024 by chethanwiz updated Apr 9, 2024

3

Multi-GPU issues

#281 opened Sep 9, 2023 by nktice updated Jan 23, 2024

9

When will the bfloat16 type of GPTQ algorithm be supported?

#310 opened Dec 20, 2023 by Kelang-Tian updated Dec 20, 2023

Illegal memory access when using a lora

#180 opened Jul 21, 2023 by ipb26 updated Dec 12, 2023

Bad output for 2080 ti

#254 opened Aug 19, 2023 by filipemesquita updated Dec 5, 2023

1

Possible to load model with low system ram?

#245 opened Aug 12, 2023 by gros87 updated Nov 28, 2023

4

Does it support safetytensor formate?>

#309 opened Nov 28, 2023 by lucasjinreal updated Nov 28, 2023

Error when using Beam Search

#308 opened Nov 17, 2023 by bibekyess updated Nov 17, 2023

Occasionally RuntimeError

#307 opened Nov 16, 2023 by leegohi04517 updated Nov 16, 2023

Using Exllama backend requires all the modules to be on GPU - how?

#306 opened Nov 6, 2023 by tigerinus updated Nov 6, 2023

1

CodeLLaMA + LoRA: RuntimeError: CUDA error: an illegal memory access was encountered

#290 opened Sep 15, 2023 by juanps90 updated Oct 12, 2023

3

Support for Baichuan2 models

#280 opened Sep 9, 2023 by bernardx updated Oct 12, 2023

1

OSError: CUDA_HOME environment variable is not set.

#291 opened Sep 17, 2023 by jamesbraza updated Sep 29, 2023

8

Changing hyper-parameters after initilization without reloading weights from disk.

#299 opened Sep 28, 2023 by kmccleary3301 updated Sep 28, 2023

followed instructions with error

#288 opened Sep 15, 2023 by hiqsociety updated Sep 25, 2023

2

Tried to build setup exllama but encountering ninja related errors, can someone please help me?

#258 opened Aug 22, 2023 by BwandoWando updated Sep 25, 2023

3

doesn't use CUDA_HOME?

#293 opened Sep 20, 2023 by j2l updated Sep 20, 2023

GPU Inference from IPython

#289 opened Sep 15, 2023 by Rajmehta123 updated Sep 15, 2023

Weird issue with context length

#220 opened Aug 3, 2023 by zzzacwork updated Sep 15, 2023

6

Lora support

#55 opened Jun 15, 2023 by alain40 updated Sep 14, 2023

GPU Usage Keeps High Even Without Inference Load

#253 opened Aug 19, 2023 by leonxia1018 updated Sep 13, 2023

7

Speed on A100

#266 opened Aug 30, 2023 by Ber666 updated Sep 11, 2023

4

Progress on the rewrite for older cards (Like the P40)

#279 opened Sep 8, 2023 by TimyIsCool updated Sep 10, 2023

1

RoPE Frequency Base and Frequency Scale Support

#262 opened Aug 28, 2023 by ChrisCates updated Sep 9, 2023

3

Previous 1 2 3 Next

Previous Next

ProTip! Updated in the last three days: updated:>2025-02-21.

Footer

© 2025 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.