Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor: Replace ollama with nexa-sdk #20

Open
iwr-redmond opened this issue Jan 25, 2025 · 3 comments
Open

Refactor: Replace ollama with nexa-sdk #20

iwr-redmond opened this issue Jan 25, 2025 · 3 comments

Comments

@iwr-redmond
Copy link

iwr-redmond commented Jan 25, 2025

Nodetool currently relies on Ollama, an unmanaged dependency that requires manual user installation.

It would be helpful to consider replacing this dependency with nexa-sdk, which can be installed and managed from within Python.

Refactor

Like Torch, nexa-sdk uses custom repositories for different platforms:

Nodetool already has the ability to add extra-index-urls for Torch, which means that reusing this functionality to support nexa-sdk should be easy. In this scenario, the CUDA version for Torch would need to increase to 12.4, matching the installable version of ComfyUI.

Functionality

Looking at src/nodetool/providers/ollama, I observe that nexa-sdk:

  • Implements an OpenAI compatible server for inference (docs)
  • Can be activated within the Python environment (example)
  • Can store GGUF models in a vendor-specified directory controlled via the NEXA_CACHE_ROOT environment variable
  • Has pull and list functions that are similar to ollama-py

EDIT: removed erroneous commentary regarding Torch version.

@georgi
Copy link
Collaborator

georgi commented Jan 26, 2025

Hi there, that's an interesting proposal, thanks!

Nexa SDK would be a great alternative or addition to Ollama. It would also give us a wider range of models to give users more options that better fit their hardware.

Do you know of any benchmarks? In particular, how well does it run on ONNX or MPS?

We'll play around with Nexus and see how it fits.

@iwr-redmond
Copy link
Author

I don't think the ONNX support is well developed. However, there is an evaluation component for GGUF that you can use to measure performance.

Regarding additional models, T2I support is lagging behind at the moment (issues#358) and TTS support is currently disabled (pulls#359). While both of these are temporary issues that will be addressed in time, for now the most useful parts of Nexa SDK are the Python integration and the well-developed TGI backend.

@Davidqian123
Copy link

Thanks for your suggestion, T2I and TTS support are both on the way, stay tuned!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants