Skip to content

Conversation

mengshengwu
Copy link

@mengshengwu mengshengwu commented Oct 14, 2025

Introduction

Hello! Friends from Huggingface 👋

Thank you for maintaining this amazing library and ecosystem.
Following a recent discussion with Hugging Face (see here), your CEO expressed interest in exploring collaboration to bring on-device model inference to the community. This PR proposes adding Nexa SDK to the local apps section as a new runtime integration option.

About Nexa SDK

Nexa SDK is an on-device inference framework that runs any model on any device, across any backend. It runs on CPUs, GPUs, NPUs with backend support for CUDA, Metal, Vulkan, and Qualcomm / Intel / AMD NPU. It handles multiple input modalities including text 📝, image 🖼️, and audio 🎧. The SDK includes an OpenAI-compatible API server with support for JSON schema-based function calling and streaming. It supports model formats such as GGUF, MLX, Nexa AI's own .nexa format, enabling efficient quantized inference across diverse platforms.


Example Demos

Multi-Image Reasoning Demo

🖼️ Multi-Image Reasoning
Spot the difference across two images in multi-round dialogue.

Image + Audio Function Call Demo

🎤 Image + Text → Function Call
Snap a poster, add a voice note, and the AI agent creates a calendar event locally.

Multi-Audio Comparison Demo

🎶 Multi-Audio Comparison
Identify differences between two music clips — fully offline.


Related Links


Logo File

Nexa-Logo-Black


Thank you for your time reviewing this PR!
We’re excited to explore how Nexa SDK can extend Hugging Face models to mobile and edge platforms. 🙏

@mengshengwu mengshengwu changed the title feat(tasks): update Nexa SDK integration with multi-platform support feat(local-apps): Add Nexa SDK integration Oct 14, 2025
@mengshengwu mengshengwu marked this pull request as ready for review October 14, 2025 07:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants