YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Qwen3-0.6B for Apple NPU

Quickstart

  1. Install NexaSDK

  2. Run the model with one line of code:

    nexa infer NexaAI/qwen3-0.6b-ane
    

Model Description

Qwen3-0.6B is a compact 600-million-parameter language model from the Qwen team at Alibaba Cloud.
Designed for ultra-efficient inference, it provides strong multilingual understanding and basic reasoning in a tiny footprint.

With low memory requirements and fast latency, Qwen3-0.6B is ideal for mobile, embedded, and resource-constrained environments.

Features

  • Ultra-lightweight: Runs well on CPUs, mobile devices, and edge hardware.
  • Multilingual: Supports a broad set of languages.
  • Fast inference: Low-latency generation suited for real-time applications.
  • Efficient reasoning: Performs core reasoning and analysis tasks at small scale.
  • Fine-tunable: Adaptable for domain-specific use cases.

Use Cases

  • Mobile and embedded assistants
  • Lightweight chat and document apps
  • On-device summarization and Q&A
  • IoT and robotics agents
  • CPU-only or small-GPU deployments

Inputs and Outputs

Input

  • Text prompts or conversation history (tokenized sequences for API or SDK workflows)

Output

  • Generated text (answers, summaries, short reasoning)
  • Optional logits/probabilities

License

This repo is licensed under the Creative Commons Attribution–NonCommercial 4.0 (CC BY-NC 4.0) license, which allows use, sharing, and modification only for non-commercial purposes with proper attribution. All NPU-related models, runtimes, and code in this project are protected under this non-commercial license and cannot be used in any commercial or revenue-generating applications. Commercial licensing or enterprise usage requires a separate agreement. For inquiries, please contact dev@nexa.ai

Downloads last month
53
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including NexaAI/Qwen3-0.6B-ANE