Trillim
/

BitNet-Large-TRNQ

Model card Files Files and versions

BitNet-Large-TRNQ / README.md

DarkSca's picture

Upload folder using huggingface_hub

9481ca7 verified 9 days ago

|

history blame contribute delete

1.63 kB

	---
	license: mit
	tags:
	- bitnet
	- ternary
	- trillim
	- cpu-inference
	base_model: 1bitLLM/bitnet_b1_58-large
	---

	# BitNet-Large-TRNQ

	Ternary-quantized version of [1bitLLM/bitnet_b1_58-large](https://huggingface.co/1bitLLM/bitnet_b1_58-large), packaged for the [Trillim DarkNet](https://huggingface.co/Trillim) inference engine.

	This model runs entirely on CPU — no GPU required.

	## Model Details

	\| \| \|
	\|---\|---\|
	\| Architecture \| BitNet (BitnetForCausalLM) \|
	\| Parameters \| ~700M \|
	\| Hidden size \| 1536 \|
	\| Layers \| 24 \|
	\| Attention heads \| 16 \|
	\| Context length \| 2048 \|
	\| Quantization \| Ternary ({-1, 0, 1}) \|
	\| Source model \| [1bitLLM/bitnet_b1_58-large](https://huggingface.co/1bitLLM/bitnet_b1_58-large) \|
	\| License \| MIT \|

	## Usage

	```bash
	pip install trillim
	trillim pull Trillim/BitNet-Large-TRNQ
	trillim serve Trillim/BitNet-Large-TRNQ
	```

	This starts an OpenAI-compatible API server at `http://127.0.0.1:8000`.

	For interactive CLI chat:

	```bash
	trillim chat Trillim/BitNet-Large-TRNQ
	```

	## What's in this repo

	\| File \| Description \|
	\|---\|---\|
	\| `qmodel.tensors` \| Ternary-quantized weights in Trillim format \|
	\| `rope.cache` \| Precomputed RoPE embeddings \|
	\| `config.json` \| Model configuration \|
	\| `tokenizer.json` \| Tokenizer \|
	\| `tokenizer_config.json` \| Tokenizer configuration \|
	\| `tokenizer.model` \| SentencePiece model \|
	\| `tokenization_bitnet.py` \| Custom tokenizer class \|
	\| `trillim_config.json` \| Trillim metadata \|

	## License

	This model is released under the [MIT License](https://opensource.org/licenses/MIT), following the license of the source model.