EVALUATION-ONLY ACCESS (30-DAY TESTING)

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

This is a private evaluation version of LlamaOra-6.4B-Instruct-FP8.

By agreeing, you accept:

30-day internal testing only
No commercial use, redistribution, or reverse-engineering
Deletion of all files after evaluation
Full terms in LICENSE

Access is granted only to approved licensees.

LlamaOra-6.4B-Instruct-FP8

This repository contains the LlamaOra-6.4B-Instruct-FP8 model developed by Ora Computing. This is a compressed and fine-tuned derivative of the Llama 3.1‑8B‑Instruct model (Built with Llama).

Model Overview

Model name: LlamaOra-6.4B-Instruct-FP8 Base model: Llama 3.1‑8B‑Instruct
Derived size: ~6.4 billion parameters (compressed from the base model’s ~8.0 billion)
Purpose: Evaluation/test‑use only; optimized for internal benchmarking and non‑production integration.
License: See `LICENSE` (Custom Model License Agreement)

Intended Use & Restrictions

Permitted use

Internal testing, benchmarking and evaluation of the model by the named Licensee.
Exploration of model behaviours, prompt engineering, and non‑production prototypes.

Prohibited use

Deployment in a production or commercial service, publicly‑facing API, resale or redistribution.
Fine‑tuning or creating derivative models for production use without separate agreement.
Disclosure or sharing of the model (or its weights) to third parties beyond the named Licensee.

Out‑of‑scope use

Any use that triggers the “Additional Commercial Terms” of the Llama 3.1 Community Licence (e.g., >700 million monthly active users).
Use of the model in regulated or safety‑critical contexts (unless separately permitted).

Accuracy

Benchmark	Llama-3.1-8B-Instruct (base model)	LlamaOra-6.4B-Instruct-FP8 (this model)	Recovery
MMLU (0-shot)	68.34	65.82	96.31%
BOOLQ (0-shot)	85.47	84.28	98.61%
Hellaswag (norm, 0-shot)	79.5	77.88	97.96%
Winogrande (0-shot)	73.64	76.56	103.97%
ARC Challenge (norm, 0-shot)	55.63	58.28	104.76%
Average	72.52	72.56	100.07%

Finetuning

This model was finetuned using LoRA on high-quality instruction-response data.

Compression & Quantization

Parameter count compression: The parameter count was reduced from ~8.0 billion to ~6.4 billion.
Quantization: We quantized the compressed model to static FP8 using the llm_compressor toolkit.

Limitations & Risks

Compressed models may not replicate the full behaviour of the base model under all prompt categories, particularly domain‑specific or rare inputs.
Quantisation (FP8) and architectural compression may introduce subtle degradation in accuracy or stability.
The model is provided as‑is for testing only and is not certified for production use.
Users should validate outputs carefully and monitor for bias or unintended behaviours.

Upstream Attribution

This model is derived from the Llama 3.1 model family released by Meta Platforms, Inc. under the Llama 3.1 Community License.

“Llama 3.1 is licensed under the Llama 3.1 Community License, Copyright © Meta Platforms, Inc. All Rights Reserved.”
For full terms, see: https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/LICENSE

Contact & Support

For licensing inquiries or to request extended evaluation rights, please contact:
stefan@oracomputing.com

Repository and model access are regulated. Do not redistribute or share without explicit written permission from Ora Computing.

Downloads last month: -

Safetensors

Model size

6B params

Tensor type

BF16

F8_E4M3

Model tree for oracomputing/LlamaOra-6.4B-Instruct-FP8

Base model

meta-llama/Llama-3.1-8B

Finetuned

meta-llama/Llama-3.1-8B-Instruct

Adapter

(1305)

this model