EVALUATION-ONLY ACCESS (30-DAY TESTING)

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

This is a private evaluation version of LlamaOra-6.4B-Instruct-FP8.

By agreeing, you accept:

  • 30-day internal testing only
  • No commercial use, redistribution, or reverse-engineering
  • Deletion of all files after evaluation
  • Full terms in LICENSE

Access is granted only to approved licensees.

Log in or Sign Up to review the conditions and access this model content.

LlamaOra-6.4B-Instruct-FP8

This repository contains the LlamaOra-6.4B-Instruct-FP8 model developed by Ora Computing. This is a compressed and fine-tuned derivative of the Llama 3.1‑8B‑Instruct model (Built with Llama).


Model Overview

Model name: LlamaOra-6.4B-Instruct-FP8 Base model: Llama 3.1‑8B‑Instruct
Derived size: ~6.4 billion parameters (compressed from the base model’s ~8.0 billion)
Purpose: Evaluation/test‑use only; optimized for internal benchmarking and non‑production integration.
License: See LICENSE (Custom Model License Agreement)

Intended Use & Restrictions

Permitted use

  • Internal testing, benchmarking and evaluation of the model by the named Licensee.
  • Exploration of model behaviours, prompt engineering, and non‑production prototypes.

Prohibited use

  • Deployment in a production or commercial service, publicly‑facing API, resale or redistribution.
  • Fine‑tuning or creating derivative models for production use without separate agreement.
  • Disclosure or sharing of the model (or its weights) to third parties beyond the named Licensee.

Out‑of‑scope use

  • Any use that triggers the “Additional Commercial Terms” of the Llama 3.1 Community Licence (e.g., >700 million monthly active users).
  • Use of the model in regulated or safety‑critical contexts (unless separately permitted).

Accuracy

Benchmark Llama-3.1-8B-Instruct (base model) LlamaOra-6.4B-Instruct-FP8 (this model) Recovery
MMLU (0-shot) 68.34 65.82 96.31%
BOOLQ (0-shot) 85.47 84.28 98.61%
Hellaswag (norm, 0-shot) 79.5 77.88 97.96%
Winogrande (0-shot) 73.64 76.56 103.97%
ARC Challenge (norm, 0-shot) 55.63 58.28 104.76%
Average 72.52 72.56 100.07%

Finetuning

This model was finetuned using LoRA on high-quality instruction-response data.


Compression & Quantization

  • Parameter count compression: The parameter count was reduced from ~8.0 billion to ~6.4 billion.
  • Quantization: We quantized the compressed model to static FP8 using the llm_compressor toolkit.

Limitations & Risks

  • Compressed models may not replicate the full behaviour of the base model under all prompt categories, particularly domain‑specific or rare inputs.
  • Quantisation (FP8) and architectural compression may introduce subtle degradation in accuracy or stability.
  • The model is provided as‑is for testing only and is not certified for production use.
  • Users should validate outputs carefully and monitor for bias or unintended behaviours.

Upstream Attribution

This model is derived from the Llama 3.1 model family released by Meta Platforms, Inc. under the Llama 3.1 Community License.

“Llama 3.1 is licensed under the Llama 3.1 Community License, Copyright © Meta Platforms, Inc. All Rights Reserved.”
For full terms, see: https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/LICENSE


Contact & Support

For licensing inquiries or to request extended evaluation rights, please contact:
stefan@oracomputing.com


Repository and model access are regulated. Do not redistribute or share without explicit written permission from Ora Computing.

Downloads last month
-
Safetensors
Model size
6B params
Tensor type
BF16
·
F8_E4M3
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for oracomputing/LlamaOra-6.4B-Instruct-FP8

Adapter
(1305)
this model