EVALUATION-ONLY ACCESS (30-DAY TESTING)
This repository is publicly accessible, but you have to accept the conditions to access its files and content.
This is a private evaluation version of LlamaOra-6.4B-Instruct-FP8.
By agreeing, you accept:
- 30-day internal testing only
- No commercial use, redistribution, or reverse-engineering
- Deletion of all files after evaluation
- Full terms in
LICENSE
Access is granted only to approved licensees.
Log in or Sign Up to review the conditions and access this model content.
LlamaOra-6.4B-Instruct-FP8
This repository contains the LlamaOra-6.4B-Instruct-FP8 model developed by Ora Computing. This is a compressed and fine-tuned derivative of the Llama 3.1‑8B‑Instruct model (Built with Llama).
Model Overview
Model name: LlamaOra-6.4B-Instruct-FP8
Base model: Llama 3.1‑8B‑Instruct
Derived size: ~6.4 billion parameters (compressed from the base model’s ~8.0 billion)
Purpose: Evaluation/test‑use only; optimized for internal benchmarking and non‑production integration.
License: See LICENSE (Custom Model License Agreement)
Intended Use & Restrictions
Permitted use
- Internal testing, benchmarking and evaluation of the model by the named Licensee.
- Exploration of model behaviours, prompt engineering, and non‑production prototypes.
Prohibited use
- Deployment in a production or commercial service, publicly‑facing API, resale or redistribution.
- Fine‑tuning or creating derivative models for production use without separate agreement.
- Disclosure or sharing of the model (or its weights) to third parties beyond the named Licensee.
Out‑of‑scope use
- Any use that triggers the “Additional Commercial Terms” of the Llama 3.1 Community Licence (e.g., >700 million monthly active users).
- Use of the model in regulated or safety‑critical contexts (unless separately permitted).
Accuracy
| Benchmark | Llama-3.1-8B-Instruct (base model) | LlamaOra-6.4B-Instruct-FP8 (this model) | Recovery |
| MMLU (0-shot) | 68.34 | 65.82 | 96.31% |
| BOOLQ (0-shot) | 85.47 | 84.28 | 98.61% |
| Hellaswag (norm, 0-shot) | 79.5 | 77.88 | 97.96% |
| Winogrande (0-shot) | 73.64 | 76.56 | 103.97% |
| ARC Challenge (norm, 0-shot) | 55.63 | 58.28 | 104.76% |
| Average | 72.52 | 72.56 | 100.07% |
Finetuning
This model was finetuned using LoRA on high-quality instruction-response data.
Compression & Quantization
- Parameter count compression: The parameter count was reduced from ~8.0 billion to ~6.4 billion.
- Quantization: We quantized the compressed model to static FP8 using the
llm_compressortoolkit.
Limitations & Risks
- Compressed models may not replicate the full behaviour of the base model under all prompt categories, particularly domain‑specific or rare inputs.
- Quantisation (FP8) and architectural compression may introduce subtle degradation in accuracy or stability.
- The model is provided as‑is for testing only and is not certified for production use.
- Users should validate outputs carefully and monitor for bias or unintended behaviours.
Upstream Attribution
This model is derived from the Llama 3.1 model family released by Meta Platforms, Inc. under the Llama 3.1 Community License.
“Llama 3.1 is licensed under the Llama 3.1 Community License, Copyright © Meta Platforms, Inc. All Rights Reserved.”
For full terms, see: https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/LICENSE
Contact & Support
For licensing inquiries or to request extended evaluation rights, please contact:
stefan@oracomputing.com
Repository and model access are regulated. Do not redistribute or share without explicit written permission from Ora Computing.
- Downloads last month
- -
Model tree for oracomputing/LlamaOra-6.4B-Instruct-FP8
Base model
meta-llama/Llama-3.1-8B