Augustulus Latin Sentiment Analysis LoRA

Developed by Team Trojan Parse University of Florida Senior Design Project

A LoRA (Low-Rank Adaptation) adapter fine-tuned on Llama-3.1-8B-Instruct for fine-grained sentiment classification of Ancient Latin texts across seven emotional intensity levels.

Project Information

  • Team Name: Trojan Parse
  • Team Members:
    • Alex John
    • Ryan Willson
    • Byron Boatright
    • Jake Marotta
    • Duncan Fuller
  • Project Repository: GitHub: Trojan-Parse-Project
  • Advisor: Eleni Bozia, Ph.D., Dr. phil. (Associate Professor of Classics and Digital Humanities)
  • Advisor Department: Department of Classics, University of Florida

Model Description

  • Model Name: augustulus-latin-sentiment-lora
  • Model type: LoRA Adapter for Ancient Language Sentiment Classification
  • Language: Classical/Ancient Latin
  • Base model: meta-llama/Llama-3.1-8B-Instruct
  • License: Llama 3.1 Community License
  • Purpose: Academic research and historical text analysis

Sentiment Categories

Our model classifies Ancient Latin texts into six emotional intensity levels:

Positive Sentiments

  • EXTREMELY POSITIVE (+3): exsultatio, jubilum, beatitudo, summa felicitas

    • Examples: Triumphal declarations, ultimate joy, divine blessing
  • VERY POSITIVE (+2): gaudium, laetitia, amor, gloria, victoria, laudare

    • Examples: Military victories, celebrations, expressions of love/honor
  • MODERATELY POSITIVE (+1): felix, laetus, bonus, pulcher, spes

    • Examples: General contentment, hope, pleasant situations

Neutral (0)

  • Factual statements, descriptions without emotional valence

Negative Sentiments

  • MODERATELY NEGATIVE (-1): malus, tristis, anxius, timor

    • Examples: Minor concerns, sadness, mild fear
  • VERY NEGATIVE (-2): dolor magnus, timor vehemens, ira, furor

    • Examples: Great pain, intense anger, serious threats
  • EXTREMELY NEGATIVE (-3): desperatio, exitium, cruciatus, malum

    • Examples: Utter despair, destruction, torture, ultimate evil

Performance

Configuration Accuracy Notes
Base Llama 3.1 (zero-shot) 43.8% Unreliable, biased toward extremes
LoRA Adapter (raw predictions) 37.5% Systematic but conservative
LoRA + Linguistic Rules 75.0% Production-ready

Category-Level Performance

  • Neutral Detection: 100% accuracy (3/3 test cases)
  • Moderate Categories: 100% accuracy (learned systematic patterns)
  • Extreme Categories: 83.3% accuracy (with intensity calibration)

Training Approach

Our training methodology combined multiple data sources and validation strategies:

Data Pipeline (5-day development cycle)

Phase 1: Initial Generation

  • Few-shot generation using base Llama 3.1
  • Context-aware synthetic examples
  • Balanced across all six sentiment categories

Phase 2: Consensus Filtering

  • Trained multiple LoRA variants on hand-annotated data
  • Consensus filtering: kept examples where โ‰ฅ2 models agreed
  • Reduced noise and improved training data quality

Phase 3: Corpus Mining

  • Mined authentic Ancient Latin texts from Perseus Digital Library
  • Extracted high-confidence positive examples (previously underrepresented)
  • Combined ~40,000 corpus examples with synthetic data

Phase 4: Final Training & Iteration

  • Balanced dataset: 9,000 examples (1,500 per category)
  • Distributed training with data-parallel strategy
  • Multiple training runs to optimize hyperparameters

Final Training Configuration

  • Training Examples: 9,000 (balanced across 7 categories)
  • Training Epochs: 15
  • Architecture: LoRA adapter (rank: 128, alpha: 256)
  • Optimization: 8-bit quantization for efficiency
  • Hardware: High-performance GPU cluster
  • Framework: PyTorch, HuggingFace Transformers, PEFT

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

# Load base model and adapter
base_model = "meta-llama/Llama-3.1-8B-Instruct"
model = AutoModelForCausalLM.from_pretrained(
    base_model,
    device_map="auto",
    torch_dtype=torch.float16,
    trust_remote_code=True
)

# Load Team Trojan Parse's adapter
# Replace YOUR_USERNAME with your Hugging Face username
model = PeftModel.from_pretrained(model, "YOUR_USERNAME/augustulus-latin-sentiment-lora")
tokenizer = AutoTokenizer.from_pretrained(base_model)

# Classify sentiment
def classify_latin_sentiment(text):
    prompt = f'''Classify the sentiment of this Latin text as: VERY NEGATIVE, MODERATELY NEGATIVE, NEUTRAL, MODERATELY POSITIVE, VERY POSITIVE, or EXTREMELY POSITIVE.

Latin text: {text}

Sentiment:'''
    
    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
    outputs = model.generate(
        **inputs,
        max_new_tokens=20,
        temperature=0.1,
        do_sample=False
    )
    
    response = tokenizer.decode(outputs[0], skip_special_tokens=True)
    return response.split("Sentiment:")[-1].strip()

# Example: Extreme positive (triumph)
text = "Victoria splendidissima! Dux gloriam aeternam meruit!"
print(classify_latin_sentiment(text))
# Output: EXTREMELY POSITIVE

# Example: Extreme negative (despair)
text = "Bellum crudele et longum populum afflixerat."
print(classify_latin_sentiment(text))
# Output: VERY NEGATIVE

GGUF Model Download and Local Usage (Merged Fine-Tune)

The LoRA adapter has been merged with the base model and quantized to Q8_0 (8-bit) precision for efficient deployment on CPU/GPU via tools like llama.cpp and Ollama.

๐Ÿ’พ File Details

  • File Name: augustulus-latin-sentiment-8b-q8_0.gguf
  • Size: 8.0 GB
  • Quantization: Q8_0 (Recommended for best balance of speed and accuracy)

Llama 3.1 License Notice

IMPORTANT: This model (including the GGUF file) is a derivative of Metaโ€™s Llama 3.1 model and is governed by the Meta Llama 3.1 Community License.

  • Attribution: If you redistribute or build products with this model, you must include the statement โ€œBuilt with Meta Llama 3โ€ in a prominent location (e.g., README, UI footer, about page).
  • Commercial Use: Allowed without additional permission as long as your product or service has fewer than 700 million monthly active users. Above that threshold, you need a separate commercial license from Meta.

See the full license text here: https://llama.meta.com/llama3_1/license

Usage Example (with Ollama)

This workflow uses a custom Modelfile to set the strict sentiment task and gives the model a simple local name.

Create Modelfile

Save the following content as a file named Modelfile:

# Modelfile for the Augustulus Latin Sentiment Model

FROM hf.co/TronCodes/augustulus-latin-sentiment-lora/augustulus-latin-sentiment-8b-q8_0.gguf

SYSTEM """
You are Augustulus, an expert in Classical Latin sentiment analysis. Your task is to respond ONLY with one of the following exact labels: EXTREMELY POSITIVE, VERY POSITIVE, MODERATELY POSITIVE, NEUTRAL, MODERATELY NEGATIVE, VERY NEGATIVE, or EXTREMELY NEGATIVE. Do not provide any conversational text or explanation.
"""

TEMPLATE """
{{ if .System }}<|start_header_id|>system<|end_header_id|>{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>
"""

PARAMETER temperature 0.1
PARAMETER num_predict 20
PARAMETER stop "<|eot_id|>"

Create and Run

ollama create augustulus-latin -f Modelfile
ollama run augustulus-latin

Acknowledgments

We gratefully acknowledge:

  • Dr. Eleni Bozia (Ph.D., Dr. phil.) - Senior Project Advisor
  • University of Florida Department of Humanities - Computing resources and support
  • Perseus Digital Library - Access to Classical Latin corpus
  • Meta AI - Llama 3.1 base model
  • HuggingFace - PEFT library and model hosting infrastructure

Citation

@misc{trojan_parse_latin_sentiment_2025,
  author = {{Team Trojan Parse}},
  title = {Augustulus Latin Sentiment Analysis LoRA},
  year = {2025},
  publisher = {University of Florida},
  journal = {HuggingFace Model Hub},
  howpublished = {\\url{[https://huggingface.co/TronCodes/augustulus-latin-sentiment-lora](https://huggingface.co/TronCodes/augustulus-latin-sentiment-lora)}}
}
Downloads last month
168
GGUF
Model size
8B params
Architecture
llama
Hardware compatibility
Log In to view the estimation

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for TronCodes/augustulus-latin-sentiment-lora

Quantized
(538)
this model

Evaluation results

  • Accuracy (with linguistic post-processing) on Ancient Latin Sentiment (Custom)
    self-reported
    75.000
  • Raw Model Accuracy on Ancient Latin Sentiment (Custom)
    self-reported
    37.500