Commit
·
6f591dd
verified
·
0
Parent(s):
Duplicate from andreabac3/Fauno-Italian-LLM-7B
Browse filesCo-authored-by: Andrea Bacciu <andreabac3@users.noreply.huggingface.co>
- .gitattributes +34 -0
- README.md +125 -0
- adapter_config.json +23 -0
- adapter_model.bin +3 -0
- fauno.drawio.png +0 -0
- screenshot_demo.png +0 -0
.gitattributes
ADDED
|
@@ -0,0 +1,34 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
*.7z filter=lfs diff=lfs merge=lfs -text
|
| 2 |
+
*.arrow filter=lfs diff=lfs merge=lfs -text
|
| 3 |
+
*.bin filter=lfs diff=lfs merge=lfs -text
|
| 4 |
+
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
| 5 |
+
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
| 6 |
+
*.ftz filter=lfs diff=lfs merge=lfs -text
|
| 7 |
+
*.gz filter=lfs diff=lfs merge=lfs -text
|
| 8 |
+
*.h5 filter=lfs diff=lfs merge=lfs -text
|
| 9 |
+
*.joblib filter=lfs diff=lfs merge=lfs -text
|
| 10 |
+
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
| 11 |
+
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
| 12 |
+
*.model filter=lfs diff=lfs merge=lfs -text
|
| 13 |
+
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
| 14 |
+
*.npy filter=lfs diff=lfs merge=lfs -text
|
| 15 |
+
*.npz filter=lfs diff=lfs merge=lfs -text
|
| 16 |
+
*.onnx filter=lfs diff=lfs merge=lfs -text
|
| 17 |
+
*.ot filter=lfs diff=lfs merge=lfs -text
|
| 18 |
+
*.parquet filter=lfs diff=lfs merge=lfs -text
|
| 19 |
+
*.pb filter=lfs diff=lfs merge=lfs -text
|
| 20 |
+
*.pickle filter=lfs diff=lfs merge=lfs -text
|
| 21 |
+
*.pkl filter=lfs diff=lfs merge=lfs -text
|
| 22 |
+
*.pt filter=lfs diff=lfs merge=lfs -text
|
| 23 |
+
*.pth filter=lfs diff=lfs merge=lfs -text
|
| 24 |
+
*.rar filter=lfs diff=lfs merge=lfs -text
|
| 25 |
+
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
| 26 |
+
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
| 27 |
+
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
| 28 |
+
*.tflite filter=lfs diff=lfs merge=lfs -text
|
| 29 |
+
*.tgz filter=lfs diff=lfs merge=lfs -text
|
| 30 |
+
*.wasm filter=lfs diff=lfs merge=lfs -text
|
| 31 |
+
*.xz filter=lfs diff=lfs merge=lfs -text
|
| 32 |
+
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 33 |
+
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 34 |
+
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
README.md
ADDED
|
@@ -0,0 +1,125 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: gpl-3.0
|
| 3 |
+
datasets:
|
| 4 |
+
- andreabac3/MedQuaAD-Italian-Fauno-Baize
|
| 5 |
+
- andreabac3/StackOverflow-Italian-Fauno-Baize
|
| 6 |
+
- andreabac3/Quora-Italian-Fauno-Baize
|
| 7 |
+
- teelinsan/camoscio_cleaned
|
| 8 |
+
language:
|
| 9 |
+
- it
|
| 10 |
+
- en
|
| 11 |
+
tags:
|
| 12 |
+
- large language model
|
| 13 |
+
- italian large language model
|
| 14 |
+
- baize
|
| 15 |
+
- 'llama '
|
| 16 |
+
- italian
|
| 17 |
+
---
|
| 18 |
+
# Fauno - Italian LLM
|
| 19 |
+
|
| 20 |
+

|
| 21 |
+
|
| 22 |
+
Get ready to meet Fauno - the Italian language model crafted by the [RSTLess Research Group](https://rstless-lab.netlify.app/) from the Sapienza University of Rome.
|
| 23 |
+
|
| 24 |
+
The talented research team behind Fauno includes [Andrea Bacciu](https://andreabac3.github.io/), [Dr. Giovanni Trappolini](https://sites.google.com/view/giovannitrappolini), [Andrea Santilli](https://www.santilli.xyz/), and [Professor Fabrizio Silvestri](https://sites.google.com/diag.uniroma1.it/fabriziosilvestri/home).
|
| 25 |
+
|
| 26 |
+
Fauno represents a cutting-edge development in open-source Italian Large Language Modeling. It's trained on extensive Italian synthetic datasets, encompassing a wide range of fields such as medical data 🩺, technical content from Stack Overflow 💻, Quora discussions 💬, and Alpaca data 🦙 translated into Italian.
|
| 27 |
+
|
| 28 |
+
Hence, our model is able to answer to your questions in Italian 🙋, fix your buggy code 🐛 and understand a minimum of medical literature 💊.
|
| 29 |
+
|
| 30 |
+
## The 🇮🇹 open-source version of chatGPT!
|
| 31 |
+
Discover the capabilities of Fauno and experience the evolution of Italian language models for yourself.
|
| 32 |
+

|
| 33 |
+
|
| 34 |
+
### Why Fauno?
|
| 35 |
+
We started with a model called Baize, named after a legendary creature from Chinese literature. Continuing along this thematic line, we developed our Italian model based on Baize and named it Fauno, inspired by an iconic figure from Roman mythology. This choice underlines the link between the two models, while maintaining a distinctive identity rooted in Italian culture.
|
| 36 |
+
|
| 37 |
+
# Did you know that you can run Fauno on Colab base?
|
| 38 |
+
Follow this link to access a Colab notebook with our 7B version! <a target="_blank" href="https://colab.research.google.com/drive/1AepJVWS-qU910zyq-Zi7wWFQ5tthVzUe">
|
| 39 |
+
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
|
| 40 |
+
</a>
|
| 41 |
+
|
| 42 |
+
## 🔎 Model's details
|
| 43 |
+
Fauno is a fine-tuned version of the LoRa weights of [Baize](https://github.com/project-baize/baize-chatbot), that is an improved version of [LLama](https://github.com/facebookresearch/llama).
|
| 44 |
+
|
| 45 |
+
We translated and cleaned the data of Baize, and then we fine-tuned the 7b model using a single RTX A6000 (48GB of VRAM) with 19 hours for one epoch.
|
| 46 |
+
|
| 47 |
+
- 13B: https://huggingface.co/andreabac3/Fauno-Italian-LLM-13B
|
| 48 |
+
|
| 49 |
+
Fauno 30B and 65B are coming soon!
|
| 50 |
+
|
| 51 |
+
## Model initialization
|
| 52 |
+
```python
|
| 53 |
+
from transformers import LlamaTokenizer, LlamaForCausalLM, GenerationConfig
|
| 54 |
+
from peft import PeftModel
|
| 55 |
+
|
| 56 |
+
tokenizer = LlamaTokenizer.from_pretrained("decapoda-research/llama-7b-hf")
|
| 57 |
+
model = LlamaForCausalLM.from_pretrained(
|
| 58 |
+
"decapoda-research/llama-7b-hf",
|
| 59 |
+
load_in_8bit=True,
|
| 60 |
+
device_map="auto",
|
| 61 |
+
)
|
| 62 |
+
model = PeftModel.from_pretrained(model, "andreabac3/Fauno-Italian-LLM-7B")
|
| 63 |
+
model.eval()
|
| 64 |
+
```
|
| 65 |
+
|
| 66 |
+
## Inference
|
| 67 |
+
```python
|
| 68 |
+
def evaluate(question: str) -> str:
|
| 69 |
+
prompt = f"The conversation between human and AI assistant.\n[|Human|] {question}.\n[|AI|] "
|
| 70 |
+
inputs = tokenizer(prompt, return_tensors="pt")
|
| 71 |
+
input_ids = inputs["input_ids"].cuda()
|
| 72 |
+
generation_output = model.generate(
|
| 73 |
+
input_ids=input_ids,
|
| 74 |
+
generation_config=generation_config,
|
| 75 |
+
return_dict_in_generate=True,
|
| 76 |
+
output_scores=True,
|
| 77 |
+
max_new_tokens=256
|
| 78 |
+
)
|
| 79 |
+
output = tokenizer.decode(generation_output.sequences[0]).split("[|AI|]")[1]
|
| 80 |
+
return output
|
| 81 |
+
|
| 82 |
+
your_question: str = "Qual'è il significato della vita?"
|
| 83 |
+
print(evaluate(your_question))
|
| 84 |
+
```
|
| 85 |
+
|
| 86 |
+
### Output
|
| 87 |
+
```
|
| 88 |
+
Il senso della vita è una domanda che molte persone hanno cercato di rispondere per secoli.
|
| 89 |
+
Alla fine, il senso della vita è soggettivo e varia da persona a persona.
|
| 90 |
+
Alcune persone credono che il senso della vita sia trovare la felicità, mentre altre credono che sia raggiungere i propri obiettivi o aiutare gli altri.
|
| 91 |
+
Alla fine, il senso della vita è determinato dall'individuo e dai loro valori e credenze.
|
| 92 |
+
In definitiva, il senso della vita è qualcosa che ognuno deve trovare da solo.
|
| 93 |
+
Non c'è una risposta giusta o sbagliata, poiché ogni persona ha le proprie convinzioni e credenze.
|
| 94 |
+
La ricerca del senso della vita può essere un viaggio lungo e difficile, ma vale la pena perseguire.
|
| 95 |
+
```
|
| 96 |
+
# 📖 Cite our work
|
| 97 |
+
|
| 98 |
+
To use our translated dataset and model weights in your research, remember to cite our work.
|
| 99 |
+
```bibtex
|
| 100 |
+
@misc{fauno,
|
| 101 |
+
author = {Andrea Bacciu, Giovanni Trappolini, Andrea Santilli, Fabrizio Silvestri},
|
| 102 |
+
title = {Fauno: The Italian Large Language Model that will leave you senza parole!},
|
| 103 |
+
year = {2023},
|
| 104 |
+
publisher = {GitHub},
|
| 105 |
+
journal = {GitHub repository},
|
| 106 |
+
howpublished = {\url{https://github.com/andreabac3/Fauno-Italian-LLM}},
|
| 107 |
+
}
|
| 108 |
+
```
|
| 109 |
+
|
| 110 |
+
|
| 111 |
+
## 🔑 License
|
| 112 |
+
This project is a derivative of Baize, and we adhere to the licensing constraints imposed by both Baize's creators and the authors of LLama.
|
| 113 |
+
|
| 114 |
+
## ⚠️ Hallucinations
|
| 115 |
+
It is important to remark that current generation models are prone to the problem of hallucinations. So we advise you not to take their answers seriously.
|
| 116 |
+
|
| 117 |
+
## 👏 Acknowledgement
|
| 118 |
+
- LLama - Meta AI: https://github.com/facebookresearch/llama
|
| 119 |
+
- Baize: https://github.com/project-baize/baize-chatbot
|
| 120 |
+
- Standford Alpaca: https://github.com/tatsu-lab/stanford_alpaca
|
| 121 |
+
- Camoscio: https://github.com/teelinsan/camoscio
|
| 122 |
+
|
| 123 |
+
#### Image Credits
|
| 124 |
+
- llama image: https://next14.com/en/nextnews-7-march-a-new-language-model-for-meta-bing-ai-on-windows-and-the-first-tokenized-real-estate-sales/
|
| 125 |
+
- Fauno logo: https://www.flaticon.com/free-icon/faun_7931635?term=faun&page=1&position=1&origin=tag&related_id=7931635
|
adapter_config.json
ADDED
|
@@ -0,0 +1,23 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"base_model_name_or_path": "decapoda-research/llama-7b-hf",
|
| 3 |
+
"bias": "none",
|
| 4 |
+
"enable_lora": null,
|
| 5 |
+
"fan_in_fan_out": false,
|
| 6 |
+
"inference_mode": true,
|
| 7 |
+
"init_lora_weights": true,
|
| 8 |
+
"lora_alpha": 16,
|
| 9 |
+
"lora_dropout": 0.05,
|
| 10 |
+
"merge_weights": false,
|
| 11 |
+
"modules_to_save": null,
|
| 12 |
+
"peft_type": "LORA",
|
| 13 |
+
"r": 8,
|
| 14 |
+
"target_modules": [
|
| 15 |
+
"q_proj",
|
| 16 |
+
"k_proj",
|
| 17 |
+
"v_proj",
|
| 18 |
+
"down_proj",
|
| 19 |
+
"gate_proj",
|
| 20 |
+
"up_proj"
|
| 21 |
+
],
|
| 22 |
+
"task_type": "CAUSAL_LM"
|
| 23 |
+
}
|
adapter_model.bin
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:572f9f3c2c6eb0a8d6918be107ebff1c51da3ba4f2757d80c6c5bcff0dd8561a
|
| 3 |
+
size 71703053
|
fauno.drawio.png
ADDED
|
screenshot_demo.png
ADDED
|