Ambuj Varshney
commited on
Create README.md
Browse files
README.md
ADDED
|
@@ -0,0 +1,39 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
datasets:
|
| 4 |
+
- HuggingFaceFW/fineweb
|
| 5 |
+
language:
|
| 6 |
+
- en
|
| 7 |
+
library_name: transformers
|
| 8 |
+
tags:
|
| 9 |
+
- IoT
|
| 10 |
+
- sensor
|
| 11 |
+
- embedded
|
| 12 |
+
---
|
| 13 |
+
|
| 14 |
+
# TinyLLM
|
| 15 |
+
|
| 16 |
+
## Overview
|
| 17 |
+
|
| 18 |
+
This repository hosts a small language model developed as part of the TinyLLM framework ([arxiv link]). These models are specifically designed and fine-tuned with sensor data to support embedded sensing applications. They enable locally hosted language models on low-computing-power devices, such as single-board computers. The models, based on the GPT-2 architecture, are trained using Nvidia's H100 GPUs. This repo provides base models that can be further fine-tuned for specific downstream tasks related to embedded sensing.
|
| 19 |
+
## Model Information
|
| 20 |
+
|
| 21 |
+
- **Parameters:** 124M (Hidden Size = 768)
|
| 22 |
+
- **Architecture:** Decoder-only transformer
|
| 23 |
+
- **Training Data:** Up to 10B tokens from the [SHL](http://www.shl-dataset.org/) and [Fineweb](https://huggingface.co/datasets/HuggingFaceFW/fineweb) datasets, combined in a 4:6 ratio
|
| 24 |
+
- **Input and Output Modality:** Text
|
| 25 |
+
- **Context Length:** 1024
|
| 26 |
+
|
| 27 |
+
## Acknowledgements
|
| 28 |
+
|
| 29 |
+
We would like to acknowledge the open-source frameworks [llm.c](https://github.com/karpathy/llm.c) and [llama.cpp](https://github.com/ggerganov/llama.cpp), which were instrumental in training and testing these models.
|
| 30 |
+
|
| 31 |
+
## Usage
|
| 32 |
+
|
| 33 |
+
The model can be used in two primary ways:
|
| 34 |
+
1. **With Hugging Face’s Transformers Library**
|
| 35 |
+
2. **With llama.cpp**
|
| 36 |
+
|
| 37 |
+
## Disclaimer
|
| 38 |
+
|
| 39 |
+
This model is intended solely for research purposes.
|