TinyMyo / README.md

Update README.md

3aa0143 verified about 2 months ago

11.4 kB

	---
	license: cc-by-nd-4.0
	language:
	- en
	model-index:
	- name: TinyMyo
	results:

	# -------------------------
	# Hand Gesture Classification
	# -------------------------
	- task:
	type: gesture-classification
	dataset:
	type: ninapro_db5
	name: Ninapro DB5
	metrics:
	- name: acc@1
	type: acc@1
	value: 0.8941
	verified: false
	- name: f1
	type: f1
	value: 0.7797
	verified: false

	- task:
	type: gesture-classification
	dataset:
	type: epn612
	name: EPN-612
	metrics:
	- name: acc@1
	type: acc@1
	value: 0.9674
	verified: false
	- name: f1
	type: f1
	value: 0.9674
	verified: false

	- task:
	type: gesture-classification
	dataset:
	type: uci_emg
	name: UCI-EMG
	metrics:
	- name: acc@1
	type: acc@1
	value: 0.9756
	verified: false
	- name: f1
	type: f1
	value: 0.9755
	verified: false

	# -------------------------
	# Generic Neuromotor Interface (Meta RL)
	# -------------------------
	- task:
	type: gesture-classification
	dataset:
	type: gni_meta_rl
	name: Generic Neuromotor Interface (Discrete Gesture)
	metrics:
	- name: CLER
	type: classification-error-rate
	value: 0.153
	verified: false

	# -------------------------
	# Hand Kinematic Regression
	# -------------------------
	- task:
	type: kinematic-regression
	dataset:
	type: ninapro_db8
	name: Ninapro DB8
	metrics:
	- name: MAE
	type: mean-absolute-error
	value: 8.77
	verified: false
	- name: RMSE
	type: root-mean-square-error
	value: 13.35
	verified: false
	- name: R2
	type: r2
	value: 0.62
	verified: false

	# -------------------------
	# Silent Speech Synthesis
	# -------------------------
	- task:
	type: speech-synthesis
	dataset:
	type: gaddy_silent_speech
	name: Gaddy Silent Speech (MFCC to Audio)
	metrics:
	- name: WER
	type: word-error-rate
	value: 0.3354
	verified: false

	# -------------------------
	# Silent Speech Recognition
	# -------------------------
	- task:
	type: speech-recognition
	dataset:
	type: gaddy_silent_speech
	name: Gaddy Silent Speech (EMG to Text)
	metrics:
	- name: WER
	type: word-error-rate
	value: 0.3395
	verified: false
	---

	<div align="center">
	<img src="https://raw.githubusercontent.com/MatteoFasulo/BioFoundation/refs/heads/TinyMyo/docs/model/logo/TinyMyo_logo.png" alt="TinyMyo Logo" width="400" />
	<h1>TinyMyo: a Tiny Foundation Model for Flexible EMG Signal Processing at the Edge</h1>
	</div>
	<p align="center">
	<a href="https://github.com/pulp-bio/BioFoundation">
	<img src ="https://img.shields.io/github/stars/pulp-bio/BioFoundation?color=ccf" alt="Github">
	</a>
	<a href="https://creativecommons.org/licenses/by-nd/4.0/">
	<img src="https://img.shields.io/badge/License-CC_BY--ND_4.0-lightgrey.svg" alt="License">
	</a>
	<a href="https://arxiv.org/abs/2512.15729">
	<img src="https://img.shields.io/badge/arXiv-2512.15729-b31b1b.svg" alt="Paper">
	</a>
	</p>

	TinyMyo is a 3.6M-parameter Transformer-based foundation model for surface EMG (sEMG).
	It is pretrained on >480 GB of EMG data and optimized for ultra-low-power, real-time deployment, including microcontrollers (GAP9) where it achieves an inference time of 0.785 s, energy of 44.91 mJ and power envelope of 57.18 mW.

	TinyMyo is built for broad generalization across datasets, sensor configurations, movement tasks, subjects, and domains (gesture, kinematics, speech).

	---

	# 🔒 License & Usage (Model Weights)

	The released TinyMyo weights are licensed under CC BY-ND 4.0.
	This summary is not legal advice—please read the full license.

	### ✅ You may

	* Use and redistribute the unmodified TinyMyo weights (including commercially) with attribution.
	* Fine-tune/modify internally for research or production without redistributing modified weights.
	* Publish code, configs, evaluations, and papers using TinyMyo.

	### 🚫 You may not

	* Share or host modified weights in any form (including LoRA/adapter deltas, pruned/quantized models).
	* Claim endorsement from the TinyMyo authors without permission.
	* Use the TinyMyo name for derivative models.

	### 🤝 Contributing Improvements

	To upstream improvements, submit a PR to the
	[BioFoundation repository](https://github.com/pulp-bio/BioFoundation) with:

	1. Full reproducibility artifacts (configs, logs, seeds, environment).
	2. Evaluation on standard protocols (e.g., DB5, EPN-612, UCI EMG, DB8, Silent Speech).
	3. Comparison to TinyMyo’s reported metrics.

	Approved PRs will be retrained and released as official TinyMyo checkpoints under CC BY-ND.

	---

	# 🔎 1. Default Input & Preprocessing

	Unless specified otherwise, TinyMyo expects:

	* Channels: 16
	* Sampling rate: 2000 Hz
	* Segment length: 1000 samples (0.5 s)
	* Windowing: 50% overlap (pretraining)
	* Preprocessing:

	* 4th-order 20–450 Hz bandpass
	* 50 Hz notch filter
	* Min–max normalization (pretraining)
	* Z-score normalization (downstream)

	Datasets with <16 channels are zero-padded (pretraining only).

	---

	# 🔬 2. Pretraining Overview

	TinyMyo is pretrained via masked reconstruction on three large-scale EMG datasets:

	\| Dataset \| Subjects \| fs \| Channels \| Size \|
	\| ----------- \| -------- \| ------- \| -------- \| ------- \|
	\| Ninapro DB6 \| 10 \| 2000 Hz \| 14 \| 20.3 GB \|
	\| Ninapro DB7 \| 22 \| 2000 Hz \| 12 \| 30.9 GB \|
	\| EMG2Pose \| 192 \| 2000 Hz \| 16 \| 431 GB \|

	## Tokenization: Channel-Independent Patches

	Unlike EEG FMs that mix channels early, TinyMyo uses per-channel patching:

	* Patch length: 20 samples
	* Patch stride: 20 samples
	* Tokens/channel: 50
	* Total seq length: 800 tokens (16 x 50)
	* Positional encoding: RoPE (rotary)

	This preserves electrode-specific structure while allowing attention to learn cross-channel relationships.

	## Transformer Encoder

	* 8 layers, 3 heads
	* Embedding dim: 192
	* Pre-LayerNorm
	* Dropout & drop-path: 0.1

	## Lightweight Decoder

	A single linear layer (~3.9k params) reconstructs masked patches.
	Following SimMIM, this forces the encoder to learn robust latent structure.

	## Masking Objective

	* 50% random masking with a learnable `[MASK]` token
	* Loss: Smooth L1 with small penalty on visible patches
	$$
	\mathcal{L} = \mathcal{L}{\text{masked}} + 0.1,\mathcal{L}{\text{visible}}
	$$

	## Training Setup

	* Optimizer: AdamW (β=(0.9,0.98), wd=0.01)
	* LR: 1e-4 with cosine decay
	* Batch size: 512 (with grad accumulation)
	* Epochs: 50, warm-up: 10
	* Hardware: 4× NVIDIA GH200 GPUs

	---

	# 🧠 3. Architecture Summary

	### Model Variant

	\| Variant \| Params \| (Layers, Dim) \|
	\| ------- \| -------- \| ------------- \|
	\| TinyMyo \| 3.6M \| (8, 192) \|

	---

	# 🎯 4. Downstream Tasks

	TinyMyo generalizes across gesture classification, kinematic regression, and speech EMG—with state-of-the-art or competitive results.

	---

	## 4.1 Hand Gesture Classification

	Evaluated on:

	* Ninapro DB5 (52 classes, 10 subjects)
	* EPN-612 (5 classes, 612 subjects)
	* UCI EMG (6 classes, 36 subjects)
	* Meta Neuromotor Interface (9 gestures)

	### Preprocessing

	* EMG filtering: 20–90 Hz bandpass + 50 Hz notch
	* Window sizes:

	* 200 ms (best for DB5)
	* 1000 ms (best for EPN, UCI)

	### Linear Classification Head

	* Input: C × 192
	* Params: <40k

	### Performance (Fine-tuned)

	\| Dataset \| Metric \| Result \|
	\| ------------------------ \| ------ \| ----------------- \|
	\| Ninapro DB5 (200 ms) \| Acc \| 89.41 ± 0.16% \|
	\| EPN-612 (1000 ms) \| Acc \| 96.74 ± 0.09% \|
	\| UCI EMG (1000 ms) \| Acc \| 97.56 ± 0.32% \|
	\| Neuromotor \| CLER \| 0.153 ± 0.006 \|

	TinyMyo achieves new state-of-the-art on DB5, EPN-612, and UCI.

	---

	## 4.2 Hand Kinematic Regression (Ninapro DB8)

	* Predict 5 joint angles
	* Windows: 200 ms or 1000 ms
	* Normalization: z-score only

	### Regression Head (~788k params)

	* Depthwise + pointwise convs
	* Upsampling
	* Global average pooling
	* Linear projection to 5 outputs

	### Performance

	* MAE = 8.77 ± 0.12° (1000 ms)

	Note: Prior works reporting ~6.9° MAE are subject-specific; TinyMyo trains a single cross-subject model, a significantly harder setting.

	---

	## 4.3 Speech Production & Recognition (Silent Speech)

	Dataset: Gaddy Silent Speech
	(8 channels, 1000 Hz, face/neck EMG)

	### Speech Production (EMG → MFCC → HiFi-GAN → Audio)

	Pipeline:

	1. Residual downsampling
	2. TinyMyo encoder
	3. Linear projection → 26-dim MFCC
	4. HiFi-GAN vocoder

	WER: 33.54 ± 1.12%
	≈ state-of-the-art with >90% fewer params in the transduction model.

	### Speech Recognition (EMG → Text)

	* TinyMyo encoder
	* Linear projection → 37 characters
	* CTC loss
	* 4-gram LM + beam search

	WER: 33.95 ± 0.97%

	TinyMyo is EMG-only, unlike multimodal systems like MONA-LISA.

	---

	# ⚡ 5. Edge Deployment (GAP9 MCU)

	TinyMyo runs efficiently on GAP9 (RISC-V) via:

	* INT8 quantization, including attention
	* Multi-level streaming (L3 to L2 to L1)
	* Integer LayerNorm, GELU, softmax
	* Static memory arena via liveness analysis

	### Runtime (DB5 pipeline)

	* Inference time: 0.785 s
	* Energy: 44.91 mJ
	* Average power: 57.18 mW

	This is the first EMG foundation model demonstrated on a microcontroller.

	---

	# 📊 6. Results Summary

	### Pretraining

	* Smooth L1 reconstruction with high fidelity
	* Total compute ≈ 4.0 GFLOPs

	### Downstream Highlights

	* DB5: 89.41%
	* EPN-612: 96.74%
	* UCI EMG: 97.56%
	* Neuromotor: 0.153 CLER
	* DB8 Regression: MAE 8.77°
	* Silent Speech Production: 33.54% WER
	* Silent Speech Recognition: 33.95% WER

	TinyMyo matches or exceeds state-of-the-art performance, while being smaller and more efficient than all prior EMG foundation models.

	---

	# 🛠️ Code & Usage

	To fine-tune TinyMyo on downstream tasks, follow the examples in the
	[BioFoundation repository](https://github.com/pulp-bio/BioFoundation).

	```bash
	python -u run_train.py +experiment=TinyMyo_finetune \
	pretrained_safetensors_path=/path/to/model.safetensors
	```

	Environment variables:

	* `DATA_PATH` → dataset path
	* `CHECKPOINT_DIR` → checkpoint to load

	---

	## 🔗 Resources

	- Code: https://github.com/pulp-bio/BioFoundation

	---

	# 📜 Citation

	Please cite TinyMyo using:

	```bibtex
	@misc{fasulo2025tinymyotinyfoundationmodel,
	title={TinyMyo: a Tiny Foundation Model for Flexible EMG Signal Processing at the Edge},
	author={Matteo Fasulo and Giusy Spacone and Thorir Mar Ingolfsson and Yawei Li and Luca Benini and Andrea Cossettini},
	year={2025},
	eprint={2512.15729},
	archivePrefix={arXiv},
	primaryClass={eess.SP},
	url={https://arxiv.org/abs/2512.15729},
	}
	```

	---

	# 🧭 Contact & Support

	* Questions or issues?
	Open an issue on the BioFoundation GitHub repository.