Pianist Transformer
Pianist Transformer is a state-of-the-art model for generating expressive, human-like piano performances from musical scores. In subjective listening studies, its quality was found to be statistically indistinguishable from a human pianist and was often preferred.
This work is based on the paper: Pianist Transformer: Towards Expressive Piano Performance Rendering via Scalable Self-Supervised Pre-Training.
- Paper: https://arxiv.org/abs/2512.02652
- Github: https://github.com/yhj137/PianistTransformer
- Project Page: https://yhj137.github.io/pianist-transformer-demo/
- Try Online Demo: https://huggingface.co/spaces/yhj137/pianist-transformer-rendering
Model Description
Pianist Transformer addresses the data scarcity problem in expressive performance rendering by leveraging a large-scale, self-supervised pre-training strategy. The model first learns the deep principles of musical structure from a massive 10-billion-token MIDI corpus before being fine-tuned for the final rendering task.
It uses an efficient 135M-parameter asymmetric Transformer architecture (10-layer encoder, 2-layer decoder) with sequence compression, enabling it to process long musical contexts while maintaining fast inference speeds.
Key Features
- Human-Level Expressivity: Generates nuanced performances that rival and are sometimes preferred over human pianists.
- Scalable Pre-training: Overcomes the limitations of small supervised datasets by learning from a vast, diverse corpus of unlabeled piano music.
- Efficient Architecture: A custom design provides a strong balance between performance quality and real-world inference latency.
- DAW-Friendly Output: Includes a novel post-processing algorithm to convert model output into a standard, fully editable MIDI file with a dynamic tempo map.
How to Use
For detailed instructions on data preparation, inference, and using the model, please refer to the official GitHub repository. The full pipeline is required for correct usage.
Citation
If you use this model in your work, please cite the original paper:
@misc{you2025pianisttransformerexpressivepiano,
title={Pianist Transformer: Towards Expressive Piano Performance Rendering via Scalable Self-Supervised Pre-Training},
author={Hong-Jie You and Jie-Jing Shao and Xiao-Wen Yang and Lin-Han Jia and Lan-Zhe Guo and Yu-Feng Li},
year={2025},
eprint={2512.02652},
archivePrefix={arXiv},
primaryClass={cs.SD}
}
- Downloads last month
- 33