Pianist Transformer

This is the foundational base model for the Pianist Transformer, pre-trained on a massive 10-billion-token corpus of diverse piano MIDI data. It has learned a deep, generalized understanding of musical structure, harmony, and rhythm. This model is not fine-tuned for a specific task. It is intended to be used as a powerful starting point for fine-tuning on various downstream music-related tasks.

This work is based on the paper: Pianist Transformer: Towards Expressive Piano Performance Rendering via Scalable Self-Supervised Pre-Training.

Paper: https://arxiv.org/abs/2512.02652
Github: https://github.com/yhj137/PianistTransformer
Project Page: https://yhj137.github.io/pianist-transformer-demo/

Model Description

Architecture: 135M-parameter asymmetric Encoder-Decoder Transformer.
Pre-training Objective: A self-supervised masked denoising task. The model was trained to reconstruct corrupted segments of musical sequences, compelling it to internalize the underlying principles of music without explicit labels.
Data: Pre-trained on a 10B-token corpus aggregated from Aria-MIDI, GiantMIDI-Piano, PDMX, and other public datasets.

How to Use

This model serves as a powerful foundation for the expressive performance rendering task. We encourage you to fine-tune it on your own aligned score-performance dataset to create a high-quality, customized rendering system.

The official GitHub repository provides a complete framework to facilitate this process. It includes:

Data pre-processing code to help you prepare your own dataset in the required format.
The full fine-tuning script used in our paper.

➡️ Get Started on GitHub

Citation

If you use this model in your work, please cite the original paper:

@misc{you2025pianisttransformerexpressivepiano,
      title={Pianist Transformer: Towards Expressive Piano Performance Rendering via Scalable Self-Supervised Pre-Training}, 
      author={Hong-Jie You and Jie-Jing Shao and Xiao-Wen Yang and Lin-Han Jia and Lan-Zhe Guo and Yu-Feng Li},
      year={2025},
      eprint={2512.02652},
      archivePrefix={arXiv},
      primaryClass={cs.SD}
}

Downloads last month: 11

Safetensors

Model size

0.1B params

Tensor type

F32

Collection including yhj137/pianist-transformer-base

Pianist-Transformer

Collection

2 items • Updated 4 days ago