Pianist Transformer

This is the foundational base model for the Pianist Transformer, pre-trained on a massive 10-billion-token corpus of diverse piano MIDI data. It has learned a deep, generalized understanding of musical structure, harmony, and rhythm. This model is not fine-tuned for a specific task. It is intended to be used as a powerful starting point for fine-tuning on various downstream music-related tasks.

This work is based on the paper: Pianist Transformer: Towards Expressive Piano Performance Rendering via Scalable Self-Supervised Pre-Training.

Model Description

  • Architecture: 135M-parameter asymmetric Encoder-Decoder Transformer.
  • Pre-training Objective: A self-supervised masked denoising task. The model was trained to reconstruct corrupted segments of musical sequences, compelling it to internalize the underlying principles of music without explicit labels.
  • Data: Pre-trained on a 10B-token corpus aggregated from Aria-MIDI, GiantMIDI-Piano, PDMX, and other public datasets.

How to Use

This model serves as a powerful foundation for the expressive performance rendering task. We encourage you to fine-tune it on your own aligned score-performance dataset to create a high-quality, customized rendering system.

The official GitHub repository provides a complete framework to facilitate this process. It includes:

  • Data pre-processing code to help you prepare your own dataset in the required format.
  • The full fine-tuning script used in our paper.

➡️ Get Started on GitHub

Citation

If you use this model in your work, please cite the original paper:

@misc{you2025pianisttransformerexpressivepiano,
      title={Pianist Transformer: Towards Expressive Piano Performance Rendering via Scalable Self-Supervised Pre-Training}, 
      author={Hong-Jie You and Jie-Jing Shao and Xiao-Wen Yang and Lin-Han Jia and Lan-Zhe Guo and Yu-Feng Li},
      year={2025},
      eprint={2512.02652},
      archivePrefix={arXiv},
      primaryClass={cs.SD}
}
Downloads last month
11
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including yhj137/pianist-transformer-base