d3LLM_Dream_Coder / README.md
nielsr's picture
nielsr HF Staff
Add metadata, paper link and citation
26f91fa verified
|
raw
history blame
2.31 kB
metadata
datasets:
  - d3LLM/Ling-Coder-dParallel-merged-512-120k
base_model: Dream-org/Dream-Coder-v0-Instruct-7B
pipeline_tag: text-generation
library_name: transformers
license: apache-2.0
tags:
  - diffusion
  - fast-inference
  - d3llm

d3LLM: Ultra-Fast Diffusion LLM using Pseudo-Trajectory Distillation ๐Ÿš€

d3LLM-Dream-Coder is an ultra-fast diffusion language model introduced in the paper d3LLM: Ultra-Fast Diffusion LLM using Pseudo-Trajectory Distillation. It is built on Dream-org/Dream-Coder-v0-Instruct-7B.

Model Description

d3LLM (pseuDo-Distilled Diffusion Large Language Model) is a framework designed to strike a balance between accuracy and parallelism in diffusion LLMs. It achieves up to 10ร— speedup over vanilla diffusion models like LLaDA/Dream and 5ร— speedup over autoregressive (AR) models.

The model utilizes two primary innovations:

  • Pseudo-Trajectory Distillation: A training method that teaches the model which tokens can be decoded confidently at early steps.
  • Entropy-Based Multi-Block Decoding: An inference strategy using a KV-cache refresh mechanism to maintain accuracy while maximizing parallelism.

Resources

Usage

For detailed usage instructions, evaluation scripts, and training code, please refer to the official GitHub repository. Since the model uses a custom architecture, ensure you have transformers==4.49.0 installed and use trust_remote_code=True when loading the model.

Citation

@article{arxiv'26:d3llm,
  title   = {d3LLM: Ultra-Fast Diffusion LLM using Pseudo-Trajectory Distillation},
  author  = {Yu-Yang Qian and Junda Su and Lanxiang Hu and Peiyuan Zhang and Zhijie Deng and Peng Zhao and Hao Zhang},
  journal = {ArXiv preprint},
  volume  = {arXiv:2601.07568},
  year    = {2026}
}