Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context Paper • 1901.02860 • Published Jan 9, 2019 • 4
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention Paper • 2404.07143 • Published Apr 10, 2024 • 111
view article Article Fine-Tuning Your First Large Language Model (LLM) with PyTorch and Hugging Face Feb 11 • 89
The Pile: An 800GB Dataset of Diverse Text for Language Modeling Paper • 2101.00027 • Published Dec 31, 2020 • 9
Direct Preference Optimization: Your Language Model is Secretly a Reward Model Paper • 2305.18290 • Published May 29, 2023 • 63
view article Article Making LLMs Smaller Without Breaking Them: A GLU-Aware Pruning Approach Nov 24, 2024 • 15