Massively Multilingual Adaptation of Large Language Models Using Bilingual Translation Data Paper • 2506.00469 • Published May 31 • 4
Scaling Low-Resource MT via Synthetic Data Generation with LLMs Paper • 2505.14423 • Published May 20 • 2
view article Article Introducing MTEB v2: Evaluation of embedding and retrieval systems for more than just text Oct 20 • 33
view article Article Fine-Tune Wav2Vec2 for English ASR in Hugging Face with 🤗 Transformers Mar 12, 2021 • 37
Distilling LLM Agent into Small Models with Retrieval and Code Tools Paper • 2505.17612 • Published May 23 • 81
Fixing Data That Hurts Performance: Cascading LLMs to Relabel Hard Negatives for Robust Information Retrieval Paper • 2505.16967 • Published May 22 • 24
view article Article 🥬 LettuceDetect Goes Multilingual: Fine-tuning EuroBERT on Synthetic Translations May 19 • 9
view article Article Train 400x faster Static Embedding Models with Sentence Transformers Jan 15 • 219
view article Article Model2Vec: Distill a Small Fast Model from any Sentence Transformer Oct 14, 2024 • 99
LLM-R2: A Large Language Model Enhanced Rule-based Rewrite System for Boosting Query Efficiency Paper • 2404.12872 • Published Apr 19, 2024 • 11