kas's picture

kas

shing3232

·

AI & ML interests

None yet

Organizations

None yet

upvoted a paper 7 months ago

TransMLA: Multi-head Latent Attention Is All You Need

Paper • 2502.07864 • Published Feb 11 • 58

upvoted an article 8 months ago

Article

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

+4

Sep 18, 2024

•

272

upvoted 2 papers 8 months ago

Hogwild! Inference: Parallel LLM Generation via Concurrent Attention

Paper • 2504.06261 • Published Apr 8 • 110

VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks

Paper • 2504.05118 • Published Apr 7 • 26

upvoted a collection about 1 year ago

Qwen2.5-Coder

Code-specific model series based on Qwen2.5 • 40 items • Updated Jul 21 • 348

upvoted 3 papers over 1 year ago

BASS: Batched Attention-optimized Speculative Sampling

Paper • 2404.15778 • Published Apr 24, 2024 • 11

ReFT: Representation Finetuning for Language Models

Paper • 2404.03592 • Published Apr 4, 2024 • 101

ChatEDA: A Large Language Model Powered Autonomous Agent for EDA

Paper • 2308.10204 • Published Aug 20, 2023 • 1

upvoted a collection over 1 year ago

Camelidae

5 items • Updated Aug 22 • 2

upvoted 2 collections almost 2 years ago

Microsoft Research Papers

#PapersToRead from Microsoft Research in the broad space of Generative AI, Multi-agent systems, responsible AI practices, LLM Ops, and language models • 20 items • Updated Jun 26, 2024 • 5

Papers

Large Language Model (LLM) and NLP related papers. • 334 items • Updated about 3 hours ago • 13

upvoted 2 papers almost 2 years ago

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 626

LLM Augmented LLMs: Expanding Capabilities through Composition

Paper • 2401.02412 • Published Jan 4, 2024 • 38