Dongwon Jo's picture

3 10

Dongwon Jo

dongwonjo

·

AI & ML interests

Efficient AI, Model Compression, Quantization, Pruning, Generative Model, Large Language Model, Diffusion

Recent Activity

upvoted a paper 13 days ago

Squeezing Large-Scale Diffusion Models for Mobile

upvoted a paper 13 days ago

SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks

upvoted a paper 13 days ago

LiteStage: Latency-aware Layer Skipping for Multi-stage Reasoning

View all activity

Organizations

commented a paper about 1 year ago

FastKV: Decoupling of Context Reduction and KV Cache Compression for Prefill-Decoding Acceleration

Paper • 2502.01068 • Published Feb 3, 2025 • 18 •

commented a paper over 1 year ago

Mixture of Scales: Memory-Efficient Token-Adaptive Binarization for Large Language Models

Paper • 2406.12311 • Published Jun 18, 2024 • 8 •