AI & ML interests

None defined yet.

Recent Activity

inference-optimization 's collections 3

Mixed Precision Models
Collection of Mixed Precision LLaMA and Qwen Models
KV Cache Quantization
Collection on FP8 Quantization of Weights, Activations and KV Cache