Spaces:
Runtime error
Apply for community grant: Academic project (gpu)
Cache-to-Cache (C2C) enables Large Language Models to communicate directly through their KV-Caches, bypassing text generation. By projecting and fusing KV-Caches between models, C2C achieves 8.5–10.5% higher accuracy than individual models and 3.0–5.0% better performance than text-based communication, with 2.0× speedup in latency.
It earns much attention on X: https://x.com/jiqizhixin/status/1985219136000299215
Hi
@fuvty
, we've assigned ZeroGPU to this Space. Please check the compatibility and usage sections of this page so your Space can run on ZeroGPU.
If you can, we ask that you upgrade to Pro ($9/month) to enjoy higher ZeroGPU quota and other features like Dev Mode, Private Storage, and more: hf.co/pro
Thank you so much for your generous help! So sorry that I previously reset the GPU by accident.