CacheGen: KV Cache Compression and Streaming for Fast Language Model Serving accepted at SIGCOMM'24.

Congratulations to Yuhan et al.