CacheGen: KV Cache Compression and Streaming for Fast Language Model Serving accepted at SIGCOMM'24. May 7, 2024 1 min read Congratulations to Yuhan et al.