Junchen's Lab
Junchen's Lab
Tour
News
People
Projects
Publications
Contact
Qizheng Zhang
Latest
CacheGen: KV Cache Compression and Streaming for Fast Language Model Serving
CacheBlend: Fast Large Language Model Serving for RAG with Cached Knowledge Fusion
Automatic and Efficient Customization of Neural Networks for ML Applications
GRACE: Loss-Resilient Real-Time Video through Neural Codecs
OneAdapt: Fast Adaptation for Deep Learning Applications via Backpropagation
AccMPEG: Optimizing Video Encoding for Video Analytics
Cite
×