KVarN: Native vLLM backend for KV-cache quantization by Huawei

51 points | by theanonymousone 2 hours ago

7 comments