Google's TurboQuant AI-compression algorithm can reduce LLM memory usage by 6x

8 points | by geoffbp 16 hours ago

6 comments