Quantized Llama models with increased speed and a reduced memory footprint

507 points | by egnehots 6 days ago

120 comments