Bluffbench is near saturation: LLMs can interpret counterintuitive plots

2 points | by ionychal 14 hours ago

No comments yet.