Reward hacking is swamping model intelligence gains

3 points | by matt_d 8 hours ago

No comments yet.