Tree Search Distillation for Language Models Using PPO

86 points | by at2005 2 days ago

14 comments