Adaptive speculative decoding: picking draft lengths at runtime

2 points | by hasheddan 8 hours ago

No comments yet.