RL on Incorrect Synthetic Data Scales the Efficiency of LLM Math Reasoning by Eight-Fold, Setlur et al. 2024

by /u/StartledWatermelon in /r/mlscaling

Upvotes: 17

Favorite this post:

Mark as read:

Your rating:

Add this post to a custom list