/u/StartledWatermelon's posts

Year:

Only show posts with narrations

Large Language Monkeys: Scaling Inference Compute with Repeated Sampling, Brown et al. 2024 [Given sufficient number of attempts, smaller models can reach parity with larger models in solving tasks. Pareto frontier for compute cost varies from task to task]

25 upvotes • r/mlscaling

Mark as read: Add to a list

Human-like Episodic Memory for Infinite Context LLMs, Fountas et al. 2024

18 upvotes • r/mlscaling

Mark as read: Add to a list

RL on Incorrect Synthetic Data Scales the Efficiency of LLM Math Reasoning by Eight-Fold, Setlur et al. 2024

17 upvotes • r/mlscaling

Mark as read: Add to a list

[R] How Well Can a Long Sequence Model Model Long Sequences? Comparing Architectural Inductive Biases on Long-Context Abilities

6 upvotes • r/MachineLearning

Mark as read: Add to a list

Title	Upvotes	Subreddit	Mark as read	Favorited	Rating	Add to a list
Large Language Monkeys: Scaling Inference Compute with Repeated Sampling, Brown et al. 2024 [Given sufficient number of attempts, smaller models can reach parity with larger models in solving tasks. Pareto frontier for compute cost varies from task to task]	25	mlscaling
Human-like Episodic Memory for Infinite Context LLMs, Fountas et al. 2024	18	mlscaling
RL on Incorrect Synthetic Data Scales the Efficiency of LLM Math Reasoning by Eight-Fold, Setlur et al. 2024	17	mlscaling
[R] How Well Can a Long Sequence Model Model Long Sequences? Comparing Architectural Inductive Biases on Long-Context Abilities	6	MachineLearning