Built RL for long-horizon agents – tested on 32x H100s but too poor to train

Article URL: https://github.com/Danau5tin/terminal-bench-rl

Comments URL: https://news.ycombinator.com/item?id=44721791

Points: 31

# Comments: 2

https://github.com/Danau5tin/terminal-bench-rl

Creato 11h | 29 lug 2025, 12:20:15

Accedi per aggiungere un commento

Altri post in questo gruppo

The Making of Dario Amodei

The Making of Dario Amodei

Article URL: https://www.bigtechnology.com/p/the-making-of-dario-amodei

Comments URL:

29 lug 2025, 21:40:18 | Hacker news

How the brain increases blood flow on demand

How the brain increases blood flow on demand

Article URL: https://hms.harvard.edu/news/how-brain-increases-blood-flow-demand

Comments URL:

29 lug 2025, 21:40:12 | Hacker news

Maru OS – Your phone is your PC

Maru OS – Your phone is your PC

Article URL: https://maruos.com/

Comments URL: https://news.ycombinator.com/item?id=44727298

29 lug 2025, 21:40:11 | Hacker news

More honey bees dying, even as antibiotic use halves

More honey bees dying, even as antibiotic use halves

Article URL: https://news.uoguelph.ca/2025/07/more-honey-bees-dying-even-as-antibiotic-use-halves/

29 lug 2025, 21:40:10 | Hacker news

Supervised Fine Tuning on Curated Data is Reinforcement Learning

Supervised Fine Tuning on Curated Data is Reinforcement Learning

Article URL: https://arxiv.org/abs/2507.12856

Comments URL: https://news.ycombinator.c

29 lug 2025, 21:40:10 | Hacker news

RIP Shunsaku Tamiya, the man who made plastic model kits a global obsession

RIP Shunsaku Tamiya, the man who made plastic model kits a global obsession

Article URL: https://JapaneseNostalgicCar.com/rip-shunsaku-tamiya-plastic-model-kits/

Comments URL:

29 lug 2025, 21:40:08 | Hacker news

CodeCrafters (YC S22) is hiring first Marketing Person

CodeCrafters (YC S22) is hiring first Marketing Person

Article URL: https://www.ycombinator.com/companies/codecrafters/jobs/7ATipKJ-1st-marketing-hire

29 lug 2025, 21:40:05 | Hacker news

Techie