Supervised Fine Tuning on Curated Data is Reinforcement Learning

Article URL: https://arxiv.org/abs/2507.12856

Comments URL: https://news.ycombinator.com/item?id=44727788

Points: 13

# Comments: 4

https://arxiv.org/abs/2507.12856

Erstellt 3d | 29.07.2025, 21:40:10

Melden Sie sich an, um einen Kommentar hinzuzufügen

Andere Beiträge in dieser Gruppe

Native Sparse Attention

Native Sparse Attention

Was submitted as "DeepSeek won the best paper award at ACL 2025"

Here is the awards page:

02.08.2025, 04:30:13 | Hacker news

Hardening mode for the compiler

Hardening mode for the compiler

Article URL: https://discourse.llvm.org/t/rfc-hardening-mode-for-the-compiler/87660

Comments URL:

02.08.2025, 04:30:12 | Hacker news

Peak Energy just shipped the US's first grid-scale sodium-ion battery

Peak Energy just shipped the US's first grid-scale sodium-ion battery

Article URL: https://electrek.co/2025/07/30/peak-energy-us-first-grid-scale-sodium-ion-battery/

02.08.2025, 04:30:10 | Hacker news

Robert Wilson has died

Robert Wilson has died

https://www.nytimes.com/2025/07/31/theater/robert-wilson-dea... (

02.08.2025, 04:30:07 | Hacker news

Meta violated privacy law, jury says in menstrual data fight

Meta violated privacy law, jury says in menstrual data fight

Article URL: https://www.courthousenews.com/meta-violated-privacy-law-jury-says-in-menstrual-d

02.08.2025, 02:20:06 | Hacker news

Contrarian climate assessment from U.S. government draws pushback

Contrarian climate assessment from U.S. government draws pushback

Article URL: https://www.science.org/content/article/contrarian-climate-assessme

02.08.2025, 02:20:04 | Hacker news

The Rickover Corpus: A digital archive of Admiral Rickover's speeches and memos

The Rickover Corpus: A digital archive of Admiral Rickover's speeches and memos

Article URL: https://rickovercorpus.org/

Comments URL: https://news.ycombinator.com/item?id

02.08.2025, 02:20:03 | Hacker news

Techie