ETH Zurich and EPFL to release a LLM developed on public infrastructure



Accedi per aggiungere un commento

Altri post in questo gruppo

jank is C++
11 lug 2025, 20:10:34 | Hacker news
Show HN: RULER – Easily apply RL to any agent

Hey HN, Kyle here, one of the co-founders of OpenPipe.

Reinforcement learning is one of the best techniques for making agents more reliable, and has been widely adopted by frontier labs. However

11 lug 2025, 20:10:33 | Hacker news