Article URL: https://github.com/ollama/ollama/issues/3185
Comments URL: https://news.ycombinator.com/item?id=44003741
Points: 104
# Comments: 37
Utworzony
10h
|
16 maj 2025, 15:10:09
Zaloguj się, aby dodać komentarz
Inne posty w tej grupie

Hey everyone!
Over the past two years I threw myself back into full-time engineering with a simple goal: write code that gives back to the community. After a lot of late-night FOMO (“AI w


Article URL: https://cacm.acm.org/news/the-collapse-of-gpt/

I discovered that in LLM inference, keys and values in the KV cache have very different quantization sensitivities. Keys need higher precision than values to maintain quality.
I patched llama.cp
Article URL: https://clojurescript.org/news/2025-05-16-release
