Article URL: https://developers.googleblog.com/en/start-building-with-gemini-25-flash/
Comments URL: https://news.ycombinator.com/item?id=43720845
Points: 123
# Comments: 51
https://developers.googleblog.com/en/start-building-with-gemini-25-flash/
созданный
29d
|
17 апр. 2025 г., 20:10:08
Войдите, чтобы добавить комментарий
Другие сообщения в этой группе

I discovered that in LLM inference, keys and values in the KV cache have very different quantization sensitivities. Keys need higher precision than values to maintain quality.
I patched llama.cp
Article URL: https://clojurescript.org/news/2025-05-16-release




Article URL: https://bobacollection.staxmuseum.org/
Comments URL: https://news.y