Article URL: https://www.library.hbs.edu/working-knowledge/you-re-right-you-are-working-longer-and-attending-more-meetings
Comments URL: https://news.ycombinator.com/item?id=44003449
Points: 21
# Comments: 14
Creado
11h
|
16 may 2025, 12:40:17
Inicia sesión para agregar comentarios
Otros mensajes en este grupo.

I discovered that in LLM inference, keys and values in the KV cache have very different quantization sensitivities. Keys need higher precision than values to maintain quality.
I patched llama.cp
Article URL: https://clojurescript.org/news/2025-05-16-release




Article URL: https://bobacollection.staxmuseum.org/
Comments URL: https://news.y