Created
5y
|
Jul 13, 2020, 2:32:04 PM
Login to add comment
Other posts in this group

Hey everyone!
Over the past two years I threw myself back into full-time engineering with a simple goal: write code that gives back to the community. After a lot of late-night FOMO (“AI w


Article URL: https://cacm.acm.org/news/the-collapse-of-gpt/

I discovered that in LLM inference, keys and values in the KV cache have very different quantization sensitivities. Keys need higher precision than values to maintain quality.
I patched llama.cp
Article URL: https://clojurescript.org/news/2025-05-16-release
