X X^t can be faster

Created 8h | May 16, 2025, 5:20:12 PM


Login to add comment

Other posts in this group

Show HN: Solidis – Tiny TS Redis client, no deps, for serverless

Hey everyone!

Over the past two years I threw myself back into full-time engineering with a simple goal: write code that gives back to the community. After a lot of late-night FOMO (“AI w

May 17, 2025, 12:20:10 AM | Hacker news
The Collapse of GPT
May 17, 2025, 12:20:06 AM | Hacker news
Show HN: KVSplit – Run 2-3x longer contexts on Apple Silicon

I discovered that in LLM inference, keys and values in the KV cache have very different quantization sensitivities. Keys need higher precision than values to maintain quality.

I patched llama.cp

May 16, 2025, 9:50:10 PM | Hacker news