X X^t can be faster

Erstellt 9h | 16.05.2025, 17:20:12


Melden Sie sich an, um einen Kommentar hinzuzufügen

Andere Beiträge in dieser Gruppe

Show HN: Solidis – Tiny TS Redis client, no deps, for serverless

Hey everyone!

Over the past two years I threw myself back into full-time engineering with a simple goal: write code that gives back to the community. After a lot of late-night FOMO (“AI w

17.05.2025, 00:20:10 | Hacker news
Show HN: KVSplit – Run 2-3x longer contexts on Apple Silicon

I discovered that in LLM inference, keys and values in the KV cache have very different quantization sensitivities. Keys need higher precision than values to maintain quality.

I patched llama.cp

16.05.2025, 21:50:10 | Hacker news