Show HN: We made our own inference engine for Apple Silicon

We wrote our inference engine on Rust, it is faster than llama cpp in all of the use cases. Your feedback is very welcomed. Written from scratch with idea that you can add support of any kernel and platform.


Comments URL: https://news.ycombinator.com/item?id=44570048

Points: 72

# Comments: 23

https://github.com/trymirai/uzu

Ask HN: Is it time to fork HN into AI/LLM and "Everything else/other?"

I would very much like to enjoy HN the way I did years ago, as a place where I'd discover things that I never otherwise would have come across.

The increasing AI/LLM domination of the site has made it much less appealing to me.


Comments URL: https://news.ycombinator.com/item?id=44571740

Points: 99

# Comments: 86

https://news.ycombinator.com/item?id=44571740

Ask HN: What's Your Useful Local LLM Stack?

What I’m asking HN:

What does your actually useful local LLM stack look like?

I’m looking for something that provides you with real value — not just a sexy demo.

---

After a recent internet outage, I realized I need a local LLM setup as a backup — not just for experimentation and fun.

My daily (remote) LLM stack:

  - Claude Max ($100/mo): My go-to for pair programming. Heavy user of both the Claude web and desktop clients.
  - Windsurf Pro ($15/mo): Love the multi-lin

Search