Show HN: AutoThink – Boosts local LLM performance with adaptive reasoning

I built AutoThink, a technique that makes local LLMs reason more efficiently by adaptively allocating computational resources based on query complexity.

The core idea: instead of giving every query the same "thinking time," classify queries as HIGH or LOW complexity and allocate thinking tokens accordingly. Complex reasoning gets 70-90% of tokens, simple queries get 20-40%.

I also implemented steering vectors derived from Pivotal Token Search (originally from Microsoft's Phi-4 paper) that guide the model's reasoning patterns during generation. These vectors encourage behaviors like numerical accuracy, self-correction, and thorough exploration.

Results on DeepSeek-R1-Distill-Qwen-1.5B:

- GPQA-Diamond: 31.06% vs 21.72% baseline (+43% relative improvement)

- MMLU-Pro: 26.38% vs 25.58% baseline

- Uses fewer tokens than baseline approaches

Works with any local reasoning model - DeepSeek, Qwen, custom fine-tuned models. No API dependencies.

The technique builds on two things I developed: an adaptive classification framework that can learn new complexity categories without retraining, and an open source implementation of Pivotal Token Search.

Technical paper: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5253327

Code and examples: https://github.com/codelion/optillm/tree/main/optillm/autoth...

PTS implementation: https://github.com/codelion/pts

I'm curious about your thoughts on adaptive resource allocation for AI reasoning. Have you tried similar approaches with your local models?

Comments URL: https://news.ycombinator.com/item?id=44112326

Points: 127

# Comments: 11

https://news.ycombinator.com/item?id=44112326

Creato 1d | 28 mag 2025, 05:20:19

Accedi per aggiungere un commento

Altri post in questo gruppo

Can Open Source Projects Exit Foundations?

Article URL: https://www.infoq.com/news/2025/05/nats-cncf-open-source/

Comments URL:

29 mag 2025, 09:10:10 | Hacker news

They used Xenon to climb Everest in days – is it the future of mountaineering?

Article URL: https://www.nytimes.com/2025/05/27/world/europe/mount-everest-xenon-gas-nepal-

29 mag 2025, 09:10:09 | Hacker news

High-quality OLED displays now enabling integrated thin and multichannel audio

Article URL: https://www.sciencedaily.com/releases/2025/05/250521125055.htm

Comments URL:

29 mag 2025, 09:10:09 | Hacker news

Show HN: Entropy – Sharing screen is scary in SaaS age

Sharing screen is really scary today with all PIIs and secrets sprawling around your screen, so I built Entropy, a small Chrome extension that spots API keys, tokens, emails, and throws a blur ove

29 mag 2025, 09:10:08 | Hacker news

Gurus of 90s Web Design: Zeldman, Siegel, Nielsen

Article URL: https://cybercultural.com/p/web-design-1997/

Comments URL: ht

29 mag 2025, 09:10:08 | Hacker news

3D Simulation of the Bombe Machine

Article URL: https://bombe.virtualcolossus.co.uk/bombe/

Comments URL: https:

29 mag 2025, 06:40:29 | Hacker news

5-year study suggests chimps strike stones against trees as communication

Article URL: https://phys.org/news/2025-05-year-chimpanzees-stones-trees-communication.html

Comm

29 mag 2025, 06:40:28 | Hacker news

Techie