Ask HN: How can ChatGPT serve 700M users when I can't run one GPT-4 locally?

Sam said yesterday that chatgpt handles ~700M weekly users. Meanwhile, I can't even run a single GPT-4-class model locally without insane VRAM or painfully slow speeds.

Sure, they have huge GPU clusters, but there must be more going on - model optimizations, sharding, custom hardware, clever load balancing, etc.

What engineering tricks make this possible at such massive scale while keeping latency low?

Curious to hear insights from people who've built large-scale ML systems.

Comments URL: https://news.ycombinator.com/item?id=44840728

Points: 51

# Comments: 36

https://news.ycombinator.com/item?id=44840728

Creado 6d | 8 ago 2025, 20:30:11

Inicia sesión para agregar comentarios

Otros mensajes en este grupo.

How to rig elections [video]

Article URL: https://media.ccc.de/v/why2025-218-how-to-rig-elections

Comments URL:

14 ago 2025, 20:20:18 | Hacker news

Show HN: OWhisper – Ollama for realtime speech-to-text

Hello everyone. This is Yujong from the Hyprnote team (https://github.com/fastrepl/hyprnote).

We built OWhisper for 2 reasons: (

14 ago 2025, 20:20:18 | Hacker news

I Made a Realtime C/C++ Build Visualizer

Article URL: https://danielchasehooper.com/posts/syscall-build-snooping/

Comments URL:

14 ago 2025, 20:20:17 | Hacker news

Show HN: Modelence – Supabase for MongoDB

Hi all, Aram and Eduard here - authors of Modelence (https://github.com/modelence/modelence), an all-in-one backend platform for

14 ago 2025, 20:20:15 | Hacker news

"Privacy preserving age verification" is bullshit

Article URL: https://pluralistic.net/2025/08/14/bellovin/

Comments URL: ht

14 ago 2025, 20:20:13 | Hacker news

What are the real numbers, really? (2024)

Article URL: https://www.infinitelymore.xyz/p/what-are-the-real-numbers-really

Comments URL:

14 ago 2025, 20:20:12 | Hacker news

Steve Wozniak: Life to me was never about accomplishment, but about Happiness

Article URL: https://yro.slashdot.org/comments.pl?sid=23765914&cid=65583466

Comments URL:

14 ago 2025, 20:20:10 | Hacker news

Techie