Ask HN: How can ChatGPT serve 700M users when I can't run one GPT-4 locally?

Sam said yesterday that chatgpt handles ~700M weekly users. Meanwhile, I can't even run a single GPT-4-class model locally without insane VRAM or painfully slow speeds.

Sure, they have huge GPU clusters, but there must be more going on - model optimizations, sharding, custom hardware, clever load balancing, etc.

What engineering tricks make this possible at such massive scale while keeping latency low?

Curious to hear insights from people who've built large-scale ML systems.

Comments URL: https://news.ycombinator.com/item?id=44840728

Points: 51

# Comments: 36

https://news.ycombinator.com/item?id=44840728

Created 8h | Aug 8, 2025, 8:30:11 PM

Other posts in this group

I bought a £16 smartwatch just because it used USB-C

Article URL: https://shkspr.mobi/blog/2025/08/i-bought-a-16-smartwatch-just-because-it-used-usb-c/

Aug 9, 2025, 3:30:17 AM | Hacker news

How to safely escape JSON inside HTML SCRIPT elements

Article URL: https://sirre.al/2025/08/06/safe-json-in-script-tags-how-not-to-break-a-site/

Commen

Aug 9, 2025, 3:30:16 AM | Hacker news

Dial-up Internet to be discontinued

Article URL: https://help.aol.com/articles/dial-up-internet-to-be-discontinued

Comments URL:

Aug 9, 2025, 3:30:09 AM | Hacker news

Backpropagating through a maze with candle and WASM

Article URL: https://yberreby.com/discrete-maze-backprop-candle-wasm/

Comments URL:

Aug 9, 2025, 1:10:17 AM | Hacker news

My DIY modular charging station

Article URL: https://arun.is/blog/diy-modular-charging-station/

Comments URL:

Aug 9, 2025, 1:10:16 AM | Hacker news

Little-known leguminous plant can increase beef production by 60% (2022)

Article URL: https://www.embrapa.br/en/busca-de-noticias/-

Aug 9, 2025, 1:10:13 AM | Hacker news

What makes a SuperAger?

Article URL: https://news.northwestern.edu/stories/2025/08/what-makes-a-superager/

Comments URL:

Aug 9, 2025, 1:10:11 AM | Hacker news

Techie