From GPT-4 to GPT-5: Measuring Progress in Medical Language Understanding [pdf]

I recently worked on running a thorough healthcare eval on GPT-5. The results show a (slight) regression in GPT-5 performance compared to GPT-4 era models.

I found this to be an interesting finding. Here are the detailed results: https://www.fertrevino.com/docs/gpt5_medhelm.pdf


Comments URL: https://news.ycombinator.com/item?id=44979107

Points: 54

# Comments: 25

https://www.fertrevino.com/docs/gpt5_medhelm.pdf

Établi 1d | 22 août 2025, 02:10:09


Connectez-vous pour ajouter un commentaire

Autres messages de ce groupe

Show HN: OctaneDB – Fast, Open-Source Vector Database for Python

OctaneDB is an open-source vector database for Python that focuses on ultra-fast similarity search for high-dimensional data—perfect for AI/ML, semantic search, and large-scale document or embeddi

23 août 2025, 10:30:27 | Hacker news