I recently worked on running a thorough healthcare eval on GPT-5. The results show a (slight) regression in GPT-5 performance compared to GPT-4 era models.
I found this to be an interesting finding. Here are the detailed results: https://www.fertrevino.com/docs/gpt5_medhelm.pdf
Comments URL: https://news.ycombinator.com/item?id=44979107
Points: 54
# Comments: 25
Connectez-vous pour ajouter un commentaire
Autres messages de ce groupe

OctaneDB is an open-source vector database for Python that focuses on ultra-fast similarity search for high-dimensional data—perfect for AI/ML, semantic search, and large-scale document or embeddi
Article URL: https://wwlln.net/
Comments URL: https://news.ycombinator.com/item?id=44994090
Article URL: https://underlap.org/developers-block/
Comments URL: https://news.y
Article URL: https://lwn.net/Articles/1032612/
Comments URL: https://news.ycombinator
