Would you board a plane safety-tested by GenAI?

Ben and Ryan are joined by Robin Gupta for a conversation about benchmarking and testing AI systems. They talk through the lack of trust and confidence in AI, the inherent challenges of nondeterministic systems, the role of human verification, and whether we can (or should) expect an AI to be reliable. https://stackoverflow.blog/2024/05/24/would-you-board-a-plane-safety-tested-by-genai/

Created 24d | May 24, 2024, 5:50:06 AM


Login to add comment

Other posts in this group

The world’s most popular web framework is going AI native

On today’s episode we chat with Jared Palmer, VP of AI at Vercel, who says the company has three key goals. First, support AI native web apps like ChatGPT and Claude. Second, use GenAI to make it easi

Jun 14, 2024, 5:40:04 AM | StackOverflow blog
A peek behind the curtain with Stack Overflow’s sales engineers

In this episode, Alexa Montelibano and Tiago Torre, sales engineers at Stack Overflow, take you behind the scenes to show how customer feedback shapes our products, including OverflowAI. Alexa and Tia

Jun 11, 2024, 8:40:06 PM | StackOverflow blog
This startup uses a team of AI agents to write and review their pull requests

In this episode we chat with Saumil Patel, co-founder and CEO of Squire AI. The company uses an agentic workflow to automatically review your code, write your pull requests, and even review and provid

Jun 7, 2024, 2:10:07 PM | StackOverflow blog
This startup uses a team of AI agents to write and review their pull requests

In this episode we chat with Saumil Patel, co-founder and CEO of Squire AI. The company uses an agentic workflow to automatically review your code, write your pull requests, and even review and provid

Jun 7, 2024, 5:10:08 AM | StackOverflow blog
Breaking up is hard to do: Chunking in RAG applications

A look at some of the current thinking around chunking data for retrieval-augmented generation (RAG) systems. https://stackoverflow.blog/2024/06/06/breaking-up-is-hard-to-do-chunking-in-rag-applicatio

Jun 6, 2024, 1:10:05 PM | StackOverflow blog