StackOverflow blog

How do you evaluate an LLM? Try an LLM.

On this episode: Stack Overflow senior data scientist Michael Geden tells Ryan and Ben about how data scientists evaluate large language models (LLMs) and their output. They cover the challenges involved in evaluating LLMs, how LLMs are being used to evaluate other LLMs, the importance of data validating, the need for human raters, and more needs and tradeoffs involved in selecting and fine-tuning LLMs. https://stackoverflow.blog/2024/04/16/how-do-you-evaluate-an-llm-try-an-llm/

Diverting more backdoor disasters

In the wake of the XZ backdoor, Ben and Ryan unpack the security implications of relying on open-source software projects maintained by small teams. They also discuss the open-source nature of Linux, the high cost of education in the US, the value of open-source contributions for job seekers, and what Apple is up to AI-wise. https://stackoverflow.blog/2024/04/12/diverting-more-backdoor-disasters/

Climbing the GenAI decision tree

In this sponsored episode, Ben and Ryan are joined by Ria Cheruvu, an AI evangelist at Intel, to discuss the different approaches to incorporating AI models into organizations. https://stackoverflow.blog/2024/04/10/climbing-the-genai-decision-tree/

Want to be a great software engineer? Don’t be a jerk.

The home team convenes to discuss the XZ backdoor attack, what great software engineers have in common, how GenAI is changing the face of drug development, and the rise of managed service providers for AI. https://stackoverflow.blog/2024/04/09/want-to-be-a-great-software-engineer-don-t-be-a-jerk/

What a year building AI has taught Stack Overflow

We sit down with Jessica Clark, a senior data scientist at Stack Overflow, to discuss how our company approaches generative AI and data quality. https://stackoverflow.blog/2024/04/05/what-a-year-building-ai-has-taught-stack-overflow/

Are long context windows the end of RAG?

The home team is joined by Michael Foree, Stack Overflow’s director of data science and data platform, and occasional cohost Cassidy Williams, CTO at Contenda, for a conversation about long context windows, retrieval-augmented generation, and how Databricks’ new open LLM could change the game for developers. Plus: How will FTX co-founder Sam Bankman-Fried’s sentence of 25 years in prison reverberate in the blockchain and crypto spaces? https://stackoverflow.blog/2024/04/02/are-long-context-windo

Will antitrust suits benefit developers?

Ben and Ryan talk about how tiny nations are making huge money from their domain names, the US government’s antitrust case against Apple, the implications of a four-day work week, Reddit’s IPO, and more. https://stackoverflow.blog/2024/03/29/will-antitrust-suits-benefit-developers/


Members



Search