AI News Monitor

OpenAI Launches HealthBench to Evaluate AI Models in Healthcare

OpenAI has unveiled HealthBench, a new benchmark designed to assess AI models in healthcare using real-world applicability and physician judgment. The benchmark includes 5,000 simulated health conversations, each evaluated against physician-created rubrics to ensure accuracy and relevance. Developed in collaboration with 262 physicians across 60 countries, HealthBench spans 49 languages and 26 medical specialties. Each…

Will Chester

May 17, 2025

1–2 minutes

AI analysis, artificial intelligence, HealthBench, medical AI, OpenAI

OpenAI has unveiled HealthBench, a new benchmark designed to assess AI models in healthcare using real-world applicability and physician judgment. The benchmark includes 5,000 simulated health conversations, each evaluated against physician-created rubrics to ensure accuracy and relevance.

Developed in collaboration with 262 physicians across 60 countries, HealthBench spans 49 languages and 26 medical specialties. Each model response is graded based on criteria such as expertise-tailored communication, emergency referrals, and response depth, with evaluations conducted using GPT-4.1.

“Our findings show that large language models have improved significantly and already outperform experts in writing responses,” OpenAI stated. However, the company noted room for improvement, particularly in context-seeking and reliability.

HealthBench is now publicly available on GitHub, aligning with OpenAI’s broader efforts to advance AI in high-impact fields like healthcare.

The Larger Trend: Project Stargate Faces Delays

The announcement comes as Project Stargate, a $500 billion AI infrastructure initiative involving OpenAI CEO Sam Altman, Oracle’s Larry Ellison, and SoftBank’s Masayoshi Son, encounters setbacks.

Originally touted as a potential game-changer for healthcare—including AI-driven cancer vaccines—the project is now facing delays due to economic uncertainty and U.S. tariffs on critical tech components. SoftBank’s pledged $100 billion investment has yet to materialize, with financing discussions still pending.

Despite challenges, OpenAI remains focused on AI-driven healthcare innovation, with HealthBench serving as a key step toward ensuring real-world benefits.

2 responses to “OpenAI Launches HealthBench to Evaluate AI Models in Healthcare”

Judge Sides With Meta in Authors’ Copyright Lawsuit Over AI Training – AI News Monitor

June 26, 2025 at 12:56 pm

[…] lawsuit is one of several targeting AI companies, including OpenAI and Microsoft, as the industry grapples with how to balance innovation with intellectual property […]

Reply
Meta to Invest Hundreds of Billions in AI Superclusters, Zuckerberg Announces – AI News Monitor

July 14, 2025 at 1:20 pm

[…] announcement underscores Meta’s aggressive push to catch up to rivals like OpenAI and Google in the AI arms race. After facing criticism for the tepid reception of its Llama 4 models […]

Reply

AI News Monitor

Leave a comment Cancel reply

In Middle East, AI Adoption Surges, Outpacing Global Trends and Reshaping Work

U.A.E. Commits $1 Billion to Bring A.I. to Africa, Announced at G20 Summit

Across Africa, a Push to Harness A.I. in Classrooms and Avoid ‘Digital Colonization

Trending