Translation

Claude, GPT, Gemini Agents Fail 72% of U.S. Healthcare Workflows

A new benchmark evaluating AI agents like Claude, GPT, and Gemini found they fail 72% of U.S. healthcare workflows, indicating significant limitations in handling complex medical tasks reliably.