Researchllmbenchmarkingexpert curationopenai

Researchers Release Humanity's Last Exam Benchmark

singularityhub.com

|February 3, 2026

8.9

Relevance Score

Researchers Release Humanity's Last Exam Benchmark

An international consortium released Humanity's Last Exam (HLE) in early 2025, a 2,500-question, expert-vetted benchmark covering math, humanities, and natural sciences to assess large language models. The test contains expert-crafted short-answer and multiple-choice items designed to be non-ambiguous and difficult for models; leading systems initially scored in the single digits, with GPT-5 reaching about 25 percent. HLE aims to track AI expertise, though it measures task performance rather than general intelligence.

Researchers Release Humanity's Last Exam Benchmark

More AI & Data Science News

Google Faces Antitrust Appeal Over Search Monopoly

AI Reshapes Career Paths And Skills Demand

Musk Frames SpaceX-xAI Memo With Sci-Fi

Scoring Rationale

Sources

Community Banks Face Core Processor Frictions