Researchagentsconsultingmercorbenchmarks

Mercor Finds AI Agents Fail Consulting Tasks

businessinsider.com

|February 9, 2026

6.1

Relevance Score

Mercor Finds AI Agents Fail Consulting Tasks

Mercor published the APEX-Agents benchmark showing leading AI agents completed under 25% of real-world consulting, banking, and legal tasks on the first try and only about 40% after eight attempts; OpenAI's GPT-5.2 initially completed roughly 23% while Anthropic's Opus 4.6 reached nearly 33%. The study found agents perform well at research and single-tool data analysis but fail on long-horizon, multi-step planning and cross-file coordination, and Mercor CEO Brendan Foody says rapid model improvement could displace some consulting roles soon.

Scoring Rationale

Moderate novelty and practical relevance, limited by a single-company benchmark and lack of peer-reviewed validation.

Practice interview problems based on real data

1,500+ SQL & Python problems across 15 industry datasets — the exact type of data you work with.

Try 250 free problems

Free Career Roadmaps8 PATHS

Step-by-step roadmaps from zero to job-ready — curated courses, salary data, and the exact learning order that gets you hired.

Explore all career paths

Sources

AI agents failed at real-world consulting tasks — but Mercor's CEO says they're still on track to replace consultants
businessinsider.com
Read Original

Mercor Finds AI Agents Fail Consulting Tasks

Scoring Rationale

Sources

More AI & Data Science News

Wall Street Anticipates Strong Q1 Profit Growth

eM Client Adds Generative AI Features

AI Reshapes White-Collar Work And Management

AI Industry Pursues Self-Improving Research Systems

Mercor Finds AI Agents Fail Consulting Tasks

Scoring Rationale

Sources

More AI & Data Science News

Wall Street Anticipates Strong Q1 Profit Growth

eM Client Adds Generative AI Features

AI Reshapes White-Collar Work And Management

AI Industry Pursues Self-Improving Research Systems