Product Launchdiffusion llmreasoninginception labsopenai compatible
Inception Labs Launches Mercury 2 Diffusion LLM
9.1
Relevance Score
Last week Inception Labs launched Mercury 2, a diffusion-based large language model that generates over 1,000 tokens per second and delivers five to ten times lower end-to-end latency than speed-optimized autoregressive models, CEO Stefano Ermon told The New Stack. Mercury 2 is available via an OpenAI-compatible API, with AWS Bedrock integration coming soon, targeting faster, cheaper inference for reasoning workloads.


