OpenAI Unveils Lighter Codex Model Powered by Cerebras Megachip

OpenAI on Thursday introduced a lightweight version of its agentic coding tool Codex, deepening its hardware partnership with Cerebras and signaling a sharper focus on ultra-fast AI performance.

The new model, dubbed GPT-5.3-Codex-Spark, is described as a smaller, faster variant of the Codex 5.3 system launched earlier this month. Designed for rapid inference and real-time collaboration, Spark is positioned as a productivity-focused tool aimed at developers seeking swift prototyping and iteration rather than long-duration computational tasks.

To achieve that speed, OpenAI is integrating Cerebras’ Wafer Scale Engine 3 (WSE-3), the chipmaker’s third-generation waferscale processor built with 4 trillion transistors. The move marks the first tangible milestone in a multi-year agreement between the two companies valued at more than $10 billion, announced last month.

“Integrating Cerebras into our mix of compute solutions is all about making our AI respond much faster,” OpenAI had said at the time of the deal. With Spark, that ambition appears to be taking operational form.

Dual-Mode Codex Vision

OpenAI described Spark as the first step toward a dual-mode Codex system — one optimized for real-time collaboration and rapid iteration, and another designed for deeper reasoning and longer-running execution.

“Codex-Spark is the first step toward a Codex that works in two complementary modes: real-time collaboration when you want rapid iteration, and long-running tasks when you need deeper reasoning and execution,” the company said in a statement.

The research preview of Spark is currently available to ChatGPT Pro users within the Codex app.

In a social media post ahead of the launch, OpenAI Chief Executive Sam Altman hinted at the announcement. “We have a special thing launching to Codex users on the Pro plan later today,” he wrote. “It sparks joy for me.”

Cerebras’ Rising Profile

Founded more than a decade ago, Cerebras has gained renewed prominence amid the global race to build faster, more efficient AI hardware. The company recently raised $1 billion in fresh capital at a reported valuation of $23 billion and has previously signaled intentions to pursue an initial public offering.

Its WSE-3 megachip — engineered to deliver extremely low latency — is particularly suited for AI workloads demanding real-time response, an increasingly critical metric in developer-facing tools.

“What excites us most about GPT-5.3-Codex-Spark is partnering with OpenAI and the developer community to discover what fast inference makes possible — new interaction patterns, new use cases, and a fundamentally different model experience,” said Sean Lie, co-founder and chief technology officer of Cerebras. “This preview is just the beginning.”

Speed as Strategy

The release underscores a broader industry shift from merely scaling AI models to optimizing inference speed and user experience. As generative AI tools mature, developers and enterprises are demanding systems that not only reason deeply but also respond instantly.

With Spark, OpenAI appears to be betting that lower latency — powered by specialized silicon — will define the next phase of AI-assisted coding.

For now, the launch signals a closer fusion between AI software and custom-built hardware, as the race intensifies to make advanced AI not just smarter, but faster.

Leave a Reply

Your email address will not be published. Required fields are marked *