OpenAI and Broadcom Unveil LLM-Optimized Inference Chip

Karan Bhatia
4 hours ago
3 min read

OpenAI and Broadcom (NASDAQ: AVGO) have unveiled Jalapeño, OpenAI’s first Intelligence Processor: an accelerator architected around OpenAI’s vision for the future of LLM inference, and the first AI accelerator in a multi-generation compute platform the companies are building together to make advanced AI faster, more reliable, and more accessible to more people.

Building the AI Stack In-House.

OpenAI has taken delivery of its first custom AI chip, Jalapeño, from partners Broadcom and Celestica, marking a significant step toward building its own AI infrastructure stack. Designed around OpenAI’s expertise in large language models and inference, the chip is engineered to support both current and future AI workloads.

Early engineering samples are already running production-scale ML workloads, including GPT-5.3-Codex-Spark. While final benchmarking is ongoing, initial testing indicates that Jalapeño delivers substantially better performance per watt than today's leading AI chips. Broadcom’s silicon design and networking technologies, along with Celestica’s system integration expertise, are helping bring the platform to large-scale deployment.

Leadership Perspective.

OpenAI President and Co-Founder Greg Brockman said Jalapeño is a key part of the company's long-term strategy to build its own AI infrastructure stack, enabling faster, more reliable, and more affordable AI by making compute more abundant and efficient.

Richard Ho, who leads OpenAI's hardware program, said the chip was purpose-built for LLM inference, with its architecture optimized for the memory, networking, and serving patterns required by frontier AI models. Early testing indicates Jalapeño can execute key AI workloads close to the hardware's theoretical performance limits.

Broadcom President and CEO Hock Tan described the collaboration as a long-term effort to build the infrastructure needed for the next generation of AI, with a multi-generation chip roadmap aimed at supporting gigawatt-scale AI data centers alongside partners including Microsoft.

Built for LLM Inference.

Jalapeño is purpose-built for large language model (LLM) inference rather than adapted from general-purpose AI hardware. Designed around OpenAI's experience operating products such as ChatGPT, Codex, and its API, the chip is optimized for the compute, memory, networking, and serving requirements of modern and future LLMs. Its goal is to deliver the throughput of leading AI accelerators while achieving the low latency needed for interactive AI applications.

The chip also reflects OpenAI's full-stack strategy, spanning everything from model development to chip architecture, networking, deployment systems, and user-facing products. By optimizing every layer together, OpenAI aims to improve compute efficiency, accelerate model performance, reduce costs, and create a continuous feedback loop that powers more capable, reliable, and affordable AI.

Accelerated Chip Development.

OpenAI and Broadcom co-developed Jalapeño from initial design to manufacturing tape-out in just nine months, representing one of the fastest development cycles for a high-performance AI accelerator. The rapid timeline was enabled by close software-hardware collaboration between OpenAI's engineering teams and Broadcom's silicon design expertise, alongside the use of OpenAI's own AI models to accelerate portions of the chip design and optimization process.

By using AI to help build the infrastructure that powers future AI systems, OpenAI believes chip development can become faster and more efficient, ultimately lowering the cost of compute and expanding access to advanced AI technologies.

A Multi-Generation AI Compute Platform.

Jalapeño marks the first generation of OpenAI's long-term custom AI compute platform. Planned for initial deployment by the end of 2026, the platform combines OpenAI's accelerator architecture with Broadcom's silicon implementation, networking, and connectivity technologies, alongside Celestica's expertise in board, rack, and system integration. Together, the partners aim to build a scalable AI infrastructure platform that will evolve over multiple hardware generations.

Making AI More Accessible.

OpenAI views inference as the stage where AI delivers value to users. By improving the cost, speed, and reliability of inference, Jalapeño is expected to enable faster responses across ChatGPT, more capable AI agents, lower-cost API services, and more dependable performance during periods of high demand.

Ultimately, the company believes advances in custom AI infrastructure will help make frontier AI models more accessible and affordable for developers, businesses, researchers, students, and everyday users, expanding access to advanced AI on a global scale.

MENLO TIMES