top of page

Etched Emerges Out of Stealth with $1B+ in Customer Contracts and $800M Raised

  • Writer: Karan Bhatia
    Karan Bhatia
  • 1 day ago
  • 2 min read

Etched, building frontier inference clusters led by Gavin Uberti, Robert Wachen, Chris Zhu, and the team, has emerged out of stealth with $1B+ in customer contracts and $800M raised. The Series B was led by Stripes, with participation from Ribbit Capital, Radical Ventures, Positive Sum, Primary, & Argo.


Frontier Inference Infrastructure.


The company builds hardware systems for frontier inference clusters, co-designing chips, racks, software, and manufacturing methods to optimize throughput, latency, cost, and power efficiency across both prefill and decode workloads.


Earlier this year, its A0 silicon returned from TSMC N4P, and the team is now validating its first rack-scale product with customers, supported by reported demand of approximately $1B.


The organization has grown to more than 400 engineers with experience from companies including NVIDIA, Google TPU teams, Broadcom, SK Hynix, and TSMC. It has raised approximately $800M across multiple financings, including a strategic investment from VentureTech Alliance, and is expanding its collaboration with leading semiconductor manufacturers.


Designing a New Pareto Frontier.


The company is building inference systems that push the Pareto frontier for frontier AI workloads, including trillion-parameter MoEs, long-context models, and agentic systems. This requires deep co-design across chips, packages, PCBs, cooling systems, interconnects, and cluster software.


Two key architectural breakthroughs underpin this approach:


Low Voltage Inference (LVI):


Traditional AI accelerators are constrained by thermal limits, where higher FLOPs utilization leads to throttling and reduced sustained throughput. LVI addresses this by operating key compute blocks at significantly lower voltage, enabling higher FLOPs density and sustaining over 80% of peak compute without thermal throttling. Achieving this requires end-to-end co-design, spanning circuit design, scheduling, power delivery, packaging, and cooling systems.


Cluster Scale Memory (CSM):


To overcome memory and interconnect bottlenecks in current HBM-based systems, CSM introduces a low-latency shared memory architecture across the cluster. This hybrid HBM/SRAM approach enables higher capacity while significantly reducing memory access latency, improving both throughput and responsiveness for inference workloads.


These systems are being developed in close collaboration with leading AI companies, cloud providers, and hyperscalers. The company has validated rack-scale systems in representative data center environments and simulated production-scale workloads using real traffic patterns, supported by distributed engineering teams working closely with supply-chain partners globally.


Getting to Gigawatt Scale.


Early customer testing shows strong performance across throughput, latency, and power efficiency for inference workloads. The company plans to share additional performance and roadmap updates over the coming months.


First rack systems are scheduled to ship this summer, with production already underway to fulfill more than $1B in customer contracts. To support continuous development and deployment cycles, the company has established a Taiwan manufacturing facility, along with a data center, test environment, and NPI prototyping lab in its San Jose office.


The organization is structured for vertical integration at gigawatt scale, co-locating chip designers, inference engineers, thermal specialists, and systems teams to accelerate iteration across the full hardware and software stack.




Menlo Times is a global media platform covering AI, Deeptech, Venture Capital, Fintech, Robotics, and Security through news, analysis, and insights from founders and operators.
  • Instagram
  • Facebook
  • X(Formerly Twitter)
  • LinkedIn
  • YouTube
© 2026 Menlo Times. All rights reserved.
bottom of page