Goodfire AI Announces Series B Funding at a $1.25 Billion Valuation
- Karan Bhatia
- 6 minutes ago
- 4 min read

Goodfire AI, a research company using interpretability to understand, learn from, and design AI systems, led by Eric Ho, Daniel Balsam, Tom McGrath, and others, has announced a $150 million Series B funding round at a $1.25 billion valuation led by B Capital, with participation from Juniper Ventures, DFJ Growth, Salesforce Ventures, Menlo Ventures, Lightspeed Venture Partners, South Park Commons, Wing Venture Capital, Eric Schmidt, and others.
Goodfire was founded to address a core challenge in modern AI: frontier models function as black boxes, making critical decisions with unpredictable and poorly understood behavior.
By advancing the science of interpretability, the effort focuses on opening, understanding, and editing model internals. A world-class team and partnerships with leaders such as the Arc Institute, Mayo Clinic, and Microsoft have enabled breakthroughs, including novel techniques to decompose model internals, a 50% reduction in LLM hallucinations, and the discovery of new Alzheimer’s biomarkers.

These results reinforce the belief that interpretability is essential for building powerful, steerable, and safe AI systems, with much more progress still to be made.
Foundation models have been deployed at unprecedented speed, powering production code, interacting across the internet, and even integrated into military systems, all while their internal decision-making remains largely opaque. Traditional black-box methods for shaping behavior are crude and prone to unpredictable failures.
A deeper, principled understanding of these systems is essential to unlock safe and powerful AI. Far from being inscrutable, models exhibit an intricate internal structure that can be leveraged. Interpretability tools can guide and align models while also serving as microscopes to explore the vast knowledge they acquire about the world.
“Interpretability serves as a toolset for a new domain of science: a means to form hypotheses, run experiments, and ultimately design intelligence rather than encountering it by chance.”
— Eric Ho, CEO of Goodfire
What’s Being Built
Goodfire is advancing a future where models can be understood at a fundamental level, enabling principled, aligned, and more useful AI.
A “model design environment” has been developed: a platform leveraging interpretability-based primitives to extract insights from models and data, improve behavior, and monitor performance in production. The platform supports two main applications:
Intentional model design, including interpretable training pipelines and inference-time monitoring
Scientific discovery through model-to-human knowledge transfer
Intentional Design
Intentional design forms the first pillar of Goodfire’s approach: creating methods to understand model behavior, debug issues, reshape responses with precision, and monitor performance in production.
Current training approaches rely on large datasets and metrics like loss curves, often without clarity on why or how changes occur. A more effective method provides richer feedback, guiding models on what to learn, which areas to focus on, and which responses are correct or incorrect.
Interpretability tools enable the detection of undesirable behaviors and targeted intervention on the corresponding parts of the model. For example, interpretability-informed training recently reduced hallucinations in a large language model by 50%.
Scientific Discovery
Narrowly superhuman models are now surpassing human experts in domains ranging from molecular biology to materials science. The knowledge contained within these models is often unknown to science, yet remains locked inside a black box.

Interpretability tools provide a means to unlock this knowledge. This forms the second pillar of the platform: scientific discovery through interpretability-driven model-to-human knowledge transfer. Early proof of concept came from extracting novel chess concepts from AlphaZero and teaching them to a grandmaster. More recently, analysis of an epigenetic foundation model developed by Prima Mente led to the identification of a novel class of Alzheimer’s biomarker, the first major natural science finding obtained by reverse-engineering a foundation model.
In addition to Prima Mente, partnerships with organizations such as Arc Institute and Mayo Clinic are advancing the discovery of novel science, improving model performance, reducing confounders, and ensuring AI-driven insights are scientifically rigorous and clinically relevant.
Interpretability serves as a core toolkit for digital biology. As large foundation models take a central role in digital science, interpretability methods function as a microscope, revealing the knowledge these models acquire from vast datasets.
Foundational Research
Goodfire is fundamentally a research lab. Beyond platform development, continued investment focuses on key directions in foundational interpretability research with the potential to transform the field of machine learning. Basic research underpins the field of interpretability, driven by curiosity about neural network behavior.
Current efforts have only begun to explore the possibilities of interpretability, with foundational breakthroughs providing the path to more powerful tools and deeper understanding.
Towards Alignment
Alignment, transparency, and reliability are essential as increasingly powerful models are deployed.
The goal is to debug and correct model behavior, specifying AI responses with precision in ways that generalize across contexts. Interpretability provides critical tools for this: offering transparency into model decision-making, predicting behavior in previously untested scenarios, and enabling targeted, intelligent intervention.
This capability allows partners, customers, and the broader community to design, edit, and debug models reliably before entrusting them with important tasks. Progress in these directions generates empirical insights from real-world models and applications, advancing toward a future where AI drives human health and scientific breakthroughs. While much work remains, each research advancement moves closer to this vision.
Chief Scientist Tom McGrath previously led a research team at Google DeepMind but chose to dedicate efforts fully to Goodfire. Team members departed startups and nonprofits, relocated to San Francisco, and paused academic career paths, working for months from a windowless office in South Park Commons.
The group formed around a shared conviction: interpretability represents one of the most important technical challenges of the generation, and solving it remains an ambitious yet vital goal.