Japan’s Sakana AI Launches Fugu, an AI That Manages Other AIs

  • Sakana Fugu is a single OpenAI-compatible API that internally orchestrates a pool of expert models.

  • Sakana says Fugu Ultra scores 73.7% on SWE-Bench Pro, ahead of Claude Opus 4.8, GPT-5.5, and Gemini 3.1 Pro.

  • Early testers found it slow and costly, and critics question the benchmark and sovereignty claims.

Japan's Sakana AI did not enter the race by building a bigger model. It built one that manages other models. Meet Fugu, a system that looks like a single model on the surface but coordinates a team of expert AIs behind the scenes.

How it works

Japan's Sakana AI Launches Fugu, an AI That Manages Other AIs

Fugu is itself a language model trained to call other LLMs in an agent pool, including instances of itself recursively, and it handles model selection, delegation, verification, and synthesis internally. The approach is grounded in two ICLR 2026 papers, TRINITY and Conductor, which assign Thinker, Worker, and Verifier roles and learn coordination instead of using hard-coded rules. It ships in two tiers, Fugu for everyday low-latency work and Fugu Ultra for hard multi-step tasks.

Why now

The timing is pointed. Co-founder David Ha positioned Fugu as a hedge after Anthropic's June 12 move to revoke public access to Claude Mythos 5 and Fable 5 under a US export control order. Because Fugu routes across a swappable pool, the pitch is that if one provider restricts access, it simply routes around the disruption.

The real-world tests

Sakana claims Fugu beat Gemini 3.1 Pro, Opus 4.8, and GPT-5.5 in its own tests on automated research, mechanical design, and financial forecasting, and showed a Rubik's Cube solved faster than individual models. In one AutoResearch run, an agent ran 123 experiments over about 14 hours on a single H100 and reached the best mean score against three frontier baselines.

The honest part

Read the wins with care, because most come from Sakana's own benchmarks. Wharton's Ethan Mollick said Fugu Ultra is incredibly slow, with coding tests taking 30 minutes, and that it did not match Fable in real use. The orchestrator is closed source and relies partly on closed-source model APIs, and Sakana has not disclosed the mix of models it uses. Pricing runs $20, $100, and $200 a month, and Fugu is not available in the EU or EEA at launch.

Still, the idea matters. The next breakthrough in AI may not be one giant model. It may be an organization of specialized models working together, and Japan just showed what that could look like.

Read and watch more:

https://x.com/hardmaru/status/2068884466056225025

https://www.youtube.com/watch?v=jD9ZlugJBBs

Quick Links:

Similar Posts