Rendered at 06:29:18 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
holistio 2 hours ago [-]
You pay $200/month to Anthropic, $200/month to OpenAI, $200/month to Cursor, $200/month to $200/month to Google, and seeing that it didn't come to a nice round $1024/month, you pay $200/month to Sakana to coordinate it all, because why not.
While you're at it, feel free to send me $200 as well, I'll generate a crypto address ending with "AI".
holistio 2 hours ago [-]
TIL: I just found out that base58 disallows I (capital i), l (lowercase L), O (capital o) and 0 (zero), so I could only generate GrxoJt4eNXE2QaQ55iPSa7hhiYdzCo8ZeAuokmh2Cai.
(don't send anything, sharing only because of the base58 fun fact I didn't know)
robertwt7 52 minutes ago [-]
at this point I might just try Neuralwatt and see how much request I can get with GLM5.2. I've read a lot of reviews that its very cheap to run using Neuralwatt cloud
someone_1234 1 hours ago [-]
Or use openrouter and switch to model you want to use..(i think so)
ljlolel 1 hours ago [-]
Or TrustedRouter if you want privacy and open source
rvz 2 hours ago [-]
Pay $0 to run a local model or even a cheap DeepSeek V4 model via their API which is close to free per million tokens.
These prices are just going to get raced to $0.
holistio 2 hours ago [-]
Maybe. But for now it's fascinating how $200/month has kind of become a normal tier.
It's similar to how AirPods normalised all of us having $300+ headphones. All of us would have scoffed at the idea a decade ago.
p1esk 2 hours ago [-]
Many people here spent a lot more than $300 on headphones long before AirPods appeared.
mc3301 1 hours ago [-]
Those were hobbyists, audiophiles, professionals, artists (recording, performing, etc.).
They are talking about a much larger group of people.
holistio 2 hours ago [-]
I had a really nice Sennheiser before that, too. But now you hop on the subway and everybody sports one.
sofixa 31 minutes ago [-]
The Sony WH-1000XM series and the Bose QC35 were the standard quality headphones years before AirPods were a thing, and both retailed at $300+.
kijin 2 hours ago [-]
Not while the hardware required to run a local model at an acceptable speed costs way more than $200.
Guess what, the big players are hoarding all the RAM and GPUs so that other people can't afford decent hardware. It's working out beautifully for them!
sofixa 29 minutes ago [-]
> Not while the hardware required to run a local model at an acceptable speed costs way more than $200
It's $200/month. You have to take into account energy costs and all the rest of a system, but if you break even within 1-2 years ($2400-$4800) it'd be a pretty good deal. And $4000 buys you a pretty decent system.
audreyt 2 hours ago [-]
[flagged]
njoyablpnting 9 minutes ago [-]
Looking at the technical report I'm a bit confused. The improvement from using their orchestrator models seems minimal (in some cases lower than just the model which I'm assuming is in the orchestrator's pool?). Maybe it's sort of acting as an additional reasoning step upfront? Sort of like how if you asked Claude to create a plan for how best to prompt itself, you would probably end up with a better result than just the base prompt.
Also, from the technical report, looks like they're training on the output of Claude Code, etc. I'm guessing this doesn't violate TOS because they're technically not a directly competing model. This brings me to what I see as the main risk with this service, which is that it seems like an easy thing for a frontier lab to make obsolete, either by models beginning to converge in terms of strengths or by improving their own harnesses to include more of this meta-reasoning.
chenzhekl 34 minutes ago [-]
I probably will never pay to Sakana, as they are involved in military contracts.
Yeah, I was trying to parse their "defense policy"
https://sakana.ai/company-info/defense-policy.html?lang=en
But it seems like lot of words to say we have no policy and we'll just go along with the powers that be. Like they rely on deferring to the Pacifist constitution, which the current administration if moving mountains to try and change. And when it it you can bet they will not want to give up their defense contracts.
nickandbro 13 minutes ago [-]
I imagine if it was Deepseek partnering with the CCP it would be different?
chenzhekl 7 minutes ago [-]
I was just stating facts about Sakana, and that was enough to trigger you? For the same reason, I don’t use GPT either. At least for now, DeepSeek has no ties to the defense sector. And don’t talk as if the CCP were the devil. The U.S. president is the world’s biggest arms dealer, after all.
cortesi 2 hours ago [-]
As a developer outside the US I think it's vital to have alternatives to OpenAI and Anthropic, but sadly this is not it. For $200/month you get < 3 hours of use per week, the API is extremely slow, and the output quality in my tests is nowhere near Fable. It's nowhere remotely near usable as a day-to-day workhorse. Very disappointing.
ngl, I thought sakana.ai was doing cooler stuff than this. that said, the release of a product like this makes sense because it follows your natural intuition when using these models. The best way to use LLMs is to have at least two in your pocket, because the models do a good job at covering each others assets and filling in obvious model-specific blindspots.
it's interesting that they're offering in the form of fixed cost subscription plans too. My impression was that the first party providers can do this because they api inference margins to the tune of 80ish percent. Anyone else orchestrating on top of these models have to pass through these costs or eat it themselves.
embedding-shape 3 hours ago [-]
> Frontier-level performance without single-vendor dependency. [...] Plug collective intelligence directly into your workflows today with a single API.
Does multiple vendors run this "single API" or how is this not replacing a single-vendor dependency for another single-vendor dependency?
chvid 49 minutes ago [-]
This would have been much more interesting and impactful if it had relied on open source models rather than commercial models that are only availble via an API.
The reasoning chains could have been used, and the resulting combined model could easily and effectively have been distilled.
GolfPopper 2 hours ago [-]
This is a joke, right?
NitpickLawyer 25 minutes ago [-]
Not necessarily. There were some tests last year-ish from hf that showed that simply alternating (randomly) between claude and gpt (whatever their versions were at the time) on a task produced better results than either of them individually. So during a task, the first call was sent to one, then the other and so on.
There's also the concept of "smart routing" requests based on some heuristics / embeddings. You'd get "simple" tasks handled by smaller (cheaper) models and use a bigger model to curate / sort / merge the results.
There's a lot of things to try here. I wouldn't personally pay for this service, but I don't think it's "a joke"...
epsteingpt 2 hours ago [-]
Beta user: they piloted OpenRouter fusion before it was seen as the viable step. Everyone's understood for months now that having different models check each other is the best path forward.
This gets you that in a nice neat package, without the underlying tinkering mechanics.
If (big iff) the usage mechanics work out, then this is actually a really good anti-big-model strategy.
They'll be incentivized for your success, not token-maximizing for their investors.
The team is super smart too. What's not to like?
Wishing them the best on launch.
prodigycorp 1 hours ago [-]
if you've used codex or claude, how do the usage limits on fugu feel compared to the pro plans on either? honestly wouldn't mind subscribing to this if it's as generous as what codex is giving me monthly, which seems unrealistic.
epsteingpt 60 minutes ago [-]
Hard to say - since I used it in Beta with free credits, where the usage felt more 'Opus' than 'ChatGPT' but more efficient token wise. Switching models every time is annoying.
But their paid plans I'm not sure yet - planning to subscribe and can let you know.
Almost no chance it will be as generous as OpenAI though. They just don't have the money :-)
david_shi 2 hours ago [-]
Their research around building a domain specific model is pretty cool, it's kind of like Karpathy's autoresearch but pointed at deciding the optimal model to use at each step of the inference.
If cost becomes an even bigger problem being able to choose "best performance possible" or "strong but cost effective" will be useful.
OpenRouter Fusion is basically ask N models + synthesizer step.
This is ask a special orchestrator they built, which is in front of a bunch of models, which model would suit the request best.
Regular Fugu seems to be just "pick the best model and route the request there"
Fugu Ultra can generate like a little mini workflow/plan instead to achieve a result
1. Ask GPT to derive the math.
2. Ask Opus to check for implementation/security issues.
3. Ask Gemini to synthesize or resolve disagreement.
4. Return final answer.
I could be wrong but seems to be that at a glance, so I think it's more dynamic than OpenRouter Fusion.
runeblaze 2 hours ago [-]
links to two papers with at least enough apparent quality and novelty to get into ICLR 2026
> So basically... openrouter
:skull:
i now really wonder how many people of the public understood my thesis defense lol
Is there any official source that could confirms if Fable (or Mythos) is parallelized test-time compute (like GPT 5.5 Pro) or sparse Mixture-of-Experts (MoE) transformer combined with a multi-agent, inference-time compute scaling architecture (Gemini 3.1 Deep Think)?
adamnemecek 3 hours ago [-]
Seems kinda underwhelming considering they raised like $400M.
itemize123 2 minutes ago [-]
it's just one of their products right
ffsm8 2 hours ago [-]
400m is the new 400k!
Just look at the other company evaluations and how much they raised vs what they delivered
nickandbro 4 hours ago [-]
Very interesting. I wonder if its kinda functions similarly to how OpenRouter's fusion API does. Hopefully isn't too long to respond.
Looks like Fusion calls a bunch of models and then uses an LLM to synthesize the results, and pass to another model for final output.
Fugu looks like it's doing something different? Using an LLM earlier on in the flow as an orchestrator to decide which other LLMs to call. More coordinator than simply synthesizing results, and more "agentic".
It's interesting because it's all exposed behind a single OpenAI compatible endpoint (Responses API?) and so then presumably someone could use this for one of their single agents. Now you have agent-of-agents, nested in some sense. The token usage increases accordingly!
bprasanna 3 hours ago [-]
Isn't this what perplexity is?
puttycat 2 hours ago [-]
Can someone explain this in layman terms? I don't understand any of it
Basically, if you combine a bunch of near-frontier models (like GPT 5.5, etc) you can get performance that sometimes surpasses top line models like Claude's Fable.
Sakana seems to have a separate approach using a domain specific model to perform the model routing step.
chenzhekl 31 minutes ago [-]
But it's priced the same as frontier models. Why do I not directly pay for frontier models?
2 hours ago [-]
ljlolel 3 hours ago [-]
I’ve also developed and open-sourced Mythos level model using fusion/synthesis on TrustedRouter
Just letting you guys know that the model is not a moat.
nixosbestos 3 hours ago [-]
AI noob question, is this like Amp? I just use Amp, I ask it to do neat stuff and it does it. I desperately need to invest in my AI skills but every day I open two new tabs and add it to "AI stuff" folder, and then go back to drowning in work to do.
While you're at it, feel free to send me $200 as well, I'll generate a crypto address ending with "AI".
(don't send anything, sharing only because of the base58 fun fact I didn't know)
These prices are just going to get raced to $0.
It's similar to how AirPods normalised all of us having $300+ headphones. All of us would have scoffed at the idea a decade ago.
They are talking about a much larger group of people.
Guess what, the big players are hoarding all the RAM and GPUs so that other people can't afford decent hardware. It's working out beautifully for them!
It's $200/month. You have to take into account energy costs and all the rest of a system, but if you break even within 1-2 years ($2400-$4800) it'd be a pretty good deal. And $4000 buys you a pretty decent system.
Also, from the technical report, looks like they're training on the output of Claude Code, etc. I'm guessing this doesn't violate TOS because they're technically not a directly competing model. This brings me to what I see as the main risk with this service, which is that it seems like an easy thing for a frontier lab to make obsolete, either by models beginning to converge in terms of strengths or by improving their own harnesses to include more of this meta-reasoning.
https://japannews.yomiuri.co.jp/politics/defense-security/20...
https://x.com/cortesi/status/2068898694238486658
it's interesting that they're offering in the form of fixed cost subscription plans too. My impression was that the first party providers can do this because they api inference margins to the tune of 80ish percent. Anyone else orchestrating on top of these models have to pass through these costs or eat it themselves.
Does multiple vendors run this "single API" or how is this not replacing a single-vendor dependency for another single-vendor dependency?
The reasoning chains could have been used, and the resulting combined model could easily and effectively have been distilled.
There's also the concept of "smart routing" requests based on some heuristics / embeddings. You'd get "simple" tasks handled by smaller (cheaper) models and use a bigger model to curate / sort / merge the results.
There's a lot of things to try here. I wouldn't personally pay for this service, but I don't think it's "a joke"...
This gets you that in a nice neat package, without the underlying tinkering mechanics.
If (big iff) the usage mechanics work out, then this is actually a really good anti-big-model strategy.
They'll be incentivized for your success, not token-maximizing for their investors.
The team is super smart too. What's not to like?
Wishing them the best on launch.
But their paid plans I'm not sure yet - planning to subscribe and can let you know.
Almost no chance it will be as generous as OpenAI though. They just don't have the money :-)
If cost becomes an even bigger problem being able to choose "best performance possible" or "strong but cost effective" will be useful.
https://arxiv.org/pdf/2512.04695
This is ask a special orchestrator they built, which is in front of a bunch of models, which model would suit the request best.
Regular Fugu seems to be just "pick the best model and route the request there"
Fugu Ultra can generate like a little mini workflow/plan instead to achieve a result
1. Ask GPT to derive the math. 2. Ask Opus to check for implementation/security issues. 3. Ask Gemini to synthesize or resolve disagreement. 4. Return final answer.
I could be wrong but seems to be that at a glance, so I think it's more dynamic than OpenRouter Fusion.
> So basically... openrouter
:skull:
i now really wonder how many people of the public understood my thesis defense lol
Is there any official source that could confirms if Fable (or Mythos) is parallelized test-time compute (like GPT 5.5 Pro) or sparse Mixture-of-Experts (MoE) transformer combined with a multi-agent, inference-time compute scaling architecture (Gemini 3.1 Deep Think)?
We open sourced it all
and will be releasing a similar orchestrator next week on TrustedRouter
Looks like Fusion calls a bunch of models and then uses an LLM to synthesize the results, and pass to another model for final output.
Fugu looks like it's doing something different? Using an LLM earlier on in the flow as an orchestrator to decide which other LLMs to call. More coordinator than simply synthesizing results, and more "agentic".
It's interesting because it's all exposed behind a single OpenAI compatible endpoint (Responses API?) and so then presumably someone could use this for one of their single agents. Now you have agent-of-agents, nested in some sense. The token usage increases accordingly!
Basically, if you combine a bunch of near-frontier models (like GPT 5.5, etc) you can get performance that sometimes surpasses top line models like Claude's Fable.
Sakana seems to have a separate approach using a domain specific model to perform the model routing step.
https://trustedrouter.com/blog/fusion-evals-open-source