As always, the show is cut into a podcast and a newsletter:
π on Spotify: thursdai.news/spotify
π Apple: thursdai.news/apple
π° Newsletter: thursdai.news/mar-5
and Youtube youtu.be/SVimfToLUjA
As always, the show is cut into a podcast and a newsletter:
π on Spotify: thursdai.news/spotify
π Apple: thursdai.news/apple
π° Newsletter: thursdai.news/mar-5
and Youtube youtu.be/SVimfToLUjA
Great show with cohosts @nisten @ryancarson @ldjconfirmed @Yampeleg and @WolframRvnwlf who presented Wolfbench!
thursdai.news/mar-5
Called it!
OpenAI launched GPT 5.4 live during @thursdai_pod, so we had a chance to test it immediately and found some surprising quirks!
Also covered the Anthropic vs Dow drama, Qwen 3.5 small (and Junyang departure) and more!
Our new episode just dropped, check it out π
Liked this recap & breakdown? Follow me @altryne and @thursdai_pod to stay up to date on everything AI related every week π«‘
Overall, the sentiment is VERY positive. As it's usually is for a new model, now... will this be enough to get folks to resume their OpenAI subscriptions after leaving for Anthropic last weekend? We'll see π€
Blog: openai.com/index/intro...
System Card: openai.com/index/gpt-5...
Poe (@poe_platform) are showing that this model is great at long context and needle in a haystack searches.
x.com/poe_platfor...
@wadefoster (Wade Foster, CEO of Zapier)
" It's the new state of the art for multi-step tool use"
x.com/wadefoster/...
Shumer glazes but has decent feedback as well, showing that it's also, not great at frontend yet. But overall calling it the best by far and that the debate of "which model to use" is over
x.com/mattshumer_...
Claire praises this model for "more human speaking" and tool calls, but says it's still bad at frontend
x.com/clairevo/st...
Now for some community reactions, though take these with a grain of salt, as there's a bit of a selection bias (OpenAI prefers folks who talk positively about the models)
Bernardo Aceituno, co-founder @stackai said this is the first model to pass their benchmark
x.com/BernAceitun...
while Claude opened the website, looked at it, and suggested an actual fix that can help.
Is it perfect? nah, we've tested it out live on ThursdAI, and it still shows a bit of a "I'll do EXACTLY what you say" mentality like Codex 5.3. I asked it for 1 thing to improve on my website, and it told me I need to add <main> element
One of the most exciting things about this, is not even the model itself, it's the ability to steer its thinking mid thought. I don't think this is talked about enough. Interruption! You can steer this model mid thinking on ChatGPT (and soon Ios) interfaces
More eval goodness, this model SLAPS in math. On the @EpochAIResearch Frontier AI bench, GPT 5.4 it solved a math problem no model has been able to solve before by @nasqret
x.com/nasqret/sta...
However, if we zoom in, we see that it's better at lower thinking thresholds on the same tasks as Codex! Here on SWE-bench pro, you can see that medium score is the same) while the without thinking, the new model gets 47% accuracy!
Now for some benchmarks and evals, if you look at coding things, this model does not show a significant jump over the code dedicated Codex 5.3, but for a generalized model, it's absolutely crushing coding tasks while also being SOTA on a bunch of economic tasks as well, specifically GPDval
While the pricing is for the API, Codex app users can turn on the /fast mode in settings (and CLI users can add it to their ~/.cursor/config.toml file) and enjoy a 1.5x speedup at 2x the token burn!
Beast mode is 1M context at FAST mode!
x.com/providerpro...
First, I think for most Codex users, upgrading from Codex 5.3 is a no brainer, specifically as this model has a 1M context window, is cheaper than Sonnet 4.6
Pricing <272K tokens:
Input $2.50 / Output $15.00 1Mtok (Cached tokens 90%)
Pricing over 272K tokens:
Input $5 / Output $22.50 1Mtok
Today, OpenAI launched their "best model yet", GPT 5.4 Thinking (and 5.4 Pro). After 5.3 Codex just in beginning of Feb, OpenAI has once again pulled back the coding advances into the more generalized model
Pricing, new capabilities, vibechecks in threadπ
You can also just listen to the show right here:
thursdai.news/feb-26
And finally, @philipkiely was there with his first book! Inference is everything as Phillip said! Inference Engineering is available as a free PDF and as a gorgeous book (that I just also got in the mail!)
Don't miss these interviews π
thursdai.news/guests/phil...
@bencera_ straight up gave us singularity vibes β dude crossed $700K ARR LIVE on the show
thursdai.news/guests/benc...
x.com/altryne/sta...
Chatting with @dabit3 was amazing β been following his career forever
@nisten even said watching one of Nader's vids changed his whole career path!
Nader just joined @cognition and walked us through why
thursdai.news/guests/dabit3
You can find the edited version of our live show, show notes and links on our brand new (totally not vibecoded) website here:
thursdai.news/ep/feb-26-2026
3 years doing this weekly and I've never felt closer to the singularity than right now
Everyone's shipping async AI agents. Everything's converging.
This week we covered Anthropic vs DoD, had 3 incredible interviews with @dabit3 @philipkiely and @bencera_ and way more π
Finally we live reacted to the drop of Cerebras powered Codex and Gemini beating Arc AGI and debated the AI psycosis which makes it hard for us to sleep!
Check out the full episode youtu.be/wQb4JK5xKMw
Then a chat with Olive revealed how the heck do they get close to Opus on Swe Bench verified with only 10B active parameters!
This was a packed show, Open Source LLMs are catching up, @louszbd from @Zai_org told us the new GLM 5 is for agentic architecture and is bigger, better, faster stronger.