Alex Volkov (Thursd/AI) (@altryne)

📅 Mar 5 - GPT 5.4 Thinking and pro, Anthropic vs Pentagon, Qwen Drama, Gemini Flash-Lite & more AI GPT 5.4 Thinking dropped WHILE we were recording — and we tested it live. OpenAI's new unified reasoning model folds in everything from 5.3 Codex, hits 75% o...

As always, the show is cut into a podcast and a newsletter:

🔗 on Spotify: thursdai.news/spotify

🔗 Apple: thursdai.news/apple

📰 Newsletter: thursdai.news/mar-5

and Youtube youtu.be/SVimfToLUjA

06.03.2026 02:32 👍 1 🔁 0 💬 0 📌 0

ThursdAI - Mar 5 - GPT 5.4 is here, Anthropic supply chain risk, Qwen 3.5 small & leadership drama, wolfbench & more AI news From Weights & Biases, we were hoping for a chill "Anthropic vs Gov" week, but then OpenAI dropped GPT 5.4 in the middle of our live stream + Qwen models and corpo drama, Stepfun 3.5 and wolfbench.ai

Great show with cohosts @nisten @ryancarson @ldjconfirmed @Yampeleg and @WolframRvnwlf who presented Wolfbench!

thursdai.news/mar-5

06.03.2026 02:32 👍 0 🔁 0 💬 1 📌 0

Called it!
OpenAI launched GPT 5.4 live during @thursdai_pod, so we had a chance to test it immediately and found some surprising quirks!

Also covered the Anthropic vs Dow drama, Qwen 3.5 small (and Junyang departure) and more!

Our new episode just dropped, check it out 👇

06.03.2026 02:32 👍 1 🔁 0 💬 1 📌 0

Liked this recap & breakdown? Follow me @altryne and @thursdai_pod to stay up to date on everything AI related every week 🫡

05.03.2026 21:16 👍 0 🔁 0 💬 0 📌 0

Overall, the sentiment is VERY positive. As it's usually is for a new model, now... will this be enough to get folks to resume their OpenAI subscriptions after leaving for Anthropic last weekend? We'll see 🤔

Blog: openai.com/index/intro...
System Card: openai.com/index/gpt-5...

05.03.2026 21:16 👍 0 🔁 0 💬 1 📌 0

Poe (@poe_platform) are showing that this model is great at long context and needle in a haystack searches.
x.com/poe_platfor...

05.03.2026 21:16 👍 0 🔁 0 💬 1 📌 0

@wadefoster (Wade Foster, CEO of Zapier)
" It's the new state of the art for multi-step tool use"
x.com/wadefoster/...

05.03.2026 21:16 👍 0 🔁 0 💬 1 📌 0

Shumer glazes but has decent feedback as well, showing that it's also, not great at frontend yet. But overall calling it the best by far and that the debate of "which model to use" is over
x.com/mattshumer_...

05.03.2026 21:16 👍 0 🔁 0 💬 1 📌 0

Claire praises this model for "more human speaking" and tool calls, but says it's still bad at frontend
x.com/clairevo/st...

05.03.2026 21:16 👍 0 🔁 0 💬 1 📌 0

Now for some community reactions, though take these with a grain of salt, as there's a bit of a selection bias (OpenAI prefers folks who talk positively about the models)

Bernardo Aceituno, co-founder @stackai said this is the first model to pass their benchmark
x.com/BernAceitun...

05.03.2026 21:16 👍 0 🔁 0 💬 1 📌 0

while Claude opened the website, looked at it, and suggested an actual fix that can help.

05.03.2026 21:16 👍 0 🔁 0 💬 1 📌 0

Is it perfect? nah, we've tested it out live on ThursdAI, and it still shows a bit of a "I'll do EXACTLY what you say" mentality like Codex 5.3. I asked it for 1 thing to improve on my website, and it told me I need to add <main> element

05.03.2026 21:16 👍 0 🔁 0 💬 1 📌 0

One of the most exciting things about this, is not even the model itself, it's the ability to steer its thinking mid thought. I don't think this is talked about enough. Interruption! You can steer this model mid thinking on ChatGPT (and soon Ios) interfaces

05.03.2026 21:16 👍 0 🔁 0 💬 1 📌 0

More eval goodness, this model SLAPS in math. On the @EpochAIResearch Frontier AI bench, GPT 5.4 it solved a math problem no model has been able to solve before by @nasqret
x.com/nasqret/sta...

05.03.2026 21:16 👍 0 🔁 0 💬 1 📌 0

However, if we zoom in, we see that it's better at lower thinking thresholds on the same tasks as Codex! Here on SWE-bench pro, you can see that medium score is the same) while the without thinking, the new model gets 47% accuracy!

05.03.2026 21:16 👍 0 🔁 0 💬 1 📌 0

Now for some benchmarks and evals, if you look at coding things, this model does not show a significant jump over the code dedicated Codex 5.3, but for a generalized model, it's absolutely crushing coding tasks while also being SOTA on a bunch of economic tasks as well, specifically GPDval

05.03.2026 21:16 👍 0 🔁 0 💬 1 📌 0

While the pricing is for the API, Codex app users can turn on the /fast mode in settings (and CLI users can add it to their ~/.cursor/config.toml file) and enjoy a 1.5x speedup at 2x the token burn!

Beast mode is 1M context at FAST mode!
x.com/providerpro...

05.03.2026 21:16 👍 0 🔁 0 💬 1 📌 0

First, I think for most Codex users, upgrading from Codex 5.3 is a no brainer, specifically as this model has a 1M context window, is cheaper than Sonnet 4.6

Pricing <272K tokens:
Input $2.50 / Output $15.00 1Mtok (Cached tokens 90%)
Pricing over 272K tokens:
Input $5 / Output $22.50 1Mtok

05.03.2026 21:15 👍 0 🔁 0 💬 1 📌 0

Today, OpenAI launched their "best model yet", GPT 5.4 Thinking (and 5.4 Pro). After 5.3 Codex just in beginning of Feb, OpenAI has once again pulled back the coding advances into the more generalized model

Pricing, new capabilities, vibechecks in thread👇

05.03.2026 21:15 👍 1 🔁 0 💬 1 📌 0

📅 Feb 26 Approaching singularity: Devin 2, METR 14h, Qwen 3.5, WarClaude, Distill attacks & more When we break through the coding singularity, it won't be immediately apparent but there will be signs! Autonomous agents running for 14h straight with tasks...

Or on YT:
www.youtube.com/watch?v=4kW...

27.02.2026 03:31 👍 0 🔁 0 💬 0 📌 0

📅 ThursdAI - Feb 26 - Approaching singularity From Weights & Biases, this week is the closest I've felt to the AI singularity starting, bonkers 1 man AI startups crossing $700K ARR live on show, DoD vs Anthropic, Anthropic vs Chinese models & mor

You can also just listen to the show right here:

thursdai.news/feb-26

27.02.2026 03:31 👍 0 🔁 0 💬 1 📌 0

Philip Kiely - guest on ThursdAI podcast Philip Kiely (Baseten) on ThursdAI with Alex Volkov. Listen to episodes, find links, and explore the guest network.

And finally, @philipkiely was there with his first book! Inference is everything as Phillip said! Inference Engineering is available as a free PDF and as a gorgeous book (that I just also got in the mail!)

Don't miss these interviews 👀

thursdai.news/guests/phil...

27.02.2026 03:31 👍 0 🔁 0 💬 1 📌 0

Ben Broca - guest on ThursdAI podcast Ben Broca (Polsia) on ThursdAI with Alex Volkov. Listen to episodes, find links, and explore the guest network.

@bencera_ straight up gave us singularity vibes — dude crossed $700K ARR LIVE on the show

thursdai.news/guests/benc...

x.com/altryne/sta...

27.02.2026 03:31 👍 0 🔁 0 💬 1 📌 0

Nader Dabit - guest on ThursdAI podcast Nader Dabit (Cognition) on ThursdAI with Alex Volkov. Listen to episodes, find links, and explore the guest network.

Chatting with @dabit3 was amazing — been following his career forever

@nisten even said watching one of Nader's vids changed his whole career path!

Nader just joined @cognition and walked us through why

thursdai.news/guests/dabit3

27.02.2026 03:31 👍 0 🔁 0 💬 1 📌 0

📅 ThursdAI - Feb 26 - Approaching singularity — ThursdAI ThursdAI Feb 26: Anthropic vs Pentagon, METR 14.5h autonomy, Devin 2.2, Qwen 3.5, and Polsia's $700k ARR surge.

You can find the edited version of our live show, show notes and links on our brand new (totally not vibecoded) website here:

thursdai.news/ep/feb-26-2026

27.02.2026 03:31 👍 0 🔁 0 💬 1 📌 0

3 years doing this weekly and I've never felt closer to the singularity than right now

Everyone's shipping async AI agents. Everything's converging.

This week we covered Anthropic vs DoD, had 3 incredible interviews with @dabit3 @philipkiely and @bencera_ and way more 👇

27.02.2026 03:31 👍 1 🔁 0 💬 1 📌 0

📆 ThursdAI - GLM 5, MiniMax 2.5, Seedance 2, Gemini 3 Deep Think, Codex Spark 100tps & more AI news This week on ThursdAI: open source models are closing the gap on frontier labs FAST — and we had the researchers behind them live on the show.We interviewed ...

Finally we live reacted to the drop of Cerebras powered Codex and Gemini beating Arc AGI and debated the AI psycosis which makes it hard for us to sleep!

Check out the full episode youtu.be/wQb4JK5xKMw

13.02.2026 03:55 👍 0 🔁 0 💬 0 📌 0

Then a chat with Olive revealed how the heck do they get close to Opus on Swe Bench verified with only 10B active parameters!

13.02.2026 03:55 👍 0 🔁 0 💬 1 📌 0

This was a packed show, Open Source LLMs are catching up, @louszbd from @Zai_org told us the new GLM 5 is for agentic architecture and is bigger, better, faster stronger.

13.02.2026 03:55 👍 1 🔁 0 💬 1 📌 0

📆 ThursdAI - GLM 5, MiniMax 2.5, Seedance 2, Gemini 3 Deep Think, Codex Spark 1000tps & something big is coming From Weights & Biases - 2 interviews with Lou (Z.AI) and Olive (MiniMax) about their SOTA OSS LLms, Seedance 2 shattering reality, Codex / Claude get faster and Gemini nearly beating Arc AGI 2

First of all, find the whole show on YT (link in bio) and here:

thursdai.news/feb-12

13.02.2026 03:55 👍 0 🔁 0 💬 1 📌 0

Alex Volkov (Thursd/AI)

Latest posts by Alex Volkov (Thursd/AI) @altryne