New Paper: Towards a science of AI agent reliability
Quantifying the capability-reliability gap
A new paper by @sayash.bsky.social and @randomwalker.bsky.social examines what “reliability” means in an AI context. They propose consistency, robustness, calibration, and safety, and they define these in operationally useful ways. A worthy read! www.normaltech.ai/p/new-paper-...
03.03.2026 17:41
👍 3
🔁 2
💬 0
📌 1
Panel 1: Text: “Imagine an alternate universe in which people don’t have words for different forms of transportation, only the collective noun ‘vehicle’.” Illustration: a stick figure stands next to a much more detailed motorcycle, which a speech bubble saying “Woah! Sweet vehicle!”
Panel 2: Text: “They use that word to refer to: cars, buses, bikes, spacecraft, and all other ways of getting from place A to place B.” Illustration: A car, a school bus, a bicycle, and a space shuttle, all with a stamp that says “vehicle” on them.
Panel 1: Text: “Conversations in this world are confusing.” Illustration: A speech bubble coming from the left: “Can you drive a vehicle?”, A speech bubble coming from the right: “Definitely!”, A illustration of a car crashed into a tree. Left speech bubble: “I thought you said you could drive!” Right speech bubble: “I can! I’m just used to ones with two wheels!”
Panel 2: Text: “There are furious debates about whether or not vehicles are environmentally friendly… even though no one realizes that one side is talking about bikes and the other is talking about trucks. Illustration: Left speech bubble: “Vehicles produce so much pollution!” Right speech bubble: “That’s an exaggeration! They are actually very green!”
Panel 1: Text: “There is a breakthrough in rocketry, but the media focuses on how vehicles have gotten faster so people call there car (‘car’ is crossed out and replaced with ‘vehicle’) dealer to ask when faster models will be available.” Illustration: A TV news report with a picture of a rocket ship and a chyron saying “Breaking: Vehicles reach 1000 mph!”. Below that is a drawing of two stick figures talking at a car dealership. One says, “So I can take this to space, right?”
Panel 2: Text: “Meanwhile, fraudsters have capitalize on the fact that consumers don’t know what to believe when it comes to vehicle technology, so scams are rampant in the vehicle sector.” Illustration: A stick figure with a mean smile and a sparkle next to his eye pats a car that has plane wings taped to it. A speech bubble says, “Oh yeah! You can fly this baby across the ocean!”
Panel 1: Text: “Now replace the word “vehicle” with “artificial intelligence” and we have a pretty good descriptor of the world we live in.” Illustration: One crowd of people say “AI is bad for the environment!” Underneath them is a large box that is labeled “Size of AI people are concerned about”, another crowd of people says “AI is used for climate research!” Underneath them is a much smaller box saying “Size of AI used for climate research”. In the foreground there is a person watching the debate with several question marks above it.
Panel 2: Credits. “Text from AI Snake Oil: What Artificial Intelligence Can Do, What It Can’t, and How to Tell the Difference by Arvin’s Narayanan and Sayash Kapoor. Art by Ayla Taylor. www.aylataylor.com”
A silly little comic based on the opening section of AI Snake Oil by @randomwalker.bsky.social and @sayash.bsky.social.
03.03.2026 03:14
👍 13
🔁 5
💬 1
📌 0
Dr. Arvind Narayanan (@randomwalker.bsky.social ) will be joining us as a Keynote Speaker at the Conference on Society-Centered AI 2026! Join us Feb 12-14 at Duke for industry and academic keynotes, research spotlight talks, and poster sessions. Learn more and Register here: sites.duke.edu/scai/
07.01.2026 15:00
👍 5
🔁 2
💬 0
📌 1
After science
Twenty-five years ago, Ted Chiang wrote a prescient science fiction short that began: “It has been 25 years since a report of original research was last submitted to our editors for publication, makin...
New piece w/ James Evans in Science explores what we call 'science after science', an era where our ability to control nature may exceed our ability to understand it; a new struggle to sustain curiosity & understanding under AI's predictive dominance. #ai #science
www.science.org/doi/10.1126/...
14.11.2025 18:23
👍 25
🔁 10
💬 0
📌 0
Three schematic diagrams. The first illustrates selective publishing of internal resection, the second selective causal focus, and the third selective access and funding for researchers.
1. We ( @jbakcoleman.bsky.social, @cailinmeister.bsky.social, @jevinwest.bsky.social, and I) have a new preprint up on the arXiv.
There we explore how social media companies and other online information technology firms are able to manipulate scientific research about the effects of their products.
24.10.2025 00:47
👍 758
🔁 356
💬 16
📌 21
AI as Normal Technology
A new paper that we will expand into our next book
The "normal" framing is so key — @randomwalker.bsky.social gave us all such a useful way of articulating a valuable idea in this discussion. www.normaltech.ai/p/ai-as-norm...
17.10.2025 05:00
👍 201
🔁 26
💬 5
📌 2
Invitation to Apply: Princeton AI Policy Precepts in Washington, DC
📢📢 Call for federal employees for the AI Precepts in Washington, DC. Learn from experts Arvind Narayanan (@randomwalker.bsky.social), Mihir Kshirsagar, Peter Henderson (@peterhenderson.bsky.social, & Sayash Kapoor (@sayash.bsky.social). Deadline to apply: Fri, Oct. 3
mailchi.mp/princeton.ed...
26.09.2025 15:42
👍 1
🔁 1
💬 1
📌 0
Paperback cover of AI Snake Oil: What Artificial Intelligence Can Do, What It Can’t, and How to Tell the Difference by Arvind Narayanan and Sayash Kapoor
From two of TIME’s 100 Most Influential People in AI, what you need to know about #AI—and how to defend yourself against bogus AI claims and products.
AI Snake Oil by @randomwalker.bsky.social and @sayash.bsky.social is now available in #paperback: press.princeton.edu/books/paperb...
23.09.2025 15:07
👍 9
🔁 3
💬 1
📌 0
Our #podcast series on Harry Frankfurt’s seminal work, On Bullshit continues with Arvind Narayanan who explores the subject of bullshit in #AI.
press.princeton.edu/ideas/the-tr...
@randomwalker.bsky.social @newbooksnetwork.bsky.social @calebzakarin.bsky.social
08.09.2025 23:09
👍 13
🔁 4
💬 0
📌 0
📣 Prof Arvind Narayanan (@randomwalker.bsky.social) is hiring a Princeton University undergrad for a Video Editor & Production Assistant this semester to help with his brand new YouTube channel @ArvindOnAI
Getting started right away so feel free to comment + share with students! Link to apply 👇
03.09.2025 15:11
👍 1
🔁 3
💬 1
📌 0
Podcast Episode: Separating AI Hope from AI Hype
If you believe the hype, artificial intelligence will soon take all our jobs, or solve all our problems, or destroy all boundaries between reality and lies, or help us live forever, or take over the w...
Even superintelligent AI cannot simply replace humans for most of what we do, nor can it perfect or ruin our world unless we let it, AI Snake Oil’s @randomwalker.bsky.social tells EFF’s Cindy Cohn and @thejasonkelley.com on the new “How to Fix the Internet.”
19.08.2025 17:17
👍 31
🔁 14
💬 1
📌 0
Podcast Episode: Separating AI Hope from AI Hype
If you believe the hype, artificial intelligence will soon take all our jobs, or solve all our problems, or destroy all boundaries between reality and lies, or help us live forever, or take over the w...
Great @eff.org podcast with @randomwalker.bsky.social, touching on his AI as Normal Technology paper w/ @sayash.bsky.social for our @knightcolumbia.org AI & Democratic Freedoms project. Short 🧵 of a few other papers related to this podcast discussion:
www.eff.org/deeplinks/20...
15.08.2025 17:30
👍 17
🔁 7
💬 1
📌 0
Advancing science- and evidence-based AI policy
Policy must be informed by, but also facilitate the generation of, scientific evidence
What kind of AI governance do we need? Our new piece in @science.org answers this: we need policy grounded in evidence and built to generate more of it. Evidence-based policymaking is not a slogan—it’s a design challenge for democratic governance in the age of AI www.science.org/doi/10.1126/... 🧵
31.07.2025 23:27
👍 129
🔁 55
💬 7
📌 3
Note that the data collection ended right before ChatGPT was released, so my guess is that the percentages are no longer small.
18.07.2025 01:00
👍 3
🔁 0
💬 0
📌 0
Could AI slow science?
Confronting the production-progress paradox
Fabulous post by @randomwalker.bsky.social & Sayash raising the same concern many of us have about whether we're on the right track with how we're using AI for science. Everyone should read it, take a deep breath & think through the implications.
www.aisnakeoil.com/p/could-ai-s...
17.07.2025 16:05
👍 158
🔁 69
💬 7
📌 11
Understanding Social Media Recommendation Algorithms
I’m reading a very well written 2023 paper on social media recommender systems from @randomwalker.bsky.social I had completely forgotten that in the 00s “neither Facebook nor Twitter had the ability to reshare or retweet posts in your feed.”What a huge shift!
knightcolumbia.org/content/unde...
10.07.2025 15:51
👍 2
🔁 3
💬 0
📌 0
We’re hiring at Princeton on AI and society, working with Arvind Narayanan or me depending on fit.
I think current AI developments are all a huge deal but am very unexcited by current state of the AGI and/or AI safety discourse.
Please share as you see fit.
puwebp.princeton.edu/AcadHire/app...
20.06.2025 12:36
👍 83
🔁 51
💬 2
📌 1
After consideration, I will post occasionally, but heavily censor what I share compared to other sites.
I tried making the transition, but talking about AI here is just really fraught in ways that are tough to mitigate & make it hard to have good discussions (the point of social!). Maybe it changes
26.05.2025 04:25
👍 423
🔁 25
💬 75
📌 33
Two Paths for A.I.
The technology is complicated, but our choices are simple: we can remain passive, or assert control.
For @newyorker.com, Joshua Rothman spoke with @randomwalker.bsky.social and @sayash.bsky.social, authors of AI Snake Oil and a recently published paper “AI as Normal Technology”, which argues that practical obstacles will slow AI’s uses and potential: www.newyorker.com/culture/open...
28.05.2025 18:06
👍 17
🔁 3
💬 1
📌 3
"A hypothesis on the accelerating decline of reading:
* Broadly speaking, people read for pleasure/entertainment and for learning/obtaining information.
* Reading for pleasure has been declining for a while and is being replaced by videos (very sharply among young people). This trend will surely continue.
* Reading for obtaining information is getting intermediated by chatbots. We are in the very early stages of this shift, so I think people underappreciate the magnitude of what's coming. It's not just that AI replacing traditional web search. Even when it comes to reading news articles, business documents, or scientific papers, the vision that tech companies are pushing on us is AI summarization + synthesis + Q&A.
* We don't have to accept this, but I predict that most people will. It's a tradeoff between speed/convenience and accuracy/depth of understanding — the same tradeoff that was once offered to us when it became possible to search the web to look up a quick fact as opposed to reading about the topic in depth in an encyclopedia.
* Just as most people in most cases prefer a shallow web search over deeper reading, most people in most cases will prefer AI-intermediated access to knowledge. Traditional reading won't disappear, but people will do it vastly less often, except in hobbyist reading communities and professions where traditional reading is needed.
* The decline of reading-for-pleasure (due to video) and reading-for-information (due to AI) will accelerate each other, as reading text without an intermediary will come to be seen as a chore.
* Personally, I find this sad. But while it's tempting to moralize all this, I think that's unproductive. Yelling at individuals to resist new media has been done for centuries and has never worked.
* Even if people individually rationally choose these tradeoffs, I think we collectively lose something; critical reading skills are arguably essential for a democracy. We need to figure out what to do about that.
clear, depressing set of observations from @randomwalker.bsky.social - "The decline of reading-for-pleasure (due to video) and reading-for-information (due to AI) will accelerate each other, as reading text without an intermediary will come to be seen as a chore."
22.05.2025 14:31
👍 333
🔁 107
💬 14
📌 19
Moving towards informative and actionable social media research
Social media is nearly ubiquitous in modern life, and concerns have been raised about its putative societal impacts, ranging from undermining mental health and exacerbating polarization to fomenting v...
New preprint with @jbakcoleman.bsky.social @lewan.bsky.social @randomwalker.bsky.social @orbenamy.bsky.social @lfoswaldo.bsky.social where we argue for a complex-system perspective to understand the causal effects of social media on society and for a triangulation of methods
arxiv.org/abs/2505.09254
15.05.2025 06:31
👍 76
🔁 28
💬 2
📌 3
I'm excited that I can finally share what I've been working on for the past 9 months:
The United Nations 2025 Human Development Report: "A matter of choice: People and possibilities in the age of AI" 🧵
hdr.undp.org/content/huma...
06.05.2025 09:03
👍 109
🔁 28
💬 6
📌 5
AGI is not a milestone
There is no capability threshold that will lead to sudden impacts
“AGI is not a milestone because it is not actionable. A company declaring it has achieved, or is about to achieve, AGI has no implications for how businesses should plan, what safety interventions we need, or how policymakers should react.”
@randomwalker.bsky.social
open.substack.com/pub/aisnakeo...
01.05.2025 11:59
👍 6
🔁 1
💬 1
📌 0
Okay just started @randomwalker.bsky.social and @sayash.bsky.social's new essay and this is 🔥🔥🔥.
"Resilience as the overarching approach to catastrophic risk" -- yes thank you exactly this.
kfai-documents.s3.amazonaws.com/documents/c3...
24.04.2025 20:41
👍 13
🔁 2
💬 1
📌 0
text says "ML Reproducibility Challenge Princeton University, New Jersey, USA, August 21 2025"
We are hosting @reproml.org 2025 on Aug. 21. There will be invited talks, oral presentations, and poster sessions. Keynote speakers include @randomwalker.bsky.social, @soumithchintala.bsky.social, @jfrankle.com, @jessedodge.bsky.social, @stellaathena.bsky.social
Register now: bit.ly/4cP8vIq
24.04.2025 18:57
👍 1
🔁 1
💬 0
📌 0
In this clip from our event last week, @randomwalker.bsky.social describes how we can map out the landscape of AI along two dimensions: how well the AI tool works, and how harmful (or benign) it is.
Watch a full recording of the event: youtu.be/C3TqcUEFR58
24.04.2025 15:21
👍 12
🔁 4
💬 1
📌 1
AI as Normal Technology
A new paper that we will expand into our next book
IMO, the most important piece on AI of the last 6 months and I recommend it to everyone. A genuinely careful consideration of the technology and its intersections with culture and labor from @randomwalker.bsky.social and @sayash.bsky.social Authors of AI Snake Oil substack.com/home/post/p-...
19.04.2025 12:37
👍 102
🔁 26
💬 4
📌 3