Thanks to my friends at @datascienceweekly for featuring Probably Overthinking It ... now available in paperback!
Thanks to my friends at @datascienceweekly for featuring Probably Overthinking It ... now available in paperback!
Honestly, this is a weakness of my writing -- I don't do enough signposting. But contrary to what I actually do, I think there should be something at the beginning that presents the value proposition and something at the end that provides the big picture.
Probably Overthinking It is now available in paperback.
To celebrate, I'm publishing The Lost Chapter, which is about the strangest paradox in probability, the girl named Florida problem.
The key to the problem is Captain Chelsea Sullenberger.
www.allendowney.com/blog/2025/12...
Modeling the SAT Math Gap with #PyMC
Why do male test takers score ~30 points higher? Ability or selection bias?
At PyData Boston 2025, @allendowney.bsky.social shows how Bayesian modeling untangles confounding to reveal whatβs real.
#BayesianModeling #DataScience
βIβm way closer to LeBron James than you are to me.β
-- Brian Scalabrine
He's probably right, because in a lognormal distribution of ability, it's levels to this β¦
www.allendowney.com/blog/2025/11...
The #Scicloj group is initiating a new study group:
#DSP in #Clojure.
We will follow @allendowney.bsky.social's #ThinkDSP book and practice it in Clojure.
clojureverse.org/t/clojure-ds...
The newest chapter of Think Linear Algebra is up now!
It is about least squares regression, QR decomposition, and orthogonality:
allendowney.github.io/ThinkLinearA...
4. And if 5-year survival increases over time, it is tempting to conclude that treatment has improved.
In fact, none of these inferences are correct.
www.allendowney.com/blog/2025/10...
2. Looking at the difference in survival between early and late detection, it is tempting to conclude that more screening would save lives.
3. In a case where a patient is diagnosed late and dies of cancer, it is tempting to say that they would have survived if their cancer had been caught early.
Five-year survival rates might be the most misleading statistics in medicine.
Even smart people can make incorrect inferences. Here are the top four:
1. If a patient is diagnosed early, it is tempting to think the probability is 91% that they will survive five years after diagnosis.
And we've got the graduates to prove we were right.
The curriculum at Olin College is our answer to this question:
1) Engineering early and often,
2) Emphasize design, entrepreneurship, teamwork, and communication,
And my focus was on
3) Use computational tools before or instead of math
The effect of this error on engineering education is like the effect of the iceberg on the Titanic.
The original sin of the engineering curriculum is the Foundation Fallacy:
The assumption that math (especially calculus) and science (especially physics) are (1) the foundations of engineering, and therefore (2) the prerequisites of engineering education.
www.allendowney.com/blog/2025/10...
My new auxiliary emergency backup team is taking it down to the wire...
I just posted a new chapter of Think Linear Algebra.
It's about projection, rejection, rotation, and pool!
allendowney.github.io/ThinkLinearA...
Yes, I had not made that distinction, and you are very right.
I'm not sure I would have bothered debunking it if I had realized.
Oh, no -- it gets worse! It looks like they also including missing data in the analysis, treating it as zero. That explains the black line in the figure.
I love a good Simpson's paradox. Sadly, this is not one of them
www.allendowney.com/blog/2025/10...
In fact, I think the whole paper is nonsense.
Published in Nature, too.
I gave a talk about that chapter here: www.youtube.com/watch?v=44D1...
Thanks! It was a fun interview to record.
Now if only the playback had speed control :(
Sadly, my primary team (the Red Sox) and emergency backup team (the Padres) were both knocked out of the playoffs yesterday.
Now I am left to cheer for my team of last resort (the Notyankees).
Sometimes we can use Bayesian methods to infer the effect of selection bias and produce an unbiased estimate.
Here's an example that uses PyMC to solve a classic probability puzzle (the image shows what I think is the original version).
www.allendowney.com/blog/2025/09...
At this point I'm just barely making enough $ on this cohort to cover the platform fees.
It's going to run anyway but I'd really love to have a few more people signed up.
Use the code SIXTY for 60% off at registration.
I have published five new chapters of Think Linear Algebra!
Read about the project here
allendowney.com/blog/2025/09...
Or jump straight to the book
allendowney.github.io/ThinkLinearA...
And now⦠Asteroids!
On September 3 I'm giving a talk for the Boston Python User Group, called "A Future of Data Science"
www.meetup.com/bostonpython...
This is a talk from posit::conf last year, updated with new data and the experience of an interesting year.
Between 2021 and 2024, marijuana was legalized in eight states totaling 18% of the US population. During this time, adult use increased and youth use was unchanged.
Data from NSDUH 2024.
As a graduate of an all-boys high school, I am very interested to see the results...
For anyone who likes LLMs and daytime game shows.
Some news articles suggest young men are conservative Republicans with sexist attitudes.
But big picture, young men's views are pretty much on trend.
Here's the data: allendowney.substack.com/p/are-young-...