Use DuckDB for SQL: duckdb.org
RIP DHS. We expected this, but it’s still a shock.
www.nytimes.com/2025/02/26/h...
Annual budget of USAID is about $50B. Annual budget of USA is about $7T. Eliminating USAID does not save anything in this context.
Causal inference methods for treatment effect estimation usually assume independent units. However, this assumption is often questionable because units may interact, resulting in spillover effects between them. We develop augmented inverse probability weighting (AIPW) for estimation and inference of the expected average treatment effect (EATE) with observational data from a single (social) network with spillover effects. In contrast to overall effects such as the global average treatment effect (GATE), the EATE measures, in expectation and on average over all units, how the outcome of a unit is causally affected by its own treatment, marginalizing over the spillover effects from other units. We develop cross-fitting theory with plugin machine learning to obtain a semiparametric treatment effect estimator that converges at the parametric rate and asymptotically follows a Gaussian distribution. The asymptotics are developed using the dependency graph rather than the network graph, which makes explicit that we allow for spillover effects beyond immediate neighbors in the network. We apply our AIPW method to the Swiss StudentLife Study data to investigate the effect of hours spent studying on exam performance accounting for the students' social network.
"Treatment Effect Estimation with Observational Network Data using Machine Learning"
Arxiv: arxiv.org/abs/2206.14591
#rstats code: github.com/corinne-rahe...
#stats
A red squirrel poses for a photo
Welcome to our Crib
Huge congratulations to my Florida State Univ Population Center colleagues Mike McFarland and Matt Hauer (@drdemography.bsky.social) on this important and very newsworthy (!!) paper on the impact of leaded gasoline on US public health. #demography
acamh.onlinelibrary.wiley.com/doi/abs/10.1...
www.statnews.com/2023/03/13/m...
After I moved to Canada a couple of years ago I realized that I was no longer constantly running a massive stress routine in the background of my mind worrying about health care and guns. It was weirdly noticeable only when it stopped.
@ucberkeleyofficial.bsky.social is accepting applications for Fall 2025 for the PhD program in Demography AND the Graduate Group in Sociology & Demography. Seeking a diverse and strong cohort; applications DUE 12/17/2024.
Learn more about the program:
www.demog.berkeley.edu/graduate-pro...
Falling #fertility across the world will lead to significant changes in countries' age pyramids. By 2100, when today's newborns are in their 70s, they (or their elders!) will be the largest age group in many countries.
#demography
#rstats code: github.com/schmert/bone...
Looks like it might be time to reiterate what psychologists have been screaming from the rooftops for years: learning styles as it is presented to the general public is a myth and it damages students’ sense of efficacy www.nature.com/articles/s41...
Scientists, academics, researchers: We’re excited to share that @altmetric.com is now tracking mentions of your research on Bluesky! 🧪
Saloni, Edouard, and Lucas wrote up the history of Our World in Data during the COVID pandemic.
It's about the impact we hoped to achieve and how it felt to us during that time.
ourworldindata.org/owid-covid-h...
Is there an equivalent graphic for water flouridation and tooth decay?
Book outline
Over the past decade, embeddings — numerical representations of machine learning features used as input to deep learning models — have become a foundational data structure in industrial machine learning systems. TF-IDF, PCA, and one-hot encoding have always been key tools in machine learning systems as ways to compress and make sense of large amounts of textual data. However, traditional approaches were limited in the amount of context they could reason about with increasing amounts of data. As the volume, velocity, and variety of data captured by modern applications has exploded, creating approaches specifically tailored to scale has become increasingly important. Google’s Word2Vec paper made an important step in moving from simple statistical representations to semantic meaning of words. The subsequent rise of the Transformer architecture and transfer learning, as well as the latest surge in generative methods has enabled the growth of embeddings as a foundational machine learning data structure. This survey paper aims to provide a deep dive into what embeddings are, their history, and usage patterns in industry.
Cover image
Just realized BlueSky allows sharing valuable stuff cause it doesn't punish links. 🤩
Let's start with "What are embeddings" by @vickiboykis.com
The book is a great summary of embeddings, from history to modern approaches.
The best part: it's free.
Link: vickiboykis.com/what_are_emb...
At @rOpenSci.hachyderm.io.ap.brid.gy we're pairing first-time code contributors with experienced maintainers. If you are an rOpenSci or other #RStats package author and want to help build the road for new contributors and get co-maintainers, sign up for co-working!
ropensci.org/blog/2024/10...
Important to understand that (1) political appointees, not the university administration, are doing this; (2) they're not cancelling courses, but removing them from the list that satisfy breadth reqmts (i.e. death by strangling rather than a knife to the back).
www.tallahassee.com/story/news/l...
AI for medical transcription - in this case Whisper sneaks in its own hallucinatory phrases
apnews.com/article/ai-a...
though i wish the AI did invent ‘hyperactivated antibiotics’ we are going to need them soon 😏
h/t @placentadoc.bsky.social
#MedSky
For the Thanksgiving break I will be in Guyana visiting one of our children who is working there for two years.
en.wikipedia.org/wiki/Guyana
Just set up an account for the openVA Team @openva.net where I will post things related to the group.
Backyard now!
CGD's very own starter pack... experts and staff former and present...
bsky.app/starter-pack...
Can anyone give lit tips for papers showing this qualitative age pattern of a mortality rate ratio (e.g. frail vs not, sick vs not, high SES vs low, in nursing home vs general pop, with disease vs without)?
SQL! SQL. JUST USE SQL
Some recent beautiful evenings
Reading Peter Turchin’s interesting and provocative books. This characterization of social science disciplines in ‘Ultrasociety’ is amusing: