Albert Varela's Avatar

Albert Varela

@albertvarela

Lecturer in Quantitative Methods - School of Sociology and Social Policy, University of Leeds. 
Interested in measurement and analysis of job quality, poverty and social mobility.

613
Followers
864
Following
11
Posts
04.07.2023
Joined
Posts Following

Latest posts by Albert Varela @albertvarela

NEW: Haowen Zheng, Robert Andersen, Anders Holm, Kristian Bernt Karlson, "Is College Really “the” Equalizer? New Evidence Addressing Unobserved Selection." sociologicalscience.com/articles-v13...

03.03.2026 17:45 👍 17 🔁 8 💬 1 📌 2
Statistical Rethinking Lecture A09 - Modeling Events
Statistical Rethinking Lecture A09 - Modeling Events YouTube video by Richard McElreath

Lecture A09 - I get mildly ranty about the low quality of papers on discrimination and somehow also introduce generalized linear models for events and illustrate post-stratification. The theme continues next week with modeling sensitivity to unmeasured confounding. I will try to be less ranty.

03.03.2026 10:26 👍 50 🔁 9 💬 2 📌 0
Preview
A slacking occupational structure in Britain? Reflections on a recent column in the Financial Times

A slacking occupational structure in Britain? Reflections on my substack on a piece in the @financialtimes.com by John Burn-Murdoch. And what does this piece imply for the debate on overeducation?
@data.ft.com hermwerf.substack.com/p/a-slacking...

21.02.2026 13:29 👍 6 🔁 4 💬 1 📌 0

Problem about the loneliness epidemic is, it's everywhere except in representative survey data. Let's look at where the claim comes from. 1/

17.02.2026 07:13 👍 597 🔁 228 💬 21 📌 35
Simulated null distribution for data with a sample size of 100, difference in group means of 5, and a p-value of 0.142

Simulated null distribution for data with a sample size of 100, difference in group means of 5, and a p-value of 0.142

Simulated null distribution of a slope of 0.8 and p-value of 0.002

Simulated null distribution of a slope of 0.8 and p-value of 0.002

Finally, we have to decide if the p-value meets an evidentiary standard or threshold that would provide us with enough evidence that we aren’t in the null world (or, in more statsy terms, enough evidence to reject the null hypothesis).

There are lots of possible thresholds. By convention, most people use a threshold (often shortened to α) of 0.05, or 5%. But that’s not required! You could have a lower standard with an α of 0.1 (10%), or a higher standard with an α of 0.01 (1%).

Statistically significant
The p-value is < 0.001 and our threshold for α is 0.05

In a world where there is no relationship between x and y, the probability of seeing a slope of at least 0.901 is < 0.1%

Since < 0.001 is less than 0.05, we have enough evidence to say that the slope is statistically significant.

Finally, we have to decide if the p-value meets an evidentiary standard or threshold that would provide us with enough evidence that we aren’t in the null world (or, in more statsy terms, enough evidence to reject the null hypothesis). There are lots of possible thresholds. By convention, most people use a threshold (often shortened to α) of 0.05, or 5%. But that’s not required! You could have a lower standard with an α of 0.1 (10%), or a higher standard with an α of 0.01 (1%). Statistically significant The p-value is < 0.001 and our threshold for α is 0.05 In a world where there is no relationship between x and y, the probability of seeing a slope of at least 0.901 is < 0.1% Since < 0.001 is less than 0.05, we have enough evidence to say that the slope is statistically significant.

Evidentiary standards

When thinking about p-values and thresholds, I like to imagine myself as a judge or a member of a jury. Many legal systems around the world have formal evidentiary thresholds or standards of proof. If prosecutors provide evidence that meets a threshold (i.e. goes beyond a reasonable doubt, or shows evidence on a balance of probabilities), the judge or jury can rule guilty. If there’s not enough evidence to clear the standard or threshold, the judge or jury has to rule not guilty.

With p-values:

If the probability of seeing an effect or difference (or δ) in a null world is less than 5% (or whatever the threshold is), we rule it statistically significant and say that the difference does not fit in that world. We’re pretty confident that it’s not zero.
If the p-value is larger than the threshold, we do not have enough evidence to claim that δ doesn’t come from a world of where there’s no difference. We don’t know if it’s not zero.
Importantly, if the difference is not significant, that does not mean that there is no difference. It just means that we can’t detect one if there is. If a prosecutor doesn’t provide sufficient evidence to clear a standard or threshold, it does not mean that the defendant didn’t do whatever they’re charged with†—it means that the judge or jury can’t detect guilt.

Evidentiary standards When thinking about p-values and thresholds, I like to imagine myself as a judge or a member of a jury. Many legal systems around the world have formal evidentiary thresholds or standards of proof. If prosecutors provide evidence that meets a threshold (i.e. goes beyond a reasonable doubt, or shows evidence on a balance of probabilities), the judge or jury can rule guilty. If there’s not enough evidence to clear the standard or threshold, the judge or jury has to rule not guilty. With p-values: If the probability of seeing an effect or difference (or δ) in a null world is less than 5% (or whatever the threshold is), we rule it statistically significant and say that the difference does not fit in that world. We’re pretty confident that it’s not zero. If the p-value is larger than the threshold, we do not have enough evidence to claim that δ doesn’t come from a world of where there’s no difference. We don’t know if it’s not zero. Importantly, if the difference is not significant, that does not mean that there is no difference. It just means that we can’t detect one if there is. If a prosecutor doesn’t provide sufficient evidence to clear a standard or threshold, it does not mean that the defendant didn’t do whatever they’re charged with†—it means that the judge or jury can’t detect guilt.

I just whipped up this little #QuartoPub site last week that demonstrates how I teach p-values/hyp-testing through simulation both with live OJS and with #rstats, and I think it's super neat! It has examples for diff-in-means, diff-in-props, and regression slopes nullworlds.andrewheiss.com #statsky

11.02.2026 21:14 👍 139 🔁 26 💬 3 📌 5
Statistical Rethinking 2026 Lecture B04 - Group-level confounding and intro to social networks
Statistical Rethinking 2026 Lecture B04 - Group-level confounding and intro to social networks YouTube video by Richard McElreath

This time I try to explain group-level confounding and some ways to deal with it. Lecture B04 of Statistical Rethinking 2026 - fixed effects, Mundlak machines, latent Mundlak machines, intro to social network analysis and the social relations model. Full lecture list: github.com/rmcelreath/s...

30.01.2026 13:27 👍 102 🔁 16 💬 6 📌 4

As a statistical educator, it had not occurred to me that I need to caution students against regressing a variable on a function of itself. My naivete is unbounded.

The Peri & Sparber paper (linked below) looks really good! It has synthetic data analyses and everything.

08.01.2026 09:16 👍 57 🔁 7 💬 2 📌 0

And we're live, Lecture A1 is online. Introduction to Bayesian workflow, generative models, estimands, estimators, estimates, error checking, beginnings of probability theory and Bayesian updating. www.youtube.com/watch?v=ztbY...

06.01.2026 11:02 👍 199 🔁 64 💬 2 📌 5
Post image

Here are my favourite 2025 papers on climate policy/politics (listed in no particular order).
1. Ascari, Guido, Andrea Colciago, Timo Haber, and Stefan Wöhrmüller. 2025. ‘Inequality along the European Green Transition’. Economic Journal.
doi.org/10.1093/ej/u...

30.12.2025 15:19 👍 34 🔁 8 💬 1 📌 3
Preview
Public understanding and attitudes to irregular migration in the UK - I-CLAIM This report summarises findings from the February 2025 I-CLAIM survey of 1,147 UK adults, exploring public knowledge of irregular migration, how people define it, and their attitudes toward irregular ...

What do Britons know and think about #irregularmigration?
Read new @iclaimeu.bsky.social report out today

i-claim.eu/project/publ...

11.12.2025 07:35 👍 24 🔁 15 💬 3 📌 5
Preview
British Politics' Midlife Crisis Why British Parties Can't Make Peace with Their Actual Voters

On the morning of Keir Starmer's conference speech here's a new post on an odd psychopathology in British politics - our main parties don't like the people who vote for them - the dreaded Professional Managerial Class. And so they are acting out like a divorced dad seeking cooler voters. 1/n

30.09.2025 06:40 👍 1185 🔁 472 💬 21 📌 163

Young researchers in social policy, submit!

05.12.2025 08:19 👍 3 🔁 3 💬 0 📌 0
John Fox: Books and Software

#rstats
It is with profound sadness I heard that my long-time friend and colleague, John Fox passed away this week.
He was the author of {car}, {effects}, {Rcmdr}, ... and numerous influential books. I will miss him greatly.
www.john-fox.ca

28.11.2025 15:26 👍 202 🔁 66 💬 12 📌 10
Preview
The ‘Danish model’ is the darling of centre-left parties like Labour. The problem is, it doesn’t even work in Denmark | Cas Mudde This week’s local elections are the latest reminder that when social democrats move rightwards, they’re making a mistake, says academic and author Cas Mudde

After more than 10 years of “the Danish Model”, nativism is hegemonic in the country, the far right polls near level highs again, and the Social Democrats lost Copenhagen and poll at historic low.

European Social Democrats should look at the facts, not the myths!

Me in @theguardian.com

22.11.2025 13:59 👍 642 🔁 259 💬 14 📌 30
Visas are a key tool for states to regulate incoming mobility from abroad, which can have ramifications for the
establishment and perpetuation of global inequalities. In this article, we systematically analyze visa appointment
wait times in German embassies and consulates worldwide. Using computational methods, we collect—and
publish—fine-grained longitudinal data on the closest available appointment dates for various visa types,
covering a total of 16,182 visa appointment requests. Our analysis reveals strong and systematic variance: the
poorer the country a diplomatic mission is based in, the longer the wait time and the lower the chances of finding
an available appointment (which ranges from almost 0 to 100 percent). We also argue that Germany’s system is
quite opaque compared to other established immigration countries such as the U.S. These core findings raise
important questions in light of current debates about global justice, legal pathways to migration, and efforts to
attract foreign talent.

Visas are a key tool for states to regulate incoming mobility from abroad, which can have ramifications for the establishment and perpetuation of global inequalities. In this article, we systematically analyze visa appointment wait times in German embassies and consulates worldwide. Using computational methods, we collect—and publish—fine-grained longitudinal data on the closest available appointment dates for various visa types, covering a total of 16,182 visa appointment requests. Our analysis reveals strong and systematic variance: the poorer the country a diplomatic mission is based in, the longer the wait time and the lower the chances of finding an available appointment (which ranges from almost 0 to 100 percent). We also argue that Germany’s system is quite opaque compared to other established immigration countries such as the U.S. These core findings raise important questions in light of current debates about global justice, legal pathways to migration, and efforts to attract foreign talent.

Graph that shows that 44.1 percent of requests did not lead to an appointment that could be selected. For the 55.9 percent where an appointment was available the distribution of wait times follows a steep curve with short wait times in many cases and a long tail of few cases with very long wait times of up to 98 days.

Graph that shows that 44.1 percent of requests did not lead to an appointment that could be selected. For the 55.9 percent where an appointment was available the distribution of wait times follows a steep curve with short wait times in many cases and a long tail of few cases with very long wait times of up to 98 days.

The average wait times and chances to find an appointment varied a lot between Germany's diplomatic missions. The latter range from almost 0 to 100 percent.

The average wait times and chances to find an appointment varied a lot between Germany's diplomatic missions. The latter range from almost 0 to 100 percent.

This variance is not random. Rather, economic wellbeing (GDP per capita) is a key predictor of wait times and chances of finding  an appointment. The poorer the country a German embassy/consulate is based in, the longer the wait time and the lower the chances of finding an appointment.

This variance is not random. Rather, economic wellbeing (GDP per capita) is a key predictor of wait times and chances of finding an appointment. The poorer the country a German embassy/consulate is based in, the longer the wait time and the lower the chances of finding an appointment.

New #openaccess study

We made >16,000 visa appointment requests at German embassies and consulates worldwide

Key finding: The poorer the country, the longer the wait time and the lower the chance to get an appointment.

"A time panelty for the Global South?"
shorturl.at/ZiAFb

19.11.2025 10:30 👍 46 🔁 20 💬 3 📌 3

“Little boxes” and “Coat of many colors”

11.11.2025 18:08 👍 2 🔁 0 💬 0 📌 0
Post image

The electoral outcome most strongly linked to deprivation is not any party’s vote share, but turnout. Across almost all indicators, turnout is markedly lower in more deprived areas, with only barriers to housing & services and quality in the living environment showing weaker correlations.

03.11.2025 08:38 👍 31 🔁 14 💬 2 📌 0
Preview
The Statistical Advantages of Multilevel Analysis of Individual Heterogeneity and Discriminatory Accuracy for Estimating Intersectional Inequalities - George Leckie, Andrew Bell, Juan Merlo, SV Subram... Multilevel Analysis of Individual Heterogeneity and Discriminatory Accuracy (MAIHDA) is a multilevel regression approach grounded in intersectionality theory. I...

Pleased to see this out in print - detailing MAIHDA's desirable statistical properties.

"MAIHDA is especially valuable when inequalities are subtle or data for marginalised intersections are sparse - conditions common in practice"

journals.sagepub.com/doi/10.1177/...

@clarerevans.bsky.social

22.10.2025 16:00 👍 21 🔁 8 💬 0 📌 1
Preview
Work in the Digital Era: How Technology is Transforming Work and Occupations This report provides a comprehensive analysis of the impact of digital technologies on work and occupations in Europe, critically reassessing dominant narratives of mass unemployment and job polarisat...

We just published a new report synthesizing more than 7 years of research on the impact of digital technologies on employment in Europe carried out with my team in the JRC. Lots of evidence and ideas for discussion! #EconSky #sociology
@sergiotorrejon.com @lauranurski.bsky.social

12.09.2025 17:03 👍 14 🔁 9 💬 1 📌 0

Ever asked yourself how to detect and extract social groups from texts with computational social science? @haukelicht.bsky.social and me have a solution for you out at @bjpols.bsky.social. You can also find the pre-trained models on huggingface!

01.09.2025 15:46 👍 94 🔁 32 💬 2 📌 1
Abstract

Abstract

What do unions do? On average they make the members about $870k more wealthy over time, new findings at Social Forces show.

01.09.2025 18:20 👍 48 🔁 15 💬 1 📌 2
Models as Prediction Machines: How to Convert Confusing Coefficients into Clear Quantities

Abstract
Psychological researchers usually make sense of regression models by interpreting coefficient estimates directly. This works well enough for simple linear models, but is more challenging for more complex models with, for example, categorical variables, interactions, non-linearities, and hierarchical structures. Here, we introduce an alternative approach to making sense of statistical models. The central idea is to abstract away from the mechanics of estimation, and to treat models as “counterfactual prediction machines,” which are subsequently queried to estimate quantities and conduct tests that matter substantively. This workflow is model-agnostic; it can be applied in a consistent fashion to draw causal or descriptive inference from a wide range of models. We illustrate how to implement this workflow with the marginaleffects package, which supports over 100 different classes of models in R and Python, and present two worked examples. These examples show how the workflow can be applied across designs (e.g., observational study, randomized experiment) to answer different research questions (e.g., associations, causal effects, effect heterogeneity) while facing various challenges (e.g., controlling for confounders in a flexible manner, modelling ordinal outcomes, and interpreting non-linear models).

Models as Prediction Machines: How to Convert Confusing Coefficients into Clear Quantities Abstract Psychological researchers usually make sense of regression models by interpreting coefficient estimates directly. This works well enough for simple linear models, but is more challenging for more complex models with, for example, categorical variables, interactions, non-linearities, and hierarchical structures. Here, we introduce an alternative approach to making sense of statistical models. The central idea is to abstract away from the mechanics of estimation, and to treat models as “counterfactual prediction machines,” which are subsequently queried to estimate quantities and conduct tests that matter substantively. This workflow is model-agnostic; it can be applied in a consistent fashion to draw causal or descriptive inference from a wide range of models. We illustrate how to implement this workflow with the marginaleffects package, which supports over 100 different classes of models in R and Python, and present two worked examples. These examples show how the workflow can be applied across designs (e.g., observational study, randomized experiment) to answer different research questions (e.g., associations, causal effects, effect heterogeneity) while facing various challenges (e.g., controlling for confounders in a flexible manner, modelling ordinal outcomes, and interpreting non-linear models).

Figure illustrating model predictions. On the X-axis the predictor, annual gross income in Euro. On the Y-axis the outcome, predicted life satisfaction. A solid line marks the curve of predictions on which individual data points are marked as model-implied outcomes at incomes of interest. Comparing two such predictions gives us a comparison. We can also fit a tangent to the line of predictions, which illustrates the slope at any given point of the curve.

Figure illustrating model predictions. On the X-axis the predictor, annual gross income in Euro. On the Y-axis the outcome, predicted life satisfaction. A solid line marks the curve of predictions on which individual data points are marked as model-implied outcomes at incomes of interest. Comparing two such predictions gives us a comparison. We can also fit a tangent to the line of predictions, which illustrates the slope at any given point of the curve.

A figure illustrating various ways to include age as a predictor in a model. On the x-axis age (predictor), on the y-axis the outcome (model-implied importance of friends, including confidence intervals).

Illustrated are 
1. age as a categorical predictor, resultings in the predictions bouncing around a lot with wide confidence intervals
2. age as a linear predictor, which forces a straight line through the data points that has a very tight confidence band and
3. age splines, which lies somewhere in between as it smoothly follows the data but has more uncertainty than the straight line.

A figure illustrating various ways to include age as a predictor in a model. On the x-axis age (predictor), on the y-axis the outcome (model-implied importance of friends, including confidence intervals). Illustrated are 1. age as a categorical predictor, resultings in the predictions bouncing around a lot with wide confidence intervals 2. age as a linear predictor, which forces a straight line through the data points that has a very tight confidence band and 3. age splines, which lies somewhere in between as it smoothly follows the data but has more uncertainty than the straight line.

Ever stared at a table of regression coefficients & wondered what you're doing with your life?

Very excited to share this gentle introduction to another way of making sense of statistical models (w @vincentab.bsky.social)
Preprint: doi.org/10.31234/osf...
Website: j-rohrer.github.io/marginal-psy...

25.08.2025 11:49 👍 1007 🔁 288 💬 47 📌 22
Preview
“The System Sucks”: Computer Programs and Technical Control in Entry-Level White-Collar Work Researchers often examine how technology controls the labor of precarious workers while demonstrating the limits of technology on controlling professional workers. Drawing on a subset of 46 in-depth i...

Check out my new article in the Journal of Organizational Sociology, where I examine how technology limits the autonomy of entry-level workers. I theorize two subtypes of technical control and discuss its implications for gender inequality
www.degruyterbrill.com/document/doi...

13.08.2025 14:15 👍 23 🔁 6 💬 2 📌 0
Preview
Two societies, one sociology, and no theory This article shows the declining effectiveness of the sociological classics to make sense of the dramatically changing economy and society. However, the various ‘post-something’ analyses of such tran...

Not recent, but may be of interest onlinelibrary.wiley.com/doi/epdf/10....

05.08.2025 18:45 👍 4 🔁 0 💬 1 📌 0
Write Reproducible and Readable Analysis Code – European Network for Open Criminology Find out how to make your analysis code easy to share, understand, and reproduce.

New how-to guide now available on the European Network for Open Criminology website. This time @asiermoneva.com shares advice on writing reproducible and readable analysis code. Highly recommended! esc-enoc.github.io/how-to/repro...

30.07.2025 14:51 👍 17 🔁 11 💬 0 📌 0

The outsourcing boom of the Major-Blair years saved money in the short run but left the state without the capacity to do anything but buy in services from canny private providers who have us over a barrel and are raking it in

23.07.2025 07:27 👍 5 🔁 1 💬 0 📌 0
Post image

🚨 Major release alert
We’re thrilled to launch lissyrtools v0.2.0 — our R package that makes working with LIS & LWS microdata simpler, faster, and clearer 📦
🧵 1/12

12.06.2025 15:09 👍 12 🔁 8 💬 1 📌 0

Interested in employment and social security research? Please follow the account below (we've moved from X and need to rebuild our following!)

12.06.2025 11:44 👍 2 🔁 2 💬 0 📌 0
mlmRev::egsingle |>
  performance::check_group_variation(
    select = c("female", "grade", "math"),
    by = c("schoolid", "childid"),
    include_by = TRUE
  )
#> Check schoolid variation
#>
#> Variable | Variation |  Design
#> ------------------------------
#> childid  |      both |  nested
#> female   |    within | crossed
#> grade    |      both |
#> math     |      both |
#>
#> Check childid variation
#>
#> Variable | Variation | Design
#> -----------------------------
#> schoolid |   between |
#> female   |   between |
#> grade    |      both |
#> math     |      both |

mlmRev::egsingle |> performance::check_group_variation( select = c("female", "grade", "math"), by = c("schoolid", "childid"), include_by = TRUE ) #> Check schoolid variation #> #> Variable | Variation | Design #> ------------------------------ #> childid | both | nested #> female | within | crossed #> grade | both | #> math | both | #> #> Check childid variation #> #> Variable | Variation | Design #> ----------------------------- #> schoolid | between | #> female | between | #> grade | both | #> math | both |

🆕 Introducing check_group_variation() in the {performance} #Rstats package! 🎉

This function makes it easy to checks if variables vary within or between levels of grouping variables.

Perfect for understanding and designing mixed models 🚀

easystats.github.io/performance/...

#stats #easystats

27.05.2025 06:48 👍 44 🔁 13 💬 3 📌 0
Post image

Writing some paragraphs about odds ratio and, more generally, different scales in nonlinear models.

Any favorite articles on odds ratio?>

26.05.2025 12:59 👍 119 🔁 20 💬 19 📌 0