Lukas Jung's Avatar

Lukas Jung

@lhdjung

PhD student in psychological metascience at Uni Bern | R package developer – {scrutiny}, {unsum}, and more | https://github.com/lhdjung | #rstats | forensic metascience | error detection | missing value handling

2,351
Followers
1,127
Following
169
Posts
06.10.2023
Joined
Posts Following

Latest posts by Lukas Jung @lhdjung

python vs R?
base R vs tidyverse?
framework X vs Y?

unless you're getting paid, use the one that brings you the most joy.

life's too short to keep on writing code that you neither enjoy nor get paid for.

#rstats

06.03.2026 12:43 👍 46 🔁 9 💬 1 📌 3

Yes, and since package developers are generally users of other packages, too, it ends up working better for everyone.

04.03.2026 18:51 👍 4 🔁 1 💬 1 📌 0

New preprint! 🎉 How much post-publication correction takes place in privacy and security research? Short answer: not so much.

With Nele Borgert, Luisa Jansen, Theresa Halbritter (none on Bluesky), and @malte.the100.ci

03.03.2026 11:59 👍 5 🔁 1 💬 0 📌 0
Post image

I should think twice before having Claude comment on my drafts

01.03.2026 20:58 👍 1 🔁 0 💬 0 📌 0
Related software

Very interesting! Will include {tidychain} in this list of similar software: lhdjung.github.io/scrutiny/art...

You might find other interesting tools there if you like forensic metascience, as this type of work has increasingly been called. The package itself is also about it.

01.03.2026 10:45 👍 1 🔁 0 💬 0 📌 0

Also love the header image:
- Our beautiful feather pen
- Their disgusting poultry thing

28.02.2026 21:05 👍 2 🔁 0 💬 0 📌 0

The pipe wars, of course, are distinguished by the fact that one side is correct. #rstats

28.02.2026 21:02 👍 13 🔁 2 💬 2 📌 0

Remember what? You must be talking to the wrong person, and who are you anyways 🥸

26.02.2026 20:07 👍 1 🔁 0 💬 0 📌 0

This one is actually somewhat reasonable.

26.02.2026 19:27 👍 1 🔁 0 💬 0 📌 0

With honorable mentions:

- “Row names weren’t a mistake”
- “I unironically love <<-”
- “Everything should be a matrix”
- “Global variables create community.”
- “Recycling rules make sense if you believe.”
- “stringsAsFactors = TRUE was misunderstood.”
- “Preallocation is a state of mind.”

26.02.2026 18:28 👍 3 🔁 1 💬 1 📌 0

My favorites from what I got:

- “Nested ifelse() builds character”
- "attach() gets a bad rap"
- "setwd() isn't that bad"
- "Excel is underrated"

26.02.2026 18:24 👍 10 🔁 2 💬 4 📌 0
R plot drawn by plot(iris$Sepal.Length)

R plot drawn by plot(iris$Sepal.Length)

Base R plotting really is more transparent. Just look at these points.

26.02.2026 18:11 👍 20 🔁 2 💬 0 📌 0
Modes and missing values moder

Code based on table() assumes that NA is a known and distinct value, which is a bit out of line with its general meaning: lhdjung.github.io/moder/articl...

There are also issues with type coercion.

25.02.2026 11:28 👍 1 🔁 0 💬 0 📌 0

Thanks – the na.rm option was basically a compromise between what I thought was correct and what people would want 😅 The main point of the package is finding known correct solutions while not ignoring missing values, e.g., mode_all(c(1, 1, 2, 2, 2, 2, NA))

25.02.2026 11:25 👍 0 🔁 0 💬 1 📌 0

"Some maintain that “writing is thinking” and value the process and struggle of writing, whereas others are more in the “writing is torture” camp [...]"

Third camp: Writing is torture because writing is thinking.

25.02.2026 10:04 👍 3 🔁 0 💬 0 📌 0
Mode estimation in R The moder package determines single or multiple modes (most frequent values). By default, its 'mode_' functions check whether missing values make this impossible, and return `NA` in this case. They ha...

You were actually looking for... lhdjung.github.io/moder/

I haven't touched it in a while, though

25.02.2026 09:13 👍 0 🔁 0 💬 1 📌 0

To make things more interesting, our teacher set the default grade to A6 at the start of the year and we had do improve it by our performance.

He was a career jumper from tech, so I now think he just initialized a variable at zero. With time comes perspective. Back then, people were into hating it.

24.02.2026 23:38 👍 1 🔁 0 💬 0 📌 0

When I was in grade 9 (Germany), the scale switched from the good and natural 1-6 to B1-B4, where B1 was best and B4 was equivalent to A1, which was also a thing and went all the way down to A6.

24.02.2026 23:35 👍 0 🔁 0 💬 1 📌 0

Or any other code, for that matter

24.02.2026 22:11 👍 2 🔁 0 💬 1 📌 0

Lagging behind as an #rstats developer since I haven't implemented my own pipe operator

24.02.2026 10:37 👍 17 🔁 2 💬 0 📌 0

Tensions long simmering under the surface have erupted for what might have been the first time since the large-scale wars of R Twitter.

Seriously though, just some fun and polite disagreement over code style and the old base/tidyverse questions.

24.02.2026 08:29 👍 1 🔁 0 💬 0 📌 0

Bizarro pipe erasure

23.02.2026 21:41 👍 2 🔁 0 💬 1 📌 0

Well I suppose you mean:

tbl |>
head() |>
View()

23.02.2026 21:20 👍 2 🔁 0 💬 1 📌 0
Preview
Optimize `as_wide_n_tibble()` helper · lhdjung/unsum@312df0e

Unfortunately, forcing binary operators into prefix form is sometimes the most efficient way because it avoids intermediate allocations.

This refactoring achieved a 1.8x speedup but I'm not proud of the syntax: github.com/lhdjung/unsu...

(The function handled large data, so it was a bottleneck)

23.02.2026 20:14 👍 2 🔁 0 💬 1 📌 0

Try piping into tryCatch(), it will change your life

23.02.2026 11:09 👍 1 🔁 1 💬 0 📌 0

Maybe I'm more sensitive to this as a package dev. I don't always know in advance how large the data will be on the user's machine, and if I write inefficient code, the user will suffer for nothing.

Things could be different in a one-off analysis. But the principle is always the same.

7 / 7 Fin

22.02.2026 16:29 👍 6 🔁 1 💬 0 📌 0

A more subtle benefit of the pipe is performance. Intermediate variables mean new copies and allocations each time. With large data, your memory will get clogged up by many redundant copies.

Pipes avoid this problem completely. They are as efficient as nested calls but much more readable.

6 / 7

22.02.2026 16:29 👍 8 🔁 1 💬 1 📌 0

There is really no black and white choice. Anyways, since the pipe has gotten a beating on here:

Using too many intermediate variables can clutter the code a lot, especially if their only job is to lead from one transformation to the next. This is where the pipe does a better job.

5 / 7

22.02.2026 16:29 👍 6 🔁 1 💬 1 📌 0
Post image

The tidyverse style guide also distinguishes between good and bad usage of the pipe:

(style.tidyverse.org/pipes.html)

4 / 7

22.02.2026 16:29 👍 7 🔁 1 💬 1 📌 0
Post image

This is not a new idea. Here is the first edition of the book:

(r4ds.had.co.nz/pipes.html)

3 / 7

22.02.2026 16:29 👍 5 🔁 1 💬 1 📌 0