Hegseth: "Death & destruction from the sky all day. We're playing for keeps. Our warfighters have maximum authorities granted personally by POTUS & yours truly. Our rules of engagement are bold, precise designed to unleash American power, not shackle it ... we are punching them while they are down"
04.03.2026 13:11
π 1645
π 422
π¬ 1439
π 1705
π₯What if web text isnβt the best place to start training LLMs? Our latest work shows that warming up models on procedural data (e.g. from formal languages & simple algorithms) speeds up subsequent pretraining on language, code, and math, on models up to 1.3B parametersβ¬οΈπ§΅
20.02.2026 12:39
π 49
π 3
π¬ 1
π 0
Small Language Models (SLMs) donβt have the capacity to remember everything in their training data. Which tokens should they learn to predict, and when should they ask for help? We tackle this question in our new preprint.
You can check it out on arxiv: arxiv.org/abs/2602.12005
π§΅1/7
13.02.2026 16:16
π 46
π 7
π¬ 1
π 1
we need to talk about that Ring Super Bowl ad
10.02.2026 20:18
π 31328
π 13778
π¬ 973
π 1689
CONTRIBUTOR
GREG BROCKMAN
AMOUNT
ELECTION TO DAT
$25,000,000 $25,000,000
The largest Trump superPAC donor so far this cycle is the president of OpenAI
26.01.2026 00:55
π 1802
π 699
π¬ 33
π 94
Federal agents with weapons drawn, moments before murdering American citizens on the streets of Minneapolis at the dawn of 2026.
What should academics be doing right now?
I have been writing up some thoughts on what the research says about effective action, and what universities specifically can do.
davidbau.github.io/poetsandnurs...
It's on GitHub. Suggestions and pull requests welcome.
github.com/davidbau/poe...
26.01.2026 03:27
π 37
π 16
π¬ 0
π 4
Nasra Ahmed, a 23-year-old US citizen, was arrested and detained by ICE. She was held for TWO DAYS.
ICE agents handcuffed her, called her a racial slur, and she was knocked to the ground so hard she got a concussion.
This cannot continue happening. ICE needs to leave.
22.01.2026 01:42
π 7559
π 3346
π¬ 158
π 235
In our new work β Complete(d)P β we try to answer 3 questions about hyperparameter (HP) scaling:
β How to transfer across model size, tokens&batch-size?β Complete(d)P
β Do per-module HPs matter? βοΈ2x speed-ups possible
β Do they transfer to larger scale? βοΈ With the right parameterisation
06.01.2026 15:21
π 8
π 4
π¬ 1
π 1
What coding with an LLM feels like sometimes.
03.12.2025 09:29
π 267
π 64
π¬ 10
π 6
5. But apart from that, I'm more and more convinced we need a better incentive structure around this, maybe some kind of "submitting-reviewing credit system" where you earn credits for doing (good) reviews, and you need to spend those credits when you submit papers.
30.11.2025 13:09
π 1
π 0
π¬ 0
π 0
4. This would be quite similar to ARR (ACL Rolling Review) system, but without a promise that if you submit by date X, the review will be finished in time to be considered for the conference Y.
30.11.2025 13:09
π 1
π 0
π¬ 1
π 0
3. I think this could be realized with TMLR like reviewing process and upon acceptance, the reviewers can nominate the paper to be considered for whatever the most appropriate upcoming conference is (like what TMLR is currently doing for ICLR).
30.11.2025 13:09
π 1
π 0
π¬ 1
π 0
1. I second the sentiment that TMLR is amazing. If I had a choice, Iβd only review for TMLR.
2. Maybe itβs time for conferences to only present papers at the level equivalent to what is now βnominated for a spotlight/oralβ.
30.11.2025 13:09
π 3
π 0
π¬ 1
π 0
Came across this job market paper that I actually enjoyed reading. It picks on an intuitive idea and studies it systematically. I wonder how a similar paper, some years down the line, would look at the massive stimulus fueled post-2008 tech boom. drive.google.com/file/d/1SClt...
27.11.2025 20:42
π 2
π 2
π¬ 1
π 0
The "Log Lady" from Twin Peaks consoling her log
when you call console.log() this is what happens
24.11.2025 23:19
π 106
π 35
π¬ 1
π 2
One more banger by Zach Weinersmith. How can you have as much broadness in topics and depth at the same time? Big fan.
17.11.2025 16:45
π 8
π 3
π¬ 0
π 0
Internship - Machine Learning Research on Uncertainty - Jobs at Apple (MY)
Apply for a Internship - Machine Learning Research on Uncertainty job at Apple. Read about the role and find out if itβs right for you.
Our research team is hiring PhD interns π Spend your next summer in Paris and explore the next frontiers of LLMs for uncertainty quantification, calibration, RL and post-training, and Bayesian experimental design.
Details & Application β‘οΈ jobs.apple.com/en-my/detail...
14.11.2025 16:26
π 3
π 2
π¬ 1
π 0
I understood that they got super high scores "in general", which might or might not be a good thing depending on whether you agree with them :D but since that's the case, I'm jealous of your lucky draw!
13.11.2025 06:38
π 0
π 0
π¬ 0
π 0
rightfully so?
12.11.2025 17:13
π 0
π 0
π¬ 1
π 0
π’ Weβre looking for a researcher in in cogsci, neuroscience, linguistics, or related disciplines to work with us at Apple Machine Learning Research! We're hiring for a one-year interdisciplinary AIML Resident to work on understanding reasoning and decision making in LLMs. π§΅
07.11.2025 21:19
π 9
π 5
π¬ 1
π 1
Do AI agents ask good questions? We built βCollaborative Battleshipβ to find outβand discovered that weaker LMs + Bayesian inference can beat GPT-5 at 1% of the cost.
Paper, code & demos: gabegrand.github.io/battleship
Here's what we learned about building rational information-seeking agents... π§΅π½
27.10.2025 19:17
π 24
π 11
π¬ 1
π 2
It's that time of the year! π
The Apple Machine Learning Research (MLR) team in Paris is hiring a few interns, to do cool research for Β±6 months ππ & work towards publications/OSS.
Check requirements and apply: β‘οΈ jobs.apple.com/en-us/detail...
Moreββ βοΈ mlr_paris_internships@group.apple.com
17.10.2025 13:07
π 7
π 4
π¬ 0
π 0
As an unironic take on "LLM Science", I like this bit by @dbusbridge.bsky.social:
45:04
icml.cc/virtual/2025...
23.10.2025 08:03
π 1
π 0
π¬ 0
π 0
Reminder that Amortized inference Workshop submissions at the ELLIS Unconference are still open until **Oct 16, 2025**.
Only a short abstract (Β½ page), so go ahead!
Workshop: Dec 2, 2025, co-located with EurIPS.
Website: sites.google.com/view/amortiz...
14.10.2025 10:16
π 13
π 3
π¬ 0
π 1
Waymo is coming to London next year.
15.10.2025 09:57
π 9
π 4
π¬ 2
π 0
LLMs are currently this one big parameter block that stores all sort of facts. In our new preprint, we add context-specific memory parameters to the model, and pretrain the model along with a big bank of memories.
π arxiv.org/abs/2510.02375
[1/10]π§΅
06.10.2025 16:06
π 13
π 4
π¬ 1
π 0
Our two phenomenal interns, Alireza Mousavi-Hosseini and Stephen Zhang @syz.bsky.social have been cooking some really cool work with Michal Klein and me over the summer.
Relying on optimal transport couplings (to pick noise and data pairs) should, in principle, be helpful to guide flow matching
π§΅
03.10.2025 20:50
π 30
π 7
π¬ 2
π 1
Thanks both for the kind words! π
01.09.2025 14:44
π 2
π 0
π¬ 0
π 0