Shubhendu Trivedi's Avatar

Shubhendu Trivedi

@shubhendu

Interests on bsky: ML research, applied math, and general mathematical and engineering miscellany. Also: Uncertainty, symmetry in ML, reliable deployment; applications in LLMs, computational chemistry/physics, and healthcare. https://shubhendu-trivedi.org

922
Followers
252
Following
4,352
Posts
30.09.2023
Joined
Posts Following

Latest posts by Shubhendu Trivedi @shubhendu

Preview
Chinese Open Source: A Definitive History Open source used to be a niche topic.

Must read on Chinese open source from Kevin Xu with the very similarly named substack (story for another time)

interconnect.substack.com/p/chinese-op...

06.03.2026 16:49 πŸ‘ 13 πŸ” 5 πŸ’¬ 0 πŸ“Œ 2

War is war.

06.03.2026 22:27 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

AAAI sends out emails where it's so vague that you can't even tell whether you were sent a reviewer or area chair invitation.

05.03.2026 20:13 πŸ‘ 3 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
Compact deep neural network models of the visual cortex - Nature Parsimonious deep neural network models can be used for prediction of visual neuron responses.

Nature research paper: Compact deep neural network models of the visual cortex

go.nature.com/3OKRXZU

04.03.2026 09:07 πŸ‘ 23 πŸ” 5 πŸ’¬ 0 πŸ“Œ 0

Surprise, then defensive, then basically acknowledging.

04.03.2026 15:48 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

We put probabilistic circuits into diffusion language models and got a big boost in reasoning performance!

04.03.2026 10:03 πŸ‘ 16 πŸ” 2 πŸ’¬ 0 πŸ“Œ 0

tbc, each time this happened, I did mention it to the student i.e. that I can't provide useful input if I am not able to discern what they know.

04.03.2026 06:29 πŸ‘ 5 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

It's strange. I was only asking stuff like -- why this area? how did you get interested? What have you read or looked into? It was a way for me to probe their internal state to make some suggestions. Doesn't need perfect answers. I am terribly inarticulate these days, I would have empathized!

04.03.2026 06:29 πŸ‘ 3 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Willing to accept all this clockwork-like triumphalism about being "proven right about the AI bubble" on any correction if the above just gets done with. bsky.app/profile/shub...

04.03.2026 06:23 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Tech has continued on the distribution trend. Now in its 6th month. Since the date below you see it went up, but was again pushed below the 100 DMA. I hope the geopolitical turmoil and turbulence all around gives tech a reason to quickly puke 15-20% and get done with it. So bored of this crap.

04.03.2026 06:19 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

Are AI models effective collaborators, or mere assistants awaiting your next command? (Preprint: arxiv.org/abs/2602.24188)

To find out, we make AI collaborate with itself, in private information games: tasks that require sharing private information, like this chess board ordering task.

04.03.2026 00:15 πŸ‘ 54 πŸ” 21 πŸ’¬ 3 πŸ“Œ 1

*was easier

I mean, it was not like I was offering a position or anything, it was purely about how to navigate getting into research. So it baffled me. Also not the only time it has happened.

03.03.2026 21:28 πŸ‘ 5 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

I tend to talk to undergrads (although collaborating as easier as a free agent). I just had a chat the other day, when a student reached out to me, mentioned interest in some area, and wanted advice on how to approach a PhD in said area. But for _even that_ he was clearly reading off LLM responses.

03.03.2026 21:24 πŸ‘ 4 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
An arrangement of seven squares. Six of the squares are identical and are arranged so that reading from left to right they form three stacks that abut and are of heights 2,3,1, with each base square aligned with the second square in the stack to its left.

A tilted larger square overlays the six and shares a vertex with the lower right vertex of the bottom-most square. The top left vertex of the uppermost of the six squares lies on an edge of the larger square.

A line is drawn from the left-hand vertex of the larger square to the lower right vertex of the rightmost small square.

The angle formed by this line and the left-hand edge of the larger square is marked with a question mark.

An arrangement of seven squares. Six of the squares are identical and are arranged so that reading from left to right they form three stacks that abut and are of heights 2,3,1, with each base square aligned with the second square in the stack to its left. A tilted larger square overlays the six and shares a vertex with the lower right vertex of the bottom-most square. The top left vertex of the uppermost of the six squares lies on an edge of the larger square. A line is drawn from the left-hand vertex of the larger square to the lower right vertex of the rightmost small square. The angle formed by this line and the left-hand edge of the larger square is marked with a question mark.

notes.mathforge.org/notes/publis...

#geometrypuzzle #UKMathsChat #mathsky

02.03.2026 17:49 πŸ‘ 6 πŸ” 2 πŸ’¬ 4 πŸ“Œ 0

NB: I am not labeling it as "evil and demonic" -- that was Schmitt's appellation, who also argued for leaning into "technicity" and to follow it to "its logical conclusion."

02.03.2026 19:00 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image

master the new technology and which type of genuine friend-enemy groupings can develop on this new ground."

Also reminded me of Yuk Hui's writings, e.g.Β www.e-flux.com/journal/153/...Β which also engages with the sociotechnics of the Nomos and the "evil and demonic spirit" of Schmitt's technicity.

02.03.2026 18:55 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

process of neutralization; every strong politics will make use of it. For this reason, the present century can only be understood provisionally as the century of technology. How ultimately it should be understood will be revealed only when it is known which type of politics is strong enough to

02.03.2026 18:52 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Slightly different, but reminded me of this from The Age of Neutralizations and Depoliticizations (1929): "The process of continuous neutralization of various domains of cultural life has reached its end because technology is at hand. Technology is no longer neutral ground in the sense of the

02.03.2026 18:52 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image Post image Post image

I have added a new tutorial on discrete diffusion models:
github.com/gpeyre/ot4ml

01.03.2026 14:40 πŸ‘ 51 πŸ” 15 πŸ’¬ 0 πŸ“Œ 0
Preview
Scalable Kernel-Based Distances for Statistical Inference and Integration Representing, comparing, and measuring the distance between probability distributions is a key task in computational statistics and machine learning. The choice of representation and the associated di...

Long time lurker, first time poster. My thesis, titled "Scalable Kernel-Based Distances for Statistical Inference and Integration" is now on arxiv: arxiv.org/abs/2602.21846 .

The results in core chapters (3, 4, 5, 6) are previously published work; Chapter 5 has bonus results on UQ for BQ.

27.02.2026 11:29 πŸ‘ 17 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0
Preview
Stochastic Processes The volume Stochastic Processes by K. ItΓΆ was published as No. 16 of Lecture Notes Series from Mathematics Institute, Aarhus University in August, 1969, based on Lectures given at that Institute durin...

Back in Γ…rhus and TIL that ItΓ΄ spent about 2 years here in the 60s, between his time at Stanford and Cornell. Somehow I had completely missed this. There is even a Springer book covering lectures he gave here:

27.02.2026 14:49 πŸ‘ 8 πŸ” 1 πŸ’¬ 1 πŸ“Œ 0

Thank you. I was just headed to the local church of scientology to register. Now I am on an Uber back, reconsidering.

27.02.2026 15:02 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Geoffrey Hinton

27.02.2026 14:24 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
Probing the Geometry of Diffusion Models with the String Method Understanding the geometry of learned distributions is fundamental to improving and interpreting diffusion models, yet systematic tools for exploring their landscape remain limited. Standard latent-sp...

Hopefully the string method won't become another ML meme. It's not common to see ML papers using it despite the natural conceptual alignment
arxiv.org/abs/2602.22122

27.02.2026 06:39 πŸ‘ 12 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

*professional skeptic

Holier than thou 100x, like the sage who prematurely escaped the Himalayan cave. They are like a very prominent point in the choreographed persona graph, very much like the "silicon valley founder."

26.02.2026 22:10 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

This is very much similar to the professional skeptical, cynical "high brow" academic persona. The only uncorrupted judges in the room. Like looking at the employer or net worth of an individual before examining anything said by them ("follow the money," "uncover the bias").

26.02.2026 22:08 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Some of the choreography is also necessary for hiring (otherwise, people would not know about you). But a lot of the times it is really about developing psychological armour. There is a lot of pressure, the odds are against you, so you are better off offloading the failures to a persona.

26.02.2026 22:03 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

But it makes sense. Due to SM feedback loops over the years the whole thing has converged to a highly choreographed performance (due to mimetic imitation of highly successful founders). VCs are also not attentive (you would know if you have done pitching) and are highly reliant on pattern matching.

26.02.2026 22:03 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Not just that. They start aligning with this whole persona of being a founder. Similar ways of announcing things, mystery ("something new," "stepping back," "stay tuned," who cares), even style of pictures, the skill > destiny sort of enlightened posting. Feigning being pumped at any random thing.

26.02.2026 22:03 πŸ‘ 8 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

Agents interact with environments to get information. But exploration (tools, retrieval, user interaction) is costly.

Calibrate-Then-Act allows LLM agents to balance exploration and cost:
πŸ“ Estimate uncertainty about the environment
πŸ’­ Reason about cost-uncertainty tradeoffs
βš™οΈ Act accordingly

23.02.2026 16:00 πŸ‘ 17 πŸ” 6 πŸ’¬ 1 πŸ“Œ 1