Arna Ghosh's Avatar

Arna Ghosh

@arnaghosh

Research Scientist at Google Research, working on Bio-inspired AI • PhD at Mila & McGill University, Vanier scholar • Ex-RealityLabs, Meta AI • Comedy+Cricket enthusiast

309
Followers
225
Following
59
Posts
11.09.2024
Joined
Posts Following

Latest posts by Arna Ghosh @arnaghosh

New preprint out 🎉

What happens to the hippocampal “place code” when an animal is actively engaged in a task?

The answer surprised us (and might surprise you too!).

Let's dive in ⬇️

Link:
"Hippocampal trace coding dominates and disrupts place coding" www.biorxiv.org/content/10.6...

19.02.2026 22:25 👍 61 🔁 22 💬 3 📌 3

🧪🧠 New preprint: helping resolve a decades-long debate in synaptic plasticity

NMDA receptors are central to Hebbian learning. Yet for >30 years, the existence and function of presynaptic NMDA receptors have remained controversial.

📄 doi.org/10.64898/202...

1/6

16.02.2026 20:28 👍 55 🔁 19 💬 1 📌 2

Someone added a `.stop_gradient()` and left it running. 😜

12.02.2026 05:29 👍 7 🔁 0 💬 0 📌 0
PNAS Proceedings of the National Academy of Sciences (PNAS), a peer reviewed journal of the National Academy of Sciences (NAS) - an authoritative source of high-impact, original research that broadly spans...

New paper out at PNAS: www.pnas.org/doi/10.1073/...
Revisiting the high-dimensional geometry of population responses in the visual cortex with @jpillowtime.bsky.social. The review took forever because a reviewer was doubtful our new estimator can infer eigenvalues beyond the rank of the data! (1/6)

27.01.2026 16:34 👍 70 🔁 19 💬 2 📌 2
Preview
RetINaBox: A Hands-On Learning Tool for Experimental Neuroscience An exciting aspect of neuroscience is developing and testing hypotheses via experimentation. However, due to logistical and financial hurdles, the experiment and discovery component of neuroscience is...

Are you thinking about doing neuroscience outreach but want to make it more exciting or hands on?

Check out RetINaBox! (A collab led by the Trenholm lab)

We tried to bring the experience of experimental neuroscience to a classroom setting:

www.eneuro.org/content/13/1...

#neuroscience 🧪

13.01.2026 14:56 👍 39 🔁 14 💬 0 📌 0

Whoaaa!! This is a fantastic effort, and an amazing resource.
Huge congratulations to the authors! 🎉

05.12.2025 16:16 👍 1 🔁 0 💬 0 📌 0
Post image

Last day of poster sessions and presentations at
@neuripsconf.bsky.social. Full schedule featuring Mila-affiliated researchers presenting their work at #NeurIPS2025 here mila.quebec/en/news/foll...

05.12.2025 16:07 👍 1 🔁 1 💬 0 📌 0

In San Diego attending #NeurIPS2025?
Come to our poster to talk more about representation geometry in LLMs. 😃
🗓️ Friday 4:30-7:30 pm session
📍 Exhibit Hall C, D, E
🏁 Poster # 2502

05.12.2025 14:39 👍 2 🔁 1 💬 0 📌 0
Preview
Visual motion and landmark position align with heading direction in the zebrafish interpeduncular nucleus - Nature Communications How are various visual signals integrated in the vertebrate brain for navigation? Here authors show that different spatial signals are topographically organized and align to one another in the zebrafi...

(1/n) We are excited to share our new paper in Nature Communications, by Hagar Lavian (@hlavian.bsky.social) and team, revealing how the zebrafish brain integrates visual navigation signals! www.nature.com/articles/s41...

24.11.2025 16:17 👍 54 🔁 21 💬 3 📌 2
Image of robots struggling with a social dilemma.

Image of robots struggling with a social dilemma.

1/ Why does RL struggle with social dilemmas? How can we ensure that AI learns to cooperate rather than compete?

Introducing our new framework: MUPI (Embedded Universal Predictive Intelligence) which provides a theoretical basis for new cooperative solutions in RL.

Preprint🧵👇

(Paper link below.)

03.12.2025 19:19 👍 65 🔁 27 💬 5 📌 6

Population coding 🙌

02.12.2025 06:07 👍 3 🔁 0 💬 0 📌 0
Preview
How I contributed to rejecting one of my favorite papers of all time I believe we should talk about the mistakes we make.

How I contributed to rejecting one of my favorite papers of all times, Yes, I teach it to students daily, and refer to it in lots of papers. Sorry. open.substack.com/pub/kording/...

02.12.2025 01:27 👍 119 🔁 28 💬 1 📌 10

Thanks Ken! ☺️
Here's the (more updated) NeurIPS version: proceedings.neurips.cc/paper_files/...

Also, more recently we extended the use of powerlaws for characterizing how representations change over (pre/post) training in LLMs. 🙂
🧵 here: bsky.app/profile/arna...

18.11.2025 04:13 👍 4 🔁 0 💬 0 📌 0

This is an excellent blueprint on a very fascinating use of AI scientist! And the results and super cool and interesting! 🤩
I have been asked this when talking about our work on using powerlaws to study representation quality in deep neural networks, glad to have a more concrete answer now! 😃

16.11.2025 22:29 👍 14 🔁 3 💬 1 📌 0
Post image

Conrad Hal Waddington was born OTD in 1905.

His “epigenetic landscape” is a diagrammatic representation of the constraints influencing embryonic development.

On his 50th birthday, his colleagues gave him a pinball machine on the model of the epigenetic landscape.

🧪 🦫🦋 🌱🐋 #HistSTM #philsci #evobio

08.11.2025 16:03 👍 125 🔁 35 💬 5 📌 5

You mean the algorithms "generate" some auxilliary targets and then do supervised learning?

08.11.2025 08:17 👍 0 🔁 0 💬 0 📌 0
Post image

I got you 😉

08.11.2025 08:14 👍 2 🔁 0 💬 1 📌 0
Preview
AI and Neuroscience | IVADO

I’m looking for interns to join our lab for a project on foundation models in neuroscience.

Funded by @ivado.bsky.social and in collaboration with the IVADO regroupement 1 (AI and Neuroscience: ivado.ca/en/regroupem...).

Interested? See the details in the comments. (1/3)

🧠🤖

07.11.2025 13:52 👍 43 🔁 23 💬 1 📌 0

A tad late (announcements coming) but very happy to share the latest developments in my previous preprint!

Previously, we show that neural representations for control of movement are largely distinct following supervised or reinforcement learning. The latter most closely matches NHP recordings.

06.11.2025 02:09 👍 46 🔁 8 💬 1 📌 2

Thank you! 😁

03.11.2025 13:34 👍 0 🔁 0 💬 0 📌 0
Post image

Indeed! We show in the paper that the DPO objective is analogous to contrastive learning objectives used for self-supervised vision pretraining, which is indeed entropy-seeking in nature (shown in prev works).

I feel spectral metrics can go a long way in unlocking LLM understanding+design. 🚀

03.11.2025 01:51 👍 5 🔁 0 💬 1 📌 0
Preview
Tracing the Representation Geometry of Language Models from Pretraining to Post-training Standard training metrics like loss fail to explain the emergence of complex capabilities in large language models. We take a spectral approach to investigate the geometry of learned representations a...

A big shoutout to @koustuvsinha.com for insightful discussions that shaped this work, and
@natolambert.bsky.social + the OLMo team!

Paper 📝: arxiv.org/abs/2509.23024
👩‍💻 Code : Coming soon! 👨‍💻

31.10.2025 16:19 👍 6 🔁 0 💬 0 📌 0

This work was done with dream team 🤩
@melodylizx.bsky.social @kumarkagrawal.bsky.social Komal Teru @glajoie.bsky.social @adamsantoro.bsky.social @tyrellturing.bsky.social
at @mila-quebec.bsky.social @berkeleyair.bsky.social @cohere.com & @googleresearch.bsky.social!

🧵9/9

31.10.2025 16:19 👍 4 🔁 0 💬 1 📌 0
The multi-phasic information geometry changes in LLM pretraining and post-training. Pretraining undergoes an initial warmup phase, which corresponds to echolalia behavior, followed by entropy-seeking where the model learns high-frequency n-gram statistics, and finally a compression-seeking phase, where the model learns long-range dependencies. The post-training stages of SFT and DPO exhibit entropy-seeking behavior, where the model memorizes instruction following behavior, whereas RLVR exhibits compress-seeking behavior, where the model learns generalized reasoning at the cost of exploration.

The multi-phasic information geometry changes in LLM pretraining and post-training. Pretraining undergoes an initial warmup phase, which corresponds to echolalia behavior, followed by entropy-seeking where the model learns high-frequency n-gram statistics, and finally a compression-seeking phase, where the model learns long-range dependencies. The post-training stages of SFT and DPO exhibit entropy-seeking behavior, where the model memorizes instruction following behavior, whereas RLVR exhibits compress-seeking behavior, where the model learns generalized reasoning at the cost of exploration.

Takeaway: LLM training exhibits multi-phasic information geometry changes! ✨

- Pretraining: Compress → Expand (Memorize) → Compress (Generalize).

- Post-training: SFT/DPO → Expand; RLVR → Consolidate.

Representation geometry offers insights into when models memorize vs. generalize! 🤓

🧵8/9

31.10.2025 16:19 👍 3 🔁 0 💬 1 📌 0

BONUS: Is task-relevant info contained in the top eigendirections?

On SciQ:

- Removing top 10/50 directions barely hurts accuracy.✅

- Retaining only top 10/50 directions CRUSHES accuracy.📉

As supported by our theoretical results, eigenspectrum tail encodes critical task information! 🤯

🧵7/9

31.10.2025 16:19 👍 2 🔁 0 💬 1 📌 0
Video thumbnail

Why do these geometric phases arise?🤔

We show, both through theory and with simulations in a toy model, that these non-monotonic spectral changes occur due to gradient descent dynamics with cross-entropy loss under 2 conditions:

1. skewed token frequencies
2. representation bottlenecks

🧵6/9

31.10.2025 16:19 👍 4 🔁 0 💬 1 📌 0
Supervised Finetuning (SFT) exhibits entropy-seeking expansion, coupled with decreased OOD robustness, whereas Reinforcement Learning from Verifiable Rewards (RLVR) exhibits compression-seeking consolidation, coupled with reduced exploration.

Supervised Finetuning (SFT) exhibits entropy-seeking expansion, coupled with decreased OOD robustness, whereas Reinforcement Learning from Verifiable Rewards (RLVR) exhibits compression-seeking consolidation, coupled with reduced exploration.

Post-training also yields distinct geometric signatures:

- SFT & DPO exhibit entropy-seeking expansion, favoring instruction memorization but reducing OOD robustness.📈

- RLVR exhibits compression-seeking consolidation, learning reward-aligned behaviors at the cost of reduced exploration.📉

🧵5/9

31.10.2025 16:19 👍 2 🔁 0 💬 1 📌 0

How do these phases relate to LLM behavior?

- Entropy-seeking: Correlates with short-sequence memorization (♾️-gram alignment).

- Compression-seeking: Correlates with dramatic gains in long-context factual reasoning, e.g. TriviaQA.

Curious about ♾️-grams?
See: bsky.app/profile/liuj...
🧵4/9

31.10.2025 16:19 👍 5 🔁 0 💬 1 📌 0
OLMo-2 and Pythia models undergo multiple distinct geometric phases during pretraining, indicating a non-monotonic change in representation complexity underlying monotonic decrease in training loss.

OLMo-2 and Pythia models undergo multiple distinct geometric phases during pretraining, indicating a non-monotonic change in representation complexity underlying monotonic decrease in training loss.

LLMs have 3 pretraining phases:

Warmup: Rapid compression, collapsing representation to dominant directions.

Entropy-seeking: Manifold expansion, adding info in non-dominant directions.📈

Compression-seeking: Anisotropic consolidation, selectively packing more info in dominant directions.📉

🧵3/9

31.10.2025 16:19 👍 6 🔁 0 💬 1 📌 0

When investigating OLMo (@ai2.bsky.social) & Pythia (@eleutherai.bsky.social) model checkpoints, as expected, pretraining loss ⬇️monotonically.

BUT

🎢The spectral metrics (RankMe, αReQ) change non-monotonically (with more pretraining)!

Takeaway: We discover geometric phases of LLM learning!

🧵2/9

31.10.2025 16:19 👍 5 🔁 0 💬 1 📌 0