Ambroise Odonnat (@ambroiseodt)

SKADA-Bench: Benchmarking Unsupervised Domain Adaptation Methods... Unsupervised Domain Adaptation (DA) consists of adapting a model trained on a labeled source domain to perform well on an unlabeled target domain with some data distribution shift. While many...

SKADA-Bench : Benchmarking Unsupervised Domain Adaptation Methods with Realistic Validation On Diverse Modalities, has been published published in TMLR today 🚀. It was a huge team effort to design (and publish) an open source fully reproducible DA benchmark 🧵1/n. openreview.net/forum?id=k9F...

29.07.2025 12:54 👍 16 🔁 7 💬 1 📌 0

🚀 We are happy to organize the BERT²S workshop @neuripsconf.bsky.social 2025 on Recent Advances in Time Series Foundation Models.
🌐 berts-workshop.github.io
📜Submit by August 22
🎓Speakers and panelists: Chenghao Liu, Mingsheng Long, Zoe Piran, Danielle C. Maddix, Ameet Talwalkar, Qingsong Wen

22.07.2025 14:41 👍 5 🔁 2 💬 0 📌 0

Here is the recording with the slides for those interested!
🎤 youtu.be/UONvP1TL0-g?...
📊 drive.google.com/file/d/14ZIo...
📑 arxiv.org/pdf/2410.02724

@cohere.com @cohereforai.bsky.social

24.06.2025 16:07 👍 2 🔁 2 💬 0 📌 0

🚀 Very happy to be presenting Large Language Models as Markov Chains at Cohere Labs on June 19th at 6 pm CET (Paris time)!!

Huge thanks to Andrej Jovanović @cohere.com @cohereforai.bsky.social for the invitation 🤗

Paper: arxiv.org/pdf/2410.02724
Learn more: cohere.com/events/Coher...

13.06.2025 07:54 👍 3 🔁 0 💬 0 📌 1

Skada Sprint Alert: Contribute to Domain Adaptation in Python

📖 Machine learning models often fail when the data distribution changes between training and testing. That’s where Domain Adaptation comes in — helping models stay reliable across domains.

20.05.2025 09:30 👍 12 🔁 6 💬 1 📌 0

Congrats!

17.04.2025 13:19 👍 2 🔁 0 💬 0 📌 0

📑Paper: arxiv.org/pdf/2410.02724
📈Slides: drive.google.com/file/d/1JDrV... (better with Adobe Reader for nice GIFs)
🌐Website: ambroiseodt.github.io

28.02.2025 13:03 👍 0 🔁 0 💬 0 📌 0

🤗Thanks a lot @haeggee.bsky.social and @mjaggi.bsky.social for having me in the MLO group at EPFL @icepfl.bsky.social to present "Large Language Models as Markov Chains".

Slides are available on my website (link in thread).

🎉 New experiments with Llama and Gemma models in the updated paper!

28.02.2025 13:03 👍 4 🔁 2 💬 1 📌 0

🤗 Very happy to have (humbly) contributed to this work!

This is a collab with the usual open-source suspects from Inria, @polytechniqueparis.bsky.social and @univparissaclay.bsky.social.

Check it out if you are interested in open-source reproducible research 😇

12.02.2025 16:09 👍 2 🔁 0 💬 0 📌 0

🚀 Policy gradient methods like DeepSeek’s GRPO are great for finetuning LLMs via RLHF.

But what happens when we swap autoregressive generation for discrete diffusion, a rising architecture promising faster & more controllable LLMs?

Introducing SEPO !

📑 arxiv.org/pdf/2502.01384

🧵👇

04.02.2025 15:42 👍 6 🔁 2 💬 1 📌 0

Finally, I can't thank you enough Wes and @viviencabannes.bsky.social for this collab: you are a rare combination of super-smart and fun to work with!

Hopefully, more to come soon🤠

"Moi, si je devais résumer ma vie aujourd’hui avec vous, je dirais que c’est d’abord des rencontres."

04.02.2025 11:56 👍 0 🔁 0 💬 0 📌 0

We want to thank Elvis Dohmatob, Eshaan Nichani, @giupaolo.bsky.social , Faniriana Rakoto Endor, and Ievgen Redko for fruitful discussions during the elaboration of this work 😇

04.02.2025 11:56 👍 1 🔁 0 💬 1 📌 0

From the theoretical side, we show that clustering heads can be learned via gradient descent and provide theoretical insights into the two-stage learning observed in practice.
6/🧵

04.02.2025 11:56 👍 0 🔁 0 💬 1 📌 0

We investigate loss spikes, suggesting potential strategies for mitigation, which could lead to more stable training processes. We also peek into the transferability of circuits to showcase the usefulness of curriculum learning and data curation.
5/🧵

04.02.2025 11:56 👍 0 🔁 0 💬 1 📌 0

In the second, we unveil "𝑪𝒍𝒖𝒔𝒕𝒆𝒓𝒊𝒏𝒈 𝑯𝒆𝒂𝒅𝒔", circuits that learn the invariance of the task. Their training dynamic is in two phases: 1) clustering of the attention embeddings according to invariance and 2) classifier fitting.
4/🧵

04.02.2025 11:56 👍 0 🔁 0 💬 1 📌 0

In the first paper, we show how GD (gradient descent) reinforces useful circuits in transformers while pruning others to create sub-circuits that help solve complex tasks by breaking them down into intermediate reasoning steps.

3/🧵

04.02.2025 11:56 👍 0 🔁 0 💬 1 📌 0

We consider the 𝒔𝒑𝒂𝒓𝒔𝒆 𝒎𝒐𝒅𝒖𝒍𝒂𝒓 𝒂𝒅𝒅𝒊𝒕𝒊𝒐𝒏 problem where the inputs are sequences of L tokens in the ring of integers modulo p and the corresponding targets are the sum of the first k terms modulo p. Formally, we aim to learn the mapping:

2/🧵

04.02.2025 11:56 👍 0 🔁 0 💬 1 📌 0

🚀Proud to share our work on the training dynamics in Transformers with Wassim Bouaziz & @viviencabannes.bsky.social @Inria @MetaAI

📝Easing Optimization Paths arxiv.org/pdf/2501.02362 (accepted @ICASSP 2025 🥳)

📝Clustering Heads 🔥https://arxiv.org/pdf/2410.24050

🖥️ github.com/facebookrese...

1/🧵

04.02.2025 11:56 👍 5 🔁 4 💬 1 📌 1

GitHub - abenechehab/dicl: Official implementation of DICL (Disentangled In-Context Learning), featured in the paper Zero-shot Model-based Reinforcement Learning using Large Language Models. Official implementation of DICL (Disentangled In-Context Learning), featured in the paper Zero-shot Model-based Reinforcement Learning using Large Language Models. - abenechehab/dicl

Happy to see Disentangled In-Context Learning accepted at ICLR 2025 🥳

Make zero-shot reinforcement learning with LLMs go brrr 🚀

🖥️ github.com/abenechehab/...

📜 arxiv.org/pdf/2410.11711

Congrats Abdelhakim (abenechehab.github.io) for leading it, always fun working with nice and strong people 🤗

25.01.2025 13:10 👍 5 🔁 2 💬 0 📌 0

🎤Presenting our work on Unsupervised Accuracy Estimation at #NeurIPS2024 this week!

✋🏾Poster Session 4 West - on Thu. at 4:30 pm

📍 Poster #4310 - East Exhibit Hall A-C

DM me if you'd like to chat :)

10.12.2024 14:44 👍 2 🔁 0 💬 0 📌 0

Checkout the new version of this awesome domain adaptation library! So nice to work with such good people 🤗

06.12.2024 19:25 👍 2 🔁 1 💬 0 📌 0

Hi @vickiboykis.com, thanks for your interest. Don’t hesitate if you have any questions on the paper, we would be happy to help with @ozekri.bsky.social :)

04.12.2024 10:23 👍 3 🔁 0 💬 1 📌 0

Ahah, thanks, still a lot to learn before that 😅

03.12.2024 21:35 👍 0 🔁 0 💬 0 📌 0

🤗This is joint work with Renchunzi Xie, Vasilii Feofanov, Weijian Deng, Jianfeng Zhang, and Bo An.

Finally, I want to thank @ramealexandre.bsky.social Youssef Attia El Hili for fruitful discussions during the elaboration of this work.

🧵/🧵

03.12.2024 16:58 👍 1 🔁 0 💬 0 📌 0

🥳Finally the awaited surprise!
Our work includes a result akin to the one of
@petar-v.bsky.social in “softmax is not sharp enough” (arxiv.org/pdf/2410.01104). We discuss its implications in the context of unsupervised accuracy estimation.

12/🧵

03.12.2024 16:58 👍 1 🔁 0 💬 1 📌 0

Last but not least, we discuss in great detail the limitations of our approach and how to formalize prediction bias in unsupervised settings. We believe this is a missing piece in the current literature and hope our work can be a first step toward bridging this gap.

11/🧵

03.12.2024 16:58 👍 1 🔁 0 💬 1 📌 0

We also qualitatively demonstrate the superiority of our approach.

10/🧵

03.12.2024 16:58 👍 0 🔁 0 💬 1 📌 0

We obtain SOTA performance for various shifts (subpopulation, synthetic, natural) and architectures (ResNet, ConvNext, and Vision Transformers).

9/🧵

03.12.2024 16:58 👍 0 🔁 0 💬 1 📌 0

Thus, we truncate the exponential when the model is not calibrated. As we cannot access test labels, we provide a criterion to select the proper normalization to use automatically: softmax or Taylor. This boils down to a simple three-step recipe:

8/🧵

03.12.2024 16:58 👍 0 🔁 0 💬 1 📌 0

Here’s where it gets tricky! How do you normalize the logits? Simply using the softmax is bad as it's overconfident (see arxiv.org/pdf/2310.14814 and arxiv.org/pdf/2205.09310). We even show that it accumulates prediction bias in miscalibrated scenarios.

7/🧵

03.12.2024 16:58 👍 0 🔁 0 💬 1 📌 0

Ambroise Odonnat

Latest posts by Ambroise Odonnat @ambroiseodt