Check out the paper for more analyses and details! Huge shout out to my advisors at Google (@tallinzen.bsky.social, Ann Yuan, and Katja Filippova) for supervising this project over the summer!
Paper Link: arxiv.org/abs/2602.04212
Check out the paper for more analyses and details! Huge shout out to my advisors at Google (@tallinzen.bsky.social, Ann Yuan, and Katja Filippova) for supervising this project over the summer!
Paper Link: arxiv.org/abs/2602.04212
Overall, we show that while language models can learn novel, structured representations in context, they are a long way from being able to use these representations as a general-purpose in-context world model. Frontier models somewhat ameliorate the problem, but not entirely.
These results were generated using relatively small instruction-tuned models. Do state-of-the-art reasoning models (like Gemini and GPT5) also struggle to use representations that are learned in-context? Broadly speaking, yes!
We find that models also struggle at deploying their in-context representations in this task.
We expand our investigation to a novel task: adaptive world modeling. This task combines graph tracing with a few-shot learning task that systematically maps tokens at one point in the topology to tokens at another (e.g., token_i -> token_i+2).
Surprisingly, we find that models perform dramatically worse in the instruction condition, despite having encoded the in-context representations just as well as in the prefilled condition!
...and another where the model is instructed to generate the next word in the sequence. The prefilled condition is identical to the standard graph tracing task, whereas the instruction condition requires the model to delay its prediction for several tokens.
First, we study whether in-context representations are deployable when an instruction-tuned model is tasked with performing next-word prediction. We consider two conditions: one in which the sequence is formatted as a prefilled model response...
We consider an in-context learned representation to be flexibly deployable if it can be used in new contexts. Otherwise, we call the representation inert.
Among other things, that work shows that LM representations come to reflect the graph's topology. This indicates that models can learn novel, structured representations in-context. We push this a step further, and ask whether these representations can be used to solve new tasks!
Our starting point is an excellent paper from
@corefpark.bsky.social et al. This work defines an in-context graph tracing task, which involves next-word prediction on a sequence of words generated by a random walk on a graph. (Figure lifted from their beautiful paper!)
AI is now being deployed for long time horizon tasks. This has renewed the relevance of a longstanding ambition: to build systems capable of flexibly adapting to different environments. We ask whether LLMs can already accomplish this goal in controlled, synthetic settings.
π¨New preprint! In-context learning underlies LLMsβ real-world utility, but what are its limits? Can LLMs learn completely novel representations in-context and flexibly deploy them to solve tasks? In other words, can LLMs construct an in-context world model? Letβs see! π
Huge shoutout to my collaborators and advisors @jennhu.bsky.social, Ishita Dasgupta, @romapatel.bsky.social, @thomasserre.bsky.social, and Ellie Pavlick for their contributions to this project!
I'm excited to share that this paper was accepted at ICLR 2026! We show that language models encode one of the most basic ingredients of a world model: the ability to distinguish plausible from implausible states. Check out the paper for more details!
See you in Rio!
Paper: arxiv.org/abs/2507.12553
I had a great time helping out on this project with @jennhu.bsky.social and Michael Franke! If you're interested in the intersection of interpretability and cogsci, check it out!