In our newest work (led by the amazing
@sunnytqin.bsky.social , w/ @emalach.bsky.social, Samy Jelassi), we investigate a core question for LLMs: "๐ก๐ ๐๐๐๐๐ก๐๐๐๐ ๐๐ ๐๐๐ก ๐ก๐ ๐๐๐๐๐ก๐๐๐๐" in two prototypical logic-heavy puzzles: CountDown and Sudoku.
11.04.2025 16:29
๐ 3
๐ 2
๐ฌ 1
๐ 0
Will be presenting this work at #NeurIPS2024, today 11am, poster #2311. Come visit us!
12.12.2024 16:45
๐ 10
๐ 1
๐ฌ 0
๐ 0
Heading to NeurIPS tomorrow โ๏ธ
Will be presenting a few papers during the week. Ping me if you want to chat!
09.12.2024 14:55
๐ 2
๐ 0
๐ฌ 0
๐ 0
I defended my PhD dissertation back in May. I didn't have time to share it widely then (newborn baby), but I think some of you might enjoy it, especially the opening chapters: benjaminedelman.com/assets/disse...
02.12.2024 00:20
๐ 31
๐ 3
๐ฌ 3
๐ 1
Just put together a starter pack for Deep Learning Theory. Let me know if you'd like to be included or suggest someone to add to the list!
go.bsky.app/2qnppia
22.11.2024 21:35
๐ 87
๐ 31
๐ฌ 29
๐ 5
How does test loss change as we change the training data? And how does this interact with scaling laws?
We propose a methodology to approach these questions by showing that we can predict the performance across datasets and losses with simple shifted power law fits.
21.11.2024 15:11
๐ 19
๐ 7
๐ฌ 1
๐ 2