Check out our work on preference modeling through latent (& interpretable) attribute representation learning!
PrefPalette allows you to understand _why_ something is preferred and _how_ preference varies depending on context π¨
Check out our work on preference modeling through latent (& interpretable) attribute representation learning!
PrefPalette allows you to understand _why_ something is preferred and _how_ preference varies depending on context π¨
WHY do you prefer something over another?
Reward models treat preference as a black-boxπΆβπ«οΈbut human brainsπ§ decompose decisions into hidden attributes
We built the first system to mirror how people really make decisions in our recent COLM paperπ¨PrefPaletteβ¨
Why it mattersππ»π§΅
See our work on procedurally generating challenging reasoning problems on detecting inconsistencies in stories! FlawedFictions is a great example of what I'm most excited about: reliable synthetic data for reasoning in under-explored domains.
(I'm at ICLR to chat, DMs open!)
Excited to be at #ICLR2025 π€©
I'll be giving an oral presentation for Creativity Index on Fri 25th 11:06, Garnet 212&219 ποΈ
I'll also be presenting posters:
πExploreToM, Sat 26th 10:00, Hall 3 + 2B #49
πCreativityIndex, Fri 25th 15:00, Hall 3 + 2B #618
Hope to see you there!
A screenshot of the first page of the paper, containing the paper title: Finding Flawed Fictions: Evaluating Complex Reasoning in Language Models via Plot Hole Detection and the names of the authors: Kabir Ahuja, Melanie Sclar, and Yulia Tsvetkov. All the three authors are from CSE department in the University of Washington in Seattle, USA. They can be reached at {kahuja,msclar,yuliats}@cs.washington.edu
π’ New Paper!
Tired π΄ of reasoning benchmarks full of math & code? In our work we consider the problem of reasoning for plot holes in stories -- inconsistencies in a storyline that break the internal logic or rules of a storyβs world π
W @melaniesclar.bsky.social, and @tsvetshop.bsky.social
1/n
π¨New Paper! So o3-mini and R1 seem to excel on math & coding. But how good are they on other domains where verifiable rewards are not easily available, such as theory of mind (ToM)? Do they show similar behavioral patterns? π€ What if I told you it's...interesting, like the below?π§΅
Would love to be added, thank you!
Great point! In general, higher temperature does lead to a higher creativity index, but when compared to the gap between human and LLMs, the improvement is minimal. We only tried temperature in the usual [0, 1] range. @gximing.bsky.social will be able to share many more details!
A screenshot from the linked paper's figure 1. The figure is a pretty-complicated three column figure, but --- in essence, it sketches out how the authors compare llm sequences to the pretraining data / human authors to the pretraining data. Humans write more novel n-gram sequences.
LLMs generate novel word sequences not contained in their pretraining data. However, compared to humans, models generate significantly fewer novel n-grams.
RLHF = 30% *more* copying than base!
Awesome work from the awesome Ximing Lu (gloriaximinglu.github.io) et al. π€©
arxiv.org/pdf/2410.04265
Are LLMs π€ as creative as humans π©βπ? Not quite!
Introducing CREATIVITY INDEX: a metric that quantifies the linguistic creativity of a text by reconstructing it from existing text snippets on the web. Spoiler: professional human writers like Hemingway are still far more creative than LLMs! π²
Does last name count? π
Would love to be added, thank you!!
Would love to be on the list, thank you so much for making this happen!