Sonia Murthy's Avatar

Sonia Murthy

@soniakmurthy

cs phd student and kempner institute graduate fellow at harvard. interested in language, cognition, and ai soniamurthy.com

57
Followers
19
Following
12
Posts
30.01.2025
Joined
Posts Following

Latest posts by Sonia Murthy @soniakmurthy

Thanks Hope! I just came across your related work with the CSS team at Microsoft- I'd love to chat about it sometime if you're free πŸ™‚

11.02.2025 23:20 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Hi Daniel- thanks so much. The preprint is dependable, though missing a little additional discussion that made it into the camera-ready. I can email you the camera ready and will update arxiv with it shortly. Thank you!

11.02.2025 23:13 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
Alignment reduces conceptual diversity of language models - Kempner Institute As large language models (LLMs) have become more sophisticated, there’s been growing interest in using LLM-generated responses in place of human data for tasks such as polling, user studies, and […]

NEW blog post: Do modern #LLMs capture the conceptual diversity of human populations? #KempnerInstitute researchers find #alignment reduces conceptual diversity of language models. bit.ly/4hNjtiI

10.02.2025 15:19 πŸ‘ 12 πŸ” 3 πŸ’¬ 0 πŸ“Œ 0

Many thanks to my collaborators and @kempnerinstitute.bsky.social for helping make this idea come to life, and to @rdhawkins.bsky.social for helping plant the seeds 🌱

10.02.2025 17:20 πŸ‘ 2 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0
Preview
GitHub - skmur/onefish-twofish: One fish, two fish, but not the whole sea: Alignment reduces language models' conceptual diversity (NAACL 2025) One fish, two fish, but not the whole sea: Alignment reduces language models' conceptual diversity (NAACL 2025) - skmur/onefish-twofish

(9/9) Code and data for our experiments can be found at: github.com/skmur/onefis...
Preprint: arxiv.org/abs/2411.04427

Also, check out our feature in the @kempnerinstitute.bsky.social Deeper Learning Blog! bit.ly/417WVDL

10.02.2025 17:20 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

(8/9) We think that better understanding such tradeoffs will be important to building LLMs that are aligned to human values– human values are diverse, our models should be too.

10.02.2025 17:20 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

(7/9) This suggests a trade-off: increasing model safety in terms of value alignment decreases safety in terms of diversity of thoughts and opinion.

10.02.2025 17:20 πŸ‘ 3 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image Post image

(6/9) We put a suite of aligned models, and their instruction fine-tuned counterparts, to the test and found:
* no model reaches human-like diversity of thought.
* aligned models show LESS conceptual diversity than instruction fine-tuned counterparts

10.02.2025 17:20 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

(5/9) Our experiments are inspired by human studies in two domains with rich behavioral data.

10.02.2025 17:20 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

(4/9) We introduce a new way of measuring the conceptual diversity of synthetically-generated LLM "populations" by considering how its β€œindividuals’” variability relates to that of the population.

10.02.2025 17:20 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

(3/9) One key issue is whether LLMs capture conceptual diversity: the variation among individuals’ representations of a particular domain. How do we measure this? And how does alignment affect this?

10.02.2025 17:20 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

(2/9) There's a lot of interest right now in getting LLMs to mimic the response distributions of β€œpopulations”--heterogeneous collections of individuals– for the purposes of political polling, opinion surveys, and behavioral research.

10.02.2025 17:20 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

(1/9) Excited to share my recent work on "Alignment reduces LM's conceptual diversity" with @tomerullman.bsky.social and @jennhu.bsky.social, to appear at #NAACL2025! 🐟

We want models that match our values...but could this hurt their diversity of thought?
Preprint: arxiv.org/abs/2411.04427

10.02.2025 17:20 πŸ‘ 63 πŸ” 10 πŸ’¬ 2 πŸ“Œ 4