๐ญ How do LLMs (mis)represent culture?
๐งฎ How often?
๐ง Misrepresentations = missing knowledge? spoiler: NO!
At #CHI2026 we are bringing โจTALESโจ a participatory evaluation of cultural (mis)reps & knowledge in multilingual LLM-stories for India
๐ arxiv.org/abs/2511.21322
1/10
02.02.2026 21:38
๐ 45
๐ 22
๐ฌ 1
๐ 2
NLP 4 Democracy - COLM 2025
I will be at #COLM2025 this week, and would love to connect with folks interested in applications (and critiques) of language modeling in social science research!
And join us for the NLP4Democracy workshop on Friday!
sites.google.com/andrew.cmu.e...
#NLP #NLProc #LLM #ComputationalSocialScience
06.10.2025 19:31
๐ 16
๐ 5
๐ฌ 0
๐ 0
๐For the SoLaR workshop
@COLM_conf
we are soliciting opinion abstracts to encourage new perspectives and opinions on responsible language modeling, 1-2 of which will be selected to be presented at the workshop.
Please use the google form below to submit your opinion abstract โฌ๏ธ
08.08.2025 12:40
๐ 8
๐ 4
๐ฌ 1
๐ 0
I'll be at #ACL2025๐ฆ๐น!!
Would love to chat about all things pragmatics ๐ง , redefining "helpfulness"๐ค and enabling better cross-cultural capabilities ๐บ๏ธ ๐ซถ
Presenting our work on culturally offensive nonverbal gestures ๐
๐Wed @ Poster Session 4
๐Hall 4/5, 11:00-12:30
26.07.2025 02:46
๐ 4
๐ 1
๐ฌ 0
๐ 0
Using Hand Gestures To Evaluate AI Biases - Language Technologies Institute - School of Computer Science - Carnegie Mellon University
LTI researchers have created a model to help generative AI systems understand the cultural nuance of gestures.
Hand gestures are a major mode of human communication, but they don't always translate well across cultures. New research from @akhilayerukola.bsky.social, @maartensap.bsky.social and others is aimed at giving AI systems a hand with overcoming cultural biases:
lti.cmu.edu/news-and-eve...
27.06.2025 18:04
๐ 8
๐ 3
๐ฌ 0
๐ 0
An overview of the work โResearch Borderlands: Analysing Writing Across Research Culturesโ by Shaily Bhatt, Tal August, and Maria Antoniak. The overview describes that We survey and interview interdisciplinary researchers (ยง3) to develop a framework of writing norms that vary across research cultures (ยง4) and operationalise them using computational metrics (ยง5). We then use this evaluation suite for two large-scale quantitative analyses: (a) surfacing variations in writing across 11 communities (ยง6); (b) evaluating the cultural competence of LLMs when adapting writing from one community to another (ยง7).
๐๏ธ Curious how writing differs across (research) cultures?
๐ฉ Tired of โculturalโ evals that don't consult people?
We engaged with interdisciplinary researchers to identify & measure โจcultural normsโจin scientific writing, and show thatโLLMs flatten themโ
๐ arxiv.org/abs/2506.00784
[1/11]
09.06.2025 23:29
๐ 72
๐ 30
๐ฌ 1
๐ 5
When it comes to text prediction, where does one LM outperform another? If you've ever worked on LM evals, you know this question is a lot more complex than it seems. In our new #acl2025 paper, we developed a method to find fine-grained differences between LMs:
๐งต1/9
09.06.2025 13:47
๐ 72
๐ 21
๐ฌ 2
๐ 2
NLP 4 Democracy - COLM 2025
๐ฃ Super excited to organize the first workshop on โจNLP for Democracyโจ at COLM @colmweb.org!!
Check out our website: sites.google.com/andrew.cmu.e...
Call for submissions (extended abstracts) due June 19, 11:59pm AoE
#COLM2025 #LLMs #NLP #NLProc #ComputationalSocialScience
21.05.2025 16:39
๐ 47
๐ 18
๐ฌ 1
๐ 6
Yes! tbh this method is probably much more immediately useful for helping one understand subtle differences between [models trained on] subtly different data subsets, vs a loftier goal of helping one find "the" best data mixture -- to anyone considering this method, please feel free to reach out :)
06.05.2025 04:16
๐ 2
๐ 1
๐ฌ 0
๐ 0
These days RAG systems have gotten popular for boosting LLMsโbut they're brittle๐. Minor shifts in phrasing (โ๏ธ style, politeness, typos) can wreck the pipeline. Even advanced components donโt fix the issue.
Check out this extensive eval by @neelbhandari.bsky.social and @tianyucao.bsky.social!
18.04.2025 01:49
๐ 1
๐ 1
๐ฌ 0
๐ 0
๐For our last @MilaNLProc lab seminar, it was a pleasure to have @akhilayerukola.bsky.social presenting "Need for Culturally Contextual Safety Guardrails: A Case Study in Non-Verbal Gestures".
14.03.2025 16:16
๐ 8
๐ 3
๐ฌ 0
๐ 0
๐ New #ICLR2025 Paper Alert! ๐
Can Audio Foundation Models like Moshi and GPT-4o truly engage in natural conversations? ๐ฃ๏ธ๐
We benchmark their turn-taking abilities and uncover major gaps in conversational AI. ๐งต๐
๐: arxiv.org/abs/2503.01174
05.03.2025 16:03
๐ 9
๐ 6
๐ฌ 1
๐ 0
Check out Akhila'S VERY cool work on culturally contextual hand gestures and how current systems (can't) handle them ๐ค
26.02.2025 22:29
๐ 6
๐ 2
๐ฌ 0
๐ 0
My PhD student Akhila's been doing some incredible cultural work in the last few years! Check out out latest work on cultural safety and hand gestures, showing most vision and/or language AI systems are very cross-culturally unsafe!
26.02.2025 17:20
๐ 27
๐ 3
๐ฌ 0
๐ 0
Also, this work began while I interned with Nanyun Peng and @skgabrie.bsky.social at sunny UCLA under the guidance of my advisor @maartensap.bsky.social ! Grateful for their mentorship throughout! ๐
26.02.2025 18:29
๐ 2
๐ 0
๐ฌ 0
๐ 0
Special thanks to: @sunipadev.bsky.social @841io.bsky.social , @nouhadziri.bsky.social , Jocelyn Shen, @shaily99.bsky.social , @simi97k.bsky.social ,
@vijaytarian.bsky.social , @apratapa.xyz for helpful discussions and feedback on this work!
26.02.2025 16:22
๐ 1
๐ 0
๐ฌ 1
๐ 0
Huge shoutout to my amazing collaborators: @skgabrie.bsky.social, Nanyun (Violet) Peng, @maartensap.bsky.social!!
26.02.2025 16:22
๐ 2
๐ 0
๐ฌ 1
๐ 0
๐ I'm passionate about developing culturally contextual safety guardrails to make AI more sensitive and aware. If this work interests you, please feel free to reach outโIโd love to connect!
26.02.2025 16:22
๐ 1
๐ 0
๐ฌ 1
๐ 0
The cross-cultural safety risks arenโt theoretical โ theyโre already impacting several applications, such as:
โ๏ธ AI-powered travel guides
๐ญ AI-generated ad visuals
๐ค Automated content moderation
Culturally contextual safety guardrails are needed for AI systems!
26.02.2025 16:22
๐ 1
๐ 0
๐ฌ 1
๐ 0
๐ฌ Key Takeaway ๐ฅ
All modelsโT2I, LLMs, and VLMsโexhibit US-centric biases, with higher accuracy in identifying offensive gestures in US contexts than in non-US ones (e.g., middle finger ๐ in US vs UK)
26.02.2025 16:22
๐ 0
๐ 0
๐ฌ 1
๐ 0
๐ฌ Key Takeaway ๐ฅ
All modelsโT2I, LLMs, and VLMsโoften default to US-centric interpretations of universal concepts (e.g., "good luck" โ ๐ค), overlooking the cultural variation in gestures used to express them
26.02.2025 16:22
๐ 1
๐ 0
๐ฌ 1
๐ 0
๐ฌ Key Takeaway ๐ฅ
(a) T2I models struggle to reject offensive gestures. LLMs tend to overflag gestures as offensive. VLMs show mixed results, with some performing near chance and others over-flagging
(b) Adding scene context doesnt affect LLMs but worsens T2I and VLM performance
26.02.2025 16:22
๐ 1
๐ 0
๐ฌ 1
๐ 0
Table outlining different prompt formulations used to evaluate T2I (Text-to-Image), LLM (Large Language Model), and VLM (Vision-Language Model) responses to gestures, illustrated with the โfingers-crossedโ gesture in Vietnam. The table categorizes prompts into three conditions: (1) Explicit: Country โ directly stating both โfingers-crossedโ and 'Vietnam', (2) Explicit: Country + Scene โ adding contextual details such as a 'womenโs community gathering,' and (3) Implicit Mention โ referencing the gestureโs meaning ('wishing someone luck') without explicitly naming the gesture, while still mentioning Vietnam. The table also specifies evaluation metrics: RQ1 and RQ3 focus on rejection and offensiveness classification rates, while RQ2 measures error rates.
We assess how well T2I systems, LLMs, and VLMs understand cross-cultural gesturesโrevealing gaps in AIโs ability to navigate nonverbal communication safely. ๐ซ
26.02.2025 16:22
๐ 1
๐ 0
๐ฌ 1
๐ 0
Table displaying examples of aggregated annotations from MC-SIGNS, listing gestures, their associated cultural meanings, contexts where they may be inappropriate, and their offensiveness ratings. The table includes gestures such as 'Horns' in Brazil (infidelity), 'Fig Sign' in Indonesia (female genitalia), and 'OK' in Turkey (homophobic). Each gesture is rated for offensiveness (Off/Obs) or hatefulness (Hate) based on annotations from five evaluators, with specific scenarios suggested for avoidance, such as public spaces, professional settings, or LGBTQ+ forums.
๐ Introducing MC-SIGNS โ a testbed of 288 gesture-country pairs across 25 gestures & 85 countries, carefully annotated by cultural experts for:
1๏ธโฃOffensiveness โ how inappropriate a gesture is
2๏ธโฃConfidence score
3๏ธโฃCultural meaning โ associated gloss
4๏ธโฃContextual factors โ when/where it may be risky
26.02.2025 16:22
๐ 1
๐ 0
๐ฌ 1
๐ 0
Why This Matters? ๐ค
Humans can resolve such misunderstandings through social cues and context.
But AI? It generates STATIC content โ ads ๐ญ, travel tips ๐ซ๐๏ธ, and images ๐ธ โ without accounting for the cross-cultural safety risks.
26.02.2025 16:22
๐ 2
๐ 0
๐ฌ 1
๐ 0
Figure showing that interpretations of gestures vary dramatically across regions and cultures. โCrossing your fingers,โ commonly used in the US to wish for good luck, can be deeply offensive to female audiences in parts of Vietnam. Similarly, the 'fig gesture,' a playful 'got your nose' game with children in the US, carries strong sexual connotations in Japan and can be highly offensive.
Did you know? Gestures used to express universal conceptsโlike wishing for luckโvary DRAMATICALLY across cultures?
๐คmeans luck in US but deeply offensive in Vietnam ๐จ
๐ฃ We introduce MC-SIGNS, a test bed to evaluate how LLMs/VLMs/T2I handle such nonverbal behavior!
๐: arxiv.org/abs/2502.17710
26.02.2025 16:22
๐ 33
๐ 7
๐ฌ 1
๐ 3
Coding agents often donโt ask follow-up clarifying questions ๐คทโโ๏ธ
But interactivity isnโt about asking more questionsโitโs about asking better questions! ๐ค๐ฌ
Check out this new work led by Sanidhya Vijay! www.linkedin.com/in/sanidhya-...
19.02.2025 20:34
๐ 2
๐ 0
๐ฌ 0
๐ 0
could I be added too?
22.11.2024 04:36
๐ 1
๐ 0
๐ฌ 0
๐ 0
๐โโ๏ธ
21.11.2024 00:48
๐ 1
๐ 0
๐ฌ 0
๐ 0