Akhila Yerukola's Avatar

Akhila Yerukola

@akhilayerukola

PhD student at CMU LTI; Interested in pragmatics and cross-cultural understanding; intern @ Allen Institute for AI |Prev: Senior Research Engineer @ Samsung Research America | Masters @ Stanford https://akhila-yerukola.github.io/

402
Followers
236
Following
18
Posts
20.11.2024
Joined
Posts Following

Latest posts by Akhila Yerukola @akhilayerukola

Post image

๐ŸŽญ How do LLMs (mis)represent culture?
๐Ÿงฎ How often?
๐Ÿง  Misrepresentations = missing knowledge? spoiler: NO!

At #CHI2026 we are bringing โœจTALESโœจ a participatory evaluation of cultural (mis)reps & knowledge in multilingual LLM-stories for India

๐Ÿ“œ arxiv.org/abs/2511.21322

1/10

02.02.2026 21:38 ๐Ÿ‘ 45 ๐Ÿ” 22 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 2
Preview
NLP 4 Democracy - COLM 2025

I will be at #COLM2025 this week, and would love to connect with folks interested in applications (and critiques) of language modeling in social science research!

And join us for the NLP4Democracy workshop on Friday!

sites.google.com/andrew.cmu.e...

#NLP #NLProc #LLM #ComputationalSocialScience

06.10.2025 19:31 ๐Ÿ‘ 16 ๐Ÿ” 5 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

๐Ÿ”ˆFor the SoLaR workshop
@COLM_conf
we are soliciting opinion abstracts to encourage new perspectives and opinions on responsible language modeling, 1-2 of which will be selected to be presented at the workshop.

Please use the google form below to submit your opinion abstract โฌ‡๏ธ

08.08.2025 12:40 ๐Ÿ‘ 8 ๐Ÿ” 4 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

I'll be at #ACL2025๐Ÿ‡ฆ๐Ÿ‡น!!
Would love to chat about all things pragmatics ๐Ÿง , redefining "helpfulness"๐Ÿค” and enabling better cross-cultural capabilities ๐Ÿ—บ๏ธ ๐Ÿซถ

Presenting our work on culturally offensive nonverbal gestures ๐Ÿ‘‡
๐Ÿ•›Wed @ Poster Session 4
๐Ÿ“Hall 4/5, 11:00-12:30

26.07.2025 02:46 ๐Ÿ‘ 4 ๐Ÿ” 1 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0
Using Hand Gestures To Evaluate AI Biases - Language Technologies Institute - School of Computer Science - Carnegie Mellon University LTI researchers have created a model to help generative AI systems understand the cultural nuance of gestures.

Hand gestures are a major mode of human communication, but they don't always translate well across cultures. New research from @akhilayerukola.bsky.social, @maartensap.bsky.social and others is aimed at giving AI systems a hand with overcoming cultural biases:
lti.cmu.edu/news-and-eve...

27.06.2025 18:04 ๐Ÿ‘ 8 ๐Ÿ” 3 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0
An overview of the work โ€œResearch Borderlands: Analysing Writing Across Research Culturesโ€ by Shaily Bhatt, Tal August, and Maria Antoniak. The overview describes that We  survey and interview interdisciplinary researchers (ยง3) to develop a framework of writing norms that vary across research cultures (ยง4) and operationalise them using computational metrics (ยง5). We then use this evaluation suite for two large-scale quantitative analyses: (a) surfacing variations in writing across 11 communities (ยง6); (b) evaluating the cultural competence of LLMs when adapting writing from one community to another (ยง7).

An overview of the work โ€œResearch Borderlands: Analysing Writing Across Research Culturesโ€ by Shaily Bhatt, Tal August, and Maria Antoniak. The overview describes that We survey and interview interdisciplinary researchers (ยง3) to develop a framework of writing norms that vary across research cultures (ยง4) and operationalise them using computational metrics (ยง5). We then use this evaluation suite for two large-scale quantitative analyses: (a) surfacing variations in writing across 11 communities (ยง6); (b) evaluating the cultural competence of LLMs when adapting writing from one community to another (ยง7).

๐Ÿ–‹๏ธ Curious how writing differs across (research) cultures?
๐Ÿšฉ Tired of โ€œculturalโ€ evals that don't consult people?

We engaged with interdisciplinary researchers to identify & measure โœจcultural normsโœจin scientific writing, and show thatโ—LLMs flatten themโ—

๐Ÿ“œ arxiv.org/abs/2506.00784

[1/11]

09.06.2025 23:29 ๐Ÿ‘ 72 ๐Ÿ” 30 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 5
Post image

When it comes to text prediction, where does one LM outperform another? If you've ever worked on LM evals, you know this question is a lot more complex than it seems. In our new #acl2025 paper, we developed a method to find fine-grained differences between LMs:

๐Ÿงต1/9

09.06.2025 13:47 ๐Ÿ‘ 72 ๐Ÿ” 21 ๐Ÿ’ฌ 2 ๐Ÿ“Œ 2
Preview
NLP 4 Democracy - COLM 2025

๐Ÿ“ฃ Super excited to organize the first workshop on โœจNLP for Democracyโœจ at COLM @colmweb.org!!

Check out our website: sites.google.com/andrew.cmu.e...

Call for submissions (extended abstracts) due June 19, 11:59pm AoE

#COLM2025 #LLMs #NLP #NLProc #ComputationalSocialScience

21.05.2025 16:39 ๐Ÿ‘ 47 ๐Ÿ” 18 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 6

Yes! tbh this method is probably much more immediately useful for helping one understand subtle differences between [models trained on] subtly different data subsets, vs a loftier goal of helping one find "the" best data mixture -- to anyone considering this method, please feel free to reach out :)

06.05.2025 04:16 ๐Ÿ‘ 2 ๐Ÿ” 1 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

These days RAG systems have gotten popular for boosting LLMsโ€”but they're brittle๐Ÿ’”. Minor shifts in phrasing (โœ๏ธ style, politeness, typos) can wreck the pipeline. Even advanced components donโ€™t fix the issue.

Check out this extensive eval by @neelbhandari.bsky.social and @tianyucao.bsky.social!

18.04.2025 01:49 ๐Ÿ‘ 1 ๐Ÿ” 1 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0
Post image

๐Ÿ“–For our last @MilaNLProc lab seminar, it was a pleasure to have @akhilayerukola.bsky.social presenting "Need for Culturally Contextual Safety Guardrails: A Case Study in Non-Verbal Gestures".

14.03.2025 16:16 ๐Ÿ‘ 8 ๐Ÿ” 3 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0
Post image

๐Ÿš€ New #ICLR2025 Paper Alert! ๐Ÿš€

Can Audio Foundation Models like Moshi and GPT-4o truly engage in natural conversations? ๐Ÿ—ฃ๏ธ๐Ÿ”Š

We benchmark their turn-taking abilities and uncover major gaps in conversational AI. ๐Ÿงต๐Ÿ‘‡

๐Ÿ“œ: arxiv.org/abs/2503.01174

05.03.2025 16:03 ๐Ÿ‘ 9 ๐Ÿ” 6 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

Check out Akhila'S VERY cool work on culturally contextual hand gestures and how current systems (can't) handle them ๐Ÿค–

26.02.2025 22:29 ๐Ÿ‘ 6 ๐Ÿ” 2 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

My PhD student Akhila's been doing some incredible cultural work in the last few years! Check out out latest work on cultural safety and hand gestures, showing most vision and/or language AI systems are very cross-culturally unsafe!

26.02.2025 17:20 ๐Ÿ‘ 27 ๐Ÿ” 3 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Also, this work began while I interned with Nanyun Peng and @skgabrie.bsky.social at sunny UCLA under the guidance of my advisor @maartensap.bsky.social ! Grateful for their mentorship throughout! ๐Ÿ™Œ

26.02.2025 18:29 ๐Ÿ‘ 2 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Special thanks to: @sunipadev.bsky.social @841io.bsky.social , @nouhadziri.bsky.social , Jocelyn Shen, @shaily99.bsky.social , @simi97k.bsky.social ,
@vijaytarian.bsky.social , @apratapa.xyz for helpful discussions and feedback on this work!

26.02.2025 16:22 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

Huge shoutout to my amazing collaborators: @skgabrie.bsky.social, Nanyun (Violet) Peng, @maartensap.bsky.social!!

26.02.2025 16:22 ๐Ÿ‘ 2 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

๐Ÿš€ I'm passionate about developing culturally contextual safety guardrails to make AI more sensitive and aware. If this work interests you, please feel free to reach outโ€”Iโ€™d love to connect!

26.02.2025 16:22 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
Preview
Mind the Gesture: Evaluating AI Sensitivity to Culturally Offensive Non-Verbal Gestures Gestures are an integral part of non-verbal communication, with meanings that vary across cultures, and misinterpretations that can have serious social and diplomatic consequences. As AI systems becom...

For more interesting findings, please check out our preprint ๐Ÿ“œ arxiv.org/abs/2502.17710

Data ๐Ÿ“š github.com/Akhila-Yeruk...

26.02.2025 16:22 ๐Ÿ‘ 2 ๐Ÿ” 1 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

The cross-cultural safety risks arenโ€™t theoretical โ€“ theyโ€™re already impacting several applications, such as:
โœˆ๏ธ AI-powered travel guides
๐ŸŽญ AI-generated ad visuals
๐Ÿค– Automated content moderation
Culturally contextual safety guardrails are needed for AI systems!

26.02.2025 16:22 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

๐Ÿ”ฌ Key Takeaway ๐Ÿฅ‰
All modelsโ€”T2I, LLMs, and VLMsโ€”exhibit US-centric biases, with higher accuracy in identifying offensive gestures in US contexts than in non-US ones (e.g., middle finger ๐Ÿ–• in US vs UK)

26.02.2025 16:22 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

๐Ÿ”ฌ Key Takeaway ๐Ÿฅˆ
All modelsโ€”T2I, LLMs, and VLMsโ€”often default to US-centric interpretations of universal concepts (e.g., "good luck" โ†’ ๐Ÿคž), overlooking the cultural variation in gestures used to express them

26.02.2025 16:22 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

๐Ÿ”ฌ Key Takeaway ๐Ÿฅ‡
(a) T2I models struggle to reject offensive gestures. LLMs tend to overflag gestures as offensive. VLMs show mixed results, with some performing near chance and others over-flagging
(b) Adding scene context doesnt affect LLMs but worsens T2I and VLM performance

26.02.2025 16:22 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
Table outlining different prompt formulations used to evaluate T2I (Text-to-Image), LLM (Large Language Model), and VLM (Vision-Language Model) responses to gestures, illustrated with the โ€˜fingers-crossedโ€™ gesture in Vietnam. The table categorizes prompts into three conditions: (1) Explicit: Country โ€“ directly stating both โ€˜fingers-crossedโ€™ and 'Vietnam', (2) Explicit: Country + Scene โ€“ adding contextual details such as a 'womenโ€™s community gathering,' and (3) Implicit Mention โ€“ referencing the gestureโ€™s meaning ('wishing someone luck') without explicitly naming the gesture, while still mentioning Vietnam. The table also specifies evaluation metrics: RQ1 and RQ3 focus on rejection and offensiveness classification rates, while RQ2 measures error rates.

Table outlining different prompt formulations used to evaluate T2I (Text-to-Image), LLM (Large Language Model), and VLM (Vision-Language Model) responses to gestures, illustrated with the โ€˜fingers-crossedโ€™ gesture in Vietnam. The table categorizes prompts into three conditions: (1) Explicit: Country โ€“ directly stating both โ€˜fingers-crossedโ€™ and 'Vietnam', (2) Explicit: Country + Scene โ€“ adding contextual details such as a 'womenโ€™s community gathering,' and (3) Implicit Mention โ€“ referencing the gestureโ€™s meaning ('wishing someone luck') without explicitly naming the gesture, while still mentioning Vietnam. The table also specifies evaluation metrics: RQ1 and RQ3 focus on rejection and offensiveness classification rates, while RQ2 measures error rates.

We assess how well T2I systems, LLMs, and VLMs understand cross-cultural gesturesโ€”revealing gaps in AIโ€™s ability to navigate nonverbal communication safely. ๐Ÿ’ซ

26.02.2025 16:22 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
Table displaying examples of aggregated annotations from MC-SIGNS, listing gestures, their associated cultural meanings, contexts where they may be inappropriate, and their offensiveness ratings. The table includes gestures such as 'Horns' in Brazil (infidelity), 'Fig Sign' in Indonesia (female genitalia), and 'OK' in Turkey (homophobic). Each gesture is rated for offensiveness (Off/Obs) or hatefulness (Hate) based on annotations from five evaluators, with specific scenarios suggested for avoidance, such as public spaces, professional settings, or LGBTQ+ forums.

Table displaying examples of aggregated annotations from MC-SIGNS, listing gestures, their associated cultural meanings, contexts where they may be inappropriate, and their offensiveness ratings. The table includes gestures such as 'Horns' in Brazil (infidelity), 'Fig Sign' in Indonesia (female genitalia), and 'OK' in Turkey (homophobic). Each gesture is rated for offensiveness (Off/Obs) or hatefulness (Hate) based on annotations from five evaluators, with specific scenarios suggested for avoidance, such as public spaces, professional settings, or LGBTQ+ forums.

๐ŸŒ Introducing MC-SIGNS โ€” a testbed of 288 gesture-country pairs across 25 gestures & 85 countries, carefully annotated by cultural experts for:
1๏ธโƒฃOffensiveness โ€“ how inappropriate a gesture is
2๏ธโƒฃConfidence score
3๏ธโƒฃCultural meaning โ€“ associated gloss
4๏ธโƒฃContextual factors โ€“ when/where it may be risky

26.02.2025 16:22 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

Why This Matters? ๐Ÿค”
Humans can resolve such misunderstandings through social cues and context.
But AI? It generates STATIC content โ€” ads ๐ŸŽญ, travel tips ๐Ÿ›ซ๐Ÿ๏ธ, and images ๐Ÿ“ธ โ€” without accounting for the cross-cultural safety risks.

26.02.2025 16:22 ๐Ÿ‘ 2 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
Figure showing that interpretations of gestures vary dramatically across regions and cultures. โ€˜Crossing your fingers,โ€™ commonly used in the US to wish for good luck, can be deeply offensive to female audiences in parts of Vietnam. Similarly, the 'fig gesture,' a playful 'got your nose' game with children in the US, carries strong sexual connotations in Japan and can be highly offensive.

Figure showing that interpretations of gestures vary dramatically across regions and cultures. โ€˜Crossing your fingers,โ€™ commonly used in the US to wish for good luck, can be deeply offensive to female audiences in parts of Vietnam. Similarly, the 'fig gesture,' a playful 'got your nose' game with children in the US, carries strong sexual connotations in Japan and can be highly offensive.

Did you know? Gestures used to express universal conceptsโ€”like wishing for luckโ€”vary DRAMATICALLY across cultures?
๐Ÿคžmeans luck in US but deeply offensive in Vietnam ๐Ÿšจ

๐Ÿ“ฃ We introduce MC-SIGNS, a test bed to evaluate how LLMs/VLMs/T2I handle such nonverbal behavior!

๐Ÿ“œ: arxiv.org/abs/2502.17710

26.02.2025 16:22 ๐Ÿ‘ 33 ๐Ÿ” 7 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 3

Coding agents often donโ€™t ask follow-up clarifying questions ๐Ÿคทโ€โ™€๏ธ

But interactivity isnโ€™t about asking more questionsโ€”itโ€™s about asking better questions! ๐Ÿค–๐Ÿ’ฌ

Check out this new work led by Sanidhya Vijay! www.linkedin.com/in/sanidhya-...

19.02.2025 20:34 ๐Ÿ‘ 2 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

could I be added too?

22.11.2024 04:36 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

๐Ÿ™‹โ€โ™€๏ธ

21.11.2024 00:48 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0