Hellina Hailu Nigatu's Avatar

Hellina Hailu Nigatu

@hellinanigatu

CS PhD candidate @UCBerkeley. Interested in multilingual and low-resourced language NLP + HCI. @SIGHPC CDS Fellow. Interned @MBZUAI. Current intern at DAIR Website: https://hhnigatu.github.io

2,251
Followers
258
Following
136
Posts
15.11.2024
Joined
Posts Following

Latest posts by Hellina Hailu Nigatu @hellinanigatu

iโ€™m looking to recruit a postdoc to work on this (documentation + evaluation on accuracy, reliability and societal impacts). hope to advertise detailed descriptions of the role in the coming weeks

18.10.2025 13:27 ๐Ÿ‘ 28 ๐Ÿ” 29 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

Book #12 How Dare the Sun Rise? By Sandra Uwiringiyimana from the DRC

This was a heavy one...I had to sit for a while with the first few chapters as Sandra recounted her experience of loss and greif...

16.10.2025 16:40 ๐Ÿ‘ 8 ๐Ÿ” 2 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

๐Ÿ˜‚๐Ÿ˜‚ works for me

10.10.2025 22:18 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

แ‰ฅแ‰ปแ‹‹แŠ• แ‹จแ‰ แˆ‹แ‰ฝ....๐Ÿ˜Œ

10.10.2025 20:56 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

Congrats!!

06.10.2025 22:47 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

There is so much about navigating the Internet in a low resourced language that makes one unnecessarily vulnerable to malicious actors. It's not just a quality of experience difference, but literally the soft belly through which misinformation spreaders attack.

25.09.2025 22:21 ๐Ÿ‘ 10 ๐Ÿ” 6 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Thank you friend โค

25.09.2025 23:28 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

This work was done with my wonderful collaborators Nuredin Ali, Fiker Tewelde, @schancellor.bsky.social and @iamdaricia.bsky.social

5/n

25.09.2025 17:54 ๐Ÿ‘ 3 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

Based on our findings, we introduce the concept of Data Horizons: a critical boundary where algorithmic structures begin to degrade the relevance and reliability of search results.

4/n

25.09.2025 17:54 ๐Ÿ‘ 2 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

We investigate online health information on #YouTube and #TikTok in two low-web data languages, Amharic and Tigrinya. We find that linguistic, technological, and socio-cultural constraints on information access and production lead to degraded information quality for low-web data languages.

3/n

25.09.2025 17:54 ๐Ÿ‘ 4 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

While social media platforms are increasingly being used as sources of information for critical sectors like healthcare, the quality and quantity of information available is not always guaranteed, especially for languages with limited data available online.
2/n

25.09.2025 17:54 ๐Ÿ‘ 2 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

Very excited for our upcoming #AIES paper Into the Void: Understanding Online Health Information in Low-Web Data Languages.

Link: arxiv.org/pdf/2509.20245

1/n

25.09.2025 17:54 ๐Ÿ‘ 9 ๐Ÿ” 1 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 1

I will DM you!

24.09.2025 23:08 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

แŠฅแŠ•แŠณแŠ• แŠ แ‰ฅแˆฎ แŠ แ‹ฐแˆจแˆฐแŠ•!
So far so good navigating the documentation! Will reach out if i need help or have questions ๐Ÿ˜Š thank you!

24.09.2025 00:40 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

@meg48.bsky.social's Ethiopian new years gift to me is a new version of HornMorpho exactly as i am working on a project that requires morphological analyzer for Amharic, Tigrinya, and Afan Oromo ๐Ÿ’ƒ๐Ÿ’ƒ

23.09.2025 17:00 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

That explains a lot ๐Ÿ˜‚๐Ÿ˜‚

04.09.2025 23:44 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

What are you up to Nina ๐Ÿ‘€

04.09.2025 17:29 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
Video thumbnail

If you or your students are interested in visualization tools, may I suggest signing up for my student @parkie-doo.sh's study! We're learning *a lot* about how to build direct manipulation programming tools these days! Please pass the sign up link along to your labs!
docs.google.com/forms/d/e/1F...

28.08.2025 20:55 ๐Ÿ‘ 5 ๐Ÿ” 1 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0
Portrait of Milagros Miceli in a frame that reads TIME100/AI 2025.

Portrait of Milagros Miceli in a frame that reads TIME100/AI 2025.

I am thrilled to be recognized by TIME as one of the 100 most influential people worldwide in the field of artificial intelligence for my work with @dataworkersinquiry.bsky.social.

>> #TIME100AI time.com/time100ai

I want to take this opportunity to share a few reflections on this work ๐Ÿ‘‡๐Ÿงต

28.08.2025 12:21 ๐Ÿ‘ 57 ๐Ÿ” 17 ๐Ÿ’ฌ 5 ๐Ÿ“Œ 3

Oh no! I ran out of wall space for my tally!!!๐Ÿ˜Œ

26.08.2025 17:38 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

I am gonna start a tally for every time i have to contend with publication policies at top tier conferences that implicitly stall Global South scholarship.

26.08.2025 17:36 ๐Ÿ‘ 2 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
Post image

Came accross a common Ethiopian name on one of the poems in this book as a dedication ๐Ÿ˜Š

13.08.2025 22:07 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

this is not to say all MT is bad or MT has no place in contribution...more on that as an output of my work @dairinstitute.bsky.social ๐Ÿ˜Ž

12.08.2025 20:25 ๐Ÿ‘ 2 ๐Ÿ” 1 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Lol here is an example:

A google translated Tigrinya article: ti.wikipedia.org/wiki/%E1%88%...

English version: en.wikipedia.org/wiki/Wedding...

I took the part that says "Ethiopia" from the English article and ran it through Google Translate...almost identical output save a few words.

12.08.2025 20:20 ๐Ÿ‘ 1 ๐Ÿ” 2 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

Book #11
Missing in action and presumed Dead by Rashidah Ismaili from Benin

Got this from Thrift Books and by luck got a version with the author signature โ˜บ๏ธ

Its a beautiful collection of poems and my fav one is Nomad attached in the picture below

12.08.2025 15:02 ๐Ÿ‘ 3 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

Omg our advisor @schasins.bsky.social got us beanbags for our lab space a while back and we loveee them

11.08.2025 18:00 ๐Ÿ‘ 2 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

This is a good step IMO...but i think we conflate "Wikipedia" with "English Wikipedia" and "AI Generated" with "LLM generated"

We should also be having conversations on Machine Translated text in non-English Wikipedia...those are also "AI Generated"๐Ÿ˜

11.08.2025 16:50 ๐Ÿ‘ 9 ๐Ÿ” 4 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 1

Was a pleasure to work with you Chinasaโค here is to many more collaborations ๐Ÿฅ‚

30.07.2025 21:08 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
Screenshot of paper on the ACL website with the title (Examining the Cultural Encoding of Gender Bias in LLMs for Low-Resourced African Languages) and abstract that reads: "Abstract
Large Language Models (LLMs) are deployed in several aspects of everyday life. While the technology could have several benefits, like many socio-technical systems, it also encodes several biases. Trained on large, crawled datasets from the web, these models perpetuate stereotypes and regurgitate representational bias that is rampant in their training data. Languages encode gender in varying ways; some languages are grammatically gendered, while others do not. Bias in the languages themselves may also vary based on cultural, social, and religious contexts. In this paper, we investigate gender bias in LLMs by selecting two languages, Twi and Amharic. Twi is a non-gendered African language spoken in Ghana, while Amharic is a gendered language spoken in Ethiopia. Using these two languages on the two ends of the continent and their opposing grammatical gender system, we evaluate LLMs in three tasks: Machine Translation, Image Generation, and Sentence Completion. Our results give insights into the gender bias encoded in LLMs using two low-resourced languages and broaden the conversation on how culture and social structures play a role in disparate system performances."

Screenshot of paper on the ACL website with the title (Examining the Cultural Encoding of Gender Bias in LLMs for Low-Resourced African Languages) and abstract that reads: "Abstract Large Language Models (LLMs) are deployed in several aspects of everyday life. While the technology could have several benefits, like many socio-technical systems, it also encodes several biases. Trained on large, crawled datasets from the web, these models perpetuate stereotypes and regurgitate representational bias that is rampant in their training data. Languages encode gender in varying ways; some languages are grammatically gendered, while others do not. Bias in the languages themselves may also vary based on cultural, social, and religious contexts. In this paper, we investigate gender bias in LLMs by selecting two languages, Twi and Amharic. Twi is a non-gendered African language spoken in Ghana, while Amharic is a gendered language spoken in Ethiopia. Using these two languages on the two ends of the continent and their opposing grammatical gender system, we evaluate LLMs in three tasks: Machine Translation, Image Generation, and Sentence Completion. Our results give insights into the gender bias encoded in LLMs using two low-resourced languages and broaden the conversation on how culture and social structures play a role in disparate system performances."

My latest work, โ€œExamining the Cultural Encoding of Gender Bias in LLMs for Low-Resourced African Languages,โ€ co-authored with Abigail Oppong and Hellina Nigatu, is now published at the Workshop on Gender Bias in Natural Language Processing at #ACL2025!

aclanthology.org/2025.gebnlp-...

30.07.2025 20:16 ๐Ÿ‘ 7 ๐Ÿ” 3 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0