Verena Blaschke's Avatar

Verena Blaschke

@verenablaschke

PhD student @mainlp.bsky.social (@cislmu.bsky.social, LMU Munich). Interested in language variation & change, currently working on NLP for dialects and low-resource languages. verenablaschke.github.io

188
Followers
262
Following
28
Posts
03.02.2025
Joined
Posts Following

Latest posts by Verena Blaschke @verenablaschke

❗Deadline extension:
- Paper submission: Jan 2
- Commitment for pre-reviewed papers: Jan 10

sites.google.com/view/vardial...

16.12.2025 12:40 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
VarDial 2026 - Shared Tasks AMIYA (ΨΉΨ§Ω…ΩŠΨ©) Shared Task: Arabic Modeling In Your Accent The AMIYA shared task will offer a chance for researchers to demonstrate innovations and improvements in language modeling of dialectal Arabic...

Interested in developing LLMs that work for dialectal Arabic? Introducing the AMIYA shared task: Arabic Modeling In Your Accent, just accepted to VarDial 2026. Please consider submitting and joining us in Morocco if you do! sites.google.com/view/vardial...

12.11.2025 15:53 πŸ‘ 7 πŸ” 4 πŸ’¬ 1 πŸ“Œ 0

πŸ“„DistaLs: A Comprehensive Collection of Language Distance Measures
πŸ‘₯ Rob van der Goot, Esther Ploeger, @verenablaschke.bsky.social Tanja SamardΕΎic
πŸ”— aclanthology.org/2025.emnlp-d...
🎯A convenient toolkit for obtaining distance measures across languages
▢️ www.youtube.com/watch?v=SSk9...

05.11.2025 13:17 πŸ‘ 1 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0

It was fun to do a bit of science outreach, but I also found it super interesting to get a look behind the scenes of how a tv segment is made πŸ˜ƒ

24.10.2025 13:07 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

The work I talked about is mainly described in this paper on German dialect ASR:
bsky.app/profile/vere...

24.10.2025 13:07 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Preview
Capriccio: Warum KI kein Bairisch versteht - hier anschauen KΓΌnstliche Intelligenz verΓ€ndert die Sprache, denn sie ΓΌbernimmt nicht nur unser Vokabular, sondern prΓ€gt auch andersherum unsere Ausdrucksweise. Nur mit Dialekten tut sich die KI nachweislich schwer,...

πŸŽ₯ @barbaraplank.bsky.social, Hinrich SchΓΌtze, and I were featured on tv, talking about how AI tools struggle with dialects!
www.ardmediathek.de/video/capric...

24.10.2025 13:07 πŸ‘ 8 πŸ” 2 πŸ’¬ 1 πŸ“Œ 0
We're investigating how publishers handle name changes and the barriers scholars face. If you've changed your name (or are considering it) and dealt with updating your academic publications, we want to hear from you.

Researchers who have changed their name for any reason, such as gender transition, marriage, divorce, immigration, cultural reasons, or citation formatting issues. Whether you've successfully updated your work, are currently trying, or decided not to because of barriers, your opinion matters.

Your input will help us advocate for better, more inclusive policies in academic publishing. It takes around 5-10 minutes to complete.

Survey Link: https://forms.cloud.microsoft/e/E0XXBmZdEP

Please share with anyone who might benefit.

We're investigating how publishers handle name changes and the barriers scholars face. If you've changed your name (or are considering it) and dealt with updating your academic publications, we want to hear from you. Researchers who have changed their name for any reason, such as gender transition, marriage, divorce, immigration, cultural reasons, or citation formatting issues. Whether you've successfully updated your work, are currently trying, or decided not to because of barriers, your opinion matters. Your input will help us advocate for better, more inclusive policies in academic publishing. It takes around 5-10 minutes to complete. Survey Link: https://forms.cloud.microsoft/e/E0XXBmZdEP Please share with anyone who might benefit.

We're surveying researchers about name changes in academic publishing.

If you've changed your name and dealt with updating publications, we want to hear your experience. Any reason counts: transition, marriage, cultural reasons, etc.

forms.cloud.microsoft/e/E0XXBmZdEP

21.10.2025 12:45 πŸ‘ 16 πŸ” 23 πŸ’¬ 2 πŸ“Œ 2

Timeline:
- Paper submission: Dec 19
- Commitment for pre-reviewed papers: Jan 2
- Acceptance notifs: Jan 23
- Camera-ready: Feb 3
- Workshop: TBD (Mar 24-29)

Organizers:
Yves Scherrer, Noëmi Aepli, @tosaja.bsky.social, Nikola Ljubeőić, Preslav Nakov, @tiedeman.bsky.social, Marcos Zampieri & me

21.10.2025 10:36 πŸ‘ 0 πŸ” 1 πŸ’¬ 1 πŸ“Œ 0
VarDial @ EACL 2026, with important dates (see next post for text version). 
Photo CC-0.

VarDial @ EACL 2026, with important dates (see next post for text version). Photo CC-0.

VarDial 2026 will be colocated with @eaclmeeting.bsky.social! We're looking forward to your papers on NLP for similar languages, varieties and dialects :)

Deadline: Dec 19 (Jan 2 for pre-reviewed ARR papers)
sites.google.com/view/vardial...

21.10.2025 10:36 πŸ‘ 14 πŸ” 10 πŸ’¬ 1 πŸ“Œ 0
Slide: "Dialect NLP: How (and why) to process non-standard language varieties"

Slide: "Dialect NLP: How (and why) to process non-standard language varieties"

Moin! I'm on my way to Hamburg to meet the @ds-hamburg.bsky.social group and give a talk about dialect NLP! βš“

20.10.2025 04:53 πŸ‘ 12 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Thanks a lot!

14.10.2025 14:36 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Has the "Black LLMirror" work already been published / is it going to be turned into a publication? I'd love to read more about it!

14.10.2025 13:14 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

#Interspeech2025 had a science fair today with lots of interactive speech tech demos, not just for conference attendees but also/especially for curious laypeople! The demos were fun, and I like the idea of combining a conference w/ a bit of scicomm for the local public

17.08.2025 20:11 πŸ‘ 3 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
A Multi-Dialectal Dataset for German Dialect ASR and Dialect-to-Standard Speech Translation Although Germany has a diverse landscape of dialects, they are underrepresented in current automatic speech recognition (ASR) research. To enable studies of how robust models are towards dialectal var...

Check out the...
- talk on Mon Aug 18, 15:50–16:10
- preprint: arxiv.org/abs/2506.02894
- suppl. material: github.com/mainlp/betth...

Joint work w/ Miriam Winkler & @barbaraplank.bsky.social from @mainlp.bsky.social, and Constantin FΓΆrster & Gabriele Wenger-Glemser from Bayerischer Rundfunk!

07.08.2025 08:46 πŸ‘ 5 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0

Automatic metrics like WER and human quality judgements are moderately correlated. Dialectal words are often rendered as nonsense. Dialectal syntactic structures are often retained in the output – whether this is acceptable in Std German is hit-or-miss.

07.08.2025 08:46 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

All ASR models we benchmark perform much better on Standard German than dialectal audio. Whether the transcriptions of the dialectal audios tend to be closer to the Std German references or to the dialectal references depends on the model decoder type.

07.08.2025 08:46 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
A sentence from the dataset with a Standard German and a dialectal transcription that differ on the word and phrase level.

A sentence from the dataset with a Standard German and a dialectal transcription that differ on the word and phrase level.

Betthupferl contains sentences from three dialect groups spoken in southeast Germany, as well as Std German sentences for comparison. The dialectal sentences have both dialectal and Std German gold transcriptions, showing differences between pronunciation, word choice and morphosyntax.

07.08.2025 08:46 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Piper title ("A multi-dialectal dataset for German dialect ASR and dialect-to-standard speech translation") and a map of the German state Bavaria showing where the Franconian, Bavarian, and Alemannic dialect groups are spoken

Piper title ("A multi-dialectal dataset for German dialect ASR and dialect-to-standard speech translation") and a map of the German state Bavaria showing where the Franconian, Bavarian, and Alemannic dialect groups are spoken

At #Interspeech2025 I'm going to present Betthupferl, a dataset for German dialect ASR & dialect-to-standard speech translation! We analyze differences between dialectal & Standard German transcriptions, benchmark ASR models, and examine shortcomings of current ASR models & evaluation metrics.

07.08.2025 08:46 πŸ‘ 16 πŸ” 4 πŸ’¬ 1 πŸ“Œ 1

UPDATE: Our poster presentation got moved to Tuesday, 16:00–17:30 (session 10)! #ACL2025NLP

27.07.2025 14:39 πŸ‘ 3 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0

The poster presentation slot got moved to Tuesday, 16:00–17:30!

27.07.2025 14:27 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
Analyzing the Effect of Linguistic Similarity on Cross-Lingual Transfer: Tasks and Experimental Setups Matter Cross-lingual transfer is a popular approach to increase the amount of training data for NLP tasks in a low-resource context. However, the best strategy to decide which cross-lingual data to include i...

Joint work with Masha Fedzechkina and @maartjeterhoeve.bsky.social produced during my internship at Apple last year!
See you at the Findings poster reception on Monday July 28 (18:00-19:30) :)
Preprint: arxiv.org/abs/2501.14491

18.07.2025 10:45 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

In practice, selecting a transfer language based on just one relevant similarity measure or the transfer results on a similar NLP task w/ similar input representations works well -- although it's best to compare multiple promising transfer candidates.

18.07.2025 10:45 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

... Topic classification based on n-grams is sensitive to string overlap (+ correlated linguistic measures), but topic classification based on mBERT embeddings doesn't show any strong correlations – here, inclusion in the pre-training data is important instead.

18.07.2025 10:45 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Fortunately, the patterns confirm our intuitions – e.g., syntactic similarity matters for parsing but not for topic classification. However, input representations matter too....

18.07.2025 10:45 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Correlations between transfer results per experiment (parsing, POS tagging, topic classification with different input representations) and similarity measures. The results vary a lot across experiments and measures – some are described in the next posts.

Correlations between transfer results per experiment (parsing, POS tagging, topic classification with different input representations) and similarity measures. The results vary a lot across experiments and measures – some are described in the next posts.

At #ACL2025NLP I'll present our analysis of the effect of linguistic similarity on cross-lingual transfer! We looked at how 10 similarity measures correlate w/ transfer results btwn 263 languages across 3 NLP tasks. Different similarity measures matter for diff. experiments (no one-size-fits-all)!

18.07.2025 10:43 πŸ‘ 21 πŸ” 1 πŸ’¬ 1 πŸ“Œ 1
Preview
Watch lectures from the best researchers. On-demand video platform giving you access to lectures from conferences worldwide.

My ACL 2024 keynote talk on "Are LLMs Narrowing Our Horizon? Let’s Embrace Variation in NLP!" is online now:

underline.io/events/466/s...

2024.aclweb.org/program/keyn...

It was a huge honor to me to give last year's flagship-in-NLP-conference keynote in Bangkok πŸ‡ΉπŸ‡­

20.06.2025 14:31 πŸ‘ 19 πŸ” 3 πŸ’¬ 1 πŸ“Œ 0

Dei Boarisch heard ned bei "Servus" und "Pfiade" auf? Dann suach ma genau Di!
Wir suachan Bairischsprecher:innen, de a kurze Umfrage ΓΌber KI-generierds Boarisch fΓΌr a Masterarbeit beantwortn mechadn.
Mid jeder Teilnahme bring ma den boarischn Dialekt a Stickal weida in de digitale Weyd!

04.06.2025 14:15 πŸ‘ 6 πŸ” 3 πŸ’¬ 0 πŸ“Œ 1

Bavarian dialect speakers needed! Our MSc student Miriam wants to find out 1. how good/bad LLM-generated "Bavarian" is, and 2. whether dialect speakers agree with each other on this. The survey takes <5 min: survey.ifkw.lmu.de/dialquali25/ Thank you for sharing/participating!

30.05.2025 14:17 πŸ‘ 3 πŸ” 3 πŸ’¬ 0 πŸ“Œ 1

The first archival *CL Queer in AI workshop will kick off in about 15 min! Join us in-person if you're at NAACL or virtually πŸ’œ

We will have presentations from our amazing contributors and invited speakers. Read on for more details 🧡

04.05.2025 15:16 πŸ‘ 5 πŸ” 2 πŸ’¬ 1 πŸ“Œ 0

Happening now at #NAACL2025 in room Pecos.

Kicking off with amazing talks and a panel by Monojit Choudhury, Isabelle Augenstein, and Katia Shutova

04.05.2025 15:31 πŸ‘ 5 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0