I'm so happy to see this and so sad I missed the talk! The field would not be the same without Kathy in so many ways, from her research contributions to her generous mentorship of many of us beyond her advisees.
I'm so happy to see this and so sad I missed the talk! The field would not be the same without Kathy in so many ways, from her research contributions to her generous mentorship of many of us beyond her advisees.
...@andre-t-martins.bsky.social, Mary Nurminen, Doug Oard, @amelija166mp.bsky.social, Michel Simard, @yvofr.bsky.social
This is a big team effort with
Omri Asscher, Kalika Bali, @luisabentivogli.bsky.social, Frédéric Blain, @bowkerl.bsky.social, Monojit Choudhury, @haldaume3.bsky.social, Kevin Duh, Ge Gao, Alvin Grissom II, @markar.bsky.social, Elaine C. Khoong, @wildlewis.bsky.social...
We started this conversation at an NII Shonan seminar last year and wrote a survey highlighting key directions that emerged. Let us know what you think!
arxiv.org/abs/2506.13468
We argue for a human-centered approach: MT shouldn’t just produce correct outputs—they should support diverse users, goals, and contexts.
But let's not start from scratch: Translation Studies and HCI offer a wealth of theoretical and empirical work to rethink MT as a socio-technical problem.
What should Machine Translation research look like in the age of multilingual LLMs?
Here’s one answer from researchers across NLP/MT, Translation Studies, and HCI.
"An Interdisciplinary Approach to Human-Centered Machine Translation"
arxiv.org/abs/2506.13468
Welcome @sarahwiegreffe.bsky.social !!!
Tell me you're in France without telling me: live news coverage of high school philosophy exams!
2025 Bac de philo questions:
- Notre avenir dépend-il de la technique ?
- La vérité est-elle toujours convaincante ?
- Or discuss an excerpt from Rawls’ Theory of Justice.
www.lemonde.fr/campus/live/...
Disagreement between LLMs can be a strength! @dayeonki.bsky.social shows that having multiple LLMs debate improves their answers to culturally variable social norm questions. #ACL2025
1/ How can a monolingual English speaker 🇺🇸 decide if an automatic French translation 🇫🇷 is good enough to be shared?
Introducing ❓AskQE❓, an #LLM-based Question Generation + Answering framework that detects critical MT errors and provides actionable feedback 🗣️
#ACL2025
Life around submission deadlines is much more sane for me when we have a group paper clinic early (2 weeks before the deadline). Sometimes we even have an "extended abstract" clinic 1 month earlier.
That said, I did not follow my own advice this cycle, and I am now in recovery mode too!
🚨 New Paper 🚨
1/ We often assume that well-written text is easier to translate ✏️
But can #LLMs automatically rewrite inputs to improve machine translation? 🌍
Here’s what we found 🧵
ModernBERT or DeBERTaV3?
What's driving performance: architecture or data?
To find out we pretrained ModernBERT on the same dataset as CamemBERTaV2 (a DeBERTaV3 model) to isolate architecture effects.
Here are our findings:
Congratulations @arijriabi.bsky.social! 🎉
CamemBERT 2.0: A Smarter French 🇫🇷 Language Model Aged to Perfection 👌
We release a much-needed update for the previous. SOTA French encoder LM.
We introduce two new models CamemBERTa-v2 and CamemBERT-v2, based on the DeBERTaV3 and RoBERTa recipe.
So what's new?
[1/8]
Florian Cafiero - "A Riddle in a Haystack: Using Large Language Models for the Detection of Rare Phenomena" - ALMAnaCH seminar 7th March 2025 at 11am CET
We are happy to announce our next seminar, given by Florian Cafiero @floriancafiero.bsky.social (PSL @ecoledeschartes.bsky.social) entitled "A Riddle in a Haystack: Using Large Language Models for the Detection of Rare Phenomena" on Friday 7th March at 11am CET. Details here: t.co/pPbWfkALM4!
🤔 Interested in how #HCI thinks about using #LLMs, or looking to understand best practices for human-LLM interaction?
🚨🚨New paper: Understanding the LLM-ification of CHI: Unpacking the Impact of LLMs at CHI through a Systematic Literature Review 🧵
First up, a new task for 2025:
*Instruction-following for speech processing!*
Explore instruction-following for speech ⇨
Integrate speech foundation models with LLMs across tasks such as speech translation, recognition, summarization, and QA.
🔗: iwslt.org/2025/instruc...
We are pleased to announce that our 2025 shared tasks have launched! Find details and data on our website, with evaluation data to be released April 1!
iwslt.org/2025/#shared...
We will be highlighting one task per day here and the other site. Join us for an exciting year of speech translation!!
I'll be Germany next week to visit TUM Heilbronn and LMU Munich. Looking forward to learning from NLP researchers there and sharing recent work on human centered-machine translation! (And to discovering how much German I can actually understand after 2 weeks on duolingo 😅)
Our piece is finally out in the Imminent blog! 🎉It presents preliminary findings of our recent study evaluating the usefulness of word-level quality estimation in real-world post-editing settings (paper forthcoming)! 🧵1/
imminent.translated.com/can-word-lev...
ICYMI: the UMD LSC is looking for a postdoctoral fellow with an interdisciplinary research agenda in language sciences.
languagescience.umd.edu/news/job-opp...
If your interests connect to #NLP research that helps people communicate across languages, please reach out!
Interesting to see how Le Monde uses AI: MT (English articles via DeepL + postediting!), TTS, video captioning and translation, proofreading, and experimenting with rewriting content from news agency to their style specs www.lemonde.fr/le-monde-et-...
I hope I am not late to the party (was away post-quals chilling) but here are some thoughts on why this is bad IMO:
First, a disclaimer that I am writing this as an African who is a speaker of multiple African languages, NLP researcher of African languages, and HCI researcher focusing broadly on..
Generating new English terms would also be interesting! The paper looks at translating existing English terms. A fundamental challenge for LLMs is that some of the new terms are rare or even unseen in a Common Crawl corpus, but yes, there is lots of potential for LLMs as discovery tools.
Such a good thread idea!
arxiv.org/abs/2305.10284
"Towards More Robust NLP System Evaluation: Handling Missing Scores in Benchmarks" by Anas Himmi et al. They explore ranking LLMs is required where some scores for certain tasks are missing. The Borda count constructs reliable leaderboards.
What #NLP papers do you wish more people knew about?
I'll start: "Toward Machine Translation of Scientific Neologisms", by Lerner & Yvon
aclanthology.org/2024.jeptaln...
A real task in cross-lingual communication, linguistically grounded, and hard for LLMs!
Well worth a read, even through MT!
This underscores the need to address quality estimation in speech translation independently from text translation, as the use cases for speech translation continue to grow.
Paper: aclanthology.org/2024.emnlp-m...
GitHub: github.com/h-j-han/Spee...
🤗 Models and Data: huggingface.co/collections/...
We construct a benchmark for the task of quality estimation task for speech translation (SpeechQE). Using it, we find that dedicated end-to-end approaches are generally better suited to SpeechQE that cascaded approaches that rely on quality estimation tools designed for text.
Estimating translation quality is key to building trustworthy machine translation systems, but most work focuses on text. How well can we assess *speech* translation? 🎤
Check out our 𝐒𝐩𝐞𝐞𝐜𝐡𝐐𝐄 💬 paper with Hyojung Han and Kevin Duh at #EMNLP2024!
aclanthology.org/2024.emnlp-m...