Marine Carpuat's Avatar

Marine Carpuat

@marinecarpuat

Associate Professor in Computer Science at the University of Maryland. Human-Centered Natural Language Processing & Machine Translation

3,158
Followers
243
Following
20
Posts
08.11.2024
Joined
Posts Following

Latest posts by Marine Carpuat @marinecarpuat

I'm so happy to see this and so sad I missed the talk! The field would not be the same without Kathy in so many ways, from her research contributions to her generous mentorship of many of us beyond her advisees.

30.07.2025 17:40 👍 1 🔁 0 💬 0 📌 0

...@andre-t-martins.bsky.social‬, Mary Nurminen, Doug Oard, @amelija166mp.bsky.social‬, Michel Simard, @yvofr.bsky.social‬

18.06.2025 12:08 👍 0 🔁 0 💬 0 📌 0

This is a big team effort with

Omri Asscher, Kalika Bali, @luisabentivogli.bsky.social, Frédéric Blain, @bowkerl.bsky.social‬, Monojit Choudhury, @haldaume3.bsky.social, Kevin Duh, Ge Gao, Alvin Grissom II‬, @markar.bsky.social‬, Elaine C. Khoong, ‪@wildlewis.bsky.social...

18.06.2025 12:08 👍 1 🔁 1 💬 1 📌 0
Preview
An Interdisciplinary Approach to Human-Centered Machine Translation Machine Translation (MT) tools are widely used today, often in contexts where professional translators are not present. Despite progress in MT technology, a gap persists between system development and...

We started this conversation at an NII Shonan seminar last year and wrote a survey highlighting key directions that emerged. Let us know what you think!

arxiv.org/abs/2506.13468

18.06.2025 12:08 👍 0 🔁 0 💬 1 📌 0

We argue for a human-centered approach: MT shouldn’t just produce correct outputs—they should support diverse users, goals, and contexts.

But let's not start from scratch: Translation Studies and HCI offer a wealth of theoretical and empirical work to rethink MT as a socio-technical problem.

18.06.2025 12:08 👍 0 🔁 0 💬 1 📌 0
Preview
An Interdisciplinary Approach to Human-Centered Machine Translation Machine Translation (MT) tools are widely used today, often in contexts where professional translators are not present. Despite progress in MT technology, a gap persists between system development and...

What should Machine Translation research look like in the age of multilingual LLMs?

Here’s one answer from researchers across NLP/MT, Translation Studies, and HCI.
"An Interdisciplinary Approach to Human-Centered Machine Translation"
arxiv.org/abs/2506.13468

18.06.2025 12:08 👍 18 🔁 7 💬 1 📌 0

Welcome @sarahwiegreffe.bsky.social !!!

17.06.2025 11:25 👍 0 🔁 0 💬 1 📌 0
Preview
En direct, bac de philo 2025. Les réponses à vos questions sur les sujets : « L’épreuve écrite de philosophie valide la capacité des élèves à prendre un peu de recul sur des questions dont la réponse ... « Notre avenir dépend-il de la technique ? », « La vérité est-elle toujours convaincante ? », ou encore « Avons-nous besoin de l’art ? » en dissertation, John Rawls et Adam Smith en explication de tex...

Tell me you're in France without telling me: live news coverage of high school philosophy exams!
2025 Bac de philo questions:
- Notre avenir dépend-il de la technique ?
- La vérité est-elle toujours convaincante ?
- Or discuss an excerpt from Rawls’ Theory of Justice.
www.lemonde.fr/campus/live/...

16.06.2025 09:57 👍 3 🔁 1 💬 1 📌 0

Disagreement between LLMs can be a strength! @dayeonki.bsky.social shows that having multiple LLMs debate improves their answers to culturally variable social norm questions. #ACL2025

13.06.2025 09:09 👍 6 🔁 2 💬 0 📌 0
Post image

1/ How can a monolingual English speaker 🇺🇸 decide if an automatic French translation 🇫🇷 is good enough to be shared?

Introducing ❓AskQE❓, an #LLM-based Question Generation + Answering framework that detects critical MT errors and provides actionable feedback 🗣️

#ACL2025

21.05.2025 17:48 👍 1 🔁 2 💬 1 📌 0

Life around submission deadlines is much more sane for me when we have a group paper clinic early (2 weeks before the deadline). Sometimes we even have an "extended abstract" clinic 1 month earlier.

That said, I did not follow my own advice this cycle, and I am now in recovery mode too!

21.05.2025 08:11 👍 4 🔁 0 💬 0 📌 0
Post image

🚨 New Paper 🚨

1/ We often assume that well-written text is easier to translate ✏️

But can #LLMs automatically rewrite inputs to improve machine translation? 🌍

Here’s what we found 🧵

17.04.2025 01:32 👍 8 🔁 4 💬 1 📌 0

ModernBERT or DeBERTaV3?

What's driving performance: architecture or data?

To find out we pretrained ModernBERT on the same dataset as CamemBERTaV2 (a DeBERTaV3 model) to isolate architecture effects.

Here are our findings:

14.04.2025 15:41 👍 44 🔁 15 💬 3 📌 0

Congratulations @arijriabi.bsky.social! 🎉

20.03.2025 09:24 👍 3 🔁 0 💬 0 📌 0

CamemBERT 2.0: A Smarter French 🇫🇷 Language Model Aged to Perfection 👌

We release a much-needed update for the previous. SOTA French encoder LM.

We introduce two new models CamemBERTa-v2 and CamemBERT-v2, based on the DeBERTaV3 and RoBERTa recipe.

So what's new?

[1/8]

15.11.2024 17:07 👍 20 🔁 10 💬 1 📌 4
Florian Cafiero - "A Riddle in a Haystack: Using Large Language Models for the Detection of Rare Phenomena" - ALMAnaCH seminar 7th March 2025 at 11am CET

Florian Cafiero - "A Riddle in a Haystack: Using Large Language Models for the Detection of Rare Phenomena" - ALMAnaCH seminar 7th March 2025 at 11am CET

We are happy to announce our next seminar, given by Florian Cafiero @floriancafiero.bsky.social (PSL @ecoledeschartes.bsky.social) entitled "A Riddle in a Haystack: Using Large Language Models for the Detection of Rare Phenomena" on Friday 7th March at 11am CET. Details here: t.co/pPbWfkALM4!

05.03.2025 12:54 👍 9 🔁 3 💬 1 📌 0

🤔 Interested in how #HCI thinks about using #LLMs, or looking to understand best practices for human-LLM interaction?

🚨🚨New paper: Understanding the LLM-ification of CHI: Unpacking the Impact of LLMs at CHI through a Systematic Literature Review 🧵

31.01.2025 21:49 👍 17 🔁 7 💬 1 📌 2
Instruction-following Speech Processing track Home of the IWSLT conference and SIGSLT.

First up, a new task for 2025:
*Instruction-following for speech processing!*

Explore instruction-following for speech ⇨
Integrate speech foundation models with LLMs across tasks such as speech translation, recognition, summarization, and QA.

🔗: iwslt.org/2025/instruc...

28.01.2025 18:13 👍 8 🔁 6 💬 1 📌 0
IWSLT 2025 Home of the IWSLT conference and SIGSLT.

We are pleased to announce that our 2025 shared tasks have launched! Find details and data on our website, with evaluation data to be released April 1!
iwslt.org/2025/#shared...

We will be highlighting one task per day here and the other site. Join us for an exciting year of speech translation!!

28.01.2025 17:57 👍 3 🔁 2 💬 0 📌 0

I'll be Germany next week to visit TUM Heilbronn and LMU Munich. Looking forward to learning from NLP researchers there and sharing recent work on human centered-machine translation! (And to discovering how much German I can actually understand after 2 weeks on duolingo 😅)

23.01.2025 13:52 👍 7 🔁 1 💬 0 📌 0
Preview
Can Word-level Quality Estimation Inform and Improve Machine Translation Post-editing? - Imminent - Translated's Research Center Can Word-level Quality Estimation Inform and Improve Machine Translation Post-editing? - % Imminent is Translated’s Research Center which supports companies in localization, funds language data resear...

Our piece is finally out in the Imminent blog! 🎉It presents preliminary findings of our recent study evaluating the usefulness of word-level quality estimation in real-world post-editing settings (paper forthcoming)! 🧵1/

imminent.translated.com/can-word-lev...

10.12.2024 16:47 👍 26 🔁 9 💬 1 📌 0

ICYMI: the UMD LSC is looking for a postdoctoral fellow with an interdisciplinary research agenda in language sciences.
languagescience.umd.edu/news/job-opp...

If your interests connect to #NLP research that helps people communicate across languages, please reach out!

11.12.2024 10:57 👍 3 🔁 1 💬 1 📌 0
Preview
De quelles façons « Le Monde » se sert-il de l’IA ? Conformément à ses engagements, « Le Monde » publie une liste exhaustive de l’usage par sa rédaction d’outils d’assistance éditoriale relevant de l’intelligence artificielle générative.

Interesting to see how Le Monde uses AI: MT (English articles via DeepL + postediting!), TTS, video captioning and translation, proofreading, and experimenting with rewriting content from news agency to their style specs www.lemonde.fr/le-monde-et-...

10.12.2024 09:22 👍 7 🔁 1 💬 0 📌 0

I hope I am not late to the party (was away post-quals chilling) but here are some thoughts on why this is bad IMO:

First, a disclaimer that I am writing this as an African who is a speaker of multiple African languages, NLP researcher of African languages, and HCI researcher focusing broadly on..

02.12.2024 23:43 👍 125 🔁 60 💬 9 📌 8

Generating new English terms would also be interesting! The paper looks at translating existing English terms. A fundamental challenge for LLMs is that some of the new terms are rare or even unseen in a Common Crawl corpus, but yes, there is lots of potential for LLMs as discovery tools.

27.11.2024 10:24 👍 1 🔁 0 💬 0 📌 0
Preview
Towards More Robust NLP System Evaluation: Handling Missing Scores in Benchmarks The evaluation of natural language processing (NLP) systems is crucial for advancing the field, but current benchmarking approaches often assume that all systems have scores available for all tasks, w...

Such a good thread idea!

arxiv.org/abs/2305.10284

"Towards More Robust NLP System Evaluation: Handling Missing Scores in Benchmarks" by Anas Himmi et al. They explore ranking LLMs is required where some scores for certain tasks are missing. The Borda count constructs reliable leaderboards.

27.11.2024 10:12 👍 2 🔁 1 💬 0 📌 0
Preview
Vers la traduction automatique des néologismes scientifiques Paul Lerner, François Yvon. Actes de la 31ème Conférence sur le Traitement Automatique des Langues Naturelles, volume 1 : articles longs et prises de position. 2024.

What #NLP papers do you wish more people knew about?

I'll start: "Toward Machine Translation of Scientific Neologisms", by Lerner & Yvon
aclanthology.org/2024.jeptaln...

A real task in cross-lingual communication, linguistically grounded, and hard for LLMs!

Well worth a read, even through MT!

27.11.2024 09:46 👍 31 🔁 7 💬 5 📌 0
Preview
SpeechQE: Estimating the Quality of Direct Speech Translation HyoJung Han, Kevin Duh, Marine Carpuat. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. 2024.

This underscores the need to address quality estimation in speech translation independently from text translation, as the use cases for speech translation continue to grow.
Paper: aclanthology.org/2024.emnlp-m...
GitHub: github.com/h-j-han/Spee...
🤗 Models and Data: huggingface.co/collections/...

12.11.2024 12:47 👍 1 🔁 0 💬 0 📌 0
Post image Post image Post image

We construct a benchmark for the task of quality estimation task for speech translation (SpeechQE). Using it, we find that dedicated end-to-end approaches are generally better suited to SpeechQE that cascaded approaches that rely on quality estimation tools designed for text.

12.11.2024 12:47 👍 1 🔁 0 💬 1 📌 0
Preview
SpeechQE: Estimating the Quality of Direct Speech Translation HyoJung Han, Kevin Duh, Marine Carpuat. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. 2024.

Estimating translation quality is key to building trustworthy machine translation systems, but most work focuses on text. How well can we assess *speech* translation? 🎤
Check out our 𝐒𝐩𝐞𝐞𝐜𝐡𝐐𝐄 💬 paper with Hyojung Han and Kevin Duh at #EMNLP2024!
aclanthology.org/2024.emnlp-m...

12.11.2024 12:46 👍 6 🔁 1 💬 2 📌 0