(@mgaido91) — KonKok

🎓 Today we attended the FBK PhD Day, and our amazing students presented their research lines:
@apierg.bsky.social: gender-inclusive MT
@linaconti.bsky.social: explainable speech translation
@zhihangxie.bsky.social: long-form SpeechLLMs
@uzumakidhairya.bsky.social: resource-efficient translation

25.02.2026 15:37 👍 7 🔁 7 💬 0 📌 0

Qualtrics Survey | Qualtrics Experience Management The most powerful, simple and trusted way to gather experience data. Start your journey to experience management and try a free account today.

🔍 Stiamo studiando come l'AI viene usata in Italia e per farlo abbiamo costruito un sondaggio!

👉 bit.ly/sondaggio_ai...

(è anonimo, richiede ~10 minuti, e se partecipi o lo fai girare ci aiuti un sacco🙏)

Ci interessa anche raggiungere persone che non si occupano e non sono esperte di AI!

03.06.2025 10:24 👍 16 🔁 18 💬 1 📌 0

FAMA - a FBK-MT Collection The First Large-Scale Open-Science Speech Foundation Model for English and Italian

🚀 New tech report out! Meet FAMA, our open-science speech foundation model family for both ASR and ST in 🇬🇧 English and 🇮🇹 Italian.

The models are live and ready to try on @hf.co:
🔗 huggingface.co/collections/...

📄 Preprint: arxiv.org/abs/2505.22759

#ASR #ST #OpenScience #MultilingualAI

30.05.2025 15:35 👍 7 🔁 3 💬 0 📌 0

Reserved topic scholarships | Doctoral Program - Information Engineering and Computer Science

📢 Come and join our group!
We offer a fully funded 3-year PhD position:

📔 Automatic translation with large multimodal models: iecs.unitn.it/education/ad...

📍Full details for application: iecs.unitn.it/education/ad...

📅 Deadline May 12, 2025

#NLProc #FBK

22.04.2025 10:14 👍 9 🔁 9 💬 1 📌 0

https://arxiv.org/abs/2501.04561

Interesting to see multimodal LLM built by combining modality encoders and LLM with adapters, as in the SFM+LLM paradigm, independently for each modality. This modularity may ease the creation of more MLMs from collaborations of single-modality experts. arxiv.org/abs/2501.04561

16.04.2025 13:20 👍 3 🔁 0 💬 0 📌 1

📢 The evaluation period of the Instruction Following task at
@iwslt.bsky.social just started!

🖥️ Consider submitting your speech-to-text system!

The outputs can be easily uploaded on the SPEECHM platform developed in the Meetween project (www.meetween.eu)!
➡️ iwslt2025.speechm.cloud.cyfronet.pl

01.04.2025 12:39 👍 9 🔁 5 💬 0 📌 0

While we look forward to a sunny Geneva, why wait to join the conversation?

We’ve created a starter pack for our #GITT2025 friends!
🕵️ Follow researchers working on gender bias in MT
💬 Stay up to date and dive into the discussion!

All info at sites.google.com/tilburgunive...

28.02.2025 09:22 👍 21 🔁 16 💬 1 📌 1

AlignFormer: Modality Matching Can Achieve Better Zero-shot Instruction-Following Speech-LLM Integrating speech into LLM (speech-LLM) has gaining increased attention recently. The mainstream solution is to connect a well-trained speech encoder and LLM with a neural adapter. However, the lengt...

very interesting to see more and more methods to close the length mismatch between speech and text sequences (aka length adapter -- see arxiv.org/abs/2402.12025) for SFM+LLM models! This one merging CTC and Q-former sounds very cool to me:
arxiv.org/abs/2412.01145

14.02.2025 10:34 👍 2 🔁 0 💬 0 📌 1

Simultaneous track Home of the IWSLT conference and SIGSLT.

Next up: simultaneous speech translation!

🎯 Goal: to explore ways to translate speech into another language like simultaneous interpreting.

🔗 Link: iwslt.org/2025/simulta...

30.01.2025 19:31 👍 6 🔁 4 💬 1 📌 0

Instruction-following Speech Processing track Home of the IWSLT conference and SIGSLT.

First up, a new task for 2025:
*Instruction-following for speech processing!*

Explore instruction-following for speech ⇨
Integrate speech foundation models with LLMs across tasks such as speech translation, recognition, summarization, and QA.

🔗: iwslt.org/2025/instruc...

28.01.2025 18:13 👍 8 🔁 6 💬 1 📌 0

Today's task: model compression!!
🆕 New at IWSLT! But no less exciting 🔥

🎯 Goal: Compress a large, general-purpose multimodal model, making speech translation more efficient ⚡️, deployable 📲, and sustainable ♻️, while preserving translation quality ⭐️
#AI #SpeechTech #ModelCompression #LLMcompression

29.01.2025 16:48 👍 8 🔁 5 💬 1 📌 0

Prepending or Cross-Attention for Speech-to-Text? An Empirical Comparison Following the remarkable success of Large Language Models (LLMs) in NLP tasks, there is increasing interest in extending their capabilities to speech -- the most common form in communication. To integ...

I'm happy to share that our paper "Prepending or Cross-Attention for Speech-to-Text? An Empirical Comparison" has been accepted at @naaclmeeting.bsky.social 2025! #NAACL2025

@mgaido91.bsky.social 👏

📃 Preprint: arxiv.org/abs/2501.02370
⏰ Code will be released soon

#NLProc #Speech

23.01.2025 08:44 👍 10 🔁 3 💬 0 📌 0

a polar bear cub is laying in a pile of branches . ALT: a polar bear cub is laying in a pile of branches .

Hello world! 👋 We're coming out of hibernation to bring you this happy news:
1) We're organising the 3rd edition of GITT at #MTSummit! Working on #gender & #translation #technology? We'll see you there!
2) We're moving away from Twitter, so share the news and help us find old and new GITT friends!

22.01.2025 12:17 👍 26 🔁 15 💬 0 📌 1

Instruction-following Speech Processing track Home of the IWSLT conference and SIGSLT.

Our #iwslt 2025 task on instruction-following speech models is out! Submission by April 15th. Check it out at: iwslt.org/2025/instruc...

09.01.2025 09:43 👍 4 🔁 2 💬 0 📌 1

Latest posts by @mgaido91