Jaesung Huh (@jaesunghuh)

Extended EPIC-SOUND paper was accepted at TPAMI
arxiv.org/abs/2302.006...
This follows ICASSP 2023 oral, extended for detection and further analysis
epic-kitchens.github.io/epic-sounds/
work by @jaesunghuh.bsky.social Jacob Chalk @ekazakos.bsky.social
@oxford-vgg.bsky.social @bristoluni.bsky.social

22.07.2025 12:00 👍 7 🔁 4 💬 0 📌 0

IEEE Xplore Full-Text PDF:

📖 Advancing Active Speaker Detection for Egocentric Videos

This paper explores how to train robust Active Speaker Detection (ASD) model for egocentric videos.

Paper : ieeexplore.ieee.org/stamp/stamp....

Machine learning for multimodal data I (Poster)
Apr 11: 11:30 am - 1:00 pm

01.04.2025 06:19 👍 0 🔁 0 💬 0 📌 0

The VoxCeleb Speaker Recognition Challenge: A Retrospective The VoxCeleb Speaker Recognition Challenges (VoxSRC) were a series of challenges and workshops that ran annually from 2019 to 2023. The challenges primarily evaluated the tasks of speaker recognition ...

📖 The VoxCeleb Speaker Recognition Challenge: A Retrospective

This paper presents a review of the VoxCeleb Speaker Recognition Challenges (VoxSRC).

Paper : arxiv.org/abs/2408.14886

Speaker Recognition II (Poster)
Apr 9: 5:00 pm - 6:30 pm

01.04.2025 06:19 👍 0 🔁 0 💬 1 📌 0

Attending #ICASSP2025 next week at Hyderabad from April 7th. Ping me if you want to meet in person! 🤗

I'm presenting two papers ⬇️⬇️⬇️

01.04.2025 06:19 👍 0 🔁 0 💬 1 📌 0

VoxConverse has become one of the most widely-used speaker diarization evaluation datasets since 2020. Please also check out the paper and dataset below.

Code : github.com/JaesungHuh/a...
Paper : arxiv.org/abs/2007.01216
Dataset : mm.kaist.ac.kr/datasets/vox...

20.02.2025 04:26 👍 0 🔁 1 💬 0 📌 0

I'm releasing the audio-visual diarization pipeline that was used to create the VoxConverse dataset. Along with the original code, an enhanced version featuring new VAD and speaker verification models is now available.

20.02.2025 04:25 👍 1 🔁 0 💬 1 📌 0

GitHub - JaesungHuh/av-diarization: Audio-visual diarization pipeline used for creating VoxConverse dataset Audio-visual diarization pipeline used for creating VoxConverse dataset - JaesungHuh/av-diarization

VoxConverse has become one of the most widely-used speaker diarization evaluation datasets since 2020. Please also check out the paper and dataset below.

Code : github.com/JaesungHuh/a...
Paper : arxiv.org/abs/2007.01216

20.02.2025 04:16 👍 0 🔁 0 💬 0 📌 0

Since my PhD journey going towards the end (I’m currently looking for a full-time job in Research Engineer / Scientist positions!), I’m trying to open-source all the codes I’ve participated in! This is the first edition.

#VoxConverse #Speakerdiarization #Audiovisual

20.02.2025 04:16 👍 2 🔁 0 💬 1 📌 1

Jaesung Huh

Latest posts by Jaesung Huh @jaesunghuh