Extended EPIC-SOUND paper was accepted at TPAMI
arxiv.org/abs/2302.006...
This follows ICASSP 2023 oral, extended for detection and further analysis
epic-kitchens.github.io/epic-sounds/
work by @jaesunghuh.bsky.social Jacob Chalk @ekazakos.bsky.social
@oxford-vgg.bsky.social @bristoluni.bsky.social
22.07.2025 12:00
๐ 7
๐ 4
๐ฌ 0
๐ 0
IEEE Xplore Full-Text PDF:
๐ Advancing Active Speaker Detection for Egocentric Videos
This paper explores how to train robust Active Speaker Detection (ASD) model for egocentric videos.
Paper : ieeexplore.ieee.org/stamp/stamp....
Machine learning for multimodal data I (Poster)
Apr 11: 11:30 am - 1:00 pm
01.04.2025 06:19
๐ 0
๐ 0
๐ฌ 0
๐ 0
Attending #ICASSP2025 next week at Hyderabad from April 7th. Ping me if you want to meet in person! ๐ค
I'm presenting two papers โฌ๏ธโฌ๏ธโฌ๏ธ
01.04.2025 06:19
๐ 0
๐ 0
๐ฌ 1
๐ 0
VoxConverse has become one of the most widely-used speaker diarization evaluation datasets since 2020. Please also check out the paper and dataset below.
Code : github.com/JaesungHuh/a...
Paper : arxiv.org/abs/2007.01216
Dataset : mm.kaist.ac.kr/datasets/vox...
20.02.2025 04:26
๐ 0
๐ 1
๐ฌ 0
๐ 0
I'm releasing the audio-visual diarization pipeline that was used to create the VoxConverse dataset. Along with the original code, an enhanced version featuring new VAD and speaker verification models is now available.
20.02.2025 04:25
๐ 1
๐ 0
๐ฌ 1
๐ 0
GitHub - JaesungHuh/av-diarization: Audio-visual diarization pipeline used for creating VoxConverse dataset
Audio-visual diarization pipeline used for creating VoxConverse dataset - JaesungHuh/av-diarization
VoxConverse has become one of the most widely-used speaker diarization evaluation datasets since 2020. Please also check out the paper and dataset below.
Code : github.com/JaesungHuh/a...
Paper : arxiv.org/abs/2007.01216
20.02.2025 04:16
๐ 0
๐ 0
๐ฌ 0
๐ 0
Since my PhD journey going towards the end (Iโm currently looking for a full-time job in Research Engineer / Scientist positions!), Iโm trying to open-source all the codes Iโve participated in! This is the first edition.
#VoxConverse #Speakerdiarization #Audiovisual
20.02.2025 04:16
๐ 2
๐ 0
๐ฌ 1
๐ 1