Folgen
Honglie Chen
Honglie Chen
Meta AI, University of Oxford
Bestätigte E-Mail-Adresse bei meta.com - Startseite
Titel
Zitiert von
Zitiert von
Jahr
Vggsound: A large-scale audio-visual dataset
H Chen, W Xie, A Vedaldi, A Zisserman
ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020
5672020
Localizing Visual Sounds the Hard Way
H Chen, W Xie, T Afouras, A Nagrani, A Vedaldi, A Zisserman
Conference on Computer Vision and Pattern Recognition (CVPR), 2021, 2021
2082021
Auto-avsr: Audio-visual speech recognition with automatic labels
P Ma, A Haliassos, A Fernandez-Lopez, H Chen, S Petridis, M Pantic
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
1012023
Audio-Visual Synchronisation in the Wild
H Chen, W Xie, T Afouras, A Nagrani, A Vedaldi, A Zisserman
British Machine Vision Conference (BMVC), 2021, 2021
462021
Synthvsr: Scaling up visual speech recognition with synthetic supervision
X Liu, E Lakomkin, K Vougioukas, P Ma, H Chen, R Xie, M Doulaty, ...
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023
202023
AutoCorrect: Deep Inductive Alignment of Noisy Geometric Annotations
H Chen, W Xie, A Vedaldi, A Zisserman
British Machine Vision Conference (BMVC), 2019, 2019
182019
SparseVSR: Lightweight and noise robust visual speech recognition
A Fernandez-Lopez, H Chen, P Ma, A Haliassos, S Petridis, M Pantic
arXiv preprint arXiv:2307.04552, 2023
52023
The CHiME-8 MMCSG Challenge: Multi-modal conversations in smart glasses
K Zmolikova, S Merello, K Kalgaonkar, J Lin, N Moritz, P Ma, M Sun, ...
32024
Localizing visual sounds the hard way
A Vedaldi, H Chen, W Xie, T Afouras, A Nagrani, A Zisserman
Institute of Electrical and Electronics Engineers, 2021
22021
Large Language Models Are Strong Audio-Visual Speech Recognition Learners
U Cappellazzo, M Kim, H Chen, P Ma, S Petridis, D Falavigna, A Brutti, ...
arXiv preprint arXiv:2409.12319, 2024
12024
RT-LA-VocE: Real-Time Low-SNR Audio-Visual Speech Enhancement
H Chen, R Mira, S Petridis, M Pantic
arXiv preprint arXiv:2407.07825, 2024
12024
MSRS: Training Multimodal Speech Recognition Models from Scratch with Sparse Mask Optimization
A Fernandez-Lopez, H Chen, P Ma, L Yin, Q Xiao, S Petridis, S Liu, ...
arXiv preprint arXiv:2406.17614, 2024
12024
The CHiME-8 MMCSG Challenge: Multi-modal conversations in smart glasses
K Žmolíková, S Merello, K Kalgaonkar, J Lin, N Moritz, P Ma, M Sun, ...
Proc. CHiME 2024, 7-12, 2024
12024
Unified Speech Recognition: A Single Model for Auditory, Visual, and Audiovisual Inputs
A Haliassos, R Mira, H Chen, Z Landgraf, S Petridis, M Pantic
arXiv preprint arXiv:2411.02256, 2024
2024
Learning with multimodal self-supervision
H Chen
University of Oxford, 2021
2021
Supplementary Material: SynthVSR: Scaling Up Visual Speech Recognition With Synthetic Supervision
X Liu, E Lakomkin, K Vougioukas, P Ma, H Chen, R Xie, M Doulaty, ...
SparseVSR: Lightweight and Noise Robust Visual Speech Recognition–Extended Abstract
A Fernandez-Lopez, H Chen, P Ma, A Haliassos, S Petridis, M Pantic
Das System kann den Vorgang jetzt nicht ausführen. Versuchen Sie es später erneut.
Artikel 1–17