Code llama: Open foundation models for code B Roziere, J Gehring, F Gloeckle, S Sootla, I Gat, XE Tan, Y Adi, J Liu, ... arXiv preprint arXiv:2308.12950, 2023 | 1388 | 2023 |
The llama 3 herd of models A Dubey, A Jauhri, A Pandey, A Kadian, A Al-Dahle, A Letman, A Mathur, ... arXiv preprint arXiv:2407.21783, 2024 | 1206 | 2024 |
High fidelity neural audio compression A Défossez, J Copet, G Synnaeve, Y Adi arXiv preprint arXiv:2210.13438, 2022 | 584 | 2022 |
Simple and controllable music generation J Copet, F Kreuk, I Gat, T Remez, D Kant, G Synnaeve, Y Adi, A Défossez Advances in Neural Information Processing Systems 36, 2024 | 368 | 2024 |
On generative spoken language modeling from raw audio K Lakhotia, E Kharitonov, WN Hsu, Y Adi, A Polyak, B Bolte, TA Nguyen, ... Transactions of the Association for Computational Linguistics 9, 1336-1354, 2021 | 327 | 2021 |
Audiogen: Textually guided audio generation F Kreuk, G Synnaeve, A Polyak, U Singer, A Défossez, J Copet, D Parikh, ... arXiv preprint arXiv:2209.15352, 2022 | 302 | 2022 |
Speech resynthesis from discrete disentangled self-supervised representations A Polyak, Y Adi, J Copet, E Kharitonov, K Lakhotia, WN Hsu, A Mohamed, ... arXiv preprint arXiv:2104.00355, 2021 | 298 | 2021 |
Text-free prosody-aware generative spoken language modeling E Kharitonov, A Lee, A Polyak, Y Adi, J Copet, K Lakhotia, TA Nguyen, ... arXiv preprint arXiv:2109.03264, 2021 | 118 | 2021 |
Generative spoken dialogue language modeling TA Nguyen, E Kharitonov, J Copet, Y Adi, WN Hsu, A Elkahky, ... Transactions of the Association for Computational Linguistics 11, 250-266, 2023 | 95 | 2023 |
The llama 3 herd of models, 2024 A Dubey, A Jauhri, A Pandey, A Kadian, A Al-Dahle, A Letman, A Mathur, ... URL https://arxiv. org/abs/2407.21783 2407, 21783, 0 | 74 | |
Textless speech emotion conversion using discrete and decomposed representations F Kreuk, A Polyak, J Copet, E Kharitonov, TA Nguyen, M Rivière, WN Hsu, ... arXiv preprint arXiv:2111.07402, 2021 | 64 | 2021 |
Textually pretrained speech language models M Hassid, T Remez, TA Nguyen, I Gat, A Conneau, F Kreuk, J Copet, ... Advances in Neural Information Processing Systems 36, 2024 | 47 | 2024 |
Expresso: A benchmark and analysis of discrete expressive speech resynthesis TA Nguyen, WN Hsu, A d'Avirro, B Shi, I Gat, M Fazel-Zarani, T Remez, ... arXiv preprint arXiv:2308.05725, 2023 | 39 | 2023 |
Stop: A dataset for spoken task oriented semantic parsing P Tomasello, A Shrivastava, D Lazar, PC Hsu, D Le, A Sagar, A Elkahky, ... 2022 IEEE Spoken Language Technology Workshop (SLT), 991-998, 2023 | 32 | 2023 |
Masked audio generation using a single non-autoregressive transformer A Ziv, I Gat, GL Lan, T Remez, F Kreuk, A Défossez, J Copet, G Synnaeve, ... arXiv preprint arXiv:2401.04577, 2024 | 26 | 2024 |
Brouhaha: multi-task training for voice activity detection, speech-to-noise ratio, and C50 room acoustics estimation M Lavechin, M Métais, H Titeux, A Boissonnet, J Copet, M Rivière, ... 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 1-7, 2023 | 22 | 2023 |
textless-lib: A library for textless spoken language processing E Kharitonov, J Copet, K Lakhotia, TA Nguyen, P Tomasello, A Lee, ... arXiv preprint arXiv:2202.07359, 2022 | 17 | 2022 |
ASR4REAL: An extended benchmark for speech models M Riviere, J Copet, G Synnaeve arXiv preprint arXiv:2110.08583, 2021 | 15 | 2021 |
Augmentation invariant discrete representation for generative spoken language modeling I Gat, F Kreuk, TA Nguyen, A Lee, J Copet, G Synnaeve, E Dupoux, Y Adi arXiv preprint arXiv:2209.15483, 2022 | 10 | 2022 |
Generative Spoken Language Model based on continuous word-sized audio tokens R Algayres, Y Adi, TA Nguyen, J Copet, G Synnaeve, B Sagot, E Dupoux arXiv preprint arXiv:2310.05224, 2023 | 7 | 2023 |