Follow
Koichi Saito
Title
Cited by
Cited by
Year
Gibbsddrm: A partially collapsed gibbs sampler for solving blind inverse problems with denoising diffusion restoration
N Murata, K Saito, CH Lai, Y Takida, T Uesaka, Y Mitsufuji, S Ermon
International conference on machine learning, 25501-25522, 2023
462023
Unsupervised vocal dereverberation with diffusion-based generative models
K Saito, N Murata, T Uesaka, CH Lai, Y Takida, T Fukui, Y Mitsufuji
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
262023
Training speech enhancement systems with noisy speech datasets
K Saito, S Uhlich, G Fabbro, Y Mitsufuji
arXiv preprint arXiv:2105.12315, 2021
152021
Vrdmg: Vocal restoration via diffusion posterior sampling with multiple guidance
C Hernandez-Olivan, K Saito, N Murata, CH Lai, MA Martínez-Ramirez, ...
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024
102024
Sampling-frequency-independent convolutional layer and its application to audio source separation
K Saito, T Nakamura, K Yatabe, H Saruwatari
IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 2928-2943, 2022
102022
Sampling-frequency-independent audio source separation using convolution layer based on impulse invariant method
K Saito, T Nakamura, K Yatabe, Y Koizumi, H Saruwatari
2021 29th European Signal Processing Conference (EUSIPCO), 321-325, 2021
102021
SoundCTM: Unifying Score-based and Consistency Models for Full-band Text-to-Sound Generation
K Saito, D Kim, T Shibuya, CH Lai, Z Zhong, Y Takida, Y Mitsufuji
International Conference on Learning Representations (ICLR), 2025, https …, 2025
6*2025
Specmaskgit: Masked generative modeling of audio spectrograms for efficient audio synthesis and beyond
M Comunità, Z Zhong, A Takahashi, S Yang, M Zhao, K Saito, Y Ikemiya, ...
25th International Society for Music Information Retrieval (ISMIR) Conference, 2024
52024
VERSA: A Versatile Evaluation Toolkit for Speech, Audio, and Music
J Shi, H Shim, J Tian, S Arora, H Wu, D Petermann, JQ Yip, Y Zhang, ...
arXiv preprint arXiv:2412.17667, 2024
22024
DisMix: Disentangling Mixtures of Musical Instruments for Source-level Pitch and Timbre Manipulation
YJ Luo, KW Cheuk, W Choi, T Uesaka, K Toyama, K Saito, CH Lai, ...
arXiv preprint arXiv:2408.10807, 2024
22024
Aligning Text-to-Music Evaluation with Human Preferences
Y Huang, Z Novack, K Saito, J Shi, S Watanabe, Y Mitsufuji, J Thickstun, ...
arXiv preprint arXiv:2503.16669, 2025
2025
From White Noise to Symphony: Diffusion Models for Music and Sound -- ISMIR24 Diffusion Model Tutorial
CH Lai, B Nguyen, K Saito, S Ermon, Y Mitsufuji
https://github.com/ChiehHsinJesseLai/ISMIR24DiffusionModelTutorial, 2024
2024
周波数領域でのフィルタ設計に基づくサンプリング周波数非依存畳み込み層を用いた DNN 音源分離
K SAITO, T NAKAMURA, K YATABE, H SARUWATARI
情報処理学会研究報告 (Web) 2021 (MUS-131), 2021
2021
Orator: LLM-Guided Multi-Shot Speech Video Generation
J Chen, Y Fu, A Zeng, Z Wang, S Cen, X Yu, J Tanke, Y Chen, K Saito, ...
Disentangling Multi-instrument Music Audio for Source-level Pitch and Timbre Manipulation
YJ Luo, KW Cheuk, W Choi, WH Liao, K Toyama, T Uesaka, K Saito, ...
Audio Imagination: NeurIPS 2024 Workshop AI-Driven Speech, Music, and Sound …, 0
The system can't perform the operation now. Try again later.
Articles 1–15