Zhuoming Chen
Zhuoming Chen
Bestätigte E-Mail-Adresse bei
Zitiert von
Zitiert von
Specinfer: Accelerating large language model serving with tree-based speculative inference and verification
X Miao, G Oliaro, Z Zhang, X Cheng, Z Wang, Z Zhang, RYY Wong, A Zhu, ...
Proceedings of the 29th ACM International Conference on Architectural …, 2024
Quantized training of gradient boosting decision trees
Y Shi, G Ke, Z Chen, S Zheng, TY Liu
Advances in neural information processing systems 35, 18822-18833, 2022
Sequoia: Scalable, Robust, and Hardware-aware Speculative Decoding
Z Chen, A May, R Svirschevski, Y Huang, M Ryabinin, Z Jia, B Chen
arXiv preprint arXiv:2402.12374, 2024
TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding
H Sun, Z Chen, X Yang, Y Tian, B Chen
arXiv preprint arXiv:2404.11912, 2024
GNNPipe: Scaling Deep GNN Training with Pipelined Model Parallelism
J Chen, Z Chen, X Qian
arXiv preprint arXiv:2308.10087, 2023
SpecExec: Massively Parallel Speculative Decoding for Interactive LLM Inference on Consumer Devices
R Svirschevski, A May, Z Chen, B Chen, Z Jia, M Ryabinin
arXiv preprint arXiv:2406.02532, 2024
Quark: A Gradient-Free Quantum Learning Framework for Classification Tasks
Z Zhang, Z Chen, H Huang, Z Jia
Das System kann den Vorgang jetzt nicht ausführen. Versuchen Sie es später erneut.
Artikel 1–7