Folgen
Wanrong Zhu
Wanrong Zhu
Adobe Research
Bestätigte E-Mail-Adresse bei adobe.com - Startseite
Titel
Zitiert von
Zitiert von
Jahr
Openflamingo: An Open-Source Framework for Training Large Autoregressive Vision-Language Models
A Awadalla, I Gao, J Gardner, J Hessel, Y Hanafy, W Zhu, K Marathe, ...
arXiv preprint arXiv:2308.01390, 2023
461*2023
Large Language Models are Implicitly Topic Models: Explaining and Finding Good Demonstrations for In-Context Learning
X Wang, W Zhu, WY Wang
NeurIPS 2023, 2023
169*2023
LayoutGPT: Compositional Visual Planning and Generation with Large Language Models
W Feng, W Zhu, T Fu, V Jampani, A Akula, X He, S Basu, XE Wang, ...
NeurIPS 2023, 2023
1552023
Multimodal C4: An Open, Billion-Scale Corpus of Images Interleaved with Text
W Zhu, J Hessel, A Awadalla, SY Gadre, J Dodge, A Fang, Y Yu, ...
NeurIPS 2023 - Dataset and Benchmark Track, 2023
1402023
Text Infilling
W Zhu, Z Hu, E Xing
arXiv preprint arXiv:1901.00158, 2019
902019
GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone GUI Navigation
A Yan, Z Yang, W Zhu, K Lin, L Li, J Wang, J Yang, Y Zhong, J McAuley, ...
arXiv preprint arXiv:2311.07562, 2023
732023
Texar: A Modularized, Versatile, and Extensible Toolkit for Text Generation
Z Hu, H Shi, B Tan, W Wang, Z Yang, T Zhao, J He, L Qin, D Wang, X Ma, ...
ACL 2019: System Demonstration, 159–164, 2019
672019
VisIT-Bench: A Benchmark for Vision-Language Instruction Following Inspired by Real-World Use
Y Bitton, H Bansal, J Hessel, R Shao, W Zhu, A Awadalla, J Gardner, ...
NeurIPS 2023 - Dataset and Benchmark Track, 2023
552023
Velma: Verbalization Embodiment of LLM Agents for Vision and Language Navigation in Street View
R Schumann, W Zhu, W Feng, TJ Fu, S Riezler, WY Wang
AAAI 2024, 2023
482023
Diagnosing Vision-and-Language Navigation: What Really Matters
W Zhu, Y Qi, P Narayana, K Sone, S Basu, XE Wang, Q Wu, M Eckstein, ...
NAACL 2022, 5981–5993, 2021
442021
End-to-end Dense Video Captioning as Sequence Generation
W Zhu, B Pang, A Thapliyal, WY Wang, R Soricut
COLING 2022, 5651–5665, 2022
402022
Neuro-Symbolic Causal Language Planning with Commonsense Prompting
Y Lu, W Feng, W Zhu, W Xu, XE Wang, M Eckstein, WY Wang
ICLR 2023, 2022
36*2022
Visualize Before You Write: Imagination-Guided Open-Ended Text Generation
W Zhu, A Yan, Y Lu, W Xu, XE Wang, M Eckstein, WY Wang
Findings of EACL 2023, 78–92, 2022
332022
Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation
W Zhu, XE Wang, TJ Fu, A Yan, P Narayana, K Sone, S Basu, WY Wang
EACL 2021, 1207–1221, 2020
332020
Multimodal procedural planning via dual text-image prompting
Y Lu, P Lu, Z Chen, W Zhu, XE Wang, WY Wang
Findings of EMNLP 2024, 2023
322023
Imagination-Augmented Natural Language Understanding
Y Lu, W Zhu, XE Wang, M Eckstein, WY Wang
NAACL 2022, 4392–4402, 2022
292022
ImaginE: An Imagination-based Automatic Evaluation Metric for Natural Language Generation
W Zhu, XE Wang, A Yan, M Eckstein, WY Wang
Findings of EACL 2023, 93–105, 2021
102021
Towards Understanding Sample Variance in Visually Grounded Language Generation: Evaluations and Observations
W Zhu, XE Wang, P Narayana, K Sone, S Basu, WY Wang
EMNLP 2020, 8806–8811, 2020
102020
List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMs
A Yan, Z Yang, J Wu, W Zhu, J Yang, L Li, K Lin, J Wang, J McAuley, ...
arXiv preprint arXiv:2404.16375, 2024
62024
Mmsci: A multimodal multi-discipline dataset for phd-level scientific comprehension
Z Li, X Yang, K Choi, W Zhu, R Hsieh, HJ Kim, JH Lim, S Ji, B Lee, X Yan, ...
AI for Accelerated Materials Design-Vienna 2024, 2024
62024
Das System kann den Vorgang jetzt nicht ausführen. Versuchen Sie es später erneut.
Artikel 1–20