Main Publications
MEGA-Bench: Scaling Multimodal Evaluation to over 500 Real-World Tasks
arxiv, 2024
POINTS: Improving Your Vision-language Model with Affordable Strategies
Yuan Liu, Zhongyin Zhao, Ziyuan Zhuang, Le Tian, Xiao Zhou, Jie Zhou
arxiv, 2024
VLMEvalKit: An Open-Source Toolkit for Evaluating Large Multi-Modality Models
ACMMM 2024
Rethinking Overlooked Aspects in Vision-Language Models
Yuan Liu, Le Tian, Xiao Zhou, Jie Zhou
arxiv, 2024
Improving Pixel-based MIM by Reducing Wasted Modeling Capability
Yuan Liu, Songyang Zhang, Jiacheng Chen, Zhaohui Yu, Kai Chen, Dahua Lin
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
MMBench: Is Your Multi-modal Model an All-around Player?
Yuan Liu, Haodong Duan, Yuanhan Zhang, Bo Li, Songyang Zhang, Wangbo Zhao, Yike Yuan, Jiaqi Wang, Conghui He, Ziwei Liu, Kai Chen, Dahua Lin
European Conference on Computer Vision, 2024 (Oral)
PixMIM: Rethinking Pixel Reconstruction in Masked Image Modeling
Yuan Liu, Songyang Zhang, Jiacheng Chen, Kai Chen, Dahua Lin
Transactions on Machine Learning Research, 2024
MoQuad: Motion-focused Quadruple Construction for Video Contrastive Learning
Yuan Liu, Jiacheng Chen, Hao Wu
European Conference on Computer Vision Workshop, 2022
Contrast and order representations for video self-supervised learning
Kai Hu, Jie Shao, Yuan Liu, Bhiksha Raj, Marios Savvides, Zhiqiang Shen
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021