Publications

publications by categories in reversed chronological order. generated by jekyll-scholar.

2026

  1. HERMES: KV Cache as Hierarchical Memory for Efficient Streaming Video Understanding
    Haowei Zhang, Shudong Yang, Jinlan Fu, See-Kiong Ng, and Xipeng Qiu
    2026

2025

  1. GroundedPRM: Tree-Guided and Fidelity-Aware Process Reward Modeling for Step-Level Reasoning
    Yao Zhang, Yu Wu, Haowei Zhang, Weiguo Li, Haokun Chen, Jingpei Wu, Guohao Li, Zhen Han, and Volker Tresp
    2025
  2. MMVU: Measuring Expert-Level Multi-Discipline Video Understanding
    Yilun Zhao*Haowei Zhang*, Lujing Xie*, Guo Gan, Yitao Long, Zhiyuan Hu, Tongyan Hu, Weiyuan Chen, Chuhan Li, Junyang Song, Zhijian Xu, Chengye Wang, Weifeng Pan, Ziyao Shangguan, Xiangru Tang, Zhenwen Liang, Yixin Liu, Chen Zhao, and Arman Cohan
    In CVPR 2025, Jun 2025

2024

  1. Visual Question Decomposition on Multimodal Large Language Models
    Haowei Zhang*, Jianzhe Liu*, Zhen Han, Shuo Chen, Bailan He, Volker Tresp, Zhiqiang Xu, and Jindong Gu
    In Findings of the Association for Computational Linguistics: EMNLP 2024, Nov 2024
  2. OpenT2T: An Open-Source Toolkit for Table-to-Text Generation
    Haowei Zhang*, Shengyun Si*, Yilun Zhao*, Lujing Xie, Zhijian Xu, Lyuhao Chen, Linyong Nan, Pengcheng Wang, Xiangru Tang, and Arman Cohan
    In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Nov 2024

2023

  1. Investigating Table-to-Text Generation Capabilities of Large Language Models in Real-World Information Seeking Scenarios
    Yilun Zhao*Haowei Zhang*, Shengyun Si*, Linyong Nan, Xiangru Tang, and Arman Cohan
    In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: Industry Track, Dec 2023