Qihua Dong's

selfie_lowres.jpg

I am currently a PhD at Northeastern University, Boston. I graduated from City University of Hong Kong with a major in computer science and a minor in math.

My research interests focus on reasoning and visual understanding in (M)LLMs, including reinforcement learning and tool-use agents. My prior experience spans multimodal LLMs, image segmentation, and medical image analysis.

ps: You may reach me by email, twitter or github. Welcome to collaborate!

News

Apr 2026 Two papers accepted at ACL 2026: a Findings paper on a hierarchical visual agent with compact visual/text context for chart reasoning, and a Main Conference survey on thinking with images.
Mar 2026 Our Amazon work, Visual Reasoning through Tool-supervised Reinforcement Learning, was accepted to CVPR 2026 Findings and is now available on arXiv. If you find it interesting, feel free to upvote it on Hugging Face Papers.
Mar 2026 The code and data for our ICLR 2026 paper Ref-Adv: Exploring MLLM Visual Reasoning in Referring Expression Tasks are released at ref-adv.github.io.
Feb 2026 Our new preprint Fine-T2I is released: Fine-T2I: An Open, Large-Scale, and Diverse Dataset for High-Quality T2I Fine-Tuning. The dataset is available on HuggingFace and was trending #1 in datasets!
Jan 2026 One paper accepted at ICLR 2026: Ref-Adv: Exploring MLLM Visual Reasoning in Referring Expression Tasks!

Projects

The authors with * contributed equally to the work
  1. Qihua Dong, Gozde Sahin, Pei Wang, Zhaowei Cai, and 3 more authors
    CVPR 2026 Findings, 2026
  2. Qihua Dong, Yang Kuo, Ju Lin, Handong Zhao, and 5 more authors
    Proc. ICLR, 2026
  3. Qihua Dong, Luis Figueroa, Handong Zhao, Kushal Kafle, and 4 more authors
    arXiv preprint, 2025
  4. Qihua Dong*, Hao Du*, Ying Song, Yan Xu, and 1 more author
    Proc. ICCV, 2023
  5. Ruozhen He*Qihua Dong*, Jiaying Lin, and Rynson W. H. Lau
    Proc. AAAI, 2023
  6. Hao Du*Qihua Dong*, Yan Xu, and Jing Liao
    IEEE Transactions on Medical Imaging, 2023