Qihua Dong's

selfie_lowres.jpg

I am currently a PhD at Northeastern University, Boston. I graduated from City University of Hong Kong with a major in computer science and a minor in math.

My research interests focus on reasoning and visual understanding in (M)LLMs, including reinforcement learning and tool-use agents. My prior experience spans multimodal LLMs, image segmentation, and medical image analysis.

ps: You may reach me by email, twitter or github. Welcome to collaborate!

News

Mar 2026 Our Amazon internship work, Visual Reasoning through Tool-supervised Reinforcement Learning, was accepted to CVPR 2026 Findings. Paper coming soon.
Mar 2026 The code and data for our ICLR 2026 paper Ref-Adv: Exploring MLLM Visual Reasoning in Referring Expression Tasks are released at ref-adv.github.io.
Feb 2026 Our new preprint Fine-T2I is released: Fine-T2I: An Open, Large-Scale, and Diverse Dataset for High-Quality T2I Fine-Tuning. The dataset is available on HuggingFace and was trending #1 in datasets!
Jan 2026 One paper accepted at ICLR 2026: Ref-Adv: Exploring MLLM Visual Reasoning in Referring Expression Tasks!
Oct 2025 Honored to receive the Outstanding Reviewer Award at ICCV 2025!

Projects

The authors with * contributed equally to the work
  1. Visual Reasoning through Tool-supervised Reinforcement Learning
    Qihua Dong, Gozde Sahin, Pei Wang, Zhaowei Cai, and 3 more authors
    CVPR 2026 Findings, 2026
  2. Qihua Dong, Yang Kuo, Ju Lin, Handong Zhao, and 5 more authors
    Proc. ICLR, 2026
  3. Qihua Dong, Luis Figueroa, Handong Zhao, Kushal Kafle, and 4 more authors
    arXiv preprint, 2025
  4. Qihua Dong*, Hao Du*, Ying Song, Yan Xu, and 1 more author
    Proc. ICCV, 2023
  5. Ruozhen He*Qihua Dong*, Jiaying Lin, and Rynson W. H. Lau
    Proc. AAAI, 2023
  6. Hao Du*Qihua Dong*, Yan Xu, and Jing Liao
    IEEE Transactions on Medical Imaging, 2023