I am a Research Fellow at Harvard University working with Prof. Marinka Zitnik, following my Ph.D. at Nankai University advised by Prof. Ming-Ming Cheng. My research spans foundational AI models — including representation learning, generative modeling, and efficient architectures — and AI for Science, where I develop autonomous agentic AI systems for biomedical reasoning and scientific discovery. I have published in top-tier venues in AI-for-science, including Cell (1), TPAMI (4), and major machine learning conferences such as CVPR and NeurIPS (6). My work has received over 7,000 citations and 8,000 GitHub stars and has successfully transitioned into real-world products.
On the foundational AI models side, I have introduced several influential models for large-scale representation and generative intelligence, including Res2Net, a widely adopted multi-scale backbone; LUSS, the first fully unsupervised large-scale semantic segmentation framework; MDT, the first mask diffusion transformer enabling state-of-the-art image synthesis and efficient training; and UniTS, the first unified multi-task time-series foundation model. These contributions establish general-purpose modeling capabilities across vision, multimodal learning, and time-series analysis.
On the AI for Science side, I develop autonomous, agentic AI systems that integrate scientific knowledge, tool use, and multi-step reasoning. My work published in Cell introduces the first AI agent for biomedical discovery. Building on this direction, ToolUniverse provides a unified ecosystem of AI scientists that enables large-scale, cross-domain scientific interactions, forming the infrastructure for general-purpose scientific agents. TxAgent is an agentic "AI scientist" for medicine that leverages multi-step reasoning and extensive interactions with ToolUniverse to perform therapeutic decision-making with high accuracy and interpretability. Together, these contributions outline a new paradigm for agentic AI in science — systems that autonomously reason across heterogeneous data, tools, and scientific domains.
Qworld: Question-Specific Evaluation Criteria for LLMs
Shanghua Gao*, Yuchang Su*, Pengwei Sui, Curtis Ginder, Marinka Zitnik
arXiv, 2026
Democratizing AI Scientists Using ToolUniverse
Shanghua Gao, Richard Zhu, Pengwei Sui, Zhenglun Kong, Sufian Aldogom, Yepeng Huang, Ayush Noori, Reza Shamji, Krishna Parvataneni, Theodoros Tsiligkaridis, Marinka Zitnik
In Review 2025
Star [code] [arXiv] [aiscientist.tools] [Nature] [Science] [Kempner]
TxAgent: An AI Agent for Therapeutic Reasoning Across a Universe of Tools
Shanghua Gao, Richard Zhu, Zhenglun Kong, Ayush Noori, Xiaorui Su, Curtis Ginder, Theodoros Tsiligkaridis, Marinka Zitnik
In Review 2025
Empowering Biomedical Discovery with AI Agents
Shanghua Gao, Ada Fang*, Yepeng Huang*, Valentina Giunchiglia*, Ayush Noori*, Jonathan Richard Schwarz, Yasha Ektefaie, Jovana Kondic, Marinka Zitnik
Cell, 2024
UniTS: a Unified Multi-Task Time Series Model
Shanghua Gao, Teddy Koker, Owen Queen, Thomas Hartvigsen, Theodoros Tsiligkaridis, Marinka Zitnik
NeurIPS, 2024
Let's Think Outside the Box: Exploring Leap-of-Thought in Large Language Models with Creative Humor Generation
Shanshan Zhong, Zhongzhan Huang, Shanghua Gao, Wushao Wen, Liang Lin, Marinka Zitnik, Pan Zhou
CVPR, 2024
Masked Diffusion Transformer is a Strong Image Synthesizer
Shanghua Gao, Pan Zhou, Ming-Ming Cheng, Shuicheng Yan
ICCV, 2023
Editanything: Empowering unparalleled flexibility in image editing and generation
Shanghua Gao, Zhijie Lin, Xingyu Xie, Pan Zhou, Ming-Ming Cheng, Shuicheng Yan
ACM Multimedia, 2023
Large-scale Unsupervised Semantic Segmentation
Shanghua Gao, Zhong-Yu Li, Ming-Hsuan Yang, Ming-Ming Cheng, Junwei Han, Philip Torr
TPAMI, 2023
Towards Sustainable Self-supervised Learning
Shanghua Gao, Pan Zhou, Ming-Ming Cheng, Shuicheng Yan
Tech report, 2022
RF-Next: Efficient Receptive Field Search for Convolutional Neural Networks
Shanghua Gao, Zhong-Yu Li, Qi Han, Ming-Ming Cheng, Liang Wang
TPAMI, 2022
Global2Local: Efficient Structure Search for Video Action Segmentation
Shanghua Gao*, Qi Han*, Zhong-Yu Li, Pai Peng, Liang Wang, Ming-Ming Cheng
CVPR, 2021
A Highly Efficient Model to Study the Semantics of Salient Object Detection
Ming-Ming Cheng*, Shanghua Gao*, Ali Borji, Yong-Qiang Tan, Zheng Lin, Meng Wang
TPAMI, 2021
Highly Efficient Salient Object Detection with 100K Parameters
Shanghua Gao, Yong-Qiang Tan, Ming-Ming Cheng, Chengze Lu, Yunpeng Chen, Shuicheng Yan
ECCV, 2020
Representative Batch Normalization with Feature Calibration
Shanghua Gao, Qi Han, Duo Li, Ming-Ming Cheng, Pai Peng
Oral, CVPR, 2021
Res2Net: A New Multi-scale Backbone Architecture
Shanghua Gao, Ming-Ming Cheng, Kai Zhao, Xin-Yu Zhang, Ming-Hsuan Yang, Philip Torr
TPAMI, 2021
JCS: An explainable COVID-19 diagnosis system by joint classification and segmentation
Yu-Huan Wu, Shanghua Gao, Jie Mei, Jun Xu, Deng-Ping Fan, Chao-Wei Zhao, Ming-Ming Cheng
TIP, 2021
Point-based Iterative Graph Exploration for Road Graphs Extraction
Yong-Qiang Tan, Shanghua Gao, Xuan-Yi Li, Ming-Ming Cheng, Bo Ren
CVPR, 2020
Bifocal-Lens Converging Based OAM Wireless Communications
Shanghua Gao, Wenchi Cheng, Hailin Zhang
IEEE JCIN, 2019
High-efficient beam-converging for UCA based radio vortex wireless communications
Shanghua Gao, Wenchi Cheng, Hailin Zhang, Zan Li
ICCC, 2017