I am a Research Fellow at Harvard University working with Prof. Marinka Zitnik, following my Ph.D. at Nankai University advised by Prof. Ming-Ming Cheng. My research spans foundational AI models—including representation learning, generative modeling, and efficient architectures—and AI for Science, where I develop autonomous agentic AI systems for biomedical reasoning and scientific discovery. I have published in top-tier venues in AI-for-science, including Cell (1), TPAMI (4), and major machine learning conferences such as CVPR and NeurIPS (6). My work has received over 7,000 citations and 8,000 GitHub stars and has successfully transitioned into real-world products.
On the foundational AI models side, I have introduced several influential models for large-scale representation and generative intelligence, including Res2Net, a widely adopted multi-scale backbone; LUSS, the first fully unsupervised large-scale semantic segmentation framework; MDT, the first mask diffusion transformer enabling state-of-the-art image synthesis and efficient training; and UniTS, the first unified multi-task time-series foundation model. These contributions establish general-purpose modeling capabilities across vision, multimodal learning, and time-series analysis.
On the AI for Science side, I develop autonomous, agentic AI systems that integrate scientific knowledge, tool use, and multi-step reasoning. My work published in Cell introduces the first AI agent for biomedical discovery. Building on this direction, ToolUniverse provides a unified ecosystem of AI scientists that enables large-scale, cross-domain scientific interactions, forming the infrastructure for general-purpose scientific agents. TxAgent is an agentic "AI scientist" for medicine that leverages multi-step reasoning and extensive interactions with ToolUniverse to perform therapeutic decision-making with high accuracy and interpretability. Together, these contributions outline a new paradigm for agentic AI in science—systems that autonomously reason across heterogeneous data, tools, and scientific domains.
Democratizing AI Scientists Using ToolUniverse
Shanghua Gao, Richard Zhu, Pengwei Sui, Zhenglun Kong, Sufian Aldogom, Yepeng Huang, Ayush Noori, Reza Shamji, Krishna Parvataneni, Theodoros Tsiligkaridis, Marinka Zitnik
In Review 2025
Star [code] [arXiv] [https://aiscientist.tools] [AI Agents in Nature] [AI Cell Models in Science] [Kempner Institute]
TxAgent: An AI Agent for Therapeutic Reasoning Across a Universe of Tools
Shanghua Gao, Richard Zhu, Zhenglun Kong, Ayush Noori, Xiaorui Su, Curtis Ginder, Theodoros Tsiligkaridis, Marinka Zitnik
In Review 2025
Star [arXiv] [code] [project website] [Kempner Institute] [New York Times] [TxAgent evaluation portal]
Empowering Biomedical Discovery with AI Agents
Shanghua Gao, Ada Fang*, Yepeng Huang*, Valentina Giunchiglia*, Ayush Noori*, Jonathan Richard Schwarz, Yasha Ektefaie, Jovana Kondic, Marinka Zitnik
Cell, 2024
UniTS: a Unified Multi-Task Time Series Model
Shanghua Gao, Teddy Koker, Owen Queen, Thomas Hartvigsen, Theodoros Tsiligkaridis, Marinka Zitnik
Conference on Neural Information Processing Systems (NeurIPS), 2024
Let's Think Outside the Box: Exploring Leap-of-Thought in Large Language Models with Creative Humor Generation
Shanshan Zhong, Zhongzhan Huang, Shanghua Gao, Wushao Wen, Liang Lin, Marinka Zitnik, Pan Zhou
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024
Masked Diffusion Transformer is a Strong Image Synthesizer
Shanghua Gao, Pan Zhou, Ming-Ming Cheng, Shuicheng Yan
IEEE International Conference on Computer Vision (ICCV), 2023
Editanything: Empowering unparalleled flexibility in image editing and generation
Shanghua Gao, Zhijie Lin, Xingyu Xie, Pan Zhou, Ming-Ming Cheng, Shuicheng Yan
Proceedings of the 31st ACM International Conference on Multimedia, 2023
Large-scale Unsupervised Semantic Segmentation
Shanghua Gao, Zhong-Yu Li, Ming-Hsuan Yang, Ming-Ming Cheng, Junwei Han, Philip Torr
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Towards Sustainable Self-supervised Learning
Shanghua Gao, Pan Zhou, Ming-Ming Cheng, Shuicheng Yan
Tech report, 2022
RF-Next: Efficient Receptive Field Search for Convolutional Neural Networks
Shanghua Gao, Zhong-Yu Li, Qi Han, Ming-Ming Cheng, Liang Wang
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Global2Local: Efficient Structure Search for Video Action Segmentation
Shanghua Gao*, Qi Han*, Zhong-Yu Li, Pai Peng, Liang Wang, Ming-Ming Cheng
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021
A Highly Efficient Model to Study the Semantics of Salient Object Detection
Ming-Ming Cheng*, Shanghua Gao*, Ali Borji, Yong-Qiang Tan, Zheng Lin, Meng Wang
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Highly Efficient Salient Object Detection with 100K Parameters
Shanghua Gao, Yong-Qiang Tan, Ming-Ming Cheng, Chengze Lu, Yunpeng Chen, Shuicheng Yan
European Conference on Computer Vision (ECCV), 2020
Representative Batch Normalization with Feature Calibration
Shanghua Gao, Qi Han, Duo Li, Ming-Ming Cheng, Pai Peng
Oral, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021
Res2Net: A New Multi-scale Backbone Architecture
Shanghua Gao, Ming-Ming Cheng, Kai Zhao, Xin-Yu Zhang, Ming-Hsuan Yang, Philip Torr
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
JCS: An explainable COVID-19 diagnosis system by joint classification and segmentation
Yu-Huan Wu, Shanghua Gao, Jie Mei, Jun Xu, Deng-Ping Fan, Chao-Wei Zhao, Ming-Ming Cheng
IEEE Transactions on Image Processing (TIP), 2021
Point-based Iterative Graph Exploration for Road Graphs Extraction
Yong-Qiang Tan, Shanghua Gao, Xuan-Yi Li, Ming-Ming Cheng, Bo Ren
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020
Optimizing the F-measure for Threshold-free Salient Object Detection
Kai Zhao, Shanghua Gao, Wenguan Wang, Ming-Ming Cheng
IEEE International Conference on Computer Vision (ICCV), 2019
Hi-Fi: Hierarchical Feature Integration for Skeleton Detection
Kai Zhao, Wei Shen, Shanghua Gao, Dandan Li, Ming-Ming Cheng
International Joint Conference on Artificial Intelligence (IJCAI), 2018
Salient Objects in Clutter: Bringing Salient Object Detection to the Foreground
Deng-Ping Fan, Ming-Ming Cheng*, Jiang-Jiang Liu, Shanghua Gao, Qinbin Hou, Ali Borji
European Conference on Computer Vision (ECCV), 2018
Orbital Angular Momentum for Wireless Communications
Wenchi Cheng, Wei Zhang, Haiyue Jing, Shanghua Gao, Hailin Zhang
IEEE Wireless Communications Magazine, 2018
Bifocal-Lens Converging Based OAM Wireless Communications
Shanghua Gao, Wenchi Cheng, Hailin Zhang
IEEE Journal of Communications and Information Networks (JCIN), 2019
High-efficient beam-converging for UCA based radio vortex wireless communications
Shanghua Gao, Wenchi Cheng, Hailin Zhang, Zan Li
IEEE/CIC International Conference on Communications in China (ICCC), 2017
|
Hosted by Github |