Shanghua Gao

On the job market (AY 2025–26) shanghuagao@gmail.com

I am a Research Fellow at Harvard University working with Prof. Marinka Zitnik, following my Ph.D. at Nankai University advised by Prof. Ming-Ming Cheng. My research spans foundational AI models—including representation learning, generative modeling, and efficient architectures—and AI for Science, where I develop autonomous agentic AI systems for biomedical reasoning and scientific discovery. I have published in top-tier venues in AI-for-science, including Cell (1), TPAMI (4), and major machine learning conferences such as CVPR and NeurIPS (6). My work has received over 7,000 citations and 8,000 GitHub stars and has successfully transitioned into real-world products.

On the foundational AI models side, I have introduced several influential models for large-scale representation and generative intelligence, including Res2Net, a widely adopted multi-scale backbone; LUSS, the first fully unsupervised large-scale semantic segmentation framework; MDT, the first mask diffusion transformer enabling state-of-the-art image synthesis and efficient training; and UniTS, the first unified multi-task time-series foundation model. These contributions establish general-purpose modeling capabilities across vision, multimodal learning, and time-series analysis.

On the AI for Science side, I develop autonomous, agentic AI systems that integrate scientific knowledge, tool use, and multi-step reasoning. My work published in Cell introduces the first AI agent for biomedical discovery. Building on this direction, ToolUniverse provides a unified ecosystem of AI scientists that enables large-scale, cross-domain scientific interactions, forming the infrastructure for general-purpose scientific agents. TxAgent is an agentic "AI scientist" for medicine that leverages multi-step reasoning and extensive interactions with ToolUniverse to perform therapeutic decision-making with high accuracy and interpretability. Together, these contributions outline a new paradigm for agentic AI in science—systems that autonomously reason across heterogeneous data, tools, and scientific domains.

My Works CV
Collect from Shanghua Gao

Democratizing AI Scientists Using ToolUniverse

Shanghua Gao, Richard Zhu, Pengwei Sui, Zhenglun Kong, Sufian Aldogom, Yepeng Huang, Ayush Noori, Reza Shamji, Krishna Parvataneni, Theodoros Tsiligkaridis, Marinka Zitnik

In Review 2025

Star [code] [arXiv] [https://aiscientist.tools] [AI Agents in Nature] [AI Cell Models in Science] [Kempner Institute]

TxAgent: An AI Agent for Therapeutic Reasoning Across a Universe of Tools

Shanghua Gao, Richard Zhu, Zhenglun Kong, Ayush Noori, Xiaorui Su, Curtis Ginder, Theodoros Tsiligkaridis, Marinka Zitnik

In Review 2025

Star [arXiv] [code] [project website] [Kempner Institute] [New York Times] [TxAgent evaluation portal]

Empowering Biomedical Discovery with AI Agents

Shanghua Gao, Ada Fang*, Yepeng Huang*, Valentina Giunchiglia*, Ayush Noori*, Jonathan Richard Schwarz, Yasha Ektefaie, Jovana Kondic, Marinka Zitnik

Cell, 2024

[pdf] [arXiv] [https://aiscientist.tools]

UniTS: a Unified Multi-Task Time Series Model

Shanghua Gao, Teddy Koker, Owen Queen, Thomas Hartvigsen, Theodoros Tsiligkaridis, Marinka Zitnik

Conference on Neural Information Processing Systems (NeurIPS), 2024

Star [pdf] [arXiv] [project website] [code] [poster]

Let's Think Outside the Box: Exploring Leap-of-Thought in Large Language Models with Creative Humor Generation

Shanshan Zhong, Zhongzhan Huang, Shanghua Gao, Wushao Wen, Liang Lin, Marinka Zitnik, Pan Zhou

IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024

Star [pdf] [project page] [code]

Masked Diffusion Transformer is a Strong Image Synthesizer

Shanghua Gao, Pan Zhou, Ming-Ming Cheng, Shuicheng Yan

IEEE International Conference on Computer Vision (ICCV), 2023

Star [pdf] [code]

Editanything: Empowering unparalleled flexibility in image editing and generation

Shanghua Gao, Zhijie Lin, Xingyu Xie, Pan Zhou, Ming-Ming Cheng, Shuicheng Yan

Proceedings of the 31st ACM International Conference on Multimedia, 2023

Star [pdf] [code]

Large-scale Unsupervised Semantic Segmentation

Shanghua Gao, Zhong-Yu Li, Ming-Hsuan Yang, Ming-Ming Cheng, Junwei Han, Philip Torr

IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023

Star [pdf] [project] [code] [ImageNet-S]

Towards Sustainable Self-supervised Learning

Shanghua Gao, Pan Zhou, Ming-Ming Cheng, Shuicheng Yan

Tech report, 2022

[pdf] [code]

RF-Next: Efficient Receptive Field Search for Convolutional Neural Networks

Shanghua Gao, Zhong-Yu Li, Qi Han, Ming-Ming Cheng, Liang Wang

IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022

Global2Local: Efficient Structure Search for Video Action Segmentation

Shanghua Gao*, Qi Han*, Zhong-Yu Li, Pai Peng, Liang Wang, Ming-Ming Cheng

IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021

Star [pdf-pami] [pdf-cvpr] [project] [code]

A Highly Efficient Model to Study the Semantics of Salient Object Detection

Ming-Ming Cheng*, Shanghua Gao*, Ali Borji, Yong-Qiang Tan, Zheng Lin, Meng Wang

IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021

Highly Efficient Salient Object Detection with 100K Parameters

Shanghua Gao, Yong-Qiang Tan, Ming-Ming Cheng, Chengze Lu, Yunpeng Chen, Shuicheng Yan

European Conference on Computer Vision (ECCV), 2020

Star [pdf-pami] [pdf-eccv] [bib] [project] [code]

Representative Batch Normalization with Feature Calibration

Shanghua Gao, Qi Han, Duo Li, Ming-Ming Cheng, Pai Peng

Oral, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021

Star [pdf] [project] [bib] [code]

Res2Net: A New Multi-scale Backbone Architecture

Shanghua Gao, Ming-Ming Cheng, Kai Zhao, Xin-Yu Zhang, Ming-Hsuan Yang, Philip Torr

IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021

Star [pdf] [bib] [DEMO] [project] [code] [PPT] [中文版]

JCS: An explainable COVID-19 diagnosis system by joint classification and segmentation

Yu-Huan Wu, Shanghua Gao, Jie Mei, Jun Xu, Deng-Ping Fan, Chao-Wei Zhao, Ming-Ming Cheng

IEEE Transactions on Image Processing (TIP), 2021

[pdf] [bib]

Point-based Iterative Graph Exploration for Road Graphs Extraction

Yong-Qiang Tan, Shanghua Gao, Xuan-Yi Li, Ming-Ming Cheng, Bo Ren

IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020

[pdf] [bib] [project]

Optimizing the F-measure for Threshold-free Salient Object Detection

Kai Zhao, Shanghua Gao, Wenguan Wang, Ming-Ming Cheng

IEEE International Conference on Computer Vision (ICCV), 2019

[pdf] [bib] [project] [code]

Hi-Fi: Hierarchical Feature Integration for Skeleton Detection

Kai Zhao, Wei Shen, Shanghua Gao, Dandan Li, Ming-Ming Cheng

International Joint Conference on Artificial Intelligence (IJCAI), 2018

[pdf] [bib] [project] [code] [中文版]

Salient Objects in Clutter: Bringing Salient Object Detection to the Foreground

Deng-Ping Fan, Ming-Ming Cheng*, Jiang-Jiang Liu, Shanghua Gao, Qinbin Hou, Ali Borji

European Conference on Computer Vision (ECCV), 2018

[pdf] [bib] [project] [中文版]

Orbital Angular Momentum for Wireless Communications

Wenchi Cheng, Wei Zhang, Haiyue Jing, Shanghua Gao, Hailin Zhang

IEEE Wireless Communications Magazine, 2018

[pdf] [bib] [project]

Bifocal-Lens Converging Based OAM Wireless Communications

Shanghua Gao, Wenchi Cheng, Hailin Zhang

IEEE Journal of Communications and Information Networks (JCIN), 2019

High-efficient beam-converging for UCA based radio vortex wireless communications

Shanghua Gao, Wenchi Cheng, Hailin Zhang, Zan Li

IEEE/CIC International Conference on Communications in China (ICCC), 2017

[pdf_journal] [pdf_conf] [bib_journal] [bib_conf] [project]

Contact

  • Email shanghuagao@gmail.com
  • Addr Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA.
  • Hosted by Github