Shanghua Gao 高尚华

Shanghua Gao

On the job market (AY 2025–26) shanghuagao@gmail.com

I am a Research Fellow at Harvard University working with Prof. Marinka Zitnik, following my Ph.D. at Nankai University advised by Prof. Ming-Ming Cheng. My research spans foundational AI models—including representation learning, generative modeling, and efficient architectures—and AI for Science, where I develop autonomous agentic AI systems for biomedical reasoning and scientific discovery. I have published in top-tier venues in AI-for-science, including Cell (1), TPAMI (4), and major machine learning conferences such as CVPR and NeurIPS (6). My work has received over 7,000 citations and 8,000 GitHub stars and has successfully transitioned into real-world products.

On the foundational AI models side, I have introduced several influential models for large-scale representation and generative intelligence, including Res2Net, a widely adopted multi-scale backbone; LUSS, the first fully unsupervised large-scale semantic segmentation framework; MDT, the first mask diffusion transformer enabling state-of-the-art image synthesis and efficient training; and UniTS, the first unified multi-task time-series foundation model. These contributions establish general-purpose modeling capabilities across vision, multimodal learning, and time-series analysis.

On the AI for Science side, I develop autonomous, agentic AI systems that integrate scientific knowledge, tool use, and multi-step reasoning. My work published in Cell introduces the first AI agent for biomedical discovery. Building on this direction, ToolUniverse provides a unified ecosystem of AI scientists that enables large-scale, cross-domain scientific interactions, forming the infrastructure for general-purpose scientific agents. TxAgent is an agentic "AI scientist" for medicine that leverages multi-step reasoning and extensive interactions with ToolUniverse to perform therapeutic decision-making with high accuracy and interpretability. Together, these contributions outline a new paradigm for agentic AI in science—systems that autonomously reason across heterogeneous data, tools, and scientific domains.

My Works CV

Selected Works

Full List on Google Scholar | Github

Democratizing AI Scientists Using ToolUniverse
_{Shanghua Gao, Richard Zhu, Pengwei Sui, Zhenglun Kong,
Sufian Aldogom, Yepeng Huang, Ayush Noori, Reza Shamji, Krishna
Parvataneni, Theodoros Tsiligkaridis, Marinka Zitnik

In Review 2025

Star
[code]
[arXiv]
[https://aiscientist.tools]
[AI Agents in Nature]
[AI Cell Models in Science]
[Kempner Institute]}

TxAgent: An AI Agent for Therapeutic Reasoning Across a Universe of Tools
_{Shanghua Gao, Richard Zhu, Zhenglun Kong, Ayush Noori,
Xiaorui Su, Curtis Ginder, Theodoros Tsiligkaridis, Marinka Zitnik

In Review 2025

Star
[arXiv]
[code]
[project website]
[Kempner Institute]
[New York Times]
[TxAgent evaluation portal]}

Empowering Biomedical Discovery with AI Agents
_{Shanghua Gao, Ada Fang*, Yepeng Huang*, Valentina
Giunchiglia*, Ayush Noori*, Jonathan Richard Schwarz, Yasha
Ektefaie, Jovana Kondic, Marinka Zitnik

Cell, 2024

[pdf]
[arXiv]
[https://aiscientist.tools]}

UniTS: a Unified Multi-Task Time Series Model
_{Shanghua Gao, Teddy Koker, Owen Queen, Thomas Hartvigsen,
Theodoros Tsiligkaridis, Marinka Zitnik

Conference on Neural Information Processing Systems (NeurIPS),
2024

Star
[pdf]
[arXiv]
[project website]
[code]
[poster]}

Let's Think Outside the Box: Exploring Leap-of-Thought in Large Language Models with Creative Humor Generation
_{Shanshan Zhong, Zhongzhan Huang, Shanghua Gao, Wushao Wen,
Liang Lin, Marinka Zitnik, Pan Zhou

IEEE Conference on Computer Vision and Pattern Recognition (CVPR),
2024

Star
[pdf]
[project page]
[code]}

Masked Diffusion Transformer is a Strong Image Synthesizer
_{Shanghua Gao, Pan Zhou, Ming-Ming Cheng, Shuicheng Yan

IEEE International Conference on Computer Vision (ICCV), 2023

Star
[pdf]
[code]}

Editanything: Empowering unparalleled flexibility in image editing and generation
_{Shanghua Gao, Zhijie Lin, Xingyu Xie, Pan Zhou, Ming-Ming
Cheng, Shuicheng Yan

Proceedings of the 31st ACM International Conference on
Multimedia, 2023

Star
[pdf]
[code]}

Large-scale Unsupervised Semantic Segmentation
_{Shanghua Gao, Zhong-Yu Li, Ming-Hsuan Yang, Ming-Ming
Cheng, Junwei Han, Philip Torr

IEEE Transactions on Pattern Analysis and Machine Intelligence
(TPAMI), 2023

Star
[pdf]
[project]
[code]
[ImageNet-S]}

Towards Sustainable Self-supervised Learning
_{Shanghua Gao, Pan Zhou, Ming-Ming Cheng, Shuicheng Yan

Tech report, 2022

[pdf]
[code]}

RF-Next: Efficient Receptive Field Search for Convolutional Neural Networks
_{Shanghua Gao, Zhong-Yu Li, Qi Han, Ming-Ming Cheng, Liang
Wang

IEEE Transactions on Pattern Analysis and Machine Intelligence
(TPAMI), 2022

Global2Local: Efficient Structure Search for Video Action
Segmentation

_{Shanghua Gao*, Qi Han*, Zhong-Yu Li, Pai Peng, Liang
Wang, Ming-Ming Cheng

IEEE Conference on Computer Vision and Pattern Recognition
(CVPR), 2021

Star
[pdf-pami]
[pdf-cvpr]
[project]
[code]}}

A Highly Efficient Model to Study the Semantics of Salient Object Detection
_{Ming-Ming Cheng*, Shanghua Gao*, Ali Borji, Yong-Qiang Tan,
Zheng Lin, Meng Wang

IEEE Transactions on Pattern Analysis and Machine Intelligence
(TPAMI), 2021

Highly Efficient Salient Object Detection with 100K
Parameters

_{Shanghua Gao, Yong-Qiang Tan, Ming-Ming Cheng, Chengze
Lu, Yunpeng Chen, Shuicheng Yan

European Conference on Computer Vision (ECCV), 2020

Star
[pdf-pami]
[pdf-eccv]
[bib]
[project]
[code]}}

Representative Batch Normalization with Feature Calibration
_{Shanghua Gao, Qi Han, Duo Li, Ming-Ming Cheng, Pai Peng

Oral, IEEE Conference on Computer Vision and Pattern
Recognition (CVPR), 2021

Star
[pdf]
[project]
[bib]
[code]}

Res2Net: A New Multi-scale Backbone Architecture
_{Shanghua Gao, Ming-Ming Cheng, Kai Zhao, Xin-Yu Zhang,
Ming-Hsuan Yang, Philip Torr

IEEE Transactions on Pattern Analysis and Machine Intelligence
(TPAMI), 2021

Star
[pdf]
[bib]
[DEMO]
[project]
[code]
[PPT]
[中文版]}

JCS: An explainable COVID-19 diagnosis system by joint classification and segmentation
_{Yu-Huan Wu, Shanghua Gao, Jie Mei, Jun Xu, Deng-Ping Fan,
Chao-Wei Zhao, Ming-Ming Cheng

IEEE Transactions on Image Processing (TIP), 2021

[pdf]
[bib]}

Point-based Iterative Graph Exploration for Road Graphs Extraction
_{Yong-Qiang Tan, Shanghua Gao, Xuan-Yi Li, Ming-Ming Cheng,
Bo Ren

IEEE Conference on Computer Vision and Pattern Recognition (CVPR),
2020

[pdf]
[bib]
[project]}

Optimizing the F-measure for Threshold-free Salient Object Detection
_{Kai Zhao, Shanghua Gao, Wenguan Wang, Ming-Ming Cheng

IEEE International Conference on Computer Vision (ICCV), 2019

[pdf]
[bib]
[project]
[code]}

Hi-Fi: Hierarchical Feature Integration for Skeleton Detection
_{Kai Zhao, Wei Shen, Shanghua Gao, Dandan Li, Ming-Ming
Cheng

International Joint Conference on Artificial Intelligence (IJCAI),
2018

[pdf]
[bib]
[project]
[code]
[中文版]}

Salient Objects in Clutter: Bringing Salient Object Detection to the Foreground
_{Deng-Ping Fan, Ming-Ming Cheng*, Jiang-Jiang Liu,
Shanghua Gao, Qinbin Hou, Ali Borji

European Conference on Computer Vision (ECCV), 2018

[pdf]
[bib]
[project]
[中文版]}

Orbital Angular Momentum for Wireless Communications
_{Wenchi Cheng, Wei Zhang, Haiyue Jing, Shanghua Gao, Hailin
Zhang

IEEE Wireless Communications Magazine, 2018

[pdf]
[bib]
[project]}

Bifocal-Lens Converging Based OAM Wireless Communications
_{Shanghua Gao, Wenchi Cheng, Hailin Zhang

IEEE Journal of Communications and Information Networks (JCIN),
2019

High-efficient beam-converging for UCA based radio vortex
wireless communications

_{Shanghua Gao, Wenchi Cheng, Hailin Zhang, Zan Li

IEEE/CIC International Conference on Communications in China
(ICCC), 2017

[pdf_journal]
[pdf_conf]
[bib_journal]
[bib_conf]
[project]}}

Shanghua Gao

Selected Works

Democratizing AI Scientists Using ToolUniverse

TxAgent: An AI Agent for Therapeutic Reasoning Across a Universe of Tools

Empowering Biomedical Discovery with AI Agents

UniTS: a Unified Multi-Task Time Series Model

Let's Think Outside the Box: Exploring Leap-of-Thought in Large Language Models with Creative Humor Generation

Masked Diffusion Transformer is a Strong Image Synthesizer

Editanything: Empowering unparalleled flexibility in image editing and generation

Large-scale Unsupervised Semantic Segmentation

Towards Sustainable Self-supervised Learning

RF-Next: Efficient Receptive Field Search for Convolutional Neural Networks

Global2Local: Efficient Structure Search for Video Action Segmentation

A Highly Efficient Model to Study the Semantics of Salient Object Detection

Highly Efficient Salient Object Detection with 100K Parameters

Representative Batch Normalization with Feature Calibration

Res2Net: A New Multi-scale Backbone Architecture

JCS: An explainable COVID-19 diagnosis system by joint classification and segmentation

Point-based Iterative Graph Exploration for Road Graphs Extraction

Optimizing the F-measure for Threshold-free Salient Object Detection

Hi-Fi: Hierarchical Feature Integration for Skeleton Detection

Salient Objects in Clutter: Bringing Salient Object Detection to the Foreground

Orbital Angular Momentum for Wireless Communications

Bifocal-Lens Converging Based OAM Wireless Communications

High-efficient beam-converging for UCA based radio vortex wireless communications

Contact

Email shanghuagao@gmail.com

Addr Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA.