个人主页:https://ha0tang.github.io/
主要研究方向
生成式模型、世界模型、空间智能、大语言模型、多模态大模型、计算机视觉
科研/教育经历
唐浩,博士,北京大学计算机学院助理教授/研究员、博士生导师 、博雅青年学者、未名青年学者;北大具身与生成智能实验室负责人、北大图灵班科研导师。先后在美国卡耐基梅隆大学(CMU)和瑞士苏黎世联邦理工学院(ETH Zurich)开展博士后研究,于意大利特伦托大学获得博士学位,并在英国牛津大学、新加坡国立大学、美国东北大学及阿联酋IIAI等机构进行学术访问与研究实习。此外,他还积极推动产学研结合,曾在美国、英国、罗马尼亚和中国的多家创业公司担任高级技术顾问。
主要荣誉与获奖
入选国家级海外高水平人才计划,获国家优秀留学生奖(归国类),并连续三年(2023–2025年)入选斯坦福大学全球前2%顶尖科学家榜单。
Selected Publications(*Corresponding Author(s))
[1] Jiawei Mao, Yu Yang, Xuesong Yin, Ling Shao, Hao Tang*. AllRestorer: All-in-One Transformer for Image Restoration under Composite Degradations. IEEE TPAMI, 2026
[2] Hao Tang, Ling Shao, Zhenyu Zhang, Luc Van Gool, Nicu Sebe. Spatial-Temporal Graph Mamba for Music-Guided Dance Video Synthesis. IEEE TPAMI, 2025
[3] Hao Tang, Ling Shao, Nicu Sebe, Luc Van Gool. Enhanced Multi-Scale Cross-Attention for Person Image Generation. IEEE TPAMI, 2025
[4] Hao Tang, Ling Shao, Nicu Sebe, Luc Van Gool. Graph Transformer GANs with Graph Masked Modeling for Architectural Layout Generation. IEEE TPAMI, 2024
[5] Hui Wei, Hao Tang, Xuemei Jia, Zhixiang Wang, Hanxun Yu, Zhubo Li, Shin'ichi Satoh, Luc Van Gool, Zheng Wang. Physical Adversarial Attack Neets Computer Vision: A Decade Survey. IEEE TPAMI, 2024
[6] Hao Tang, Guolei Sun, Nicu Sebe, Luc Van Gool. Edge Guided GANs with Multi-Scale Contrastive Learning for Semantic Image Synthesis. IEEE TPAMI, 2023
[7] Hao Tang, Philip HS Torr, Nicu Sebe. Multi-Channel Attention Selection GANs for Guided Image-to-Image Translation. IEEE TPAMI, 2022
[8] Hao Tang, Ling Shao, Philip HS Torr, Nicu Sebe. Local and Global GANs with Semantic-Aware Upsampling for Image Generation. IEEE TPAMI, 2022
[9] Songtao Li, Hao Tang*. Multimodal Alignment and Fusion: A Survey. Springer IJCV, 2025
[10] Hao Tang, Ling Shao, Philip HS Torr, Nicu Sebe. Bipartite Graph Reasoning GANs for Person Pose and Facial Image Synthesis. Springer IJCV, 2022
[11] Hongpeng Wang, Zeyu Zhang, Wenhao Li, Hao Tang*. MoRL: Reinforced Reasoning for Unified Motion Understanding and Generation. In ACL 2026, San Diego, USA
[12] Yuxuan Fan, Jing Hao, Hong Chen, Jiahao Bao, Yihua Shao, Yuci Liang, Kuo Feng Hung, Hao Tang*. OralGPT-Plus: Learning to Use Visual Tools via Reinforcement Learning for Panoramic X-ray Analysis. In CVPR 2026, Denver, USA
[13] Jun Liu, Zhenglun Kong, Peiyan Dong, Changdi Yang, Tianqi Li, Hao Tang*, et al. Structured Agent Distillation for Large Language Model Agents. In AAMAS 2026, Paphos, Cyprus
[14] Zhengri Wu, Yiran Wang, Yu Wen, Zeyu Zhang, Biao Wu, Hao Tang*. StereoAdapter: Adapting Stereo Depth Estimation to Underwater Scenes. In ICRA, 2026, Vienna, Austria
[15] Nonghai Zhang, Zeyu Zhang, Jiazi Wang, Yang Zhao, Hao Tang*. VaseVQA-3D: Benchmarking 3D VLMs on Ancient Greek Pottery. In ICLR, 2026, Rio de Janeiro, Brazil
[16] Ting Huang, Zeyu Zhang, Yemin Wang, Hao Tang*. 3D Coca: Contrastive Learners Are 3D Captioners. In 3DV 2026, Vancouver, Canada
[17] Fanhu Zeng, Haiyang Guo, Fei Zhu*, Li Shen, Hao Tang*. RobustMerge: Parameter-Efficient Model Merging for MLLMs with Direction Robustness. In NeurIPS 2025, San Diego, USA
[18] Qinhua Xie, Hao Tang*. TTTFusion: A Test-Time Training-Based Strategy for Multimodal Medical Image Fusion in Surgical Robots. In IROS 2025, Hangzhou, China
[19] Xiaoyi Liu, Hao Tang*. DiffFNO: Diffusion Fourier Neural Operator. In CVPR 2025, Nashville, USA
[20] Renkai Wu, Xianjin Wang, Pengchen Liang, Zhenyu Zhang, Qing Chang*, Hao Tang*. Toward Zero-Shot Learning for Visual Dehazing of Urological Surgical Robots. In ICRA 2025, Atlanta, USA