您现在的位置: 首页 » 师资队伍 » 教研系列 » 按名字 » T » 正文

师资队伍

T

唐浩

职称:助理教授/研究员

研究所:视频与视觉技术研究所

研究领域:人工智能、具身智能

电子邮件:haotangpku.edu.cn

个人主页:https://ha0tang.github.io/


主要研究方向

生成式模型、世界模型、空间智能、大语言模型、多模态大模型、计算机视觉


科研/教育经历

唐浩,博士,北京大学计算机学院助理教授/研究员、博士生导师 、博雅青年学者、未名青年学者;北大具身与生成智能实验室负责人、北大图灵班科研导师。先后在美国卡耐基梅隆大学(CMU)和瑞士苏黎世联邦理工学院(ETH Zurich)开展博士后研究,于意大利特伦托大学获得博士学位,并在英国牛津大学、新加坡国立大学、美国东北大学及阿联酋IIAI等机构进行学术访问与研究实习。此外,他还积极推动产学研结合,曾在美国、英国、罗马尼亚和中国的多家创业公司担任高级技术顾问。


主要荣誉与获奖

入选国家级海外高水平人才计划,获国家优秀留学生奖(归国类),并连续三年(2023–2025年)入选斯坦福大学全球前2%顶尖科学家榜单。


Selected Publications(*Corresponding Author(s))

[1] Jiawei Mao,  Yu Yang,  Xuesong Yin,  Ling Shao,  Hao Tang*. AllRestorer: All-in-One Transformer for Image Restoration under Composite Degradations. IEEE TPAMI, 2026

[2] Hao Tang,  Ling Shao,  Zhenyu Zhang,  Luc Van Gool,  Nicu Sebe. Spatial-Temporal Graph Mamba for Music-Guided Dance Video Synthesis. IEEE TPAMI, 2025

[3] Hao Tang,  Ling Shao,  Nicu Sebe,  Luc Van Gool. Enhanced Multi-Scale Cross-Attention for Person Image Generation. IEEE TPAMI, 2025

[4] Hao Tang, Ling Shao, Nicu Sebe, Luc Van Gool. Graph Transformer GANs with Graph Masked Modeling for Architectural Layout Generation. IEEE TPAMI, 2024

[5] Hui Wei, Hao Tang, Xuemei Jia, Zhixiang Wang, Hanxun Yu, Zhubo Li,  Shin'ichi Satoh,  Luc Van Gool,  Zheng Wang. Physical Adversarial Attack Neets Computer Vision: A Decade Survey. IEEE TPAMI, 2024

[6] Hao Tang, Guolei Sun, Nicu Sebe, Luc Van Gool. Edge Guided GANs with Multi-Scale Contrastive Learning for Semantic Image Synthesis. IEEE TPAMI, 2023

[7] Hao Tang, Philip HS Torr, Nicu Sebe. Multi-Channel Attention Selection GANs for Guided Image-to-Image Translation.  IEEE TPAMI, 2022

[8] Hao Tang, Ling Shao, Philip HS Torr, Nicu Sebe. Local and Global GANs with Semantic-Aware Upsampling for Image Generation. IEEE TPAMI, 2022

[9] Songtao Li,  Hao Tang*. Multimodal Alignment and Fusion: A Survey. Springer IJCV, 2025

[10] Hao Tang, Ling Shao, Philip HS Torr, Nicu Sebe. Bipartite Graph Reasoning GANs for Person Pose and Facial Image Synthesis. Springer IJCV, 2022

[11] Hongpeng Wang,  Zeyu Zhang,  Wenhao Li,  Hao Tang*. MoRL: Reinforced Reasoning for Unified Motion Understanding and Generation. In ACL 2026, San Diego, USA

[12] Yuxuan Fan,  Jing Hao,  Hong Chen,  Jiahao Bao,  Yihua Shao,  Yuci Liang,  Kuo Feng Hung,  Hao Tang*. OralGPT-Plus: Learning to Use Visual Tools via Reinforcement Learning for Panoramic X-ray Analysis. In CVPR 2026, Denver, USA

[13] Jun Liu,  Zhenglun Kong,  Peiyan Dong,  Changdi Yang,  Tianqi Li,  Hao Tang*,  et al. Structured Agent Distillation for Large Language Model Agents. In AAMAS 2026, Paphos, Cyprus

[14] Zhengri Wu,  Yiran Wang,  Yu Wen,  Zeyu Zhang,  Biao Wu,  Hao Tang*. StereoAdapter: Adapting Stereo Depth Estimation to Underwater Scenes. In ICRA, 2026, Vienna, Austria

[15] Nonghai Zhang,  Zeyu Zhang,  Jiazi Wang,  Yang Zhao,  Hao Tang*. VaseVQA-3D: Benchmarking 3D VLMs on Ancient Greek Pottery. In ICLR, 2026, Rio de Janeiro, Brazil

[16] Ting Huang,  Zeyu Zhang,  Yemin Wang,  Hao Tang*. 3D Coca: Contrastive Learners Are 3D Captioners. In 3DV 2026, Vancouver, Canada

[17] Fanhu Zeng,  Haiyang Guo,  Fei Zhu*,  Li Shen,  Hao Tang*. RobustMerge: Parameter-Efficient Model Merging for MLLMs with Direction Robustness. In NeurIPS 2025, San Diego, USA

[18] Qinhua Xie,  Hao Tang*. TTTFusion: A Test-Time Training-Based Strategy for Multimodal Medical Image Fusion in Surgical Robots. In IROS 2025, Hangzhou, China

[19] Xiaoyi Liu,  Hao Tang*. DiffFNO: Diffusion Fourier Neural Operator. In CVPR 2025, Nashville, USA

[20] Renkai Wu,  Xianjin Wang,  Pengchen Liang,  Zhenyu Zhang,  Qing Chang*,  Hao Tang*. Toward Zero-Shot Learning for Visual Dehazing of Urological Surgical Robots. In ICRA 2025, Atlanta, USA