曹朴

Pu Cao (曹朴)

邮箱 caopu@bupt.edu.cn

Email caopu@bupt.edu.cn

我于 2022 年在北京科技大学获得学士学位,目前自 2022 年起在北京邮电大学智能工程与自动化学院攻读博士学位(预计 2027 年夏天毕业)。

我的研究兴趣包括多模态理解与生成,尤其关注多模态大语言模型扩散模型

I received my bachelor’s degree from the University of Science and Technology Beijing (USTB), Beijing, China, in 2022, and I am currently a Ph.D. candidate at the School of Intelligent Engineering and Automation, Beijing University of Posts and Telecommunications (BUPT), since 2022 (expected to graduate in summer 2027).

My research interests include multimodal understanding and generation, especially focusing on multimodal large language models and diffusion models.

经历 Experiences
2022.09–2027.06
北京邮电大学 博士
智能工程与自动化学院 · 控制科学与工程专业
2018.09–2022.06
北京科技大学 本科
数理学院 · 信息与计算科学专业
2022.09–2027.06
Beijing University of Posts and Telecommunications Ph.D.
School of Intelligent Engineering & Automation · Control Science and Engineering
2018.09–2022.06
University of Science and Technology Beijing B.S.
School of Mathematics and Physics · Information and Computing Science

新闻 News

论文 Publications

IEEE TPAMI 2025

Controllable Generation with Text-to-Image Diffusion Models: A Survey

曹朴, 周峰, 宋晴, 杨录

PDFCode

arXiv:2512.01426

ResDiT: Evoking the Intrinsic Resolution Scalability in Diffusion Transformers

马熠阳, 周峰, 尹雪丹, 曹朴, 党永浩, 尹建芹

PDFarXiv

AAAI 2025 (Oral)

Exploring Position Encoding in Diffusion U-Net for Training-free High-resolution Image Generation

周峰*, 曹朴*, 马熠阳, 杨录, 尹建芹

PDF

arXiv:2510.17479

Initialize to Generalize: A Stronger Initialization Pipeline for Sparse-View 3DGS

周峰, 郭文凯, 曹朴, 张志诚, 尹建芹

PDFarXiv

IEEE TCSVT 2025

OMEGAS: Object Mesh Extraction from Large Scenes Guided by Gaussian Segmentation

王立直*, 周峰*, 于博, 曹朴, 尹建芹

PDFCode

Pattern Recognition 2026

Quality Transformer for Human Parsing

郭尧, 杨录, 曹朴, 李珊, 周怡琳, 宋晴

PDF

arXiv:2505.05501

Preliminary Explorations with GPT-4o(mni) Native Image Generation

曹朴, 周峰*, 吉峻毅*, 孔庆烨*, 吕志翔*, 张明健*, 赵雪坤*, 吴思琪, 林英慧, 宋晴, 杨录

PDF

CVPR 2025

Image is All You Need to Empower Large-scale Diffusion Models for In-Domain Generation

曹朴*, 周峰*, 杨录, 黄天瑞, 宋晴

PDFCode

IEEE TCSVT 2025

E4C: Enhance Editability for Text-Based Image Editing by Harnessing Efficient CLIP Guidance

黄天瑞*, 曹朴*, 杨录, 刘春, 胡梦婕, 刘智威, 宋晴

PDF

IEEE TMM 2024

Frequency-Based Matcher for Long-Tailed Semantic Segmentation

李珊, 杨录, 曹朴, 李刘磊, 马华东

PDF

WACV 2024

What Decreases Editing Capability? Domain-Specific Hybrid Refinement for Improved GAN Inversion

曹朴, 杨录, 刘冬旭, 杨晓雅, 黄天瑞, 宋晴

PDFCode

IEEE TPAMI 2025

Controllable Generation with Text-to-Image Diffusion Models: A Survey

Pu Cao, Feng Zhou, Qing Song, Lu Yang

PDFCode

arXiv:2512.01426

ResDiT: Evoking the Intrinsic Resolution Scalability in Diffusion Transformers

Yiyang Ma, Feng Zhou, Xuedan Yin, Pu Cao, Yonghao Dang, Jianqin Yin

PDFarXiv

AAAI 2025 (Oral)

Exploring Position Encoding in Diffusion U-Net for Training-free High-resolution Image Generation

Feng Zhou*, Pu Cao*, Yiyang Ma, Lu Yang, Jianqin Yin

PDF

arXiv:2510.17479

Initialize to Generalize: A Stronger Initialization Pipeline for Sparse-View 3DGS

Feng Zhou, Wenkai Guo, Pu Cao, Zhicheng Zhang, Jianqin Yin

PDFarXiv

IEEE TCSVT 2025

OMEGAS: Object Mesh Extraction from Large Scenes Guided by Gaussian Segmentation

Lizhi Wang*, Feng Zhou*, Bo Yu, Pu Cao, Jianqin Yin

PDFCode

Pattern Recognition 2026

Quality Transformer for Human Parsing

Yao Guo, Lu Yang, Pu Cao, Shan Li, Yilin Zhou, Qing Song

PDF

arXiv:2505.05501

Preliminary Explorations with GPT-4o(mni) Native Image Generation

Pu Cao, Feng Zhou*, Junyi Ji*, Qingye Kong*, Zhixiang Lv*, Mingjian Zhang*, Xuekun Zhao*, Siqi Wu, Yinghui Lin, Qing Song, Lu Yang

PDF

CVPR 2025

Image is All You Need to Empower Large-scale Diffusion Models for In-Domain Generation

Pu Cao*, Feng Zhou*, Lu Yang, Tianrui Huang, Qing Song

PDFCode

IEEE TCSVT 2025

E4C: Enhance Editability for Text-Based Image Editing by Harnessing Efficient CLIP Guidance

Tianrui Huang*, Pu Cao*, Lu Yang, Chun Liu, Mengjie Hu, Zhiwei Liu, Qing Song

PDF

IEEE TMM 2024

Frequency-Based Matcher for Long-Tailed Semantic Segmentation

Shan Li, Lu Yang, Pu Cao, Liulei Li, Huadong Ma

PDF

WACV 2024

What Decreases Editing Capability? Domain-Specific Hybrid Refinement for Improved GAN Inversion

Pu Cao, Lu Yang, Dongxv Liu, Xiaoya Yang, Tianrui Huang, Qing Song

PDFCode

项目 Projects

服务 Service

审稿人
会议
  • ICLR 2026
  • CVPR 2025–2026
  • ICCV 2025
  • ECCV 2024 杰出审稿人奖
  • WACV 2024–2026
期刊
  • TPAMI
  • TIP
  • TCSVT
  • TMM
  • TNNLS
Reviewer
Conferences
  • ICLR 2026
  • CVPR 2025–2026
  • ICCV 2025
  • ECCV 2024 Outstanding Reviewer Award
  • WACV 2024–2026
Journals
  • TPAMI
  • TIP
  • TCSVT
  • TMM
  • TNNLS

联系 Contact

快速给我发邮件

caopu@bupt.edu.cn

Send me a quick email

caopu@bupt.edu.cn