π€ About-me
I earned my Masterβs degree in Artificial Intelligence from Tsinghua University , where I conducted research under the supervision of Prof. Haoqian Wang and collaborated closely with Prof. Yebin Liu on 3D computer vision. Prior to this, I completed my B.Eng. in Measurement and Control Technology & Instruments at Southeast University
. During my graduate studies, I also had the privilege of visiting Harvard University
as a research intern, working with Prof. Hanspeter Pfister on computational imaging projects.
I am currently an Researcher at ByteDance , focusing on cutting-edge challenges in generative AI and embodied intelligence. My work bridges 3D vision with real-world applications, particularly in dynamic scene understanding and human-AI interaction.
Research Directions
Core Expertise: 3D computer vision (NeRF, 3D Gaussian Splatting, multi-view reconstruction).
Emerging Focus: Embodied AI-driven video generation, robot-scene interaction, and physics-aware simulation.
Technical Vision: Building scalable frameworks that connect 3D reconstruction, generative models (video/3D assets), and embodied agents for industrial applications.
Open Opportunities
I am actively recruiting research interns to collaborate on:
π 3D Content Creation: 3D Reconstruction, Video Generation, 3D Generation
π 3D Scene Perception: 3D Foundation Model
π Embodied AI: LLM/Vision-Language models for robot interaction, simulation environments
If you are seeking any form of academic cooperation, please feel free to email me at qinminghan1999@gmail.com.
π₯ News
- 2025.02: Β ππ 2 paper accepted to CVPR 2025 !!!
- 2024.09: Β ππ 1 paper accepted to NeurIPS 2024 !!!
- 2024.07: Β ππ 1 paper accepted to ACM MM 2024 !!!
- 2024.02: Β ππ 2 paper accepted to ECCV 2024 !!!
- 2024.02: Β ππ LangSplat has been selected as CVPR 2024 Highlight !!!
- 2024.02: Β ππ 1 paper accepted to CVPR 2024 !!!
- 2023.11: Β ππ 1 paper accepted to AAAI 2024 !!!
π Selected Publications

LangSplat: 3D Language Gaussian Splatting
Minghan Qin*, Wanhua Li*β , Jiawei Zhou*, Haoqian Wangβ , Hanspeter Pfister
- We introduces LangSplat, which constructs a 3D language field that enables precise and efficient open-vocabulary querying within 3D spaces.

4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models
Wanhua Li, Renping Zhou, Jiawei Zhou, Yingwei Song, Johannes Herter, Minghan Qin, Gao Huang, Hanspeter Pfister
- We present 4D LangSplat, an approach to constructing a dynamic 4D language field in evolving scenes, leveraging Multimodal Large Language Models.

HRAvatar: High-Quality and Relightable Gaussian Head Avatar
Dongbin Zhang, Yunfei Liu, Lijian Lin, Ye Zhu, Kangjie Chen, Minghan Qin, Yu Liβ , Haoqian Wangβ
- With monocular video input, HRAvatar reconstructs a high-quality, animatable 3D head avatar that enables realistic relighting effects and simple material editing.

HDR-GS: Efficient High Dynamic Range Novel View Synthesis at 1000x Speed via Gaussian Splatting
Yuanhao Cai , Zihao Xiao, Yixun Liang, Minghan Qin, Yulun Zhang, Xiaokang Yang, Yaoyao Liu, Alan Yuille
- The first 3D Gaussian splatting-based method for high dynamic range imaging

Animatable 3d gaussian: Fast and high-quality reconstruction of multiple human avatars
Yang Liu, Xiang Huang, Minghan Qin, Qinwei Lin, Haoqian Wang (* indicates equal contribution)
- We propose Animatable 3D Gaussian, a novel neural representation for fast and high-fidelity reconstruction of multiple animatable human avatars, which can animate and render the model at interactive rate.

Gaussian in the Wild: 3D Gaussian Splatting for Unconstrained Image Collections
Dongbin Zhang*, Chuming Wang*, Weitao Wang, Peihao Li, Minghan Qin, Haoqian Wangβ
- We utilize 3D Gaussian Splatting with introduced separated intrinsic and dynamic appearance to reconstruct scenes from uncontrolled images, achieving high-quality results and a 1000 Γ rendering speed increase.

Category-level Object Detection, Pose Estimation and Reconstruction from Stereo Images
Chuanrui Zhang*, Yonggen Ling*β , Minglei Lu, Minghan Qin, Haoqian Wangβ
- We present CODERS, a one-stage approach for Category-level Object Detection, pose Estimation and Reconstruction from Stereo images.

Minghan Qin*, Yifan Liu*, Yuelang Xu, Xiaochen Zhao, Yebin Liuβ , Haoqian Wangβ
- We introduce a novel Spatially-Varying Expression (SVE) conditioning, encompassing both spatial positional features and global expression information.
π Honors and Awards
- Scholarship, Tsinghua University, 2023.
- National 1st Award, the 10th BD-CASTIC, 2019.
π» Research Experience
- 2023.09 - 2024.4, Harvard University - VCG Lab - Computer Vision Group. I spent a good time with Prof. Hanspeter Pfister.
π Academic Service
Reviewers of: CVPR, ECCV, ICCV, NeurIPS, ACM MM, AAAI