Yichen Li
I am a final-year Ph.D. candidate in EECS at MIT CSAIL, advised by Prof. Antonio Torralba .
My Ph.D. research focuses on generative foundation models for multimodal perception and interactive world understanding.
I enjoy principled development towards:
Multimodal and World Model: approaches from video models, interactions, and multimodality.
Post-training and RL: generic RL alg. for faster convergence with zeroth-order and dense backprop.
Model Architecture: effective mechanisms of learning and hardware-inspired architecture design.
Recognizing the difficulty for Academic ML research, I started an Open Research Seeds effort to help junior students. Before coming to MIT, I worked with Prof. Leonidas Guibas and Prof. Gordon Wetzstein at Stanford.
Email: yichenl [at] mit [dot] edu
Google Scholar /
Twitter /
GitHub
Photo credit: Jiayuan Mao
Multimodal & World Model
RL Post-Training
Model Architecture
Open Research Seeds
The Open Research Seeds effort is to offer some delibrately unconventional research ideas to help junior students to start in resource scarse academia.
Technical and Perspective Blogs
Publication
all
multimodal
video
rl
physics
robotics
other
Advantage Weighted Matching: Aligning RL with Pretraining in Diffusion Models
rl
Shuchen Xue,
Chongjian Ge,
Shilong Zhang,
Yichen Li ,
Zhi-Ming Ma
ICML , 2026
[paper]
[code]
MultiModal Action Conditioned Video Generation
video
multimodal
Yichen Li ,
Antonio Torralba
ICCV , 2025
[paper]
[project page]
[code]
Generalized Dynamics Generation towards Physical World Model
physics
video
robotics
Yichen Li ,
Zhiyi Li,
Brandon Feng,
Antonio Torralba
Preprint , 2025
[paper]
[project page]
Learning to Jointly Understand Visual and Tactile Signals
multimodal
Yichen Li ,
Yilun Du ,
Chao Liu ,
Chao Liu,
Mike Foshey,
Francis Williams,
Joshua B. Tenenbaum ,
Wojciech Matusik ,
Antonio Torralba
ICLR , 2024
[paper]
[project page]
[dataset]
Category-Level Multi-Part Multi-Joint 3D Shape Assembly
robotics
Yichen Li ,
Kaichun Mo ,
Yueqi Duan ,
He Wang ,
Jiequan Zhuang ,
Lin Shao ,
Wojciech Matusik ,
Leonidas Guibas
CVPR , 2024
[paper]
[project page]
[data]
[code]
Learning Preconditioners for Conjugate Gradient PDE Solver
physics
Yichen Li ,
Peter Yichen Chen ,
Tao Du ,
Wojciech Matusik
ICML , 2023
[paper]
[video]
[project page]
[code]
Revisiting Image-Language for Open-ended Phrase Detection
multimodal
Bryan Plummer ,
Kevin Shih ,
Yichen Li ,
Ke Xu ,
Svetlana Lazebnik ,
Stan Sclaroff ,
Kate Saenko
TPAMI , 2019
[paper]
ASAP: Automated Sequence Planning for Complex Assembly with Physical Feasibility
robotics
Yunsheng Tian ,
Karl D.D. Willis ,
Bassel Al Omari ,
Jieliang Luo ,
Pingchuan Ma ,
Yichen Li ,
Farhad Javid ,
Edward Gu ,
Joshua Jacob ,
Shinjiro Sueda ,
Hui Li ,
Sachin Chitta ,
Wojciech Matusik
ICRA , 2024
[paper]
[project page]
[dataset]
[code]
Assemble Them All: Physics-Based Planning for Generalizable Assembly by Disassembly
robotics
Yunsheng Tian ,
Jie Xu ,
Yichen Li ,
Jieliang Luo ,
Shinjiro Sueda ,
Hui Li ,
Karl D.D. Willis ,
Wojciech Matusik
Siggraph Asia , 2022
[paper]
[project page]
[code]
3D Part Assembly from A Single Image
other
Yichen Li* ,
Kaichun Mo* ,
Lin Shao ,
Minhyuk Sung ,
Leonidas Guibas
ECCV , 2020
[paper]
[project page]
[code]
Domain2Vec: Domain Embedding for Unsupervised Domain Adaptation
other
Xingchao Peng* ,
Yichen Li* ,
Kate Saenko
ECCV , 2020
[paper]
[project page]
[code]
Professional Experience
NVIDIA Research — Summer 2024
Built a unified physics simulation framework for soft, articulated, and rigid body dynamics.
Designed anisotropic Young's modulus learning across physics regimes.
51% error reduction over team baseline.
Adobe Research — Summer 2025
Built RL post-training systems for video diffusion models including zeroth-order evolutionary
optimization and dense per-frame reward methods.
NVIDIA Research — Summer 2023
Built a Gaussian kernel-based architecture for fast point cloud processing as a PointNet replacement.
Adobe Research — Summer 2021
Built a video layer decomposition system using source separation methods.
NVIDIA Research — Summer 2020
Built a point cloud completion system utilizing raycasting-based data generation. US Patent filed.
Academic Services
Workshop Organizer: CVPR 2026 Multimodal Learning , Sense of Space , RSS 2025: Multimodal & MultiSensory Robotics , ECCV 2024: Geometry in the Large Model Era
Conference Reviewer: CVPR, ICCV , ECCV, ICML, ICLR, NeurIPS, ACM SIGGRAPH
Journal Reviewer: ACM TOG, IEEE-TPAMI
Awards
Robert J. Shillman Fellowship
College Prize For Excellence in Computer Science (GPA Rank: 1st)