Short Bio
I am building AI agents at Yutori. I had been working at Meta for 7.5 years, built multi-modality Llama models, face tracking in AR/VR, and the creation of stylized / photorealistic avatars.
I obtained my Ph.D. in Electronic Engineering at The Chinese University of Hong Kong, advised by Prof. Xiaogang Wang. I graduated from Tsinghua University with B. Eng. degree in Computer Science.
I am passionate about developing multi-modality foundation models and applying them to help extend human capabilities, build autonomous machines, and ultimately, engineer humanoids with general intelligence.
Recent News
-
(2025/11) We introduced Yutori Navigator, a state-of-the-art AI web agent.
-
(2025/02) I joined Yutori to build AI agents for the web.
Publications
Geometric Correspondence Fields: Learned Differentiable Rendering for 3D Pose Refinement in the Wild
European Conference on Computer Vision (ECCV), 2020
Video Person Re-Identification With Competitive Snippet-Similarity Aggregation and Co-Attentive Snippet Embedding
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018
Identity-Aware Textual-Visual Matching with Latent Co-attention
IEEE International Conference on Computer Vision (ICCV), 2017
Learning Deep Neural Networks for Vehicle Re-ID with Visual-spatio-temporal Path Proposals
IEEE International Conference on Computer Vision (ICCV), 2017
Object Detection in Videos with Tubelet Proposal Networks
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017
















