I am a Research Scientist at Luma AI, working on multimodal foundation models.
I obtained my Ph.D. in Computer Vision at the University of Michigan, where I was
advised by Prof. Andrew Owens. Prior to that, I earned my Master's degree at the University of Michigan and my Bachelor's degree at Shanghai Jiao Tong University.
My primary research interest lies in Multimodal learning and Computer Vision. I have spent wonderful years in computer vision group @ Michigan. During my Ph.D. I have been fortunate to intern at Codec Avatars Lab of Meta and SODA (Sound Design AI) Group of Adobe Research. I am honored to collaborate with Prof. David Fouhey and Prof. Alex Wong.
My research so far has focused on audio-visual learning, multimodal learning (vision, audio, language, touch) and computer vision with self-supervision. Still, I am open to other topics as well.
(* indicates equal contribution)