Ziyang Chen

About

I am a Research Scientist at Luma AI, working on multimodal foundation models.
I obtained my Ph.D. in Computer Vision at the University of Michigan, where I was advised by Prof. Andrew Owens. Prior to that, I earned my Master's degree at the University of Michigan and my Bachelor's degree at Shanghai Jiao Tong University.

My primary research interest lies in Multimodal learning and Computer Vision. I have spent wonderful years in computer vision group @ Michigan. During my Ph.D. I have been fortunate to intern at Codec Avatars Lab of Meta and SODA (Sound Design AI) Group of Adobe Research. I am honored to collaborate with Prof. David Fouhey and Prof. Alex Wong.

Google Scholar · GitHub · CV · Twitter · Bluesky · Linkedin

News

[2025/05] I am joining LUMA AI as a full-time Research scientist!
[2025/05] I defended my Ph.D. and graduated!
[2025/02] Three papers are accepted to CVPR 2025! See you in Nashville!
[2024/11] Check out our new work MultiFoley for creative Foley sound generation!

Research

Multimodal Learning

Self-Supervision

Audio in Metaverse

My research so far has focused on audio-visual learning, multimodal learning (vision, audio, language, touch) and computer vision with self-supervision. Still, I am open to other topics as well.

Work Experience

Luma AI, June 2025 ~ Now
Research Scientist

Adobe Research, May 2024 ~ Nov. 2024
Research Scientist Intern
Host: Prem Seetharaman, Bryan Russell, and Justin Salamon

Codec Avatars Lab, Pittsburgh, Meta, May 2023 ~ Nov. 2023
Research Scientist Intern
Host: Israel D. Gebru, Christian Richardt, and Alexander Richard

Publication

(* indicates equal contribution)

Video-Guided Foley Sound Generation with Multimodal Controls
Ziyang Chen, Prem Seetharaman, Bryan Russell, Oriol Nieto, David Bourgin, Andrew Owens, Justin Salamon
CVPR 2025
project page · paper · bibtex

GPS as a Control Signal for Image Generation
Chao Feng, Ziyang Chen, Aleksander Holynski, Alexei A. Efros, Andrew Owens
CVPR 2025
project page · paper · code · bibtex

Supervising Sound Localization Using In-the-wild Ego-motion
Anna Min, Ziyang Chen, Hang Zhao, Andrew Owens
CVPR 2025 (Highlight)

Images that Sound: Composing Images and Sounds on a Single Canvas
Ziyang Chen, Daniel Geng, Andrew Owens
NeurIPS 2024
CVPR 2024 AI Art Gallery
project page · paper · code · bibtex

Real Acoustic Fields: An Audio-Visual Room Acoustics Dataset and Benchmark
Ziyang Chen, Israel D. Gebru, Christian Richardt, Anurag Kumar, William Laney, Andrew Owens, Alexander Richard
CVPR 2024 (Highlight -- 2.8% accept rate)
project page · paper · dataset · bibtex

Binding Touch to Everything: Learning Unified Multimodal Tactile Representations
Fengyu Yang*, Chao Feng*, Ziyang Chen*, ...... , Andrew Owens, Alex Wong
CVPR 2024
project page · paper · code · bibtex

Sound Localization from Motion: Jointly Learning Sound Direction and Camera Rotation
Ziyang Chen, Shengyi Qian, Andrew Owens
ICCV 2023
project page · paper · code · bibtex

Conditional Generation of Audio from Video via Foley Analogies
Yuexi Du, Ziyang Chen, Justin Salamon, Bryan Russell, Andrew Owens
CVPR 2023
project page · paper · code · bibtex

Self-Supervised Video Forensics by Audio-Visual Anomaly Detection
Chao Feng, Ziyang Chen, Andrew Owens
CVPR 2023 (Highlight -- 2.5% accept rate)
project page · paper · code · bibtex

Sound Localization by Self-Supervised Time Delay Estimation
Ziyang Chen, David F. Fouhey, Andrew Owens
ECCV 2022
project page · paper · slides · code · bibtex

Mix and Localize: Localizing Sound Sources in Mixtures
Xixi Hu*, Ziyang Chen*, Andrew Owens
CVPR 2022
project page · paper · slides · code · bibtex

Structure from Silence: Learning Scene Structure from Ambient Sound
Ziyang Chen*, Xixi Hu*, Andrew Owens
CoRL 2021 (Oral)
project page · paper · slides · code · bibtex

Profession

Reviewer:

Conference: WACV 2023, CVPR 2023, ICCV 2023, CVPR 2024, SIGGRAPH 2024, ECCV 2024, NeurIPS 2024, CVPR 2025, ICCV 2025
Journal: IJCV

Teaching Assistant:

VE312 Digital Integrated Circuits, Fall 2018, UM-SJTU Joint Institute. With Yaping Dan.
VE311 Electronic Circuits, Summer 2019, UM-SJTU Joint Institute. With Chang-Ching Tu.