Physical AI Research
About
We are dedicated to advancing the next generation of mobility systems through the integration of artificial intelligence (AI), digital twins, and engineering-driven methodologies. Our mission is to build intelligent, efficient, and resilient transportation systems by applying AI techniques such as deep learning, reinforcement learning, and multi-modal foundation models, combined with system-level engineering approaches and mixed-reality environments to address critical functions including perception, prediction, decision-making, and control.
Our research spans both foundational and application-driven challenges across a diverse range of intelligent mobility platforms, including autonomous vehicles, cooperative driving automation (CDA) systems, mobile robots, and smart infrastructure. We focus on problems such as optimal sensor placement, scenario engineering, and mixed-reality experimentation to enable robust, scalable, and interpretable mobility solutions. These efforts are powered by extensive GPU computing resources and supported by a comprehensive ecosystem of digital twin technologies, CDA software systems, and real-world testing platforms, bridging simulation and deployment to accelerate development and validation in complex and dynamic environments.
The following highlights our main research areas and representative ongoing projects. For a comprehensive list of our work, please refer to our publications, which include over 20 peer-reviewed papers and technical reports on these topics.
AOI 1 – Foundation Models for Autonomous Driving
Understanding diverse, dynamic driving scenes and interacting with them safely remains a challenge in autonomous driving. The extensive world knowledge and strong reasoning capabilities of large language models (LLMs) and vision-language models (VLMs) introduce a new paradigm for scene understanding and safe driving. This project aims to develop a foundation model for autonomous driving with adaptive reasoning and high-quality action generation, enhancing generalization and driving performance, particularly in long-tail and challenging scenarios. Below are some of our works:
- AutoVLA: A Vision-Language-Action Model for End-to-End Autonomous Driving with Adaptive Reasoning and Reinforcement Fine-Tuning (Under Review)
- Driving with Regulation: Interpretable Decision-Making for Autonomous Vehicles with Retrieval-Augmented Reasoning via LLM (Under Review)
AOI 2 – Multi-Agent Perception, Prediction, and Planning
Safe and efficient autonomous driving hinges on accurate and comprehensive scene understanding. Beyond the limited perception range and occlusions inherent to single-agent systems, multi-agent systems enable information sharing across multiple agents and frames, facilitating holistic spatio-temporal scene understanding. This project focuses on constructing a real-world multi-agent sequential dataset, designing an end-to-end multi-agent fusion framework, and developing cooperative perception, prediction, and planning models. Some of our works are listed below:
- V2XPnP: Vehicle-to-Everything Spatio-Temporal Fusion for Multi-Agent Perception and Prediction (ICCV 2025)
- TurboTrain: Towards Efficient and Balanced Multi-Task Learning for Multi-Agent Perception and Prediction (ICCV 2025)
- CooPre: Cooperative Pretraining for V2X Cooperative Perception (IROS 2025 & Best Paper Award at the CVPR 2025 DriveX Workshop)
- CooperRisk: A Driving Risk Quantification Pipeline with Multi-Agent Cooperative Perception and Prediction (IROS 2025)
- Risk Map As Middleware: Towards Interpretable Cooperative End-to-end Autonomous Driving for Risk-Aware Planning (Under Review)
AOI 3 – Mixed-Reality Testing
Reliable and scalable autonomous driving requires rigorous testing across diverse scenarios and system configurations. Traditional validation approaches often lack the flexibility, realism, and interoperability needed to evaluate complex cooperative behaviors. Mixed-reality testing addresses these limitations by combining simulation, digital twins, and real-world testing assets into a unified framework. This project focuses on building an end-to-end testing pipeline that supports both isolated and collaborative evaluation of CDA systems and autonomous driving algorithms. It enables testing across hybrid environments, facilitates distributed multi-agent coordination, and supports systematic validation of perception, decision-making, and control under real-world constraints. Some of our works in the area are:
- OpenCDA Full-Stack CDA System (ITSC 2022)
- Integrated Highway Prototype with Platooning Algorithm (Transportation Research Part C 2023)
- Traffic Regulation-Aware Path Planning (ICRA 2025)
- VOICES Distributed Testing Initiative (ITSC 2025)
- UCLA Smart Intersection Digital Twin (ITSC 2025)
AOI 4 – Infrastructure Multi-Modal Perception
Infrastructure-based perception is essential for improving safety and situational awareness at complex traffic environments like intersections. However, challenges such as diverse camera layouts, occlusions, and environmental variability remain under-explored. This project aims to develop a scalable infrastructure perception framework that enables effective sensor placement evaluation and robust multi-camera fusion. It introduces a set of perception-driven metrics for evaluating sensor configurations and a transformer-based BEV perception model with graph-based fusion to handle diverse camera views. The system is designed to be resilient to sensor degradation and performs consistently across diverse intersection layouts and traffic scenarios. Our works in the area include:
- InSPE: Rapid Evaluation of Heterogeneous Multi-Modal Infrastructure Sensor Placement: Infrastructure sensing is critical for traffic monitoring and cooperative perception at safety-critical locations like intersections but remains underexplored due to its unique challenges. This project introduces a scalable evaluation framework that leverages multi-modal sensors, surrogate metrics, and simulation