Physical AI Research

About

 

We are dedicated to advancing the next generation of mobility systems through the integration of artificial intelligence (AI), digital twins, and engineering-driven methodologies. Our mission is to build intelligent, efficient, and resilient transportation systems by applying AI techniques such as deep learning, reinforcement learning, and multi-modal foundation models, combined with system-level engineering approaches and mixed-reality environments to address critical functions including perception, prediction, decision-making, and control.

Our research spans both foundational and application-driven challenges across a diverse range of intelligent mobility platforms, including autonomous vehicles, cooperative driving automation (CDA) systems, mobile robots, and smart infrastructure. We focus on problems such as optimal sensor placement, scenario engineering, and mixed-reality experimentation to enable robust, scalable, and interpretable mobility solutions. These efforts are powered by extensive GPU computing resources and supported by a comprehensive ecosystem of digital twin technologies, CDA software systems, and real-world testing platforms, bridging simulation and deployment to accelerate development and validation in complex and dynamic environments.

The following highlights our main research areas and representative ongoing projects. For a comprehensive list of our work, please refer to our publications, which include over 20 peer-reviewed papers and technical reports on these topics.

 

AOI 1 – Foundation Models for Autonomous Driving

Understanding diverse, dynamic driving scenes and interacting with them safely remains a challenge in autonomous driving. The extensive world knowledge and strong reasoning capabilities of large language models (LLMs) and vision-language models (VLMs) introduce a new paradigm for scene understanding and safe driving. This project aims to develop a foundation model for autonomous driving with adaptive reasoning and high-quality action generation, enhancing generalization and driving performance, particularly in long-tail and challenging scenarios. Below are some of our works:

 

AOI 2 – Multi-Agent Perception, Prediction, and Planning

Safe and efficient autonomous driving hinges on accurate and comprehensive scene understanding. Beyond the limited perception range and occlusions inherent to single-agent systems, multi-agent systems enable information sharing across multiple agents and frames, facilitating holistic spatio-temporal scene understanding. This project focuses on constructing a real-world multi-agent sequential dataset, designing an end-to-end multi-agent fusion framework, and developing cooperative perception, prediction, and planning models. Some of our works are listed below:

 

AOI 3 – Mixed-Reality Testing

Reliable and scalable autonomous driving requires rigorous testing across diverse scenarios and system configurations. Traditional validation approaches often lack the flexibility, realism, and interoperability needed to evaluate complex cooperative behaviors. Mixed-reality testing addresses these limitations by combining simulation, digital twins, and real-world testing assets into a unified framework. This project focuses on building an end-to-end testing pipeline that supports both isolated and collaborative evaluation of CDA systems and autonomous driving algorithms. It enables testing across hybrid environments, facilitates distributed multi-agent coordination, and supports systematic validation of perception, decision-making, and control under real-world constraints. Some of our works in the area are:

AOI 4 – Infrastructure Multi-Modal Perception

Infrastructure-based perception is essential for improving safety and situational awareness at complex traffic environments like intersections. However, challenges such as diverse camera layouts, occlusions, and environmental variability remain under-explored. This project aims to develop a scalable infrastructure perception framework that enables effective sensor placement evaluation and robust multi-camera fusion. It introduces a set of perception-driven metrics for evaluating sensor configurations and a transformer-based BEV perception model with graph-based fusion to handle diverse camera views. The system is designed to be resilient to sensor degradation and performs consistently across diverse intersection layouts and traffic scenarios. Our works in the area include: