XPENG, in collaboration with Peking University, has achieved acceptance of its research paper at AAAI 2026, introducing FastDriveVLA—an advanced visual token pruning framework tailored for end-to-end Vision-Language-Action (VLA) models in autonomous driving. This innovation significantly reduces computational demands while preserving planning accuracy, enabling more efficient onboard processing in electric vehicles equipped with advanced driver-assistance systems. The development highlights XPENG’s progress in optimizing AI large models for real-world deployment in intelligent EVs.
Highlights
- FastDriveVLA framework delivers a 7.5x reduction in computational load by pruning visual tokens from 3,249 to 812 on the nuScenes benchmark.
- Paper accepted by AAAI 2026 with a selective 17.6% acceptance rate from 23,680 submissions.
- Introduces reconstruction-based token pruning inspired by human driver focus on essential foreground elements.
- Supports efficient end-to-end autonomous driving VLA models critical for scalable L4 deployment in EVs.
- Demonstrates XPENG’s full-stack in-house AI capabilities from architecture design to vehicle integration.
Research Acceptance and Technical Significance
The paper, titled “FastDriveVLA: Efficient End-to-End Driving via Plug-and-Play Reconstruction-based Token Pruning,” earned acceptance at AAAI 2026, a leading global artificial intelligence conference. From 23,680 submissions, only 4,167 papers were selected, resulting in a 17.6% acceptance rate. This recognition underscores the framework’s contributions to efficient AI processing in autonomous systems.
FastDriveVLA addresses key challenges in VLA models increasingly adopted for end-to-end autonomous driving. These models encode images into numerous visual tokens to enable scene understanding and action reasoning. However, high token volumes elevate onboard computational requirements, constraining inference speed and real-time performance in vehicle environments.
FastDriveVLA Framework and Methodology
Existing visual token pruning techniques, reliant on text-visual attention or token similarity, exhibit limitations in complex driving scenarios. FastDriveVLA introduces a reconstruction-based approach that mimics human driving behavior by prioritizing relevant foreground information—such as lanes, vehicles, and pedestrians—while discarding irrelevant background data.
The framework employs an adversarial foreground-background reconstruction strategy to enhance token selection accuracy. This plug-and-play method integrates seamlessly with existing VLA architectures, facilitating efficient inference without compromising decision-making quality.
Performance on nuScenes Benchmark
Evaluations on the nuScenes autonomous driving dataset demonstrate state-of-the-art results across multiple pruning ratios. Notably, reducing visual tokens from 3,249 to 812 yields a nearly 7.5x decrease in computational load. The framework maintains high planning accuracy under these conditions, validating its effectiveness for resource-constrained onboard computing in electric vehicles.
Broader Context in XPENG’s AI Advancements
This marks the second major recognition for XPENG at a top-tier AI conference in the current year. Previously, XPENG participated as the only Chinese automaker invited to present at CVPR WAD in June, discussing foundation models for autonomous driving. In November, the company unveiled its VLA 2.0 architecture, eliminating the intermediate language translation step to enable direct Visual-to-Action generation.
These milestones reflect XPENG’s comprehensive in-house expertise, encompassing model design, training, distillation, and deployment. The company continues to prioritize investments in AI large model technology to advance L4 autonomous driving capabilities.
Company Overview and Strategic Focus
XPENG positions itself as an explorer of future mobility through technological innovation in electric vehicles and intelligent systems. Headquartered in Guangzhou, China, the company maintains R&D centers in multiple domestic locations and an international presence, including a U.S. R&D center and European subsidiaries.
XPENG emphasizes full-stack in-house development of intelligent driver-assistance software and core hardware. The company achieved dual primary listings on the New York Stock Exchange (NYSE: XPEV) in August 2020 and the Hong Kong Stock Exchange (HKEX: 9868) in July 2021.
For more information, please visit https://www.xpeng.com/.
Sign up for our popular daily email to catch all the latest EV news!







