DriveLM: Driving with Graph Visual Question Answering [ECCV 2024 Oral]https://arxiv.org/abs/2312.14150 DriveLM: Driving with Graph Visual Question AnsweringWe study how vision-language models (VLMs) trained on web-scale data can be integrated into end-to-end driving systems to boost generalization and enable interactivity with human users. While recent approaches adapt VLMs to driving via single..