Towards World Models for Robot Navigation

Oct 26, 2023 · 1 min read

Autonomously navigating a robot with images as input presents a significant challenge in perception and planning. Images capture potentially important details, such as complex geometry, body movement, and other visual cues. In order to successfully solve the navigation task from only images, algorithms must be able to model the scene and its dynamics using only this channel of information. The world model has shown success in the camera-based navigation problem. We investigate the connection between the quality of a world model and the navigation task’s performance and improve the world model’s performance based on our discovery. To this end, we propose a systematic world model evaluation method and a feature-based world model. We find that the deep features captured by a deep neural network can be used to evaluate a world model’s quality better, and the feature-based world model performs better than the image-based world model.