Self-Supervised Monocular Depth Estimation

Completed 2022–2024

Learning to predict depth from single images without ground-truth supervision, with a focus on dynamic scenes and challenging conditions.

This project develops self-supervised approaches to estimating depth from a single image, without ground-truth depth for training — a capability of direct relevance to robotics and autonomous driving. A guiding theme is that state-of-the-art accuracy can come from improving the learning process rather than simply increasing network complexity.

Concrete contributions include dynamic-object-aware training, which disregards small potentially moving objects and separately estimates the pose of genuinely dynamic ones; robustness to real-world conditions, using computer graphics and generative models to augment fair-weather data so that models generalise across weather, time of day and image quality; and curriculum-learning strategies that exploit larger camera baselines — normally harmful to self-supervised training — to improve depth quality.

Collaborators

Related publications