
Completed
Vision, Language and Visual Retrieval
Multimodal methods connecting images and language: large-scale visual retrieval, semantic art understanding, and image caption generation.
Current and past research projects. Filter by status or research theme.

Multimodal methods connecting images and language: large-scale visual retrieval, semantic art understanding, and image caption generation.

Learning to predict depth from single images without ground-truth supervision, with a focus on dynamic scenes and challenging conditions.

Recovering surface normals, shape, and reflectance from images captured under varying illumination — including coloured-light, multi-spectral, and shadow-robust methods.

Recovering accurate 3D shape from collections of images, using multi-view stereo, volumetric graph-cuts, and probabilistic depth-map fusion.