
Completed
Vision, Language and Visual Retrieval
Multimodal methods connecting images and language: large-scale visual retrieval, semantic art understanding, and image caption generation.
Current and past research projects. Filter by status or research theme.

Multimodal methods connecting images and language: large-scale visual retrieval, semantic art understanding, and image caption generation.

Learning to predict depth from single images without ground-truth supervision, with a focus on dynamic scenes and challenging conditions.

Recovering accurate 3D shape from collections of images, using multi-view stereo, volumetric graph-cuts, and probabilistic depth-map fusion.