|
|
I am a senior research scientist at the Toshiba Research Laboratory in Cambridge, UK where I am a member of the Computer Vision Group. Prior to joining Toshiba Research I was a PhD student in in the Machine Intelligence Laboratory in the Department of Engineering, University of Cambridge. As an undergraduate I studied Mathematics and Computer Science at Imperial College, London. I also currently hold a junior research fellowship in Wolfson College. My research is focused on algorithms for obtaining a complete, detailed, three-dimensional model of a real object from a collection of digital photographs. 3D computer models are a vital part of a wide range of disciplines, from the study of sculpture and architecture to archaeology, structural engineering and computer graphics. The owner of such a digital copy of an object is able to inspect details of the object from any viewpoint, measure geometric properties such as angles, lengths and volumes or reproduce it in a different material. |
The great importance of 3D computer models leads to the question: is there a practical, simple way of acquiring them? Inspired by the human visual system, which effortlessly converts sequences of 2D images into complex 3D representations, my research is about using the rich information contained in a set of photographs to automatically acquire 3D models of objects.
|
Using Multiple Hypotheses to Improve Depth-Maps for Multi-View Stereo
This paper proposes an improvement to a large class of Multi-View Stereo algorithms that fuse stereo depth maps. We show that if individual depth-maps are filtered for outliers prior to the fusion stage, good performance can be maintained in sparse data-sets. Our strategy is to collect a list of good hypotheses for the depth of each pixel. We then chose the optimal depth for each pixel by enforcing consistency between neighbouring pixels in a depth-map. A crucial element of the fitering stage is the introduction of a possible unknown depth hypothesis for each pixel, which is selected by the algorithm when no consistent depth can be chosen. This pre-processing of the depth-maps allows the global fusion stage to operate on fewer outliers and consequently improve the performance under sparsity of data. |
|
|
Shadows in Three-Source Photometric Stereo
Shadows present a significant challenge for Photometric Stereo methods. When four or more images are available, local surface orientation is overdetermined and the shadowed pixels can be discarded. In this paper we look at the challenging case when only three images under three different illuminations are available. In this case, when one of the three pixel intensity constraints is missing due to shadow, a 1 dof ambiguity per pixel arises. We show that using integrability one can resolve this ambiguity and use the remaining two constraints to reconstruct the geometry in the shadow regions. As the problem becomes ill-posed in the presence of noise, we describe a regularization scheme that improves the numerical performance of the algorithm while preserving data. |
|
|
Non-rigid Photometric Stereo with Colored Lights
When three lights illuminate a surface from three different angles and with three different colors, there is a one-to-one mapping between the RGB color measured by a camera and the surface orientation. If we illuminate a complex object under this setup, we can invert the mapping to get surface orientations from an RGB image, then integrate those to get a depth-map. In this paper, this idea, previously used only with static objects, is applied to the reconstruction of a deforming object, such as a moving cloth. We capture color videos of complex motions of fabrics, from which we extract sequences of depth maps. We propose a simple scheme with which these depth maps can be registered to a canonical pose and this allows complex applications such as texture mapping or avatar skinning. A video showing the system in action can be found in the following links: short avi, longer version and from YouTube part 1 and part 2. |
|
|
Multi-view Photometric Stereo
This is an extension and consolidation of our CVPR 2006 work on multi-view photometric stereo. The main difference of this paper is that significant albedo variations in the surface of the reconstructed object can be tolerated. In the case where albedo variation is present on the object, we can usually obtain reconstructions with classic multi-view dense stereo. We show however, how our work produces results of much higher geometric detail than multi-view stereo, by exploiting the change in illumination. An earlier version of this work appeared in Lighting-up geometry:
accurate 3D modelling of museum artifacts with a torch and a camera
|
|
|
Automatic 3D Object Segmentation in Multiple
Views using Volumetric Graph-Cuts
Many multi-view stereo methods are faced with the problem of segmenting 20-100 calibrated images of a 3D object. These segmentations are used to create a visual hull which is a first approximation to the object's geometry. In this paper we propose a simple technique for automatically segmenting these images. Our idea is based on two observations: (1) In each image the camera will usually fixate on the object of interest and (2) the segmentations are not independent because of the silhouette coherence constraint. We use (1) to initalise an object color model. We then perform a series of simultaneous segmentations using (2). In each iteration we update the color model based on previous results. The process converges to the correct segmentations after just a few iterations. |
|
|
Multi-view Stereo via Volumetric Graph-cuts
and Occlusion Robust Photo-Consistency
Here we revisit our CVPR 2005 work and develop a much improved formulation. The object surface is defined as a partition of 3D space into 'inside' and 'outside' regions. The cost functional, which we optimise using Graph-cuts, is a combination of a simple balooning force and an occlusion-independent Normalised Cross Correlation cost. The advantages of our approach are the following: (1) Objects of arbitrary topology can be fully represented and computed as a single surface with no self-intersections. (2) The representation and geometric regularisation is image and viewpoint independent. (3) Global optimisation is computationally tractable, using existing max-flow algorithms. |
|
|
Probabilistic visibility for multi-view stereo
In this work we explore how the photo-consistency criterion can be used to obtain a likelihood for a given 3D location being 'empty'. We observe that if a 3D point is considered as photo-consistent from a certain camera, then all 3D locations between that point and the camera are likely to be empty. In fact the degree of likelihood for 'emptiness' is related to the degree of photo-consistency. We formalise this observation probabilistically and show how it can be used to reconstruct difficult concavities in objects. |
|
|
Reconstruction in
the Round Using Photometric Normals and Silhouettes
In this work we have obtained full 3D reconstructions of single-albedo, near-Lambertian objects such as white porcelain from 36 views under changing but unknown lighting (single distant light-source assumed). This work is the first to generalise uncalibrated photometric stereo in the multi-view setting. For a more detailed look at some other models we reconstructed using this technique, look here. All you will need is a java enabled browser. |
|
|
Using Frontier
Points to Recover Shape, Reflectance and Illumination
Frontier Points are a robust geometrical feature extracted from the silhouettes. They are points on the surface of the object with a known 3D location and known local surface orientation. In this paper we have shown how they can be used to recover information about the surface reflectance of the object as well as illumination. |
![]() |
|
Multi-view
stereo via Volumetric Graph-cuts
The object surface is defined as a boundary separating the Visual Hull surface from an inner surface at a constant offset from and inside the Visual Hull. The volume between these two surfaces is discretized into voxels and for each voxel we compute a photo-consistency cost. Using Graph-Cuts and a specially defined weighted graph, we compute the surface that optimally separates voxels inside and outside the scene. |
|
|
Reconstructing
Relief Surfaces
The object surface is defined as a height field on top of a coarse approximation of the scene surface (typically the visual hull). The height field is formulated as a Markov Random Field incorporating photo-consistency and surface smoothness constraints. The resulting cost function is optimized using Loopy Belief Propagation. |
|
2002-2006, PhD in Computer Vision from Trinity College, Cambridge
1997-2001, MSci in Mathematics and Computer Science from Imperial College, London
My CV (pdf)Wolfson College
Cambridge, CB3 9BB
E-mail : gv215 [at] cam ac uk