Zygmunt L. Szpak
Computer Vision & Machine Learning

Multiple View Geometry

\( \newcommand{\T}{\mathsf{T}} \newcommand{\ve}[1]{\mathbf{#1}} \newcommand{\M}[1]{\mathbf{#1}} \newcommand{\costAML}{J_{\mathrm{AML}}} \newcommand{\costALS}{J_{\mathrm{ALS}}} \newcommand{\de}{\mathbf{\theta}} \newcommand{\estALS}{\widehat{\de}_{\mathrm{ALS}}} \newcommand{\costDIR}{J_{\mathrm{DIR}}} \newcommand{\Lambdab}{\mathbf{\Lambda}} \newcommand{\cov}[2][]{\Lambdab_{#2}^{#1}} \newcommand{\estDIR}{\widehat{\de}_{\mathrm{DIR}}} \newcommand{\estGAML}{\widehat{\de}_{\mathrm{GAML}}} \newcommand{\estAML}{\widehat{\de}_{\mathrm{AML}}} \newcommand{\etab}{\mathbf{\eta}} \newcommand{\kappab}{\mathbf{\kappa}} \)

A classic problem in computer vision, called multiple view geometry, is to recover the three-dimensional structure of a scene given several images taken from different vantage points. Working primarily with Wojciech Chojnacki, I am studying various compelling unsolved problems. What makes multiple view geometry so challenging is that the underlying projective models lead to non-linear errors-in-variables parameter estimation problems. Hence Wojciech and I find ourselves working at the interface of various disciplines such as numerical analysis, projective geometry, statistics and computer science.

We are particularly interested in:

  1. devising new parameter estimation techniques which result in estimates with desirable properties (e.g. unbiased and consistent);
  2. devising new unconstrained optimisation methods which are simpler to implement and faster than established schemes such as Levenberg--Marquardt;
  3. deriving and imposing constraints associated with various mathematical models in multiple view geometry;
  4. creating new constrained optimisation methods which naturally and reliably enforce constraints.

Hence our research endeavours can be characterised as both theoretical and practical.

1 Robust Multiple Homography Estimation: An Ill-Solved Problem

Images in two views of world points lying on a planar surface are related by a homography matrix. Planar surfaces are ubiquitous in urban environments, and this makes estimating multiple homography matrices from image measurements between two views an important step in many applications such as augmenting reality, stitching and warping images, calibrating cameras, finding a metric reconstruction, and detecting non-rigid motion. Because of the diverse utility of homography matrices, researchers often exploit the task of estimating multiple homographies to demonstrate the merits of robust multi-structure estimation methods. In fact, some robust multi-structure estimation methods, such as multiRANSAC [1], were specifically designed to address the multi-homography estimation problem. However, an inadvertent oversight has crept into multi-structure estimation methods, one that persists in all state-of-the-art methods that we are familiar with, namely the failure to recognise that a set of homographies that each of these schemes produces is actually not a genuine set of homographies between two views of the same scene. A collection of homography matrices forms a valid set only if the matrices satisfy consistency constraints implied by the rigidity of the motion and the scene. If the constraints are not deliberately enforced, they are not satisfied in typical scenarios. Hence, one of the fundamental problems in estimating multiple homography matrices is to find a way to enforce the consistency constraints---a task reminiscent of that of enforcing the rank-two constraint in the case of the fundamental matrix estimation.

The consequence of homographies being incompatible can clearly be appreciated by examining the extent to which a fundamental matrix constructed from an incompatible set of homographies can represent the epipolar geometry of the scene. As is well known, given a homography \( \M{H} \) between the first and second views induced by a plane in the scene, the underlying fundamental matrix \( \M{F} \) can be written as $$ \begin{align} \label{eq:12} \M{F} = \M{H}^{-\T}[\ve{e}]_{\times},
\end{align} $$ where \( \ve{e} \) represents the epipole in the first image. If \( {\M{H}_i}_{i=1}^{I} \) is a set of homographies, then, as it turns out, the epipole in the first image is a fixed point of any homology with matrix of the form \( \M{H}^{-1}_j\M{H}_i \) and can be retrieved as the eigenvector corresponding to a non-repeated eigenvalue of any of the \( \M{H}^{-1}_j\M{H}_i \)'s. If the homography matrices \( \M{H}_i \) are incompatible, the various homology matrices \( \M{H}^{-1}_j\M{H}_i \) will also be incompatible and will lead to different estimates of the epipole in the first image. This in turn will lead to different estimates of the fundamental matrix, as exemplified in Figure X. Moreover, if a fundamental matrix is constructed by selecting a particular homography matrix \( \M{H}_{i_0} \) in \eqref{eq:12} , then even though the resulting fundamental matrix may be compatible with \( \M{H}_{i_0} \), it will not be compatible with the remaining homographies \( \M{H}_i \), \( i \neq i_0 \). Hence, given multiple incompatible homographies, it is possible to construct multiple incompatible fundamental matrices. This subtlety is overlooked in all robust multi-structure estimation methods that we are familiar with.

Explicit formulae for all constraints that must be satisfied in order for a collection of homographies to form a valid set have so far eluded the vision community. It was only as recently as 2011 that a decisive answer pertaining to even just the number of constraints was given [2] [3]. There are several reasons why knowledge of explicit formulae for homography constraints would be advantageous:

  1. with explicit constraints it would be possible to devise new homography estimation methods that enforce full compatibility without recourse to latent variables;
  2. one could investigate new global optimisation methods for multi-homography estimation, analogous to what has recently been achieved for fundamental matrix estimation [4] [5]; and
  3. one could establish a new generation of robust multi-structure fitting methods that, unlike existing methods, yield estimates with consistent epipolar geometry.

In this paper we take a step toward achieving the goal of determining all homography constraints. We derive two new sets of consistency constraints and use them to measure, for the first time, the extent to which separately estimated homographies are mutually incompatible. Our findings indicate that none of the state-of-the-art multi-structure estimation methods adequately address the problem of multiple homography estimation, as none of them enforce consistency constraints. Consequently, robust multiple homography estimation continues to be an ill-solved problem.


Fig 1. Comparison of epipolar lines associated with two fundamental matrices computed from two separately estimated homographies. The epipolar lines associated with the two fundamental matrices do not overlap, and thereby demonstrate that the two homographies do not share the same epipolar geometry and hence are incompatible.

2 Enforcing Consistency Constraints in Uncalibrated Multiple Homography Estimation using Latent Variables.

An approach is presented to estimating a set of interdependent homography matrices linked together by latent variables. The approach allows enforcement of all underlying consistency constraints while accounting for the arbitrariness of the scale of each individual matrix. The input data is assumed to be in the form of a set of homography matrices individually estimated from image data with no regard to the consistency constraints, appended by a set of error covariances, each associated with a corresponding matrix from the previous set. A statistically-motivated cost function is introduced for upgrading, via optimisation, the input data to a set of homography matrices satisfying the constraints. The function is invariant to a change of any of the individual scales of the input matrices. The proposed approach is applied to the particular problem of estimating a set of homography matrices induced by multiple planes in the 3D scene between two views. An optimisation algorithm for this problem is developed that operates on natural underlying latent variables, with the use of those variables ensuring that all consistency constraints are satisfied. Experimental results indicate that the algorithm outperforms previous schemes proposed for the same task and is fully comparable in accuracy with the `gold standard' bundle adjustment technique, rendering the whole approach both of practical and theoretical interest. With a view to practical application, it is shown that the proposed algorithm can be incorporated into the familiar RANSAC technique so that the resulting modified scheme is capable of robust fitting of fully consistent homographies to data with outliers.

3 Sampson Distance Based Joint Estimation of Multiple Homographies

Two images of a scene consisting of multiple flat surfaces are related by a collection of homography matrices. Practitioners typically estimate these homographies separately thereby violating inherent inter-homography constraints that arise naturally out of the rigid geometry of the scene. We demonstrate that through a suitable choice of parametrisation multiple homographies can be jointly estimated in a manner so as to satisfy all inter-homography constraints. Unlike the cost functions used previously for solving this problem, our cost function does not correspond to fitting one set of homography matrices to another set of homography matrices. Instead, we utilise the Sampson distance for homography matrix estimation and operate directly on image data points. By using the Sampson distance and working directly on data points, we expedite the application of a vast amount of knowledge that already exists for Sampson-distance-based single homography or fundamental matrix estimation. The estimation framework reported in this paper establishes a new baseline for joint multiple homography estimation and at the same time raises intriguing new research questions. The work may be of interest to a broad range of researchers who require the estimation of homography matrices as part of their solution.


4 References

[1]: M. Zuliani, C. S. Kenney, and B. S. Manjunath, “The multiRANSAC algorithm and its application to detect planar homographies,” in Proc. int. conf. image processing, 2005, vol. 3, pp. 153–156.

[2]: W. Chojnacki and A. van den Hengel, “A dimensionality result for multiple homography matrices,” in Proc. 13th int. conf. computer vision, 2011, pp. 2104–2109.

[3]: W. Chojnacki and A. van den Hengel, “On the dimension of the set of two-view multi-homography matrices,” Complex Anal. Oper. Theory, vol. 7, no. 2, pp. 465–484, 2013.

[4]: Y. Zheng, S. Sugimoto, and M. Okutomi, “A branch and contract algorithm for globally optimal fundamental matrix estimation,” in Proc. IEEE conf. computer vision and pattern recognition, 2011, pp. 2953–2960.

[5]: Y. Zheng, S. Sugimoto, and M. Okutomi, “A practical rank-constrained eight-point algorithm for fundamental matrix estimation,” in Proc. IEEE conf. computer vision and pattern recognition, 2013, pp. 1546–1553.


last update: Tue Oct 06, 2015