 Astronomy
 Biomedical Optics & Medical Imaging
 Defense & Security
 Electronic Imaging & Signal Processing
 Illumination & Displays
 Lasers & Sources
 Micro/Nano Lithography
 Nanotechnology
 Optical Design & Engineering
 Optoelectronics & Communications
 Remote Sensing
 Sensing & Measurement
 Solar & Alternative Energy
 Sign up for Newsroom EAlerts
 Information for:
Advertisers




Electronic Imaging & Signal Processing
Enhancing cameramotion estimation
Michel Sarkis and Klaus Diepold
A novel projective Newtononthemanifold algorithm provides accurate computation of camera position and orientation with reduced complexity.
12 May 2009, SPIE Newsroom. DOI: 10.1117/2.1200905.1610
Estimating a moving camera's motion parameters is a challenging subfield in computer vision and robotics. In general, we first calculate and track a number of feature points for a given sequence of images, which we can then use to reconstruct the scene's initial 3D model, as well as the corresponding camera positions and orientations (‘poses’). Nevertheless, estimates of both the motion and the 3D model are susceptible to a variety of errors. If P is a point reconstructed in 3D space, it will be reprojected on the image, using these erroneous values, at a position , which is different from its real locus p (see Figure 1). To reduce the reprojection error, defined by the distance p and , we must apply nonlinear minimization to simultaneously adjust these errors for all points in all images. This is known as the ‘bundle adjustment’ (BA).^{1–5} For motion refinement only, BA reduces errors to estimate the pose of a moving camera. The most common applicable minimization scheme is the LevenbergMarquardt (LM) approach,^{6,7} which is based on a Newtontype iterative algorithm that finds successively better approximations. With some initial estimates of the camera parameters, LM refines the reprojection errors to reach a global solution. Convergence is guaranteed, provided the initial values are sufficiently close to the global solution (otherwise instability problems might occur). As in any Newtontype algorithm, for each iteration LM computes the gradient and Hessian of the cost function defined by the reprojection errors. In addition, the Hessian used is approximate only, since LM ignores the secondorder derivatives of the cost function. To enhance the convergence rate (and thus the accuracy of the results), one can use a general Newton algorithm that computes the full numerical form of the Hessian instead of LM.^{8} However, this leads to a significant increase in computational complexity. We, therefore, developed a projective Newtononthemanifold approach to enhance the accuracy of the motion estimate while not requiring significant increases in computational power.^{9} Figure 1. The problem setup. O: Camera center, P: Real object, p: Real position, : Erroneous reprojection. Designing an optimization scheme on a manifold is equivalent to making the algorithm walk iteratively on the surface of the curve described by the motion variables. To do so, the manifold must be differentiable and smooth, so that we can define the distances and angles required to determine the optimization direction. Relevant examples include the Riemannian manifolds, which are equipped with tangent spaces. These allow smooth transitions from one point to another.^{10–12} The variables defining 3D rigid motion are described by the special Euclidean group SE_{3}, a Riemannian manifold. Its tangent space is defined by the associated Lie algebra (se_{3}). Our projected Newton method exploits the local parametrization of the motion variables on the manifold using the associated Lie algebra. Therefore, we determine the optimization direction (i.e., the gradient and Hessian computation of the cost function) for each iteration on the tangent space. We subsequently project the result back onto SE_{3}to update the motion variables (see Figure 2). This back projection is possible since se_{3} is related to SE_{3} through exponential mapping. Finally, we repeat these steps until the minimization converges to the global solution. By analyzing the projected Newton algorithm, we found its computational requirements are 60% lower than for a general Newton algorithm (since the evaluation of both the Hessian and the gradient is simpler).^{9} We can now use the full numerical form of the Hessian with a numerical complexity close to that of LM. This will also render the results of the projective Newton scheme more accurate. To visualize this, in Figure 3 we show an example application of the two algorithms to the Dinosaur sequence, provided by the visual geometry group at the University of Oxford (UK). Figure 2. Projective Newton algorithm on the manifold. SE _{3}: Special Euclidian group (a Riemannian manifold), se _{3}: Associated Lie algebra, M: Cameramotion estimate, : 6D Euclidean space. In conclusion, application of a projective Newtononthemanifold algorithm to refine a camera's motion parameters is more accurate than using LM, since the former uses the full numerical form of the Hessian in the optimization. In addition, the complexity involved in the Hessian computation is 60% lower than for a general Newton algorithm and close to that required by LM. We will next integrate our technique in structurefrommotion problems and explore the underlying manifold structure of 3D objects. Figure 3. Example application of (left) LM and (right) the projective Newtononthemanifold algorithm to the Dinosaur sequence.
Michel Sarkis, Klaus Diepold Lehrstuhl für Datenverarbeitung Technical University Munich Munich, Germany
References: 3. M. Pollefeys, L. Van Gool, M. Vergauwen, F. Verbiest, K. Cornelis, J. Tops, R. Koch, Visual modeling with a handheld camera, Int'l J. Comput. Vis. 59, no. 3, pp. 207232, 2004.


