Automatic Dense Reconstruction from Uncalibrated Video Sequences. Front Cover. David Nistér. KTH, – pages. Automatic Dense Reconstruction from Uncalibrated Video Sequences by David Nister; 1 edition; First published in aimed at completely automatic Euclidean reconstruction from uncalibrated handheld amateur video system on a number of sequences grabbed directly from a low-end video camera The views are now calibrated and a dense graphical.
|Published (Last):||19 November 2007|
|PDF File Size:||9.51 Mb|
|ePub File Size:||7.85 Mb|
|Price:||Free* [*Free Regsitration Required]|
The proposed approach first compresses the feature points of each image into three principal component points by using the principal component analysis method.
In order to select the key images suitable for 3D reconstruction, the principal component points are used to estimate the interrelationships between images. Second, these key images are inserted into a fixed-length image queue. The positions and orientations of the images are calculated, and the 3D coordinates of the feature points are estimated using weighted bundle adjustment.
With this structural information, the depth maps of these images can be calculated. Next, we update the image queue by deleting some of the old images and inserting some new images into the queue, and a structural calculation of all the images can be performed by repeating the ajtomatic steps. Finally, a dense 3D point cloud can be obtained using the depth—map fusion method.
The experimental results indicate that when the texture of the images is complex and the number of images exceedsthe proposed method can improve the calculation speed by reconstrucction than a factor of four with almost no loss of precision.
Furthermore, as the number of images increases, the improvement in the calculation speed will become more noticeable. Because of the rapid development auhomatic the unmanned aerial vehicle UAV industry in ffrom years, civil UAVs have been used in agriculture, energy, environment, public safety, infrastructure, and other fields. By carrying a digital camera on a UAV, two-dimensional 2D images can be obtained.
However, as the requirements have grown and matured, 2D images have not been able to meet the requirements of many applications such as three-dimensional 3D terrain and scene understanding. Thus, there is an urgent need to reconstruct 3D structures from the 2D images collected from UAV camera.
Rapid 3D Reconstruction for Image Sequence Acquired from UAV Camera
The study of the methods in which 3D structures are generated by 2D images is an important branch of computer vision. In this field, many researchers have proposed several methods and theories [ 1234567891011121314151617 ]. Among these theories and methods, the three most important categories are the simultaneous localization and mapping SLAM [ 123 ], structure from motion SfM [ 4567891011121314 ] and multiple view stereo MVS algorithms [ 151617 ], which have been implemented in many practical applications.
As the number of images and their resolution increase, the computational times of the algorithms will increase significantly, limiting them in some high-speed reconstruction applications. Two major contributions in this paper are methods of selecting key images selection and SfM calculation of sequence images.
Key images selection is very important to the success of 3D reconstruction. In this paper, a fully automatic approach to key frames extraction without initial pose information is proposed.
Principal Component Analysis PCA is used to analyze the correlation of features over frames to automate the key frame selection. Considering the continuity of the images taken by UAV camera, this paper proposes a 3D reconstruction method based on an image queue. To ensure the smooth of two consecutive point cloud, an improved bundle-adjustment named weighted bundle-adjustment is used in this paper. After using a fixed-size image queue, the global structure calculation is divided into several local structure calculations, thus improving the speed of the algorithm with almost no loss of accuracy.
The general 3D reconstruction algorithm without a priori positions and orientation information can be roughly divided into two steps. The first step involves recovering the 3D structure of the scene and the camera motion from the images. The problem addressed in this step is generally referred to as the SfM problem. The second step involves obtaining the 3D topography of the scene captured by the images. This step is usually completed by generating a dense point data cloud or mesh data cloud from multiple images.
The problem addressed in this step is generally referred to as the MVS problem. In addition, the research into Real-time simultaneous localization and mapping SLAM and 3D reconstruction of the environment have become popular over the past few years.
Positions and orientations of monocular camera and sparse point map can be obtained from the images by using SLAM algorithm. The SfM algorithm is used to obtain the structure of the 3D scene and the camera motion from the images of stationary objects.
They both estimate the localizations and orientations of camera and sparse features. Researchers have proposed improved algorithms for different situations based on early SfM algorithms [ 456 ]. A variety of SfM strategies have emerged, including incremental [ 78 ], hierarchical [ 9 ], and global [ 101112 ] approaches. Among these methods, a very typical one was proposed by Snavely [ 13 ], who used it in the 3D reconstruction of real-world objects.
With the help of feature point matching, bundle adjustment, and other technologies, Snavely completed the 3D reconstruction of objects by using images of famous landmarks and cities.
The SfM algorithm is limited in many applications because of the time-consuming calculation.
Maxime Lhuillier’s home page
With the continuous development of computer hardware, multicore technologies, and GPU technologies, the SfM algorithm can now be used in several areas.
In many applications, the SfM algorithm has higher requirements for the computing speed and accuracy. There are several improved SfM methods such as the method proposed by Wu [ 814 ]. These methods can improve the speed of the structure calculation without loss of accuracy.
Among the incremental SfM, hierarchical SfM, and global SfM, the incremental SfM is the most popular strategy for the reconstruction of unordered images. Two important steps in incremental SfM are the feature point matching between images, and bundle adjustment. As the resolution and number of images increase, the number of matching points and parameters optimized by bundle adjustment will increase dramatically. This results in a significant increase in the computational complexity of the algorithm and will make it difficult to use it in many applications.
When the positions and orientations of the cameras are known, the MVS algorithm can reconstruct the 3D structure of a scene by using multiple-view images. One of the most representative methods was proposed by Furukawa [ 15 ].
This method estimates the 3D coordinates of the initial points by matching the difference of Gaussians and Harris corner points between different images, followed by patch expansion, point filtering, and other processing. The patch-based matching method is used to match other pixels between images. After that, a dense point data cloud and mesh data cloud can be obtained. These algorithms can obtain reconstruction results with an even higher density and accuracy.
The method proposed by Shen [ 16 ] is one of the most representative approaches. The estimated depth maps are obtained from the mesh data generated by the sparse feature points. Then, after depth—map refinement and depth—map fusion, a dense 3D point data cloud can be obtained.
An implementation of this method can be found in the open-source software openMVS [ 16 ]. When processing weakly textured images, it is difficult for this method to generate a dense point cloud. In addition, the algorithm must repeat the patch expansion and point cloud filtering several times, resulting in a significant increase in the calculation time. This method can easily and rapidly obtain a dense point cloud.
SLAM mainly consists in the simultaneous estimation of the localization of the robot and the map of the environment. The map obtained by SLAM is often required to support other tasks. The popularity of SLAM is connected with the need for indoor applications of mobile robotics. Without priors, MAP estimation reduces to maximum-likelihood estimation. Most SLAM algorithms are based on iterative nonlinear optimization [ 12 ]. The biggest problem of SLAM is that some algorithms are easily converging to a local minimum.
It usually returns a completely wrong estimate. Convex relaxation is proposed by some authors to avoid convergence to local minima. These contributions include the work of Liu et al.
Kinds of uncwlibrated SLAM algorithms have been proposed to adapt to different applications. Some of them are used for vision-based navigation and mapping. The first step of our method involves building a fixed-length image queue, selecting the key images from the video image sequence, and inserting them into the image queue until full.
A structural calculation is then performed for the images recknstruction the queue. Next, the image queue is updated, several images are deleted from the front of the queue, and the same number of images is placed at the end of the queue. The structural calculation of the images in the queue is then repeated until all images are processed.
On an independent autmatic, the depth maps of the images are calculated and saved in the depth-map set. Finally, all depth maps are fused to generate dense 3D point cloud data.
Automatic Dense Reconstruction from Uncalibrated Video Sequences
Without the use of ground control points, the result of our method lost the accurate scale of the model. The algorithm flowchart is outlined in Figure 1. In order to complete the dense reconstruction of the point cloud and improve the computational speed, the key images which are suitable for the structural calculation must first be selected from a large number of UAV video images captured by a camera.
The selected key images should have a good overlap of area for the captured scenes. For two consecutive key images, they must meet the key image constraint denoted as R I 1I 2 if they have a sufficient overlap area.
The overlap area between images can be estimated by the correspondence between the feature points of the images. In order to reduce the computational complexity of feature point matching, we propose a method of compressing the feature points based on principal component analysis PCA.
It is assumed that the images used for reconstruction are rich in texture. Three principal component points PCPs can be generated from PCA, each reflecting the distribution of the feature points in different images. If the two images are captured almost at the same position, the PCPs of them almost coincide in the same place. Otherwise, the PCPs will move and be located in different positions on the image. The process steps are as follows.
First, we use the scale-invariant feature transform SIFT [ 19 ] feature detection algorithm to detect the feature points of each image Figure 2 a. There must be at least four feature points, and the centroid of these feature points can then be calculated as follows:.
The following matrix is formed by the image coordinates of the feature points:. Then, the singular value decomposition SVD of matrix A yields two principal component vectors. The principal component points PCPs are obtained from these vectors Equations 3 and 4. To compress a large number of feature points into three PCPs Figure 2 b.
The PCPs can reflect the distribution of the feature points in the image. After that, by calculating the positional relationship of the corresponding PCPs between two consecutive images, we can estimate the overlap area between images.
The average displacement d p between PCPs, as expressed in Equation 5can be calculated as follows: The result is presented in Figure 2 c.