- Mastering OpenCV 4
- Roy Shilkrot David Millán Escrivá
- 709字
- 2021-07-02 14:47:41
Stereo reconstruction and SfM
In SfM, we would like to recover both the poses of cameras and the position of 3D feature points. We have just seen how simple 2D pair matches of points can help us estimate the essential matrix and thus encode the rigid geometric relationship between views: . The essential matrix can be decomposed into
and
by way of SVD, and having found
and
, we proceed with finding the 3D points and fulfilling the SfM task for the two images.
We have seen the geometric relationship between two 2D views and the 3D world; however, we are yet to see how to recover 3D shape from the 2D views. One insight we had is that given two views of the same point, we can cross the two rays from the optic center of the cameras and the 2D points on the image plane, and they will converge on the 3D point. This is the basic idea of triangulation. One simple way to go about solving for the 3D point is to write the projection equation and equate, since the 3D point () is common,
, where the
matrices are the
projection matrices. The equations can be worked into a homogeneous system of linear equations and can be solved, for example, by SVD. This is known as the direct linear method for triangulation; however, it is severely sub-optimal since it makes no direct minimization of a meaningful error functor. Several other methods have been suggested, including looking at the closest point between the rays, which generally do not directly intersect, known as the mid-point method.
After getting a baseline 3D reconstruction from two views, we can proceed with adding more views. This is usually done in a different method, employing a match between existing 3D and incoming 2D points. The class of algorithms is called Point-n-Perspective (PnP), which we will not discuss here. Another method is to perform pairwise stereo reconstruction, as we've seen already, and calculate the scaling factor, since each image pair reconstructed may result in a different scale, as discussed earlier.
Another interesting method for recovering depth information is to further utilize the epipolar lines. We know that a point in image L will lie on a line in image R, and we can also calculate the line precisely using . The task is, therefore, to find the right point on the epipolar line in image R that best matches the point in image L. This line matching method may be called stereo depth reconstruction, and since we can recover the depth information for almost every pixel in the image, it is most times a dense reconstruction. In practice, the epipolar lines are first rectified to be completely horizontal, mimicking a pure horizontal translation between the images. This reduces the problem of matching only on the x axis:

The major appeal of horizontal translation is disparity, which describes the distance an interest point travels horizontally between the two images. In the preceding diagram, we can notice that due to right overlapping triangles: , which leads to
. The baseline
(horizontal motion), and the focal length
are constant with respect to the particular 3D point and its distance from the camera. Therefore, the insight is that the disparity is inversely proportional to depth. The smaller the disparity, the farther the point is from the camera. When we look at the horizon from a moving train's window, the faraway mountains move very slowly, while the close by trees move very fast. This effect is also known as parallax. Using disparity for 3D reconstruction is at the base of all stereo algorithms.
Another topic of wide research is MVS, which utilizes the epipolar constraint to find matching points from multiple views at once. Scanning the epilines in multiple images all at once can impose further constraints on the matching features. Only when a match that satisfies all the constraints is found is it considered. When we recover multiple camera positions, we could employ MVS to get a dense reconstruction, which is what we will do later in this chapter.
- ASP.NET Web API:Build RESTful web applications and services on the .NET framework
- Learning Python by Building Games
- Node.js:來一打 C++ 擴展
- ArcGIS for Desktop Cookbook
- HTML+CSS+JavaScript網頁設計從入門到精通 (清華社"視頻大講堂"大系·網絡開發視頻大講堂)
- C++ System Programming Cookbook
- PhoneGap 4 Mobile Application Development Cookbook
- 黑莓(BlackBerry)開發從入門到精通
- Building UIs with Wijmo
- Koa與Node.js開發實戰
- Java核心技術速學版(第3版)
- JavaScript編程精解(原書第3版)
- Python量子計算實踐:基于Qiskit和IBM Quantum Experience平臺
- Getting Started with Hazelcast
- Visual FoxPro程序設計教程(第3版)