Multi-view stereopsis generalizes ๐Ÿฟ Two-View Stereopsis to views; given these images and their relative and , our goal is to reconstruct the 3D world model. The goal is to check correspondences and infer occupancies and colors.

There are a multitude of approaches to this problem. A few of the more common ones are below.

Depth Map Fusion

The most straightforward approach is to simply fuse the two-view stereopsis depth maps together. That is, for each pair of the views, we can generate the depth map by computing disparities; then, we can project them all into world coordinates and merge them together, giving us a more complete model.

Plane Sweep Stereo

Plane-sweep stereo constructs a robust depth map for a single view by scanning the world plane-by-plane. That is, we pick a view as the reference, then sweep a parallel plane across the optical axis; for each plane , we can compute ๐Ÿ–ผ๏ธ Homographys from the views onto this plane; the depth for each pixel in the reference is the value for the plane where the pixelโ€™s patches are most similar.

Space Carving

Space carving builds a 3D world model by iteratively โ€œcarvingโ€ a voxel grid. That is, we start out with a full grid. Then, we choose a voxel, project it onto the views, and remove it if itโ€™s not photo-consistent.

Visual Hull

The visual hull follows a similar idea to space carving but in the reverse direction: for each view, we segment out the object and then project it into 3D space. The final shape is the intersection of the back-propagated volumes.

Surface Expansion

Surface expansion revisits the feature matching idea from two-view stereopsis but adds robustness by changing our strategy for selecting matches. We first start out with a sparse set of confident matches, then โ€œexpandโ€ them to nearby locations, iteratively growing our matches from the most confident locations toward nearby semi-confident ones.