Fully convolutional networks (FCNs) are simply ๐Ÿ‘๏ธ Convolutional Neural Networks without flattening or fully connected layers. Our output is thus a structure feature map (specifically, a 3-dimensional tensor), which is helpful for dense prediction tasks like segmentation or depth prediction.

Segmentation

Notably, FCNs are excellent for semantic segmentation. To output a high-resolution segmentation map and incorporate coarse and fine information, we can collect feature activations from different depths in our FCN, as shown below. At the end, we use convolutions to map our final activations to per-pixel class probabilities.