Generative cellular automata (GCA) is a 3D reconstruction method that takes inspiration from ๐Ÿ PixelCNN to โ€œgrowโ€ a conditioned 3D model in voxel space. Like its namespace, the cellular automata, this model mutates states based on rules based on a Markov Chain.

Formally, we can represent a 3D shape in voxel space as , a set of occupied cells . Then, to generate a shape, we follow the process

Within each transition, we assume independence across occupancies and generate the next state as with local updates,

Specifically, we apply convolutions onto each to get probabilities for its neighboring voxels, . After this, the product above goes through all neighbors and samples using the cell-wise average of the predicted probabilities.

To train our transitions, we seek to maximize

the probability that our next generation (within the reachable neighborhood ) is close to . However, naively optimizing this using the sampling method above runs into difficulties since we might have . The solution is to sample from an infusion chain instead during training,

where ๐Ÿ™ encourages the sampling to get closer to . By โ€œpushingโ€ our current shape toward the desired one, we can then maximize our next-generation probability above.

Continuous GCA

We can combine GCA with an encoder and decoder to form continuous GCA (cGCA), which generates smooth surfaces with an implicit function. Specifically, the encoder encodes the input into a voxel latent representation; the GCA completes the shape, which is now defined with both occupancy and latent codes , and the decoder predicts occupancy for a query point conditioned on the latents, .