Theory
ICA is an algorithm for source separation, also known as disentanglement. For example, if we have
Info
Note that though ICA shares a similar name with ๐๏ธ Principle Component Analysis, their objectives are unrelated.
Let
First, assume independence across sources, and let them come from some non-gaussian distribution
Converting to
Next, we assume that the cdf of the sources is a sigmoid, so the pdf
Then, our log likelihood is
Finally, this can be maximized via โฐ๏ธ Gradient Descent with the gradient
Training
Our loss function is the negative of the log likelihood above, and we optimize with gradient descent.
Prediction
To get the sources from