Stochastic Optimization
Chapters
- General Terms and tools
- PCA
- PCA
- Hebbian Learning
- Kernel-PCA
- Source Separation
- ICA
- Infomax ICA
- Second Order Source Separation
- FastICA
- Stochastic Optimization
- Clustering
- k-means Clustering
- Pairwise Clustering
- Self-Organising Maps
- Locally Linear Embedding
- Estimation Theory
- Density Estimation
- Kernel Density Estimation
- Parametric Density Estimation
- Mixture Models - Estimation Models
- Density Estimation
Simulated Annealing
Simulated annealing is oriented in crystallization procedures in nature where the lowest energy state is achieved only when the temperature is lowered very slowly.
The temperature in nature is equivalent to fast the change in the system decreases. Parameters
- Cost function \(E : \underline s \mapsto E_{(\underline s)}\in |\!R\)
Pros
- Easy to implement
- Converges
Drawbacks
- expensive (takes forever)
Algorithm
Note that the original algorithm has an inner loop where \(\beta_t\) is not changed. I do not see a reason for it.
- while true
- choose new state randomly
- calculate difference in energy levels: \(\Delta E = E_{(\underline s)} - E_{(\underline s_t)}\)
- change state with probability: \(\frac{1}{1+e^{(\beta_t\Delta E)}}\)
- Update \(\beta_t = \tau\beta_{t-1}\) occasionally
Mean-field Annealing
The idea of mean-field annealing is to estimate \(P\) with \(Q\). \(Q\) is updated with every temperature level \(\beta\).
The transition probabilities between to states are symmetric.
Is Markov Process.
Gibbs-Boltzmann-distribution (is symmetric)
\[P_{(\underline s)} = \frac{1}{Z}exp(-\beta E)\]Factorizing distribution
\[\frac{1}{Z_Q}e^{-\beta\sum_k e_k s_k}\]Algorithm
- while true
- calculate mean-fields \(e_k\text{, }k=1,...,N\)
- calculate moments \((s_k)_Q\text{, }k=1,...,N\)
- until \(\lvert e_k^{old} -e_k^{new} \rvert <\varepsilon\)