Covariance Matrix Adaption Evolution Strategy

Covariance Matrix Adaptation Evolution Strategy (CMA-ES) is a robust and efficient evolutionary algorithm for solving difficult non-linear, non-convex optimization problems. It was developed by Nikolaus Hansen and Andreas Ostermeier in the late 1990s [1]. It leverages covariance matrix adaptation to efficiently explore and exploit the search space.

[1] N. Hansen and A. Ostermeier, “Completely Derandomized Self-Adaptation in Evolution Strategies”, in Evolutionary Computation, vol. 9, no. 2, pp. 159-195, June 2001, https://doi.org/10.1162/106365601750190398.

Basic Concepts

  • Population-Based Search: CMA-ES operates on a population of candidate solutions, evolving them over generations to improve the overall fitness.

  • Covariance Matrix: A key feature of CMA-ES is the adaptation of the covariance matrix, which models the distribution of the population and its correlations, allowing for effective search in the solution space.

Key Components

1. Initialization

  • Initialize a population of candidate solutions with a mean vector \(m\), step size \(\sigma\), and covariance matrix \(C\).

  • The initial mean vector is set based on prior knowledge or randomly.

2. Sampling

  • Generate new candidate solutions by sampling from a multivariate normal distribution:

    \[x_k = m + \sigma \cdot N\left(0, C\right)\]

    where:

    • \(x_k\) is the k-th candidate solution.

    • \(m\) is the mean vector.

    • \(\sigma\) is the step size.

    • \(N\left(0, C\right)\) is a normal distribution with mean 0 and covariance matrix \(C\).

3. Evaluation

  • Evaluate the fitness (or loss) of each candidate solution using a predefined fitness (loss) function.

4. Selection

  • Select the best-performing candidates based on their fitness values.

5. Recombination and Adaptation

  • Update the mean vector \(m\) to the weighted average of the selected candidates:

    \[m_\text{new} = \sum_{i=1}^{\mu} w_i \cdot x_i\]

    where:

    • \(w_i\) are the recombination weights.

    • \(x_i\) are the selected candidate solutions.

  • Adapt the covariance matrix \(C\) using the evolution path and selected candidates:

    \[C_\text{new} = (1 - c_1 - c_\mu) \cdot C + c_1 \cdot p_c \cdot p_c^T + c_\mu \cdot \sum_{i=1}^{\mu} w_i \cdot (x_i - m)(x_i - m)^T\]

    where:

    • \(c_1\) and \(c_\mu\) are learning rates.

    • \(p_c\) is the evolution path.

6. Step Size Adaptation

  • Adapt the step size \(\sigma\) based on the length of the evolution path:

    \[\sigma_\text{new} = \sigma \cdot \exp \left( \frac{c_\sigma}{d_\sigma} \left( \frac{||p_\sigma||}{E(||N(0,I)||)} - 1 \right) \right)\]

    where:

    • \(c_\sigma\) and \(d_\sigma\) are learning rates.

    • \(p_\sigma\) is the evolution path for the step size.

    • \(E(||N(0,I)||)\) is the expected length of a random vector from the standard normal distribution.

The process of sampling, evaluating, selecting, and adapting continues iteratively until a stopping criterion is met, such as a maximum number of generations or a satisfactory fitness level.

Advantages

  • CMA-ES is highly effective for solving complex, multi-modal optimization problems.

  • It requires minimal parameter tuning and adapts dynamically to the search landscape.

  • The algorithm’s robustness and adaptability make it suitable for a wide range of applications.