ยท 1 min read

For an 0 th order optimization. You only need a way to generate samples around you (sampling) and a way to estimate gradient on these samples. Several aspects will affect the performance of your optimizer

  • Estimating gradient and Hessian to form a step
  • Handling constraints
  • Efficient sampling
  • Parametrization of the space or the landscape over the parameters

Boundary Handling

Sampling from a bounded region in a high dimensional space can be a challenge. As rejection based method could be super-inefficient, since a large volume of density will be reside outside the support.

So there is other strategies that helps this.

http://dobigeon.perso.enseeiht.fr/papers/Dobigeon_TechReport_2007b.pdf

Design Optimizers for BigGAN

Estimating Gradient and Hessian

Thanks to Dr. Holy, he points out a line of research on derivative free optimizations, which is Powell’s algorithms: COBYLA, UOBYQA, NEWUOA, BOBYQA and LINCOA algorithms.

The thought behind it is closely related to the LBFGS algorithm. (See LBFGS notes) which is to update the Hessian little by little with minimal change each step.