Krylov Subspace, Lancosz Iteration, QR and Conjugate Gradient Motivation In practise, many numerical algorithms include iteratively multiply a matrix, like power method and QR algorithm. All these algorithms have their core connected to a single construct, Krylov subspace and a operation, Lancosz Iteration. So this note motivates to understand this core.
Jan 1, 2020
TOC {:toc} Note on Online Regression Algorithm Least Square Problem Classical least square linear regression is $$ \hat \beta_{ls}=\arg\min_\beta\|y-X\beta\|^2_2 $$ With regularizations it becomes a ridge or lasso regression problem
Dec 15, 2019
Motivation Sometimes we want to examine the Hessian or Jacobian of a function w.r.t some variables. For that purpose, autogradient algorithm can help us. Autograd mechanism In Essence, Autograd requires a computational graph. (Directed Acyclic Graph) For each computational node (e.g. $z=f(x,y)$), we define a forward computation $(x,y)\mapsto z,\ z=f(x,y)$ mapping bottom to top, and a backward computation mapping the partial derivative to top to the partial derivative to bottom. $\partial_z\mapsto (\partial_x,\partial_y); (gx,gy)=g(gz;x,y)$ .
Dec 9, 2019
TOC {:toc} L-BFGS algorithm Motivation L-BFGS is one of the not so simple optimization algorithm that we may encounter in large scale optimization problems. Not so simple means it’s not simply a first order algorithm, and the deviation from that is well motivated by theoretical arguments. So this note target to understand this algorithm
Nov 11, 2019
Note on Optimization on Manifold Manifold is locally similar to $\R^n$ flat space, but globally not. Manifold is locally homeomorphic to a Euclidean space of the same dimension. Besides there is Riemann logrithm map that connect the local vector space $T_p$ to the neighbourhood of $p$ on the manifold. Thus, many local optimization algorithm that work on flat $\R^n$ space can work the same on neighborhood of a manifold. The unique thing of working on a manifold is how to transport the local direction information in one neighborhood into another neighborhood.
Sep 29, 2019
TOC {:toc} Problem Setting The original problem of non-negative matrix factorization is simple, if the dissimarity $D(A\|HW)$ between original matrix and reconstructed one is L2 distance than, $$ argmin_{H,W} \|A-HW\|_F^2, \\ s.t.\ W\succeq0, H\succeq0 $$The non-negative constraint applies element-wise.
Jun 25, 2019
TOC {:toc} Constrained CMA-ES Algorithm Target CMA-ES is originally used in unconstrained optimization. To adapt it into constrained optimization and we have to handle the boundary in some way. So how could it handle this geometric boundary?
Jun 20, 2019
TOC {:toc} 最近在阅读1,是以为记。 Objective of Algorithm 目标 Belief Propagation算法想解决的是Markov随机场,Bayes网络等图模型的边缘概率估计,以及求解最可能的状态的问题。 有许多名字称呼这一General的算法,如sum-product, max-product, min-sum, Message Passing等,属于更general的Message Passing算法范畴。 同时这一算法可以说是一种通用框架或者philosophy,因此在不同结构的模型中有许多著名的特例,这些具体算法也有各自的名字(如前向后向算法,Kalman Filter等等) 对于统计学习问题,通常会区分模型与算法,模型设定一些假设,抽象现实的某个方面,建立问题的结构;而算法求解问题(很多时候是转化为优化问题来求解)。在这个post中将要介绍的Belief Propagation算法,属于后者,但为了理解他,我们首先需要理解他对应的模型,即概率图模型。 Graphical Models: What relates graph to probability? 第一次接触概率图模型的人(像我)都会问,概率和图这两者有什么关系呢? 我们知道图是一种直观的表征事物之间二元关系的方法通常由$(\mathcal V, \mathcal E)$定点和边组成。在概率图模型中,顶点通常代表随机变量,而边代表随机变量之间的关系。
May 18, 2019