Bio
Papers
Talks
Experience
Teaching
Writings
🗂️
🏷️

Algorithm

29 items tagged

Home / Tags / Algorithm

Note on Variants of Diffusion Scheduler, DDPM DDIM PNDM

Motivation Given the popularity and power of diffusion models, the theoretical formulation of these models are not in unison. Because multiple groups have derived these models from different background, there exist multiple formulations, SDE, ODE, Markov Chain, Non-markov chain etc.

tech note Algorithm ODE SDE Dynamic System Generative Model Diffusion model

January 16, 2023

Note on Hippo- Recurrent Memory with Optimal Polynomial Projection

Hippo: Recurrent Memory with Optimal Polynomial Projection Motivation Hidden state in RNN represents a form of memory of the past. For a sequence, a natural way to represent the past sequence is to project it onto an orthonormal basis set. Here depending on the different emphasis of the past, we could define different measures on the time axis and define the basis set based on this measure. Then we can keep track of the projection coefficient on this basis when observing new data points.

tech note Algorithm RNN Recurrent Memory linear algebra Dynamic System Sequence Model

July 25, 2022

Note on S4-Efficiently Modeling Long Sequences with Structured State Spaces

[TOC] Motivation S4 sequence model is rising in the sequence modelling field. It dominates on long sequence modelling over RNN, LSTM and transformers. It’s both mathematically elegant and useful, and it’s trending, so why not write about it.

tech note Algorithm Statistical Learning linear algebra Dynamic System Sequence Model

July 17, 2022

Note on Equivalent Score Matching Objective

TOC {:toc} Motivation Consider a distribution $p(x)$, we could “convolve” it with a kernel $p(\tilde{x}\mid x)=q(\tilde{x}-x)$. The marginal distribution of $\tilde{x}$ is denoted as $p_\sigma(\tilde{x})$. We want to model the score of this convolved distribution and that of the original distribution $\nabla\log p_\sigma(\tilde{x})$ .

tech note Algorithm Machine Learning Statistical Learning Generative Model probability

June 20, 2022

Note on EM algorithm and likelihood lower bound

Motivation How to understand EM algorithm from a theoretical perspective? This post tries to understand EM as a form of alternative ascent of a lower bound of likelihood. The Key Trick of EM The key trick we need to remember is the usage of Jensen Inequality on logarithm. So we could swap Expectation and logarithm and obtain a lower bound on likelihood. Generally, we have such inequality, given a positive function $q(z)$ that sums to $1$ (probability density),

tech note Bayesian Statistical Learning MLE

May 13, 2022

Note on Diffusion Generative Models

TOC {:toc} Motivation Recently, a line of research emerged in generative image models, diffusion models, which showed a competitive performance with GAN [^1]. More recently, a larger scale version of it gave rise to the ground breaking model DALL-E 2 and its precursor GLIDE.

tech note Algorithm Machine Learning Statistical Learning Generative Model

May 9, 2022

Probabilistic Models of Neural Population

Motivation Here we summarize a few common probabilistic neural population models. Adapted from reading notes and class presentations from Neuro QC316 taught by Jan Drugowitsch. LNP, GLM These are the simplist models of neurons.

statistics EM neuro-model Bayesian Statistical Learning

April 16, 2022

Note on Kernel PCA

TOC {:toc} Motivation Simply put, “kernel trick” is the finding that sometimes only inner product appears in the formulation of some algorithms. because of this, we could substitute the inner product with some fancier kernel function, i.e. inner product in some other spaces. This post is about another usage of kernel trick. Another usage is Kernel (ridge) Regression.

tech note Algorithm Machine Learning Statistical Learning Applied Math Linear Algebra kernel method

March 21, 2022

Note on Kernel Ridge Regression

TOC {:toc} Motivation Understand the use of kernel in regression problems. For usage in unsupervised learning / dimension reduction, see notes on Kernel PCA. Kernel in Classification Kernel is usually introduced in SVM classification problems. The rationale is that a linearly non-separable dataset could be separable in a high-dimensional feature space using the mapping $\phi:\mathcal X\to\mathcal F$ .

tech note Algorithm Machine Learning Statistical Learning Applied Math Linear Algebra kernel method

December 17, 2021

Note on Modern Hopfield Network and Transformers

Motivation There is a resurgent of interest in investigating and developing Hopfield network in recent years. This development is quite exciting in that it connect classic models in physics and machine learning to modern techniques like transformers.

tech note Unsupervised Learning Statistical Learning ML Representation Learning RNN Transformers

November 15, 2021

Note on Hopfield Network

Rationale Hopfield Network can be viewed an energy based model: deriving all properties from it. General RNN has many complex behaviors, but setting symmetric connections can prohibit it! No oscillation is possible in a symmetric matrix.

tech note Unsupervised Learning Statistical Learning ML RNN

November 15, 2021

Note on Word2Vec

Motivation Word2Vec is a very famous method that I heard of since the freshman year in college (yeah it comes out in 2013). Recently, some reviewer reminds us of the similarity of the “analogy” learnt by the vector representation of words and the vector analogy of image space in GAN or VAE.

tech note Unsupervised Learning Statistical Learning ML Representation Learning Language NLP

November 27, 2020

Note on Non-Parametric Regression

Problem Statement Given a bunch of noisy data, you want a smooth curve going through the cloud. As the points are noisy, there is no need to going through each point.

tech note Math Statistics Statistical Learning ML

October 13, 2020

Note on Variational Inference (Updating)

TOC {:toc} Philosophy The spirit of Variational Inference is to solve Bayesian inference problem with optimization. In the scenario of latent factor It’s not trying to use Bayes rule directly, but to fit this distribution within a class of distributions $q(z;\nu)$, by minimizing the KL-divergence between the 2 models.

tech note Algorithm Machine Learning Statistical Learning Graph Model Bayesian Variational Probabilistic Programming

July 27, 2020

Note on Gaussian Process

Note on Gaussian Process Gaussian Process can be thought of as a Gaussian distribution in function space (or infinite dimension vector). One of its major usage is to tackle nonlinear regression problem and provide mean estimate and errorbar around it.

tech note Math Statistics Statistical Learning ML

July 1, 2020

Note on Bayesian Optimization

Note on Bayesian Optimization Related to Gaussain Process model Philosophy Bayesian Optimization applies to black box functions and it employs the active learning philosophy. Use Case and Limitation BO is preferred in such cases

tech note Math Statistics Statistical Learning ML Optimization

July 1, 2020

Brief Note on Online Correlation Computation

TOC {:toc} Motivation Sometimes the matrix (samples) to be correlated is too large, then you need to compute the correlation when the data is pouring in, i.e. online computing correlation.

tech note Algorithm Statistics

May 22, 2020

Note on MinMax Game

Note on MiniMax (Updating) Motivation This is a very traditional way of solving turn based game like chess or tic-tac-toc. It’s climax is Deep Blue AI in playing chess. Note, some people think about GAN training procedure as a min-max game between G and D, which is also interesting.

academic note reinforcment learning machine learning AI deep learning

May 18, 2020

Note on Laplacian Operator (Diffusion) in Geometry Processing

Note on Laplacian-Beltrami (Diffusion) Operator Motivation Laplacian on graph and on discrete geometry (mesh) are very useful tools. One core intuition, just like Laplacian in $\R^n$ space, it’s related to diffusion and heat equation. Recall the diffusion equation is

tech note Math Linear Algebra computer vision Graphics computer graphics Geometry

May 8, 2020

Spectral Graph Theory and Segmentation

Spectral Graph Theory and Segmentation Motivation Spectral Graph Theory is a powerful tool as it sits at the center of multiple representation. Connects to Graph and manifold, and linear algrbra. It’s related to dynamics on graph, related to Markov chain, random walk (diffusion.) Could be applied to any point cloud: images, meshes are suited. Could be used to perform clustering, segmentation etc. Linear Algebra Review There are several ways to see a eigenvalue problem

tech note Math Linear Algebra computer vision segmentation graph theory

April 22, 2020

Note on Hessian Aware Zeroth Order Optimization Framework

Motivation Major Reference Zeroth order optimization, or derivative free optimization is also known as the oracle problem. It’s nothing new to optimization community. Interest in ZOO algorithm resurges partly because it could be used in black box adversarial attack, if the softmax probability is given; and it could also be used in optimization of experimental output; and it could also be used for many design problem as the result has a non-analytical relationship with the parameters.

tech note Algorithm Optimization Adversarial Example

January 21, 2020

Krylov Subspace, Conjugate Gradient and Lancosz Iteration

Krylov Subspace, Lancosz Iteration, QR and Conjugate Gradient Motivation In practise, many numerical algorithms include iteratively multiply a matrix, like power method and QR algorithm. All these algorithms have their core connected to a single construct, Krylov subspace and a operation, Lancosz Iteration. So this note motivates to understand this core.

tech note Linear Algebra Math Numerical Algebra Algorithm Optimization

January 1, 2020

Note on Online Linear Regression Algorithm

TOC {:toc} Note on Online Regression Algorithm Least Square Problem Classical least square linear regression is $$ \hat \beta_{ls}=\arg\min_\beta\|y-X\beta\|^2_2 $$ With regularizations it becomes a ridge or lasso regression problem

tech note Algorithm Machine Learning Statistical Learning Applied Math Linear Algebra Numerical Method

December 15, 2019

Autograd Mechanism and Hessian Computation in PyTorch

Motivation Sometimes we want to examine the Hessian or Jacobian of a function w.r.t some variables. For that purpose, autogradient algorithm can help us. Autograd mechanism In Essence, Autograd requires a computational graph. (Directed Acyclic Graph) For each computational node (e.g. $z=f(x,y)$), we define a forward computation $(x,y)\mapsto z,\ z=f(x,y)$ mapping bottom to top, and a backward computation mapping the partial derivative to top to the partial derivative to bottom. $\partial_z\mapsto (\partial_x,\partial_y); (gx,gy)=g(gz;x,y)$ .

tech note Machine Learning Algorithm Linear Algebra

December 9, 2019

LBFGS Algorithm

TOC {:toc} L-BFGS algorithm Motivation L-BFGS is one of the not so simple optimization algorithm that we may encounter in large scale optimization problems. Not so simple means it’s not simply a first order algorithm, and the deviation from that is well motivated by theoretical arguments. So this note target to understand this algorithm

tech note Algorithm Machine Learning Statistical Learning Optimization

November 11, 2019

Note on Optimization on Manifold

Note on Optimization on Manifold Manifold is locally similar to $\R^n$ flat space, but globally not. Manifold is locally homeomorphic to a Euclidean space of the same dimension. Besides there is Riemann logrithm map that connect the local vector space $T_p$ to the neighbourhood of $p$ on the manifold. Thus, many local optimization algorithm that work on flat $\R^n$ space can work the same on neighborhood of a manifold. The unique thing of working on a manifold is how to transport the local direction information in one neighborhood into another neighborhood.

tech note Algorithm Differential Geometry Optimization

September 29, 2019

Note on Algorithm for Non-Negative Matrix Factorization

TOC {:toc} Problem Setting The original problem of non-negative matrix factorization is simple, if the dissimarity $D(A\|HW)$ between original matrix and reconstructed one is L2 distance than, $$ argmin_{H,W} \|A-HW\|_F^2, \\ s.t.\ W\succeq0, H\succeq0 $$The non-negative constraint applies element-wise.

tech note Algorithm Machine Learning Statistical Learning Applied Math Linear Algebra Numerical Method

June 25, 2019

Constrained CMA-ES Algorithm

TOC {:toc} Constrained CMA-ES Algorithm Target CMA-ES is originally used in unconstrained optimization. To adapt it into constrained optimization and we have to handle the boundary in some way. So how could it handle this geometric boundary?

tech note Algorithm Machine Learning Statistical Learning Optimization

June 20, 2019

Note on Belief Propagation Algorithm (Updating)

TOC {:toc} 最近在阅读1，是以为记。 Objective of Algorithm 目标 Belief Propagation算法想解决的是Markov随机场，Bayes网络等图模型的边缘概率估计，以及求解最可能的状态的问题。有许多名字称呼这一General的算法，如sum-product, max-product, min-sum, Message Passing等，属于更general的Message Passing算法范畴。同时这一算法可以说是一种通用框架或者philosophy，因此在不同结构的模型中有许多著名的特例，这些具体算法也有各自的名字(如前向后向算法，Kalman Filter等等) 对于统计学习问题，通常会区分模型与算法，模型设定一些假设，抽象现实的某个方面，建立问题的结构；而算法求解问题(很多时候是转化为优化问题来求解)。在这个post中将要介绍的Belief Propagation算法，属于后者，但为了理解他，我们首先需要理解他对应的模型，即概率图模型。 Graphical Models： What relates graph to probability? 第一次接触概率图模型的人（像我）都会问，概率和图这两者有什么关系呢？我们知道图是一种直观的表征事物之间二元关系的方法通常由$(\mathcal V, \mathcal E)$定点和边组成。在概率图模型中，顶点通常代表随机变量，而边代表随机变量之间的关系。

tech note Algorithm Machine Learning Statistical Learning Graph Model

May 18, 2019

Related Tags

Courses (3) 科普 (5) Misc (5) Personal Writing (8) Computer Science (14) Mathematics (18) Neuroscience (22) Coding (25) Machine Learning (33)

© 2026 Me. This work is licensed under CC BY NC ND 4.0

Published with Hugo Blox Builder — the free, open source website builder that empowers creators.