Machine Learning

33 posts in this category

Note on Variants of Diffusion Scheduler, DDPM DDIM PNDM

Motivation Given the popularity and power of diffusion models, the theoretical formulation of these models are not in unison. Because multiple groups have derived these models from different background, there exist multiple formulations, SDE, ODE, Markov Chain, Non-markov chain etc.

Jan 16, 2023

Note on Hippo- Recurrent Memory with Optimal Polynomial Projection

Hippo: Recurrent Memory with Optimal Polynomial Projection Motivation Hidden state in RNN represents a form of memory of the past. For a sequence, a natural way to represent the past sequence is to project it onto an orthonormal basis set. Here depending on the different emphasis of the past, we could define different measures on the time axis and define the basis set based on this measure. Then we can keep track of the projection coefficient on this basis when observing new data points.

Jul 25, 2022

Note on S4-Efficiently Modeling Long Sequences with Structured State Spaces

[TOC] Motivation S4 sequence model is rising in the sequence modelling field. It dominates on long sequence modelling over RNN, LSTM and transformers. It’s both mathematically elegant and useful, and it’s trending, so why not write about it.

Jul 17, 2022

Note on EM algorithm and likelihood lower bound

Motivation How to understand EM algorithm from a theoretical perspective? This post tries to understand EM as a form of alternative ascent of a lower bound of likelihood. The Key Trick of EM The key trick we need to remember is the usage of Jensen Inequality on logarithm. So we could swap Expectation and logarithm and obtain a lower bound on likelihood. Generally, we have such inequality, given a positive function $q(z)$ that sums to $1$ (probability density),

May 13, 2022

Note on Modern Hopfield Network and Transformers

Motivation There is a resurgent of interest in investigating and developing Hopfield network in recent years. This development is quite exciting in that it connect classic models in physics and machine learning to modern techniques like transformers.

Nov 15, 2021

Note on Hopfield Network

Rationale Hopfield Network can be viewed an energy based model: deriving all properties from it. General RNN has many complex behaviors, but setting symmetric connections can prohibit it! No oscillation is possible in a symmetric matrix.

Nov 15, 2021

Note on Classic CNNs

Motivations Many CNN models have become the bread and butter in modern deep learning pipeline. Here I’m summarizing some famous CNN structure and their key innovations as I use them.

Jan 17, 2021

Note on Word2Vec

Motivation Word2Vec is a very famous method that I heard of since the freshman year in college (yeah it comes out in 2013). Recently, some reviewer reminds us of the similarity of the “analogy” learnt by the vector representation of words and the vector analogy of image space in GAN or VAE.

Nov 27, 2020

Note on Non-Parametric Regression

Problem Statement Given a bunch of noisy data, you want a smooth curve going through the cloud. As the points are noisy, there is no need to going through each point.

Oct 13, 2020

Compiling C Extensions for PyTorch

Note on Compiling Torch C Extensions Motivation Sometimes fusing operations in C library without using python can accelerate your model, especially for key operations that occurs a lot and lots of data pass through.

Sep 15, 2020