33 posts in this category
Motivation Given the popularity and power of diffusion models, the theoretical formulation of these models are not in unison. Because multiple groups have derived these models from different background, there exist multiple formulations, SDE, ODE, Markov Chain, Non-markov chain etc.
Jan 16, 2023
Hippo: Recurrent Memory with Optimal Polynomial Projection Motivation Hidden state in RNN represents a form of memory of the past. For a sequence, a natural way to represent the past sequence is to project it onto an orthonormal basis set. Here depending on the different emphasis of the past, we could define different measures on the time axis and define the basis set based on this measure. Then we can keep track of the projection coefficient on this basis when observing new data points.
Jul 25, 2022
[TOC] Motivation S4 sequence model is rising in the sequence modelling field. It dominates on long sequence modelling over RNN, LSTM and transformers. It’s both mathematically elegant and useful, and it’s trending, so why not write about it.
Jul 17, 2022
Motivation How to understand EM algorithm from a theoretical perspective? This post tries to understand EM as a form of alternative ascent of a lower bound of likelihood. The Key Trick of EM The key trick we need to remember is the usage of Jensen Inequality on logarithm. So we could swap Expectation and logarithm and obtain a lower bound on likelihood. Generally, we have such inequality, given a positive function $q(z)$ that sums to $1$ (probability density),
May 13, 2022
Motivation There is a resurgent of interest in investigating and developing Hopfield network in recent years. This development is quite exciting in that it connect classic models in physics and machine learning to modern techniques like transformers.
Nov 15, 2021
Rationale Hopfield Network can be viewed an energy based model: deriving all properties from it. General RNN has many complex behaviors, but setting symmetric connections can prohibit it! No oscillation is possible in a symmetric matrix.
Motivations Many CNN models have become the bread and butter in modern deep learning pipeline. Here I’m summarizing some famous CNN structure and their key innovations as I use them.
Jan 17, 2021
Motivation Word2Vec is a very famous method that I heard of since the freshman year in college (yeah it comes out in 2013). Recently, some reviewer reminds us of the similarity of the “analogy” learnt by the vector representation of words and the vector analogy of image space in GAN or VAE.
Nov 27, 2020
Problem Statement Given a bunch of noisy data, you want a smooth curve going through the cloud. As the points are noisy, there is no need to going through each point.
Oct 13, 2020
Note on Compiling Torch C Extensions Motivation Sometimes fusing operations in C library without using python can accelerate your model, especially for key operations that occurs a lot and lots of data pass through.
Sep 15, 2020