Computer Vision

Archetypal SAE: Adaptive and Stable Dictionary Learning for Concept Extraction in Large Vision Models

Jan 1, 2025

Understanding Learning Dynamics of Neural Representations via Feature Visualization at Scale

Nov 1, 2023

Note on Classic CNNs

Motivations Many CNN models have become the bread and butter in modern deep learning pipeline. Here I’m summarizing some famous CNN structure and their key innovations as I use them.

Jan 17, 2021

Compiling C Extensions for PyTorch

Note on Compiling Torch C Extensions Motivation Sometimes fusing operations in C library without using python can accelerate your model, especially for key operations that occurs a lot and lots of data pass through.

Sep 15, 2020

Debugging StyleGAN2 in PyTorch

Environment Bug https://github.com/rosinality/stylegan2-pytorch/issues/70 Compiler not found bug We need to change compiler_bindir_search_path in ./stylegan2/dnnlib/tflib/custom_ops.pyNeed to be changed to have the C compiler on the machine. Note Visual Studio 2019 is not supported so have to use 2017!

Jun 10, 2020

Note on Laplacian Operator (Diffusion) in Geometry Processing

Note on Laplacian-Beltrami (Diffusion) Operator Motivation Laplacian on graph and on discrete geometry (mesh) are very useful tools. One core intuition, just like Laplacian in $\R^n$ space, it’s related to diffusion and heat equation. Recall the diffusion equation is

May 8, 2020

Spectral Graph Theory and Segmentation

Spectral Graph Theory and Segmentation Motivation Spectral Graph Theory is a powerful tool as it sits at the center of multiple representation. Connects to Graph and manifold, and linear algrbra. It’s related to dynamics on graph, related to Markov chain, random walk (diffusion.) Could be applied to any point cloud: images, meshes are suited. Could be used to perform clustering, segmentation etc. Linear Algebra Review There are several ways to see a eigenvalue problem

Apr 22, 2020

Note on Photometric Reasoning

Note on Photometric Reasoning Shape $\hat n$, lighting $l$, reflectance $\rho$ affect image appearance $I$. Can we infer them back? $$ I=\rho<\hat n,l> $$ How much does shading and photometric effects tell us about shape, in natural settings.

Mar 17, 2020

Note on Generative Adversarial Network

Note on GAN Note with reference to the Youtube lecture series Hongyi Li. Architecture Developments Self Attention Used in Self-Attention GAN and BigGAN class Self_Attn(nn.Module): """ Self attention Layer""" def __init__(self,in_dim,activation): super(Self_Attn,self).__init__() self.chanel_in = in_dim self.activation = activation self.query_conv = nn.Conv2d(in_channels = in_dim , out_channels = in_dim//8 , kernel_size= 1) self.key_conv = nn.Conv2d(in_channels = in_dim , out_channels = in_dim//8 , kernel_size= 1) self.value_conv = nn.Conv2d(in_channels = in_dim , out_channels = in_dim , kernel_size= 1) self.gamma = nn.Parameter(torch.zeros(1)) self.softmax = nn.Softmax(dim=-1) # def forward(self,x): """ inputs : x : input feature maps( B X C X W X H) returns : out : self attention value + input feature attention: B X N X N (N is Width*Height) """ m_batchsize,C,width ,height = x.size() proj_query = self.query_conv(x).view(m_batchsize,-1,width*height).permute(0,2,1) # B X CX(N) proj_key = self.key_conv(x).view(m_batchsize,-1,width*height) # B X C x (*W*H) energy = torch.bmm(proj_query,proj_key) # transpose check attention = self.softmax(energy) # BX (N) X (N) proj_value = self.value_conv(x).view(m_batchsize,-1,width*height) # B X C X N out = torch.bmm(proj_value,attention.permute(0,2,1) ) out = out.view(m_batchsize,C,width,height) out = self.gamma*out + x return out,attention Style GAN BigGAN Conditional GAN Text Conditioning Text is processed and combined with noise vector.

Mar 7, 2020

Note on Hardware Based Computational Photography

Note on Hardware Based Computational Photography Now we have far more computational power than before! Besides, many images will go through complex algorithms as postprocessing. But we can also optimize camera measurement, so that results look even better.

Feb 27, 2020