Jan 1, 2025
Nov 1, 2023
Motivations Many CNN models have become the bread and butter in modern deep learning pipeline. Here I’m summarizing some famous CNN structure and their key innovations as I use them.
Jan 17, 2021
Note on Compiling Torch C Extensions Motivation Sometimes fusing operations in C library without using python can accelerate your model, especially for key operations that occurs a lot and lots of data pass through.
Sep 15, 2020
Environment Bug https://github.com/rosinality/stylegan2-pytorch/issues/70 Compiler not found bug We need to change compiler_bindir_search_path in ./stylegan2/dnnlib/tflib/custom_ops.pyNeed to be changed to have the C compiler on the machine. Note Visual Studio 2019 is not supported so have to use 2017!
Jun 10, 2020
Note on Laplacian-Beltrami (Diffusion) Operator Motivation Laplacian on graph and on discrete geometry (mesh) are very useful tools. One core intuition, just like Laplacian in $\R^n$ space, it’s related to diffusion and heat equation. Recall the diffusion equation is
May 8, 2020
Spectral Graph Theory and Segmentation Motivation Spectral Graph Theory is a powerful tool as it sits at the center of multiple representation. Connects to Graph and manifold, and linear algrbra. It’s related to dynamics on graph, related to Markov chain, random walk (diffusion.) Could be applied to any point cloud: images, meshes are suited. Could be used to perform clustering, segmentation etc. Linear Algebra Review There are several ways to see a eigenvalue problem
Apr 22, 2020
Note on Photometric Reasoning Shape $\hat n$, lighting $l$, reflectance $\rho$ affect image appearance $I$. Can we infer them back? $$ I=\rho<\hat n,l> $$ How much does shading and photometric effects tell us about shape, in natural settings.
Mar 17, 2020
Note on GAN Note with reference to the Youtube lecture series Hongyi Li. Architecture Developments Self Attention Used in Self-Attention GAN and BigGAN class Self_Attn(nn.Module): """ Self attention Layer""" def __init__(self,in_dim,activation): super(Self_Attn,self).__init__() self.chanel_in = in_dim self.activation = activation self.query_conv = nn.Conv2d(in_channels = in_dim , out_channels = in_dim//8 , kernel_size= 1) self.key_conv = nn.Conv2d(in_channels = in_dim , out_channels = in_dim//8 , kernel_size= 1) self.value_conv = nn.Conv2d(in_channels = in_dim , out_channels = in_dim , kernel_size= 1) self.gamma = nn.Parameter(torch.zeros(1)) self.softmax = nn.Softmax(dim=-1) # def forward(self,x): """ inputs : x : input feature maps( B X C X W X H) returns : out : self attention value + input feature attention: B X N X N (N is Width*Height) """ m_batchsize,C,width ,height = x.size() proj_query = self.query_conv(x).view(m_batchsize,-1,width*height).permute(0,2,1) # B X CX(N) proj_key = self.key_conv(x).view(m_batchsize,-1,width*height) # B X C x (*W*H) energy = torch.bmm(proj_query,proj_key) # transpose check attention = self.softmax(energy) # BX (N) X (N) proj_value = self.value_conv(x).view(m_batchsize,-1,width*height) # B X C X N out = torch.bmm(proj_value,attention.permute(0,2,1) ) out = out.view(m_batchsize,C,width,height) out = self.gamma*out + x return out,attention Style GAN BigGAN Conditional GAN Text Conditioning Text is processed and combined with noise vector.
Mar 7, 2020
Note on Hardware Based Computational Photography Now we have far more computational power than before! Besides, many images will go through complex algorithms as postprocessing. But we can also optimize camera measurement, so that results look even better.
Feb 27, 2020