Spectral Graph Theory and Segmentation Motivation Spectral Graph Theory is a powerful tool as it sits at the center of multiple representation. Connects to Graph and manifold, and linear algrbra. It’s related to dynamics on graph, related to Markov chain, random walk (diffusion.) Could be applied to any point cloud: images, meshes are suited. Could be used to perform clustering, segmentation etc. Linear Algebra Review There are several ways to see a eigenvalue problem
Note on Hyperbolic Geometry Reference Notes 2018 Lec Note 2015 Lecture note Ch5-3 Measurement in Hyperbolic Geometry [Cheatsheet / Note](http://home.iiserb.ac.in/~kashyap/MTH 520/lp.pdf) Motivation Hyperbolic geometry is a great source of inspiration for math art. Besides it is used to model some hierarchical data structure. Here I collected a few models
Reinforcement Learning deals with environment and rewards. Agents have a set of actions to interact with environment (state $s_i$), and the environment will be changed by these actions $a_j$, from time to time, there will be reward coming from environment!
Note on Photometric Reasoning Shape $\hat n$, lighting $l$, reflectance $\rho$ affect image appearance $I$. Can we infer them back? $$ I=\rho<\hat n,l> $$ How much does shading and photometric effects tell us about shape, in natural settings.
Note on GAN Note with reference to the Youtube lecture series Hongyi Li. Architecture Developments Self Attention Used in Self-Attention GAN and BigGAN class Self_Attn(nn.Module): """ Self attention Layer""" def __init__(self,in_dim,activation): super(Self_Attn,self).__init__() self.chanel_in = in_dim self.activation = activation self.query_conv = nn.Conv2d(in_channels = in_dim , out_channels = in_dim//8 , kernel_size= 1) self.key_conv = nn.Conv2d(in_channels = in_dim , out_channels = in_dim//8 , kernel_size= 1) self.value_conv = nn.Conv2d(in_channels = in_dim , out_channels = in_dim , kernel_size= 1) self.gamma = nn.Parameter(torch.zeros(1)) self.softmax = nn.Softmax(dim=-1) # def forward(self,x): """ inputs : x : input feature maps( B X C X W X H) returns : out : self attention value + input feature attention: B X N X N (N is Width*Height) """ m_batchsize,C,width ,height = x.size() proj_query = self.query_conv(x).view(m_batchsize,-1,width*height).permute(0,2,1) # B X CX(N) proj_key = self.key_conv(x).view(m_batchsize,-1,width*height) # B X C x (*W*H) energy = torch.bmm(proj_query,proj_key) # transpose check attention = self.softmax(energy) # BX (N) X (N) proj_value = self.value_conv(x).view(m_batchsize,-1,width*height) # B X C X N out = torch.bmm(proj_value,attention.permute(0,2,1) ) out = out.view(m_batchsize,C,width,height) out = self.gamma*out + x return out,attention Style GAN BigGAN Conditional GAN Text Conditioning Text is processed and combined with noise vector.
Note on Hardware Based Computational Photography Now we have far more computational power than before! Besides, many images will go through complex algorithms as postprocessing. But we can also optimize camera measurement, so that results look even better.
Computational Photography TOC {:toc} Basically, enhance image by computation! Intersection of 3 fields Optics Vision Graphics Majorly two kinds of work Co-design camera and image processing (optics + vision) Use Vision to help Graphics to help generate better image faster! CG2REAL CG rendering is very computational intensive!
Motivation This is a brief analytical note about how physical self movement of eye / camera will induce optic flow in a static environment. And then discuss how a system can separate these two components instantaneously.
Stereo Basic Stereo algorithm can be formulated as Markov Random Field. Thus Methods in MRF inference could all be used. Prior Planar Prior Natural scene is usually piece-wise! How to impose this idea to depth map?
Semantics Vision Task Note semantics and geometric reasoning is conceptually similar to each other Stereo and Optical flow is about finding correspondence / matches. Object recognition in some sense is finding correspondence w.r.t. a template, and make the template match the observation. Semantic Vision before CNN So ancient semantic detector works like this