Motivation
Here we summarize a few common probabilistic neural population models. Adapted from reading notes and class presentations from Neuro QC316 taught by Jan Drugowitsch.
LNP, GLM
These are the simplist models of neurons.
\[z= \exp(k^Tx)\\ y\sim Poisson(z)\]To fit these model, we use maximum likelihood (MLE), or maximum a posteriori (MAP) given a prior of $k$.
\[\arg\max_k\mathcal L(k|y,x)\]Remarks
- This Generalized linear model has convex likelihood function with unique solutions.
- If the $x$ is distributed isotropically, then this MLE estimate of $k$ will align with spike triggered average $\sum_i y_ix_i$
Latent Variable Models
In the next few sections, the models will all have a similar format $x\to z\to y$ or $z\to y$ , namely there will be some latent variables that are not input and not output – not observed.
For these models, a common strategy is to use Expectation Maximization. It’s a way to perform MLE over the latent states $z$ and the parameter of the model $\theta$ jointly. The optimization is performed alternatively, aka coordinate descent or maximization-maximization procedure .
Formal Procedure
Consider model $z\to y$, we have a parametric generative model given any $z$, $p(y\mid z)$ with $\theta$.
Expectation : Then given an observation $y$, we can estimate the distribution of the latent assuming a fixed paramter $\theta$
\[p(z\mid y)=\frac{p_\theta(y\mid z)p_\theta(z)}{p(y)}\]Maximization : Using this distribution $p(z\mid y)$, we can estimate the parameters by MLE, this step is as if we know the input data to $y$ then we can use the normal method to estimate $\theta$.
Formally, this is maximizing the “expected” likelihood function.
\[\mathbb E_{p(z\mid y)} [\log\mathcal L(\theta;y,z)]\]Remark
- Exact EM algorithm will increase the marginal likelihood of observed data $p(y)$ so it will converge, but not necessarily to global maxima. Thus multiple restart is needed; and a nice initialization could help!
- Approximate EM algorithm is not guanranteed to converge! If you use gaussian approximation or other approximations for $p(z\mid y)$ etc.
State Space Model
Latent state: a $p$ dimension linear dynamic system
Observation: a linear readout and Generalized linear model with Poisson
GPFA
Latent state: gaussian processes
- $p$ independent GPs with different time scales
Observation: a linear readout $C$ with a indevidual neuron based variance $R$
Thus the whole model is a joint multivariate Gaussian.
To learn these models, we use Expectation Maximization .
LFADS
Latent state: Recurrent neural network
Observation: Linear readout
To learn these models, we use gradient descent training on datasets.