March 29, 2015
[LMM] literature overview: performance March 27, 2015 [LMM] literature overview: approximate methods March 15, 2015 [FaSTLMM] Proximal contamination March 13, 2015 [FaSTLMM] REML estimate March 11, 2015 [FaSTLMM] comparison with PyLMM (continued) March 10, 2015 [FaSTLMM] comparison with PyLMM (practice) March 9, 2015 [FaSTLMM] comparison with PyLMM (theory) March 3, 2015 [FaSTLMM] fastlmm/inference/lmm_cov.py, part 2 February 27, 2015 [FaSTLMM] highlevel overview, part 2 February 25, 2015 [FaSTLMM] highlevel overview of the codebase, part 1 February 18, 2015 [FaSTLMM] fastlmm/inference/lmm.py February 16, 2015 [FaSTLMM] fastlmm/inference/lmm_cov.py, part 1 
It turns out that I had wrong understanding of how REML estimate can be computed. It was confusing to see that both Reference paperIt took me a while to find the original derivations, but here they are: The simplest form of the equation (conceptually)The advantage of REML estimate of the variance component is that it’s unbiased. For that, we get rid of $\beta$, fixedeffect coefficients, by taking a matrix $S$ such that $SX = 0$. The idea behind it is this: if $y$ has gaussian distribution with mean $X\beta + Gu$ and var.covar. matrix $V=\sigma_g^2GG^T + \sigma^2I$, the transformed phenotype vector $Sy$ will still have gaussian distribution (because the transformation is linear) but with mean $0$ and variance $SVS^T$, containing no trace of $\beta$. Then ML estimate of $Sy$ is maximized to obtain the optimal variance ratio $\gamma=\frac{\sigma_g^2}{\sigma^2}$. Now, $ SX = 0 $ means that rows of $ S $ are in the orthogonal complement of $ \mathrm{colspace}(X),\ $ and if $X$ has $D$ linearly independent columns, rank of $S$ can’t be more than $ N  D$. It makes sense to take $S$ with this maximal rank. And because of this, the number of individuals is effectively reduced from $N$ to $ND$. Thus the REML estimate of $\gamma$ is just the ML estimate with $Sy$ instead of $y$ and $ND$ instead of $N$ because now we effectively have $ND$ individuals: In our case, $V$ is $\sigma^2(\gamma K + I)$, therefore we can pull $\sigma^2$ out of the determinant and rewrite the expression in the parentheses as Instead of the unknown $\sigma^2$ we put its ML estimate, which is and get This expression is used in
Removing $S$ from the equationRemarkably, that happens to be possible. Details can be found in Searle on pages 89–90, where the following facts are obtained, with the aid of formulas derived on pages 2930:
Therefore we obtain the following expression for the REML loglikelihood: Again, after pulling out $\sigma^2$ and ignoring the constant, it gets transformed into which simplifies to Compare with this line from
