Start from a 2-way contingency table \(X\) with \(\sum_{i,j} X_{i,j}=N\)
Normalize \(P = \frac{1}{N}X\) (correspondance matrix)
Let \(r\) (resp. \(c\)) be the row (resp. column) wise sums vector
Let \(D_r=\text{diag}(r)\) denote the diagonal matrix with row sums of \(P\) as coefficients
Let \(D_c=\text{diag}(c)\) denote the diagonal matrix with column sums of \(P\) as coefficients
The row profiles matrix is \(D_r^{-1} \times P\)
The standardized residuals matrix is \(S = D_r^{-1/2} \times \left(P - r c^T\right) \times D_c^{-1/2}\)
CA consists in computing the SVD of the standardized residuals matrix \(S = U \times D \times V^T\)
From the SVD, we get - \(D_r^{-1/2} \times U\) standardized coordinates of rows - \(D_c^{-1/2} \times V\) standardized coordinates of columns - \(D_r^{-1/2} \times U \times D\) principal coordinates of rows - \(D_c^{-1/2} \times V \times D\) principal coordinates of columns - Squared singular values: the principal inertia
When calling svd(.)
, the argument should be \[D_r^{1/2}\times \left(D_r^{-1} \times P \times D_c^{-1}- \mathbf{I}\times \mathbf{I}^T \right)\times D_c^{1/2}\]