Code
require(tidyverse)
require(patchwork)
require(httr)
require(glue)
require(broom)
<- theme_set(theme_minimal()) old_theme
require(tidyverse)
require(patchwork)
require(httr)
require(glue)
require(broom)
<- theme_set(theme_minimal()) old_theme
Dataset swiss
from datasets::swiss
connect fertility and social, economic data within 47 French-speaking districts in Switzerland.
Fertility
: fertility indexAgriculture
: jobs in agricultural sectorExamination
: literacy index (military examination)Education
: proportion of people with successful secondary educationCatholic
: proportion of CatholicsInfant.Mortality
: mortality quotient at age 0Fertility index (Fertility
) is considered as the response variable
The social and economic variables are covariates (explanatory variables).
See European Fertility Project for more on this dataset.
PCA (Principal Component Analysis) is concerned with covariates.
data("swiss")
%>%
swiss glimpse(50)
Rows: 47
Columns: 6
$ Fertility <dbl> 80.2, 83.1, 92.5, 85.8,…
$ Agriculture <dbl> 17.0, 45.1, 39.7, 36.5,…
$ Examination <int> 15, 6, 5, 12, 17, 9, 16…
$ Education <int> 12, 9, 5, 7, 15, 7, 7, …
$ Catholic <dbl> 9.96, 84.84, 93.40, 33.…
$ Infant.Mortality <dbl> 22.2, 22.2, 20.2, 20.3,…
Have a look at the documentation of the dataset
Compute, display and comment the sample correlation matrix.
Display jointplots for each pair of variables.
Pairwise analysis did not provide us with a clear and simple picture of the French-speaking districts.
Play with centering and scaling
Project the dataset on the first two principal components (perform dimension reduction) and build a scatterplot. Colour the points according to the value of original covariates.
scale(., center=T, scale-F)
)\[X\]
Checking Orthogonality of \(V\)
Pay attention to the correlation circles.
Explain the contrast between the two correlation circles.
In the sequel we focus on standardized PCA.
How many axes should we keep?
Fertility
variablePlot again the correlation circle using the same principal axes as before, but add the Fertility
variable. How does Fertility
relate with covariates? with principal axes?
https://scholar.google.com/citations?user=xbCKOYMAAAAJ&hl=fr&oi=ao