We investigate life tables describing countries from Western Europe (France, Great Britain –actually England and Wales–, Italy, the Netherlands, Spain, and Sweden) and the United States.
Life tables used here have been doctored and merged so as to simplify discussion.
We will use the next lookup tables to recode some factors.
Code
country_code<-list(fr_t='FRATNP', fr_c='FRACNP', be='BEL', gb_t='GBRTENW', gb_c='GBRCENW', nl='NLD', it='ITA', swe='SWE', sp='ESP', us='USA')countries<-c('fr_t', 'gb_t', 'nl', 'it', 'sp', 'swe', 'us')country_names<-list(fr_t='France', # total population fr_c='France', # civilian population be='Belgium', gb_t='England & Wales', # total population gb_c='England & Wales', # civilian population nl='Netherlands', it='Italy', swe='Sweden', sp='Spain', us='USA')gender_names<-list('b'='Both','f'='Female','m'='Male')
Code
datafile<-'full_life_table.Rds'fpath<-stringr::str_c("./DATA/", datafile)# here::here('DATA', datafile) # check getwd() if problem if(!file.exists(fpath)){download.file("https://stephane-v-boucheron.fr/data/full_life_table.Rds", fpath, mode="wb")}life_table<-readr::read_rds(fpath)
Two kinds of Life Tables can be distinguished: Table du moment which contain for each calendar year, the mortality risks at different ages for that very year; and Tables de génération which contain for a given birthyear, the mortality risks at which an individual born during that year has been exposed.
The life tables investigated in this lab are Table du moment. According to the document by Vallin and Meslé, building the life tables required decisions and doctoring.
Definitions can be obtained from www.lifeexpectancy.org. We translate it into mathematical (rather than demographic) language. Recall that the quantities define a probability distribution over \(\mathbb{N}\). This probability distribution is a construction that reflects the health situation in a population at a given time (year). This probability distribution does not describe the sequence of sanitary situations experienced by a cohort (people born during a specific year).
One works with a period, or current, life table (table du moment). This summarizes the mortality experience of persons across all ages in a short period, typically one year or three years. More precisely, the death probabilities \(q(x)\) for every age \(x\) are computed for that short period, often using census information gathered at regular intervals. These \(q(x)\)’s are then applied to a hypothetical cohort of \(100 000\) people over their life span to produce a life table.
In the sequel, we denote by \(F_{t}\) the cumulative distribution function for year \(t\). \(F_t(x)\) represents the probability of dying at age not larger than \(x\).
We agree on \(\overline{F}_t = 1 - F_t\) and \(F_t(-1)=0\).
The life tables are highly redundant. Provided we get the right conventions we can derive almost all columns from column qx.
`summarise()` has grouped output by 'Country', 'Year'. You can override using
the `.groups` argument.
# A tibble: 21 × 7
# Groups: Country, Gender [21]
Year Country Gender m1 m2 m3 m4
<int> <fct> <fct> <int> <dbl> <dbl> <dbl>
1 1948 Spain Both 1 0.874 2.20 0.00838
2 1948 Spain Female 1 0.789 1.56 0.00816
3 1952 Spain Male 1 0.802 5.5 0.0119
4 2004 Italy Both 1 0.836 0.968 0.0150
5 2004 Italy Female 1 0.875 1.03 0.0149
6 1984 Italy Male 1 0.774 5.56 0.0146
7 2007 France Both 1 0.887 0.976 0.0152
8 2007 France Female 1 0.890 0.980 0.0151
9 1979 France Male 1 0.764 4.97 0.0161
10 1992 England & Wales Both 1 0.898 2.42 0.0135
# ℹ 11 more rows
qx
(age-specific) risk of death at age \(x\), or mortality quotient at given age \(x\) for given year \(t\): \(q_{t,x} = \frac{\overline{F}_t(x) - \overline{F}_t(x+1)}{\overline{F}_t(x)}\).
For each year, each age, \(q_{t,x}\) is determined by data from that year.
We also have \[\overline{F}_{t}(x+1) = \overline{F}_{t}(x) \times (1-q_{t,x+1})\, .\]
mx
central death rate at age \(x\) during year \(t\). This is connected with \(q_{t,x}\) by \[m_{t,x} = -\log(1- q_{t,x}) \,, \] or equivalently \(q_{t,x} = 1 - \exp(-m_{t,x})\).
lx
the so-called survival function: the scaled proportion of persons alive at age \(x\). These values are computed recursively from the \(q_{t,x}\) values using the formula \[l_t(x+1) = l_t(x) \times (1-q_{t,x}) \, ,\] with \(l_{t,0}\), the “radix” of the table, arbitrarily set to \(100000\). Function \(l_{t,\cdot}\) and \(\overline{F}_t\) are connected by \[l_{t,x + 1} = l_{t,0} \times \overline{F}_t(x)\,.\] Note that in Probability theory, \(\overline{F}\) is also called the survival or tail function.
dx
\(d_{t,x} = q_{t,x} \times l_{t,x}\)
Tx
Total number of person-years lived by the cohort from age \(x\) to \(x+1\). This is the sum of the years lived by the \(l_{t, x+1}\) persons who survive the interval, and the \(d_{t,x}\) persons who die during the interval. The former contribute exactly \(1\) year each, while the latter contribute, on average, approximately half a year, so that \(L_{t,x} = l_{t,x+1} + 0.5 \times d_{t,x}\). This approximation assumes that deaths occur, on average, half way in the age interval x to x+1. Such is satisfactory except at age 0 and the oldest age, where other approximations are often used; We will stick to a simplified vision \(L_{t,x}= l_{t,x+1}\)
ex:
Residual Life Expectancy at age \(x\) and year \(t\)
Western countries in 1948
Several pictures share a common canvas: we plot central death rates against ages using a logarithmic scale on the \(y\) axis. Countries are identified by aesthetics (shape, color, linetypes). Abiding to the DRY principle, we define a prototype ggplot (alternatively plotly) object. The prototype will be fed with different datasets and decorated and arranged for the different figures.