Chapter 2 Modeling Lifetimes

2.1 Mortality data: Life Expectancies, Deaths, Counts, & Lifetimes

2.1.1 Life Expectancies

Perhaps the most relevant question to any individual when it comes to their future lifetime is: how long am I expected to live? As we will discuss, there are many angles to this question. However, as a first and proximate answer, we may rely on public statistics. Government agencies around the world publish vital statistics such as the life expectancies for their population, which are typically separated by age and gender – and potentially other attributes such as race.

The file CDCLifeExp.csv is provided with the supplemental information of this text and includes an excerpt of the U.S. national vital statistics that provides “Expectation of life, by age, race, […] and sex: United States, 2017” in Table 2.1.

library(knitr)
us_les <- read.csv("Data/CDCLifeExp.csv")
kable(us_les, caption="Life Expectancies From 2017 U.S. National Vital Statistics")

Table 2.1: Life Expectancies From 2017 U.S. National Vital Statistics
Age	Total	Male	Female	Hispanic..Total	Hispanic..Male	Hispanic..Female
0	78.6	76.1	81.1	81.8	79.1	84.3
20	59.4	57.0	61.8	62.5	59.9	64.9
40	40.7	38.7	42.6	43.5	41.2	45.5
60	23.3	21.7	24.7	25.5	23.6	27.0
80	9.2	8.4	9.8	10.5	9.4	11.1

This excerpt provides the life expectancy for ages 0 (newborn), 20, 40, 60, and 80 observed within a population, for males and females with separate figures for the hispanic subpopulation. There are a few immediate observations.

First, females generally seem to have a longer life expectancy than males, whereas the aggregate “Total” life expectancy is in between the two figures. This is intuitive as the aggregate population is (largely) made up of male and female individuals, so that the “Total” life expectancy is a weighted average of the gender-specific life expectancies, relative to the composition of the population.

Second, life expectancy is decreasing in age, which again is intuitive: older individuals will have shorter life expectancies, ceteris paribus. It may be somewhat less obvious that the differences in life expectancies are less than the differences in age; subtracting the lines in Table 2.1 gives incremental life expectancies for 20-year age gaps.

kable(us_les[1:4,2:7]-us_les[2:5,2:7])

Total	Male	Female	Hispanic..Total	Hispanic..Male	Hispanic..Female
19.2	19.1	19.3	19.3	19.2	19.4
18.7	18.3	19.2	19.0	18.7	19.4
17.4	17.0	17.9	18.0	17.6	18.5
14.1	13.3	14.9	15.0	14.2	15.9

Hence, while a 40-year-old male is twenty years older than a 20-year-old male, the 20-year-old’s life expectancy is 18.3 years higher. The difference of 1.7 years is due to the possibility of the 20-year-old not surviving up to age 40. Consider the following: 40-year-old males lived 20 years since age 20 plus they are expected to live another 38.7 years, for a total of 58.7 years; in contrast, 20-year-old males have a life expectancy of 57 years, which is 1.7 years less. In other words, 40-year-old males have a higher expected age at death than 20-year-old males, because we view them conditionally on already having lived until age 40. As the table reveals, this effect is more pronounced when comparing a 40-year-old with a 60-year-old or a 60-year-old with an 80-year-old.

Third, the life expectancies for the hispanic sub-population exceed the figures of the total population, which suggests that other subpopulations must exhibit a lower life expectancy. There are many questions of potential reasons for this difference, although these fall more in the demographic or even sociological realm. For instance, there are many interesting studies related to the dependence of life expectancies on socio-economic factors, including some concerning recent trends related to so-called “deaths of despair” in the U.S.

From an actuarial perspective, a relevant question may be how we could model the mortality data. In other words, is there a simple parametric model that may describe the progression of life expectancy across ages, at least in the context of one particular population? We will return to this question in the context of our mortality models, particularly in Section 2.3.

As an early caveat to the question raised at the beginning of this section, it is not necessarily accurate to take these figures as estimates of a given individual’s future lifetime or even its expectation. This is the case since the life expectancy is usually generated based on recent mortality experience rather than forecasts. As we will discuss in Section 2.4, this is the difference between the so-called period and cohort life expectancies.

2.1.2 Population Mortality Counts

We now bring into consideration mortality experience for populations that had been observed over time, which is available at the Human Mortality Database (HMD) for a wide range of countries. The available data include Exposures by age, sex, and calender year period, i.e. how many people of a given age and sex lived in the country’s population during a given period of time, and corresponding Deaths, i.e. how many of these individuals had died.

In the supplemental information to this text, we provide exposures and deaths for the U.S. population, downloaded from the HMD as HMD_Expo.csv and HMD_Deaths.csv. We use the data over five year intervals starting at 1935 until 2015. Let us take a look at the exposures:

us_exp <- read.csv("Data/HMD_Expo.csv")

kable(head(us_exp), align = "cccrrr", digits = 2, format.args = list(big.mark = ","))

Year_start	Year_end	Age	Female	Male	Total
1935	1939	0	4,869,267	5,057,569	9,926,836
1935	1939	1	4,802,597	4,936,238	9,738,835
1935	1939	2	5,119,574	5,244,634	10,364,208
1935	1939	3	5,159,494	5,287,402	10,446,896
1935	1939	4	5,189,350	5,307,754	10,497,104
1935	1939	5	5,359,159	5,531,967	10,891,126

and deaths:

us_deaths <- read.csv("Data/HMD_Deaths.csv")

kable(head(us_deaths), align = "cccrrr", digits = 2, format.args = list(big.mark = ","))

Year_start	Year_end	Age	Female	Male	Total
1935	1939	0	253,145.89	335,492.00	588,637.89
1935	1939	1	36,010.02	42,169.20	78,179.22
1935	1939	2	17,718.83	21,208.22	38,927.05
1935	1939	3	12,450.36	14,852.17	27,302.53
1935	1939	4	10,154.85	11,771.25	21,926.10
1935	1939	5	8,678.43	10,339.57	19,018.00

To illustrate, let us plot the exposures and deaths for a 70-year-old U.S. females over time, which is given in Figure 2.1:

Month_of_Sale	Age	Sex	BMI	BloodPressure	Claim	Time_of_death
1	27	0	25.8	117	YES	55.63
1	51	1	17.6	109	YES	18.53
1	59	1	22.5	132	YES	15.88
1	37	1	22.9	109	YES	57.40
1	62	0	30.9	147	YES	27.83
1	31	0	17.3	91	YES	64.06

Month_of_Sale	Age	Sex	Smoking	BMI	BloodPressure	Claim	Time_of_death
780	43	0	0	17.10	110	NO	NA
780	57	1	1	19.30	118	NO	NA
780	40	0	0	20.10	117	NO	NA
780	27	1	0	20.60	90	NO	NA
780	55	1	1	20.10	118	NO	NA
780	23	1	0	19.03	82	NO	NA

	vars	n	mean	sd	median	trimmed	mad	min	max	range	skew	kurtosis	se
Month_of_Sale	1	160,781	394.58	216.72	382.00	393.08	277.25	1.0	780.0	779.0	0.08	-1.17	0.54
Age	2	160,781	39.99	11.61	39.00	39.65	11.86	19.0	65.0	46.0	0.22	-0.73	0.03
Sex	3	160,781	0.70	0.46	1.00	0.75	0.00	0.0	1.0	1.0	-0.87	-1.25	0.00
Smoking	4	160,781	0.30	0.46	0.00	0.25	0.00	0.0	1.0	1.0	0.88	-1.22	0.00
BMI	5	160,781	22.79	4.55	21.70	22.20	3.85	16.1	69.6	53.5	1.46	3.26	0.01
BloodPressure	6	160,781	114.74	15.75	114.00	114.36	16.31	57.0	208.0	151.0	0.26	0.09	0.04
Claim*	7	160,781	1.37	0.48	1.00	1.34	0.00	1.0	2.0	1.0	0.54	-1.71	0.00
Time_of_death	8	59,382	28.27	13.69	28.36	28.24	15.21	0.0	64.2	64.2	0.02	-0.73	0.06

Sex	A	B	c
Female	0.0005386	0.0000112	0.1031558
Male	0.0008564	0.0000354	0.0927685

Year_start	Age	mx US Females	qx US Females
2010	0	0.005461737	0.005446862
2010	1	0.000379927	0.000379855
2010	2	0.000224890	0.000224865
2010	3	0.000169394	0.000169380
2010	4	0.000138322	0.000138313
2010	5	0.000119244	0.000119237

Year_start	Age	mx US Males	qx US Males
2010	0	0.006554279	0.006532870
2010	1	0.000441664	0.000441567
2010	2	0.000298869	0.000298824
2010	3	0.000225097	0.000225072
2010	4	0.000184733	0.000184715
2010	5	0.000145992	0.000145981

2.1 Mortality data: Life Expectancies, Deaths, Counts, & Lifetimes

2.1.1 Life Expectancies

2.1.2 Population Mortality Counts

2.1.3 Individual Mortality Data

2.2 Modeling Death

2.2.1 Lifetime Random Variable and its Distribution

2.2.2 Standard Actuarial Notation: \(q_{\bf{x}}\), \(_tp_{\bf{x}}\), and All That

2.2.3 The Classical Case: Age-Only Model

2.3 Analytical Laws of Mortality

2.3.1 De Moivre and Constant Force Models

2.3.2 Gompertz and Makeham Laws

2.3.3 Makeham Law based on Life Expectancies Data

2.4 Life Tables and their Functions

2.4.1 Life Table Basics

2.4.2 Life Table based on U.S. Population Data

2.4.3 Curtate Quantities and Recursive Relationships

2.4.4 Fractional Year Assumptions

2.4.5 Cohort Life Tables and Mortality Improvement Modeling

2.5 Non-parametric Survival Estimation

2.5.1 Kaplan-Meier Estimates

2.5.2 Nelson-Aalen Estimates

2.6 Conditional Models: Survival regression

2.6.1 The Cox Proportional Hazard Model

2.6.2 Cox Models Based on Life Insurer Data

2.6.3 Conditional Life Tables

2.7 Notes and Comments