Chapter 12 Experience Rating Using Credibility Theory

Chapter Preview. This chapter introduces credibility theory as an important actuarial tool for estimating pure premiums, frequencies, and severities for individual risks or classes of risks. Credibility theory provides a convenient framework for combining the experience for an individual risk or class with other data to produce more stable and accurate estimates. Several models for calculating credibility estimates will be discussed including Bühlmann, Bühlmann-Straub, limited fluctuation, and nonparametric and semiparametric credibility methods. The chapter will also show a connection between credibility theory and Bayesian estimation which was introduced in Chapter 9, Bayesian Inference and Modeling.

12.1 Introduction to Applications of Credibility Theory

What premium should be charged to provide insurance? The answer depends upon the exposure to the risk of loss. A common method to compute an insurance premium is to rate an insured using a classification rating planA rating plan that uses an insured’s risk characteristics to determine premium. A classification plan is used to select an insurance rate based on an insured’s rating characteristics such as geographic territory, age, etc. All classification rating plans use a limited set of criteria to group insureds into a “class” and there will be variation in the risk of loss among insureds within the class.

An experience rating plan attempts to capture some of the variation in the risk of loss among insureds within a rating class by using the insured’s own loss experience to complement the rate from the classification rating plan. One way to do this is to use a credibility weightThe weight assigned to an insured’s historical loss experience for the purposes of determining their premium in an experience rating plan \(Z\) with \(0\leq Z \leq 1\) to compute \[ \hat{R}=Z\bar{X}+(1-Z)M, \] \[\begin{eqnarray*} \hat{R}&=&\textrm{credibility weighted rate for risk,}\\ \bar{X}&=&\textrm{average loss for the risk over a specified time period,}\\ M&=&\textrm{the rate for the classification group, often called the manual rate.}\\ \end{eqnarray*}\] For a risk whose loss experience is stable from year to year, \(Z\) might be close to 1. For a risk whose losses vary widely from year to year, \(Z\) may be close to 0.

Credibility theory is also used for computing rates for individual classes within a classification rating plan. When classification plan rates are being determined, some or many of the groups may not have sufficient data to produce stable and reliable rates. The actual loss experience for a group will be assigned a credibility weight \(Z\) and the complement of credibilityThe remainder of the weight not assigned to an insured’s historical loss experience in the experience rating plan \(1-Z\) may be given to the average experience for risks across all classes. Or, if a class rating plan is being updated, the complement of credibility may be assigned to the current class rateAverage rate per exposure for an insured in a particular classification group. Credibility theory can also be applied to the calculation of expected frequencies and severities.

Computing numeric values for \(Z\) requires analysis and understanding of the data. What are the variances in the number of losses and sizes of losses for risks? What is the variance between expected values across risks?

Show Quiz Solution

12.2 Bühlmann Credibility


In this section, you learn how to:

  • Compute a credibility-weighted estimate for the expected loss for a risk or group of risks.
  • Determine the credibility \(Z\) assigned to observations.
  • Calculate the values required in Bühlmann credibility including the Expected Value of the Process Variance (\(EPV\)), Variance of the Hypothetical Means (\(VHM\)) and collective mean \(\mu\).

A classification rating plan groups policyholders together into classes based on risk characteristics. Although policyholders within a class have similarities, they are not identical and their expected losses will not be exactly the same. An experience rating plan can supplement a class rating plan by credibility weighting an individual policyholder’s loss experience with the class rate to produce a more accurate rate for the policyholder. Chapter 15 Experience Rating using Bonus-Malus provides examples of rating plans that adjust a policyholder’s rate to recognize their loss experience.

The Bühlmann credibility model introduced in this section is often called greatest accuracy credibility, least-squares credibility, or Bayesian credibility.

In this presentation a risk parameterParameter in a distribution whose value reflects the risk categorization \(\theta\) will be assigned to each policyholder. Losses \(X\) for the policyholder with parameter \(\theta\) will have a pdf \(f_{X|\Theta=\theta}(x)\) and mean \[\begin{equation} \mu(\theta)=\mathrm{E}_{X}(X|\theta)=\int xf_{x|\Theta=\theta}(x) \, dx \end{equation}\] and variance \[\begin{equation} \sigma^2(\theta)=\mathrm{Var}_{X}(X|\theta)=\int (x-\mu(\theta))^2 f_{x|\Theta=\theta}(x) \, dx. \end{equation}\] The integrals are over the support for the distributions. Losses \(X\) can represent pure premiums, aggregate losses, number of claims, claim severities, or some other measure of loss for a period of time, often one year. Risk parameter \(\theta\) may be continuous or discrete and may be multivariate depending on the model. For a randomly selected risk the risk parameter \(\theta\) is unknown but the probability density function for \(\theta\) is modeled with \(f_{\Theta}(\theta)\). Averaging across the policyholders in the class the collective mean loss is \[\begin{equation} \mu=\mathrm{E_{\Theta}}[\mathrm{E_{X}}(X|\theta)]=\int f_{\Theta}(\theta)\mu(\theta)d\theta=\int f_{\Theta}(\theta) \int xf_{x|\Theta=\theta}(x) \, dx d\theta. \end{equation}\]

Example 12.2.1. The number of claims \(X\) for an insured in a class has a Poisson distribution with mean \(\theta>0\). The risk parameter \(\theta\) is exponentially distributed within the class with pdf \(f(\theta)=e^{-\theta}\). What is the expected number of claims for an insured chosen at random from the class?

Show Example Solution

In the prior example the risk parameter \(\theta\) is a continuous random variable with an exponential distribution. In the next example there are three types of risks and the risk parameter has a discrete distribution.

Example 12.2.2. For any risk (policyholder) in a population the number of losses \(N\) in a year has a Poisson distribution with parameter \(\lambda\). Individual loss amounts \(X_i\) for a risk are independent of \(N\) and are iid with Type II Pareto distribution \(F(x)=1-[\theta/(x+\theta)]^{\alpha}\). There are three types of risks in the population as follows:

\[ \small{ \begin{array}{|c|c|c|c|} \hline \text{Risk } & \text{Percentage} & \text{Poisson} & \text{Pareto} \\ \text{Type} & \text{of Population} & \text{Parameter} & \text{Parameters} \\ \hline A & 50\% & \lambda=0.5 & \theta=1000, \alpha=2.0 \\ B & 30\% & \lambda=1.0 & \theta=1500, \alpha=2.0 \\ C & 20\% & \lambda=2.0 & \theta=2000, \alpha=2.0 \\ \hline \end{array} } \]

If a risk is selected at random from the population, what is the expected aggregate loss in a year?

Show Example Solution

What is the risk parameter for a risk (policyholder) in the prior example? One could say that the risk parameter has three components \((\lambda,\theta,\alpha)\) with possible values (0.5,1000,2.0), (1.0,1500,2.0), and (2.0,2000,2.0) depending on the type of risk.

Note that in both of the examples the risk parameter is a random quantity with its own probability distribution. We do not know the value of the risk parameter for a randomly chosen risk.

12.2.1 Credibility-Weighted Estimate for the Expected Loss

If a policyholder with risk parameter \(\theta\) has losses \(x_1, \ldots, x_n\) during \(n\) time periods then the goal is to find \(\mathrm{E}_{\Theta}(\mu(\theta)|x_1,\ldots, x_n)\), the conditional expectation of \(\mu(\theta)\) given observations \(x_1,\ldots, x_n\). Section 12.3, Bayesian Inference and Bühlmann Credibility explains how to evaluate \(\mathrm{E}_{\Theta}(\mu(\theta)|x_1,\ldots, x_n)\) using Bayesian inference.

The Bühlmann credibility model calculates a linear approximation \(\hat{\mu}(\theta)=Z\bar{x}+(1-Z)\mu\) to estimate \(\mathrm{E}_{\Theta}(\mu(\theta)|x_1,\ldots,x_n)\) with \(\bar{x}=(x_1+\ldots+x_n)/n\). We can rewrite this as \(\hat{\mu}(\theta)=a+b\bar{x}\) which makes it obvious that the credibility estimate is a linear function of the mean.

In the Bühlmann model, \(\mathrm{E}_{\Theta}(\mu(\theta)|X_1,\ldots,X_n)\) is approximated by the linear function \(a+b\bar{X}\) and constants \(a\) and \(b\) are calculated to minimize the square of the difference between these two quantities \[\begin{equation} G(a,b)=\mathrm{E}_{X}([\mathrm{E}_{\Theta}(\mu(\theta)|X_1,\ldots,X_n)-(a+b\bar{X})]^2), \end{equation}\] hence the alternative name least-squares credibility. Minimizing the expectation yields \(b=n/(n+K)\) and \(a=(1-b)\mu\). Quantity \(n\) is the number of observations and \(\mu=\mathrm{E}_{\Theta}(\mu(\theta))\) is the population mean. For the moment we will assign \(K\) the mysterious equation \(K\) = (Expected Value of the Process Variance) / (Variance of the Hypothetical Means)=\(EPV/VHM\) and will clarify the meaning at the beginning of the next section. More details about this model and calculation of \(a\) and \(b\) can be found in references (Bühlmann 1967), (Bühlmann and Gisler 2005), (Klugman, Panjer, and Willmot 2012), and (Tse 2009).

The Bühlmann credibility-weighted estimate for \(\mathrm{E}_{\Theta}(\mu(\theta)|x_1,\ldots, x_n)\) for the policyholder is \[\begin{equation} \hat{\mu}(\theta)=Z\bar{x}+(1-Z)\mu \tag{12.1} \end{equation}\] with \[\begin{eqnarray*} \theta&=&\textrm{a risk parameter that identifies a policyholder's risk level}\\ \hat{\mu}(\theta)&=&\textrm{estimated expected loss for a policyholder with parameter }\theta\\ & & \textrm{and loss experience } \bar{x}\\ \bar{x}&=&(x_1+\cdots+x_n)/n \textrm{ is the average of $n$ observations of the policyholder } \\ Z&=&\textrm{credibility assigned to $n$ observations } = n/(n+K) \\ K&=&EPV/VHM \\ \mu&=&\textrm{the expected loss for a randomly chosen policyholder in the class.}\\ \end{eqnarray*}\]

For a selected policyholder, random variables \(X_j\) are assumed to be iid for \(j=1,\ldots,n\) because it is assumed that the policyholder’s exposure to loss is not changing through time and \(\mathrm{E}_{X}(\bar{X}|\theta)=\mathrm{E}_{X}(X_j|\theta)=\mu(\theta)\).

If a policyholder is randomly chosen from the class and there is no loss information about the risk then the expected loss is \(\mu=\mathrm{E}_{\Theta}(\mu(\theta))\) where the expectation is taken over all \(\theta\)’s in the class. In this situation \(Z=0\) and the expected loss is \(\hat\mu(\theta)=\mu\) for the risk. The quantity \(\mu\) can also be written as \(\mu=\mathrm{E}(X_j)\) or \(\mu=\mathrm{E}(\bar{X})\) and is referred to as the overall mean, population mean, or collective mean. Note that \(\mathrm{E}(X_j)\) is evaluated with the law of total expectationThe expected value of the conditional expected value of x given y is the same as the expected value of x: \(\mathrm{E}(X_j)=\mathrm{E_\Theta}[\mathrm{E_X}(X_j|\theta)]\).

Although formula (12.1) was introduced using experience rating as an example, the Bühlmann credibility model has wider application. Suppose that a rating plan has multiple classes. Credibility formula (12.1) can be used to determine individual class rates. The overall mean \(\mu\) would be the average loss for all classes combined, \(\bar{x}\) would be the experience for the individual class, and \(\hat{\mu}(\theta)\) would be the estimated loss for the class.

12.2.2 Credibility Z, EPV, and VHM

When computing the credibility estimate \(\hat{\mu}(\theta)=Z\bar{X}+(1-Z)\mu\), how much weight \(Z\) should go to experience \(\bar{X}\) and how much weight \((1-Z)\) to the overall mean \(\mu\)? In Bühlmann credibility there are three factors that need to be considered:

  1. How much variation is there in a single observation \(X_j\) for a selected risk? With \(\bar{X}=(X_1+\cdots+X_n)/n\) and assuming that the observations are iid conditional on \(\theta\), it follows that \(\mathrm{Var}_{X}(\bar{X}|\theta)\) = \(\mathrm{Var}_{X}(X_j|\theta)/n\). For larger \(\mathrm{Var}_{X}(\bar{X}|\theta)\) less credibility weight \(Z\) should be given to experience \(\bar{X}\). The Expected Value of the Process VarianceAverage of the natural variability of observations from within each risk, abbreviated \(EPV\), is the expected value of \(\mathrm{Var}_{X}(X_j|\theta\)) across all risks: \[ EPV = \mathrm{E}_{\Theta}(\mathrm{Var}_{X}(X_j|\theta)). \] Because \(\mathrm{Var}_{X}(\bar{X}|\theta)\) = \(\mathrm{Var}_{X}(X_j|\theta)/n\) it follows that \(\mathrm{E}_{\Theta}(\mathrm{Var}_{X}(\bar{X}|\theta))=EPV/n\).
  2. How homogeneous is the population of risks whose experience was combined to compute the overall mean \(\mu\)? If all the risks are similar in loss potential then more weight \((1-Z)\) would be given to the overall mean \(\mu\) because \(\mu\) is the average for a group of similar risks whose means \(\mu(\theta)\) are not far apart. The homogeneity or heterogeneity of the population is measured by the Variance of the Hypothetical MeansVariance of the means across different classes, used to determine how similar or different the classes are from one another with abbreviation \(VHM\): \[ VHM=\mathrm{Var}_{\Theta}(\mathrm{E}_{X}(X_j|\theta))=\mathrm{Var}_{\Theta}(\mathrm{E}_{X}(\bar{X}|\theta)). \] Note that we used \(\mathrm{E}_{X}(\bar{X}|\theta)=\mathrm{E}_{X}(X_j|\theta)\) for the second equality.
  3. How many observations \(n\) were used to compute \(\bar{X}\)? A larger sample would infer a larger \(Z\).

Example 12.2.3. The number of claims \(N\) in a year for a risk in a population has a Poisson distribution with mean \(\lambda>0\). The risk parameter \(\lambda\) is uniformly distributed over the interval \((0,2)\). Calculate the \(EPV\) and \(VHM\) for the population.

Show Example Solution

The Bühlmann credibility formula includes values for \(n\), \(EPV\), and \(VHM\): \[\begin{equation} Z=\frac{n}{n+K} \quad , \quad K =\frac{EPV}{VHM}. \tag{12.2} \end{equation}\] If the \(VHM\) increases then \(Z\) increases. If the \(EPV\) increases then \(Z\) gets smaller. Credibility \(Z\) asymptotically approaches 1 as the number of observations \(n\) goes to infinity.

If you multiply the numerator and denominator of the \(Z\) formula by (\(VHM\)/\(n\)) then \(Z\) can be rewritten as \[ Z=\frac{VHM}{VHM+(EPV/n)} . \] The number of observations \(n\) is captured in the term (\(EPV/n\)).

Example 12.2.4. The law of total varianceA decomposition of the variance of a random variable into conditional components. specifically, for random variables x and y on the same probability space, var(x) = e[var(y|x)] + var[e(x|y)]. can be written as \(\mathrm{Var}(Y)=\mathrm{E}(\mathrm{Var}[Y|X])+\mathrm{Var}(\mathrm{E}[Y|X])\). Show that \(\mathrm{Var}(\bar{X})\) = \(VHM + (EPV/n)\) and derive a formula for \(Z\) in terms of \(\bar{X}\).

Show Example Solution

The following long example and solution demonstrate how to compute the credibility-weighted estimate with frequency and severity data.

Example 12.2.5. For any risk in a population the number of losses \(N\) in a year has a Poisson distribution with parameter \(\lambda\). Individual loss amounts \(X\) for a selected risk are independent of \(N\) and are iid with exponential distribution \(F(x)=1-e^{-x/\beta}\). There are three types of risks in the population as shown below. A risk was selected at random from the population and all losses were recorded over a five-year period. The total amount of losses over the five-year period was 5,000. Use Bühlmann credibility to estimate the annual expected aggregate loss for the risk.
\[ \small{ \begin{array}{|c|c|c|c|} \hline \text{Risk } & \text{Percentage} & \text{Poisson} & \text{Exponential} \\ \text{Type} & \text{of Population} & \text{Parameter} & \text{Parameter} \\ \hline A & 50\% & \lambda=0.5 & \beta=1000 \\ B & 30\% & \lambda=1.0 & \beta=1500 \\ C & 20\% & \lambda=2.0 & \beta=2000 \\ \hline \end{array} } \]

Show Example Solution

In real world applications of Bühlmann credibility the value of \(K=EPV/VHM\) must be estimated. Sometimes a value for \(K\) is selected using judgment. A smaller \(K\) makes estimator \(\hat{\mu}(\theta)\) more responsive to actual experience \(\bar{X}\) whereas a larger \(K\) produces a more stable estimate by giving more weight to \(\mu\). Judgment may be used to balance responsiveness and stability. Section 12.5 in this chapter will discuss methods for determining \(K\) from data.

Show Quiz Solution

12.3 Bayesian Inference and Bühlmann Credibility


In this section, you learn how to:

  • Calculate formulas for expected outcomes for beta-binomial and gamma-Poisson models using Bayes Theorem or Bühlmann credibility .
  • Understand the connection between the Bayesian and Bühlmann estimates for conjugate families.

Chapter 9 presents Bayesian inference and modeling and it is assumed that the reader is familiar with that material, in particular, Section 9.3 which discusses conjugate families. This section will compare Bayesian inference with Bühlmann credibility and show connections between the two models.

First we will look at a Bayesian model. Suppose a risk has \(n\) observed losses \(x_1, x_2, ..., x_n\). These losses will be represented by the vector \(\mathbf{x} = (x_1, x_2, ..., x_n)\) which are realizations of the random variables \(\mathbf{X} = (X_1, X_2, ..., X_n)\) which we will assume are iid.

A risk with risk parameter \(\theta\) has expected loss \(\mu(\theta)=\mathrm{E}_{X}(X|\theta)\). If the risk had losses \(\mathbf{x}\) then \(\mathrm{E}_{\Theta}(\mu(\theta)|\mathbf{x})\) is the conditional expectation of \(\mu(\theta)\) given outcomes \(\mathbf{x}\). The expected loss is updated to reflect the observations.

The expectation \(\mathrm{E}_{\Theta}(\mu(\theta)|\mathbf{x})\) can be calculated using the conditional density function \(f_{X|\Theta=\theta}(x|\theta)\) and the posterior distribution \(f_{\Theta|\mathbf{X}=\mathbf{x}}(\theta|\mathbf{x})\) \[\begin{eqnarray*} \mu(\theta)&=&\mathrm{E}_{\Theta}(X|\theta)=\int xf_{X|\Theta=\theta}(x|\theta)dx \\ \mathrm{E}_{\Theta}(\mu(\theta)|\mathbf{x})&=&\int \mu(\theta) f_{\Theta|\mathbf{X}=\mathbf{x}}(\theta|\mathbf{x}) d\theta. \end{eqnarray*}\] The integrations are over the support of the distributions. The posterior distribution comes from Bayes theoremA probability law that expresses conditional probability of the event a given the event b in terms of the conditional probability of the event b given the event a and the unconditional probability of a \[\begin{equation*} f_{\Theta|\mathbf{X}=\mathbf{x}}(\theta) = \frac{f_{\mathbf{X}|\Theta=\theta}(\mathbf{x})\, f_{\Theta}(\theta)}{f_{\mathbf{X}}(\mathbf{x})}. \end{equation*}\] The first function \(f_{\mathbf{X}|\Theta=\theta}(\mathbf{x})\) in the numerator is the likelihood function and the second term \(f_{\Theta}(\theta)\) is the prior distribution. The denominator \(f_{\mathbf{X}}(\mathbf{x})\) is the joint density function for \(n\) losses \(\mathbf{x}=(x_1,\ldots,x_n)\).

Now we turn to the Bühlmann model. The Bühlmann credibility estimate for \(\mathrm{E}_{\Theta}(\mu(\theta)|\mathbf{x})\) is \(\hat{\mu}(\theta)=Z\bar{x}+(1-Z)\mu\). This model requires credibility \(Z\) and collective mean \(\mu\) which can be computed from the distributions used in the Bayesian model described above, if the distributions are known.

Example 12.3.1. Using \(n\), conditional density function \(f_{X|\Theta=\theta}(x|\theta)\), and prior distribution \(f_{\Theta}(\theta)\), calculate credibility \(Z\) and collective mean \(\mu\) for the Bühlmann credibility estimate \(\hat{\mu}(\theta)\).

Show Example Solution

12.3.1 Beta-Binomial Model

Section 9.3.1 of the chapter Bayesian Inference and Modeling analyzes the beta-binomial model.

The number of successes \(x\) in \(m\) Bernoulli trials with unknown probability of success \(q\) is given by the binomial distribution \[\begin{equation*} p_{X|Q=q}(x) = \binom{m}{x} q^x (1-q)^{m-x}, \quad x \in \{0,1,...,m\}. \end{equation*}\] The probability of success \(q\) is modeled with the conjugate prior for the binomial distribution: the beta distribution with parameters \(a\) and \(b\). The pdf of the beta distribution is \[ f_{Q}(q) = \frac{\Gamma(a+b)}{\Gamma(a)\Gamma(b)} q^{a-1} (1-q)^{b-1}, \quad q \in [0,1]. \] Given \(x\) successes in \(m\) Bernoulli trials the posterior distribution for \(q\) was shown in 9.3.1 to be \[ f_{Q|X=x}(q) = \frac{\Gamma(a+b+m)}{\Gamma(a+x)\Gamma(b+m-x)} q^{a+x-1} (1-q)^{b+m-x-1}, \] which is a beta distribution with parameters \(a+x\) and \(b+m-x\).

The mean for the beta distribution with parameters \(a\) and \(b\) is \(\mathrm{E}(Q)=a/(a+b)\). Given \(x\) successes in \(m\) trials in the beta-binomial model the mean of the posterior distribution is \[\begin{equation*} \mathrm{E}_{Q}(Q|x)=\frac{a+x}{a+b+m}. \end{equation*}\]
The Bühlmann credibility estimate for \(\mathrm{E}_{Q}(Q|x)\) exactly matches the Bayesian estimate as demonstrated in the following example.

Example 12.3.2. The probability that a coin toss will yield heads is \(q\). The prior distribution for probability \(q\) is beta with parameters \(a\) and \(b\). On \(m\) tosses of the coin there were exactly \(x\) heads. Use Bühlmann credibility to estimate the expected value of \(q\).

Show Example Solution

12.3.2 Gamma-Poisson Model

The chapter Bayesian Inference and Modeling also analyzes the gamma-Poisson conjugate family. The results are summarized below.

Let \(\mathbf{X} = (X_1, X_2, ..., X_n)\) be a sample of iid Poisson random variables with \[ p_{X_i|\Lambda=\lambda}(x_i) = \frac{\lambda^{x_i}\, e^{-\lambda}}{x_i!}, \quad x_i \in \mathbb{R}_+ . \] Define the prior distribution for \(\Lambda\) to be gamma with parameters \(\alpha\) and \(\theta\), \[ f_{\Lambda}(\lambda) = \frac{1}{\Gamma(\alpha)\theta^{\alpha}} \lambda^{\alpha-1} \, e^{-\frac{\lambda}{\theta}}, \quad \lambda \in \mathbb{R}_+. \] Given a sample of \(n\) observations \(\mathbf{x} = (x_1, x_2, ..., x_n)\), the posterior distribution of \(\Lambda\) is \[ f_{\Lambda|\mathbf{X}=\mathbf{x}}(\lambda) = \frac{1}{\Gamma(\alpha+x)\left(\frac{\theta}{n\theta+1} \right)^{\alpha+x}} \lambda^{\alpha+x-1} \, e^{-\frac{\lambda\,(n\theta+1)}{\theta}}, \] where \(x = \sum_{i=1}^n x_i\), which is a gamma distribution with parameters \(\alpha+x\) and \(\tfrac{\theta}{n\theta+1}\).

We are going to make a minor change to the formulas above. Instead of a scale parameter \(\theta\), we will substitute a rate parameter \(\beta=1/\theta\). The posterior distribution becomes \[ f_{\Lambda|\mathbf{X}=\mathbf{x}}(\lambda) = \frac{(\beta+n)^{(\alpha+x)}}{\Gamma(\alpha+x)} \lambda^{\alpha+x-1} \, e^{-\lambda(\beta+n)}. \] The posterior distribution is gamma and the expected value for \(\Lambda\) given observations \(\mathbf{x}\) is easy to calculate: \[\begin{equation*} \mathrm{E}_{\Lambda}(\Lambda|x_1,\ldots,x_n) = \frac{\alpha+x}{\beta+n}. \end{equation*}\] Prior to collecting a sample, \(\mathrm{E}(\Lambda)=\alpha/\beta\) using parameters from the prior distribution.

The Bühlmann credibility model will give the same result as seen in the following example.

Example 12.3.3 The number of claims X each year for a risk has a Poisson distribution \(p(x)=\lambda^{x} e^{-\lambda} /x!\). Each risk in a class has a constant risk parameter \(\lambda\). Parameter \(\lambda\) is gamma distributed across the class with pdf \(f(\lambda) = \beta^{\alpha}\lambda^{\alpha-1}e^{-\lambda\beta}/\Gamma(\alpha)\). A risk was selected at random from the population and observed for \(n\) years. The claims counts were \(\mathbf{x} = (x_1, x_2, ..., x_n)\). Use Bühlmann credibility to calculate the expected value of \(\lambda\) for the selected risk.

Show Example Solution

We will leave it to the reader to compare the Bayesian and Bühlmann models for the normal-normal conjugate family.


12.3.3 Exact Credibility

As demonstrated in the prior section, the Bühlmann credibility estimates for the beta-binomial and gamma-Poisson models exactly match the Bayesian analysis results. The term exact credibilityA situation where the bayesian credibility estimate matches that of the buhlmann credibility estimate is applied in these situations. Exact credibility may occur if the probability distribution for \(X_j\) is in the linear exponential family and the prior distribution is a conjugate prior. Besides these two models, examples of exact credibility also include Gamma-Exponential and Normal-Normal models.

If the conditional mean \(\mathrm{E}_{\Theta}(\mu(\theta)|X_1,...,X_n)\) is linear in the mean of the observations, then the Bühlmann credibility estimate will coincide with the Bayesian estimate. More information about exact credibility can be found in (Bühlmann and Gisler 2005), (Klugman, Panjer, and Willmot 2012), and (Tse 2009).

12.4 Bühlmann-Straub Credibility


In this section, you learn how to:

  • Compute a credibility-weighted estimate for the expected loss for a risk or group of risks using the Bühlmann-Straub model.
  • Determine the credibility \(Z\) assigned to observations.
  • Calculate required values including the Expected Value of the Process Variance (\(EPV\)), Variance of the Hypothetical Means (\(VHM\)) and collective mean \(\mu\).
  • Recognize situations when the Bühlmann-Straub model is appropriate.

With standard Bühlmann credibility as described in the prior section, losses \(X_1,\ldots,X_n\) arising from a selected policyholder are assumed to be iid. If the subscripts indicate year 1, year 2 and so on up to year \(n\), then the iid assumption means that the policyholder has the same exposure to loss every year. For commercial insurance this assumption is frequently violated.

Consider a commercial policyholder that uses a fleet of vehicles in its business. In year 1 there are \(m_1\) vehicles in the fleet, \(m_2\) vehicles in year 2, .., and \(m_n\) vehicles in year \(n\). The exposure to loss from ownership and use of this fleet is not constant from year to year. The annual losses for the fleet are not iid.

Define \(Y_{jk}\) to be the loss for the \(k^{th}\) vehicle in the fleet for year \(j\). Then, the total losses for the fleet in year \(j\) are \(Y_{j1}+\cdots+Y_{jm_j}\) where we are adding up the losses for each of the \(m_j\) vehicles. In the Bühlmann-Straub model it is assumed that random variables \(Y_{jk}\) are iid across all vehicles and years for the policyholder. With this assumption the means \(\mathrm{E}_{Y}(Y_{jk}|\theta)=\mu(\theta)\) and variances \(\mathrm{Var}_{Y}(Y_{jk}|\theta)=\sigma^2(\theta)\) are the same for all vehicles and years. The quantity \(\mu(\theta)\) is the expected loss and \(\sigma^2(\theta)\) is the variance in the loss for one year for one vehicle for a policyholder with risk parameter \(\theta\).

If \(X_j\) is the average loss per unit of exposure in year \(j\), \(X_j=(Y_{j1}+\cdots+Y_{jm_j})/m_j\), then \(\mathrm{E}_{Y}(X_j|\theta)=\mu(\theta)\) and \(\mathrm{Var}_{Y}(X_j|\theta)=\sigma^2(\theta)/m_j\) for a policyholder with risk parameter \(\theta\). Note that we used the fact that the \(Y_{jk}\) are iid for a given policyholder. The average loss per vehicle for the entire \(n\)-year period is \[\begin{equation*} \bar{X}= \frac{1}{m} \sum_{j=1}^{n} m_j X_{j} \quad , \quad m=\sum_{j=1}^{n} m_j. \end{equation*}\] It follows that \(\mathrm{E}_{Y}(\bar{X}|\theta)=\mu(\theta)\) and \(\mathrm{Var}_{Y}(\bar{X}|\theta)=\sigma^2(\theta)/m\) where \(\mu(\theta)\) and \(\sigma^2(\theta)\) are the mean and variance for a single vehicle for one year for the policyholder.

Example 12.4.1. Prove that \(\mathrm{Var}_{Y}(\bar{X}|\theta)=\sigma^2(\theta)/m\) for a risk with risk parameter \(\theta\).

Show Example Solution

The Buhlmann-Straub credibilityAn extension of the buhlmann credibility model that allows for varying exposure by year estimate is: \[\begin{equation}\hat{\mu}(\theta)=Z\bar{x}+(1-Z)\mu \tag{12.3} \end{equation}\] with \[\begin{eqnarray*} \theta&=&\textrm{a risk parameter that identifies a policyholder's risk level}\\ \hat{\mu}(\theta)&=&\textrm{estimated expected loss for one exposure for the policyholder}\\ & & \textrm{with loss experience } \bar{X}\\ \bar{x}&=& \frac{1}{m} \sum_{j=1}^{n} m_j x_j \textrm{ is the average loss per exposure for $m$ exposures.}\\ & & \textrm{$x_j$ is the average loss per exposure and $m_j$ is the number of exposures in year $j$.} \\ Z&=&\textrm{credibility assigned to $m$ exposures } \\ \mu&=&\textrm{expected loss for one exposure for randomly chosen}\\ & & \textrm{ policyholder from population.}\\ \end{eqnarray*}\]

Note that \(\hat{\mu}(\theta)\) is the estimator for the expected loss for one exposure. If the policyholder has \(m_j\) exposures then the expected loss is \(m_j\hat{\mu}(\theta)\).

In Example 12.2.4, it was shown that \(Z=\mathrm{Var}_{\Theta}(\mathrm{E}_{X}(\bar{X}|\theta))/\mathrm{Var}(\bar{X})\) where \(\bar{X}\) is the average loss for \(n\) observations. In equation (12.3) the \(\bar{X}\) is the average loss for \(m\) exposures and the same \(Z\) formula can be used: \[ Z=\frac{\mathrm{Var}_{\Theta}(\mathrm{E}_{Y}(\bar{X}|\theta))}{\mathrm{Var}(\bar{X})}= \frac{\mathrm{Var}_{\Theta}(\mathrm{E}_{Y}(\bar{X}|\theta))}{\mathrm{E}_{\Theta}(\mathrm{Var}_{Y}(\bar{X}|\theta))+\mathrm{Var}_{\Theta}(\mathrm{E}_{Y}(\bar{X}|\theta))}. \] (Note that \(X_{j}\) is a sum of \(Y_{jk}\)’s and \(\bar{X}\) is an average of \(Y_{jk}\)’s.) The denominator was expanded using the law of total varianceA decomposition of the variance of a random variable into conditional components. specifically, for random variables x and y on the same probability space, var(x) = e[var(y|x)] + var[e(x|y)].. As noted above \(\mathrm{E}_{Y}(\bar{X}|\theta)=\mu(\theta)\) so \(\mathrm{Var}_{\Theta}(\mathrm{E}_{Y}(\bar{X}|\theta))=\mathrm{Var}_{\Theta}(\mu(\theta))=VHM\). Because \(\mathrm{Var}_{Y}(\bar{X}|\theta)=\sigma^2(\theta)/m\) it follows that \(\mathrm{E}_{\Theta}(\mathrm{Var}_{Y}(\bar{X}|\theta))=\mathrm{E}_{\Theta}(\sigma^2(\theta))/m\) = \(EPV/m\). Making these substitutions and using a little algebra gives \[\begin{equation} Z=\frac{m}{m+K} \quad , \quad K =\frac{EPV}{VHM}. \tag{12.4} \end{equation}\] This is the same \(Z\) as for Bühlmann credibility except number of exposures \(m\) replaces number of years or observations \(n\).

Example 12.4.2. A commercial automobile policyholder had the following exposures and claims over a three-year period: \[ \small{ \begin{array}{|c|c|c|} \hline \text{Year} & \text{Number of Vehicles} & \text{Number of Claims} \\ \hline 1 & 9 & 5 \\ 2 & 12 & 4 \\ 3 & 15 & 4 \\ \hline \end{array} } \]

  • The number of claims in a year for each vehicle in the policyholder’s fleet is Poisson distributed with the same mean (parameter) \(\lambda\).
  • Parameter \(\lambda\) is distributed among the policyholders in the population with pdf \(f(\lambda)=6\lambda(1-\lambda)\) with \(0<\lambda<1\).

The policyholder has 18 vehicles in its fleet in year 4. Use Bühlmann-Straub credibility to estimate the expected number of policyholder claims in year 4.

Show Example Solution

12.5 Estimating Credibility Parameters


In this section, you learn how to:

  • Perform nonparametric estimation with the Bühlmann and Bühlmann-Straub credibility models.
  • Identify situations when semiparametric estimation is appropriate.
  • Use data to approximate the \(EPV\) and \(VHM\).

The examples in this chapter have provided assumptions for calculating credibility parameters. In actual practice the actuary must use real world data and judgment to determine credibility parameters.

12.5.1 Nonparametric Estimation for Bühlmann and Bühlmann-Straub Models

Bayesian analysis as described previously requires assumptions about a prior distribution and likelihood. It is possible to produce estimates without these assumptions and these methods are often referred to as empirical Bayes methodsCredibility methods that estimate the credibility weight without using any assumptions about prior distributions or likelihoods, instead relying only on empirical data. Bühlmann and Bühlmann-Straub credibility with parameters estimated from the data are included in the category of empirical Bayes methods.

Bühlmann Model. First we will address the simpler Bühlmann model. Assume that there are \(r\) risks in a population. For risk \(i\) with risk parameter \(\theta_i\) the losses for \(n\) periods are \(X_{i1},\ldots, X_{in}\). The losses for a given risk are iid across periods as assumed in the Bühlmann model. For risk \(i\) the sample mean is \(\bar{X}_i=\sum_{j=1}^{n}X_{ij}/n\) and the unbiased sample process variance is \(s_i^2=\sum_{j=1}^{n}(X_{ij}-\bar{X}_i)^2/(n-1)\). An unbiased estimator for the \(EPV\) can be calculated by taking the average of \(s_i^2\) for the \(r\) risks in the population: \[\begin{equation} \widehat{EPV}=\frac{1}{r}\sum_{i=1}^{r} s_i^2 = \frac{1}{r(n-1)} \sum_{i=1}^{r} \sum_{j=1}^{n}(X_{ij}-\bar{X}_i)^2 . \tag{12.5} \end{equation}\] The individual risk means \(\bar{X}_i\) for \(i=1,\ldots, r\) can be used to estimate the \(VHM\). An unbiased estimator of Var(\(\bar{X}_i\)) is \[\begin{equation*} \widehat{\mathrm{Var}}(\bar{X}_i)=\frac{1}{r-1} \sum_{i=1}^{r}(\bar{X}_i-\bar{X})^2 \textrm{ and } \bar{X}=\frac{1}{r}\sum_{i=1}^{r} \bar{X}_i, \end{equation*}\] but Var(\(\bar{X}_i\)) is not the \(VHM\). The total variance formula or unconditional variance formula is \[ \mathrm{Var}(\bar{X}_i)=\mathrm{E}_{X}(\mathrm{Var}_{\Theta}(\bar{X}_i|\theta_i))+\mathrm{Var}_{\Theta}(\mathrm{E}_{X}(\bar{X}_i|\theta_i)). \] The \(VHM\) is the second term on the right because \(\mu(\theta_i)=\mathrm{E}_{X}(\bar{X}_i|\theta_i)\) is the hypothetical mean for risk \(i\). So, \[\begin{equation*} VHM= \mathrm{Var}(\bar{X}_i) - \mathrm{E}_{\Theta}(\mathrm{Var}_{X}(\bar{X}_i|\theta_i)). \end{equation*}\] As discussed previously in Section 12.2.2, \(EPV/n\) = \(\mathrm{E}_{\Theta}(\mathrm{Var}_{X}[\bar{X}_i|\theta_i])\) and using the above estimators gives an estimator for the \(VHM\): \[\begin{equation} \widehat{VHM} = \frac{1}{r-1} \sum_{i=1}^{r}(\bar{X}_i-\bar{X})^2 - \frac{\widehat{EPV}}{n} . \tag{12.6} \end{equation}\] Although the expected loss for a risk with parameter \(\theta_i\) is \(\mu(\theta_i)\)=\(\mathrm{E}_{X}(\bar{X}_i|\theta_i\)), the variance of the sample mean \(\bar{X}_i\) is greater than or equal to the variance of the hypothetical means: \(\mathrm{Var}(\bar{X}_i)\geq\mathrm{Var}(\mu(\theta_i)\)). The variance in the sample means \(\mathrm{Var}(\bar{X}_i\)) includes both the variance in the hypothetical means plus a process variance term.

In some cases formula (12.6) can produce a negative value for \(\widehat{VHM}\) because of the subtraction of \(\widehat{EPV}/n\), but a variance cannot be negative. The process variance within risks is so large that it overwhelms the measurement of the variance in means between risks. In this case we cannot use this method to determine the values needed for Bühlmann credibility.

Example 12.5.1. Two policyholders had claims over a three-year period as shown in the table below. Estimate the expected number of claims for each policyholder using Bühlmann credibility and calculating necessary parameters from the data. \[ \small{ \begin{array}{|c|c|c|} \hline \text{Year} & \text{Risk A} & \text{Risk B} \\ \hline 1 & 0 & 2 \\ 2 & 1 & 1 \\ 3 & 0 & 2 \\ \hline \end{array} } \]

Show Example Solution

Example 12.5.2. Two policyholders had claims over a three-year period as shown in the table below. Calculate the nonparametric estimate for the \(VHM\).

\[ \small{ \begin{array}{|c|c|c|} \hline \text{Year} & \text{Risk A} & \text{Risk B} \\ \hline 1 & 3 & 3 \\ 2 & 0 & 0 \\ 3 & 0 & 3 \\ \hline \end{array} } \]

Show Example Solution

Bühlmann-Straub Model Empirical formulas for \(EPV\) and \(VHM\) in the Bühlmann-Straub model are more complicated because a risk’s number of exposures can change from one period to another. Also, the number of experience periods does not have to be constant across the population. First some definitions:

  • \(X_{ij}\) is the losses per exposure for risk \(i\) in period \(j\). Losses can refer to number of claims or amount of loss. There are \(r\) risks so \(i=1,\ldots,r\).
  • \(n_i\) is the number of observation periods for risk \(i\)
  • \(m_{ij}\) is the number of exposures for risk \(i\) in period \(j\) for \(j=1,\ldots,n_i\)

Risk \(i\) with risk parameter \(\theta_i\) has \(m_{ij}\) exposures in period \(j\) which means that the losses per exposure random variable can be written as \(X_{ij}=(Y_{i1}+\cdots+Y_{im_{ij}})/m_{ij}\). Random variable \(Y_{ik}\) is the loss for one exposure. For risk \(i\) losses \(Y_{ik}\) are iid with mean \(\mathrm{E}_{Y}(Y_{ik}|\theta_{i})=\mu(\theta_i)\) and process variance \(\mathrm{Var}_{Y}(Y_{ik}|\theta_{i}) =\sigma^2(\theta_{i})\). It follows that \(\mathrm{Var}_{Y}(X_{ij}|\theta_{i})\) = \(\sigma^2(\theta_i)/m_{ij}\).

Two more important definitions are:

  • \(\bar{X}_i=\frac{1}{m_i}\sum_{j=1}^{n_i} m_{ij}X_{ij}\) with \(m_i = \sum_{j=1}^{n_i} m_{ij}\). \(\bar{X}_i\) is the average loss per exposure for risk \(i\) for all observation periods combined.
  • \(\bar{X}=\frac{1}{m}\sum_{i=1}^{r} m_i \bar{X}_i\) with \(m=\sum_{i=1}^r m_i\). \(\bar{X}\) is the average loss per exposure for all risks for all observation periods combined.

An unbiased estimator for the process variance \(\sigma^2(\theta_i)\) of one exposure for risk \(i\) is \[\begin{equation*} {s_i}^2=\frac{\sum_{j=1}^{n_i} m_{ij}(X_{ij}-\bar{X}_i)^2}{n_i-1}. \end{equation*}\] The weights \(m_{ij}\) are applied to the squared differences because the \(X_{ij}\) are the averages of \(m_{ij}\) exposures. The weighted average of the sample variances \({s_i}^2\) for each risk \(i\) in the population with weights proportional to the number of \((n_i-1)\) observation periods will produce the expected value of the process variance (\(EPV\)) estimate \[\begin{equation*} \widehat{EPV}=\frac{\sum_{i=1}^r (n_i-1){s_i}^2}{\sum_{i=1}^r (n_i-1)}=\frac{\sum_{i=1}^r \sum_{j=1}^{n_i} m_{ij}(X_{ij}-\bar{X}_i)^2}{\sum_{i=1}^r (n_i-1)}. \end{equation*}\] The quantity \(\widehat{EPV}\) is an unbiased estimator for the expected value of the process variance of one exposure for a risk chosen at random from the population.

To calculate an estimator for the variance in the hypothetical means (\(VHM\)) the squared differences of the individual risk sample means \(\bar{X}_i\) and population mean \(\bar{X}\) are used. An unbiased estimator for the \(VHM\) is \[\begin{equation*} \widehat{VHM}=\frac{\sum_{i=1}^r m_i(\bar{X}_i-\bar{X})^2 - (r-1)\widehat{EPV}}{m-\frac{1}{m}\sum_{i=1}^r m_i^2}. \end{equation*}\] This complicated formula is necessary because of the varying number of exposures. Proofs that the \(EPV\) and \(VHM\) estimators shown above are unbiased can be found in several references mentioned at the end of this chapter including (Bühlmann and Gisler 2005), (Klugman, Panjer, and Willmot 2012), and (Tse 2009).

Example 12.5.3. Two policyholders had claims shown in the table below. Estimate the expected number of claims per vehicle for each policyholder using Bühlmann-Straub credibility and calculating parameters from the data.

\[ \small{ \begin{array}{|c|c|c|c|c|c|} \hline \text{Policyholder} & & \text{Year 1} & \text{Year 2} & \text{Year 3} & \text{Year 4} \\ \hline \text{A} & \text{Number of claims} & 0 & 2 & 2 & 3 \\ \hline \text{A} & \text{Insured vehicles} & 1 & 2 & 2 & 2\\ \hline & & & & & \\ \hline \text{B} & \text{Number of claims} & 0 & 0 & 1 & 2\\ \hline \text{B} & \text{Insured vehicles} & 0 & 2 & 3 & 4\\ \hline \end{array} } \]

Show Example Solution

12.5.2 Semiparametric Estimation for Bühlmann and Bühlmann-Straub Models

In the prior section on nonparametric estimationStatistical method that allows the functional form of a fit from data to have no assumed prior distribution, constraints, or parameters, there were no assumptions about the distribution of the losses per exposure \(X_{ij}\). Assuming that the \(X_{ij}\) have a particular distribution and using properties of the distribution along with the data to determine credibility parameters is referred to as semiparametric estimationCredibility method that assumes a distribution for the loss per exposure random variable and otherwise uses empirical data.

An example of semiparametric estimation would be the assumption of a Poisson distribution when estimating claim frequencies. The Poisson distribution has the property that the mean and variance are identical and this property can simplify calculations. The following simple example comes from the prior section but now includes a Poisson assumption about claim frequencies.

Example 12.5.4. Two policyholders had claims over a three-year period as shown in the table below. Assume that the number of claims for each risk has a Poisson distribution. Estimate the expected number of claims for each policyholder using Bühlmann credibility and calculating necessary parameters from the data. \[ \small{ \begin{array}{|c|c|c|} \hline \text{Year} & \text{Risk A} & \text{Risk B} \\ \hline 1 & 0 & 2 \\ 2 & 1 & 1 \\ 3 & 0 & 2 \\ \hline \end{array} } \]

Show Example Solution

Although we assumed that the number of claims for each risk was Poisson distributed in the prior example, we did not need this additional assumption because there was enough information to use nonparametric estimation. In fact, the Poisson assumption might not be appropriate because for risk B the sample mean is not equal to the sample variance: \(\bar{x}_B=\frac{5}{3}\neq s_B^2=\frac{1}{3}\).

The following example is commonly used to demonstrate a situation where semiparametric estimation is needed. There is insufficient information for nonparametric estimation but with the Poisson assumption, estimates can be calculated.

Example 12.5.5. A portfolio of 2,000 policyholders generated the following claims profile during a five-year period: \[ \small{ \begin{array}{|c|c|} \hline \text{Number of Claims} & \\ \text{In 5 Years} & \text{Number of policies}\\ \hline 0 & 923 \\ 1 & 682 \\ 2 & 249 \\ 3 & 70 \\ 4 & 51 \\ 5 & 25 \\ \hline \end{array} } \]

In your model you assume that the number of claims for each policyholder has a Poisson distribution and that a policyholder’s expected number of claims is constant through time. Use Bühlmann credibility to estimate the annual expected number of claims for policyholders with 3 claims during the five-year period.

Show Example Solution

12.6 Limited Fluctuation Credibility


In this section, you learn how to:

  • Calculate full credibility standards for number of claims, average size of claims, and aggregate losses.
  • Learn how the relationship between means and variances of underlying distributions affects full credibility standards.
  • Determine credibility-weight \(Z\) using the square-root partial credibility formula.

Limited fluctuation credibilityA credibility method that attempts to limit fluctuations in its estimates, also called “classical credibility” and “American credibility,” was given this name because the method explicitly attempts to limit fluctuations in estimates for claim frequencies, severities, or losses. For example, suppose that you want to estimate the expected number of claims \(N\) for a group of risks in an insurance rating class. How many risks are needed in the class to ensure that a specified level of accuracy is attained in the estimate? First the question will be considered from the perspective of how many claims are needed.

12.6.1 Full Credibility for Claim Frequency

Let \(N\) be a random variable representing the number of claims for a group of risks, for example, risks within a particular rating classification. The observed number of claims will be used to estimate \(\mu_N=\mathrm{E}[N]\), the expected number of claims. How big does \(\mu_N\) need to be to get a good estimate? One way to quantify the accuracy of the estimate would be with a statement like: ``The observed value of \(N\) should be within 5\(\%\) of \(\mu_N\) at least 90\(\%\) of the time.” Writing this as a mathematical expression would give \(\Pr[0.95 \mu_N \leq N \leq 1.05 \mu_N] \geq 0.90\). Generalizing this statement by letting the range parameter \(k\) replace 5\(\%\) and probability level \(p\) replace 0.90 gives the equation \[\begin{equation} \Pr[(1-k) \mu_N \leq N \leq (1+k) \mu_N] \geq p . \tag{12.7} \end{equation}\] The expected number of claims required for the probability on the left-hand side of (12.7) to equal \(p\) is called the full credibility standardThe threshold of experience necessary to assign 100% credibility to the insured’s own experience.

If the expected number of claims is greater than or equal to the full credibility standard then full credibility can be assigned to the data so \(Z=1\). Usually the expected value \(\mu_N\) is not known so full credibility will be assigned to the data if the actual observed number of claims \(n\) is greater than or equal to the full credibility standard. The \(k\) and \(p\) values must be selected and the actuary may rely on experience, judgment, and other factors in making the choices.

Subtracting \(\mu_N\) from each term in (12.7) and dividing by the standard deviation \(\sigma_N\) of \(N\) gives \[\begin{equation} \Pr\left[\frac{-k\mu_N}{\sigma_N}\leq \frac{N-\mu_N}{\sigma_N} \leq \frac{k\mu_N}{\sigma_N}\right] \geq p. \tag{12.8} \end{equation}\] In limited fluctuation credibility the standard normal distribution is used to approximate the distribution of \((N-\mu_N)/\sigma_N\). If \(N\) is the sum of many claims from a large group of similar risks and the claims are independent, then the approximation may be reasonable.

Let \(y_p\) be the value such that \[ \Pr[-y_p\leq \frac{N-\mu_N}{\sigma_N} \leq y_p]=\Phi(y_p)-\Phi(-y_p)=p \] where \(\Phi( )\) is the cumulative distribution function of the standard normalCumulative density function for the normal distribution with mean 0 and standard deviation 1. Because \(\Phi(-y_p)=1-\Phi(y_p)\), the equality can be rewritten as \(2\Phi(y_p)-1=p\). Solving for \(y_p\) gives \(y_p=\Phi^{-1}((p+1)/2)\) where \(\Phi^{-1}( )\) is the inverse of \(\Phi( )\).

Equation (12.8) will be satisfied if \(k\mu_N/\sigma_N \geq y_p\) assuming the normal approximation. First we will consider this inequality for the case when \(N\) has a Poisson distribution: \(\Pr[N=n] = \lambda^n\textrm{e}^{-\lambda}/n!\). Because \(\lambda=\mu_N=\sigma_N^2\) for the Poisson, taking square roots yields \(\mu_N^{1/2}=\sigma_N\). So, \(k\mu_N/\mu_N^{1/2} \geq y_p\) which is equivalent to \(\mu_N \geq (y_p/k)^2\). Let’s define \(\lambda_{kp}\) to be the value of \(\mu_N\) for which equality holds. Then the full credibility standard for the Poisson distribution is \[\begin{equation} \lambda_{kp} = \left(\frac{y_p}{k}\right)^2 \textrm{with } y_p=\Phi^{-1}((p+1)/2). \tag{12.9} \end{equation}\] If the expected number of claims \(\mu_N\) is greater than or equal to \(\lambda_{kp}\) then equation (12.7) is assumed to hold and full credibility can be assigned to the data. As noted previously, because \(\mu_N\) is usually unknown, full credibility is given if the observed number of claims \(n\) satisfies \(n \geq \lambda_{kp}.\)

Example 12.6.1. The full credibility standard is set so that the observed number of claims is to be within 5% of the expected value with probability \(p=0.95\). If the number of claims has a Poisson distribution find the number of claims needed for full credibility.

Show Example Solution

If claims are not Poisson distributed then equation (12.8) does not imply (12.9). Setting the upper bound of \((N-\mu_N)/\sigma_N\) in (12.8) equal to \(y_p\) gives \(k\mu_N/\sigma_N=y_p\). Squaring both sides and moving everything to the right side except for one of the \(\mu_N\)’s gives \(\mu_N=(y_p/k)^2(\sigma_N^2/\mu_N)\). This is the full credibility standard for frequency and will be denoted by \(n_f\), \[\begin{equation} n_f=\left(\frac{y_p}{k}\right)^2\left(\frac{\sigma_N^2}{\mu_N}\right)=\lambda_{kp}\left(\frac{\sigma_N^2}{\mu_N}\right). \tag{12.10} \end{equation}\] This is the same equation as the Poisson full credibility standard except for the \((\sigma_N^2/\mu_N)\) multiplier. When the claims distribution is Poisson this extra term is one because the variance equals the mean.

Example 12.6.2. The full credibility standard is set so that the total number of claims is to be within 5\(\%\) of the observed value with probability \(p=0.95\). The number of claims has a negative binomial distribution, \[ \Pr(N=x)={x+r-1\choose x} \left(\frac{1}{1+\beta}\right)^r \left(\frac{\beta}{1+\beta}\right)^x , \] with \(\beta=1\). Calculate the full credibility standard.

Show Example Solution

We see that the negative binomial distribution with \((\sigma_N^2/\mu_N)>1\) requires more claims for full credibility than a Poisson distribution for the same \(k\) and \(p\) values. The next example shows that a binomial distribution which has \((\sigma_N^2/\mu_N)<1\) will need fewer claims for full credibility.

Example 12.6.3. The full credibility standard is set so that the total number of claims is to be within 5\(\%\) of the observed value with probability \(p=0.95\). The number of claims has a binomial distribution \[ \Pr(N=x)={m\choose x}q^x(1-q)^{m-x}. \] Calculate the full credibility standard for \(q=1/4\).

Show Example Solution

Rather than using expected number of claims to define the full credibility standard, the number of exposures can be used for the full credibility standard. An exposure is a measure of risk. For example, one car insured for a full year would be one car-year. Two cars each insured for exactly one-half year would also result in one car-year. Car-years attempt to quantify exposure to loss. Two car-years would be expected to generate twice as many claims as one car-year if the vehicles have the same risk of loss. To translate a full credibility standard denominated in terms of number of claims to a full credibility standard denominated in exposures one needs a reasonable estimate of the expected number of claims per exposure.

Example 12.6.4. The full credibility standard should be selected so that the observed number of claims will be within 5\(\%\) of the expected value with probability \(p=0.95\). The number of claims has a Poisson distribution. If one exposure is expected to have about 0.20 claims per year, find the number of exposures needed for full credibility.

Show Example Solution

Frequency can be defined as the number of claims per exposure. Letting \(m\) denote the number of exposures. Then, if observed claim frequency \(N/m\) is used to estimate \(\mathrm{E}(N/m)\):
\[ \Pr[(1-k)\mathrm{E}(N/m)\leq N/m \leq(1+k)\mathrm{E}(N/m)] \geq p. \] Because the number of exposures is not a random variable, \(\mathrm{E}(N/m)=\mathrm{E}(N)/m=\mu_N/m\) and the prior equation becomes \[ \Pr\left[(1-k)\frac{\mu_N}{m}\leq \frac{N}{m} \leq(1+k)\frac{\mu_N}{m}\right] \geq p. \] Multiplying through by \(m\) results in equation (12.7) at the beginning of the section. The full credibility standards that were developed for estimating expected number of claims also apply to frequency.

12.6.2 Full Credibility for Aggregate Losses and Pure Premium

Aggregate losses are the total of all loss amounts for a risk or group of risks. Letting \(S\) represent aggregate losses \[ S=X_1+X_2+\cdots+X_N. \] The random variable \(N\) represents the number of losses and random variables \(X_1, X_2,\ldots,X_N\) are the individual loss amounts. In this section it is assumed that \(N\) is independent of the loss amounts and that \(X_1, X_2,\ldots,X_N\) are iidIndependent and identically distributed.

The mean and variance of \(S\) are \[ \mu_S=\mathrm{E}(S)=\mathrm{E}(N)\mathrm{E}(X)=\mu_N\mu_X \] and \[ \sigma^{2}_S=\mathrm{Var}(S)=\mathrm{E}(N)\mathrm{Var}(X)+[\mathrm{E}(X)]^{2}\mathrm{Var}(N)=\mu_N\sigma^{2}_X+\mu^{2}_X\sigma^{2}_N , \] where \(X\) is the amount of a single loss. See the discussion on collective risk models in Section 7.3 for more discussion of this framework.

Observed losses \(S\) will be used to estimate expected losses \(\mu_S=\mathrm{E}(S)\). As with the frequency model in the previous section, the observed losses must be close to the expected losses as quantified in the equation \[ \Pr[(1-k)\mu_S\leq S \leq(1+k)\mu_S] \geq p. \] After subtracting the mean and dividing by the standard deviation, \[ \Pr\left[\frac{-k\mu_S}{\sigma_S}\leq (S-\mu_S)/\sigma_S \leq \frac{k\mu_S}{\sigma_S}\right] \geq p . \] As done in the previous section the distribution for \((S-\mu_S)/\sigma_S\) is assumed to be standard normal and \(k\mu_S/\sigma_S=y_p=\Phi^{-1}((p+1)/2)\). This equation can be rewritten as \(\mu_S^2=(y_p/k)^2\sigma_S^2\). Using the prior formulas for \(\mu_S\) and \(\sigma_{S}^2\) gives \((\mu_N\mu_X)^2=(y_p/k)^2(\mu_N\sigma^{2}_X+\mu^{2}_X\sigma^{2}_N)\). Dividing both sides by \(\mu_N\mu_X^2\) and reordering terms on the right side results in a full credibility standard \(n_S\) for aggregate losses \[\begin{equation} n_S=\left(\frac{y_p}{k}\right)^2\left[\left(\frac{\sigma_N^2}{\mu_N}\right)+\left(\frac{\sigma_X}{\mu_X}\right)^2\right]=\lambda_{kp}\left[\left(\frac{\sigma_N^2}{\mu_N}\right)+\left(\frac{\sigma_X}{\mu_X}\right)^2\right]. \tag{12.11} \end{equation}\]

Example 12.6.5. The number of claims has a Poisson distribution. Individual loss amounts are independently and identically distributed with a Pareto distribution \(F(x)=1-[\theta/(x+\theta)]^{\alpha}\). The number of claims and loss amounts are independent. If observed aggregate losses should be within 5\(\%\) of the expected value with probability \(p=0.95\), how many losses are required for full credibility?

Show Example Solution

When the number of claims is Poisson distributed then equation (12.11) can be simplified using \((\sigma_N^2/\mu_N)=1\). It follows that \[ [(\sigma_N^2/\mu_N)+(\sigma_X/\mu_X)^2]=[1+(\sigma_X/\mu_X)^2]=[(\mu_X^2+\sigma_X^2)/\mu_X^2]=\mathrm{E}(X^2)/\mathrm{E}(X)^2 \] using the relationship \(\mu_X^2+\sigma_X^2=\mathrm{E}(X^2)\). The full credibility standard is \(n_S=\lambda_{kp}~\mathrm{E}(X^2)/\mathrm{E}(X)^2\).

The pure premium \(PP\) is equal to aggregate losses \(S\) divided by exposures \(m\): \(PP=S/m\). The full credibility standard for pure premium will require \[ \Pr\left[(1-k)\mu_{PP}\leq PP \leq(1+k)\mu_{PP}\right] \geq p. \] The number of exposures \(m\) is assumed fixed and not a random variable so \(\mu_{PP}=\mathrm{E}(S/m)=\mathrm{E}(S)/m=\mu_S/m\). \[ \Pr\left[(1-k)\left(\frac{\mu_S}{m}\right)\leq \left(\frac{S}{m}\right) \leq(1+k)\left(\frac{\mu_S}{m}\right)\right] \geq p. \] Multiplying through by \(m\) returns the bounds for losses \[ \Pr[(1-k)\mu_S\leq S \leq(1+k)\mu_S] \geq p. \] This means that the full credibility standard \(n_{PP}\) for the pure premium is the same as that for aggregate losses \[ n_{PP}=n_S=\lambda_{kp}\left[\left(\frac{\sigma_N^2}{\mu_n}\right)+\left(\frac{\sigma_X}{\mu_X}\right)^2\right]. \]

12.6.3 Full Credibility for Severity

Let \(X\) be a random variable representing the size of one claim. Claim severity is \(\mu_X=\mathrm{E}(X)\). Suppose that \({X_1,X_2, \ldots, X_n}\) is a random sample of \(n\) claims that will be used to estimate claim severity \(\mu_X\). The claims are assumed to be iid. The average value of the sample is \[ \bar{X}=\frac{1}{n}\left(X_1+X_2+\cdots+X_n\right). \] How big does \(n\) need to be to get a good estimate? Note that \(n\) is not a random variable whereas it is in the aggregate loss model.

In Section 12.6.1 the accuracy of an estimator for frequency was defined by requiring that the number of claims lie within a specified interval about the mean number of claims with a specified probability. For severity this requirement is \[ \Pr[(1-k)\mu_X\leq \bar{X} \leq(1+k)\mu_X ]\geq p , \] where \(k\) and \(p\) need to be specified. Following the steps in Section 12.6.1, the mean claim severity \(\mu_X\) is subtracted from each term and the standard deviation of the claim severity estimator \(\sigma_{\bar{X}}\) is divided into each term yielding \[ \Pr\left[\frac{-k~\mu_X}{\sigma_{\bar{X}}}\leq (\bar{X}-\mu_X)/\sigma_{\bar{X}} \leq \frac{k~\mu_X}{\sigma_{\bar{X}}}\right] \geq p . \] As in prior sections, it is assumed that \((\bar{X}-\mu_X)/\sigma_{\bar{X}}\) is approximately normally distributed and the prior equation is satisfied if \(k\mu_X/\sigma_{\bar{X}}\geq y_p\) with \(y_p=\Phi^{-1}((p+1)/2)\). Because \(\bar{X}\) is the average of individual claims \(X_1, X_2,\dots, X_n\), its standard deviation is equal to the standard deviation of an individual claim divided by \(\sqrt{n}\): \(\sigma_{\bar{X}}=\sigma_X/\sqrt{n}\). So, \(k\mu_X/(\sigma_X/\sqrt{n})\geq y_p\) and with a little algebra this can be rewritten as \(n \geq (y_p/k)^2(\sigma_X/\mu_X)^2\). The full credibility standard for severity is \[\begin{equation} n_X=\left(\frac{y_p}{k}\right)^2\left(\frac{\sigma_X}{\mu_X}\right)^2=\lambda_{kp}\left(\frac{\sigma_X}{\mu_X}\right)^2. \tag{12.12} \end{equation}\] Note that the term \(\sigma_X/\mu_X\) is the coefficient of variationStandard deviation divided by the mean of a distribution, to measure variability in terms of units of the mean for an individual claim. Even though \(\lambda_{kp}\) is the full credibility standard for frequency given a Poisson distribution, there is no assumption about the distribution for the number of claims.

Example 12.6.6. Individual loss amounts are independently and identically distributed with a Type II Pareto distribution \(F(x)=1-[\theta/(x+\theta)]^{\alpha}\). How many claims are required for the average severity of observed claims to be within 5\(\%\) of the expected severity with probability \(p=0.95\)?

Show Example Solution

12.6.4 Partial Credibility

In prior sections full credibility standards were calculated for estimating frequency (\(n_f\)), pure premium (\(n_{PP}\)), and severity (\(n_X\)) - in this section these full credibility standards will be denoted by \(n_{0}\). In each case the full credibility standard was the expected number of claims required to achieve a defined level of accuracy when using empirical data to estimate an expected value. If the observed number of claims is greater than or equal to the full credibility standard then a full credibility weight \(Z=1\) is given to the data.

In limited fluctuation credibility, credibility weights \(Z\) assigned to data are \[ Z= \left\{ \begin{array}{ll} \sqrt{n /n_{0}} &\textrm{if } n < n_{0} \\ 1 & \textrm{if } n \ge n_{0} , \end{array} \right. \]

where \(n_0\) is the full credibility standard. The quantity \(n\) is the number of claims for the data that is used to estimate the expected frequency, severity, or pure premium.

Example 12.6.7. The number of claims has a Poisson distribution. Individual loss amounts are independently and identically distributed with a Type II Pareto distribution \(F(x)=1-[\theta/(x+\theta)]^{\alpha}\). Assume that \(\alpha=3\). The number of claims and loss amounts are independent. The full credibility standard is that the observed pure premium should be within 5\(\%\) of the expected value with probability \(p=0.95\). What credibility \(Z\) is assigned to a pure premium computed from 1,000 claims?

Show Example Solution

Limited fluctuation credibility uses the formula \(Z=\sqrt{n/n_0}\) to limit the fluctuation in the credibility-weighted estimate to match the fluctuation allowed for data with expected claims at the full credibility standard. Variance or standard deviation is used as the measure of fluctuation. Next we show an example to explain why the square-root formula is used.

Suppose that average claim severity is being estimated from a sample of size \(n\) that is less than the full credibility standard \(n_0=n_X\). Applying credibility theory, the estimate \(\hat{\mu}_X\) would be \[ \hat{\mu}_X=Z\bar{X}+(1-Z)M_X , \] with \(\bar{X}=(X_1+X_2+\cdots+X_n)/n\) and \(iid\) random variables \(X_i\) representing the sizes of individual claims. The complement of credibility is applied to \(M_X\) which could be last year’s estimated average severity adjusted for inflation, the average severity for a much larger pool of risks, or some other relevant quantity selected by the actuary. It is assumed that the variance of \(M_X\) is zero or negligible. With this assumption \[ \mathrm{Var}(\hat{\mu}_X)=\mathrm{Var}(Z\bar{X})=Z^2\mathrm{Var}(\bar{X})=\frac{n}{n_0}\mathrm{Var}(\bar{X}). \] Because \(\bar{X}=(X_1+X_2+\cdots+X_n)/n\) it follows that \(\mathrm{Var}(\bar{X})=\mathrm{Var}(X_i)/n\) where random variable \(X_i\) is one claim. So, \[ \mathrm{Var}(\hat{\mu}_X)=\frac{n}{n_0}\mathrm{Var}(\bar{X})=\frac{n}{n_0}\frac{\mathrm{Var}(X_i)}{n}=\frac{\mathrm{Var}(X_i)}{n_0}. \] The last term is exactly the variance of a sample mean \(\bar{X}\) when the sample size is equal to the full credibility standard \(n_0=n_X\).

12.6.5 Full Credibility Standard for Limited Fluctuation Credibility

Limited-fluctuation credibility requires a full credibility standard. The general formula for aggregate losses or pure premium, as obtained in formula (12.11), is \[ n_S=\left(\frac{y_p}{k}\right)^2\left[\left(\frac{\sigma_N^2}{\mu_N}\right)+\left(\frac{\sigma_X}{\mu_X}\right)^2\right] , \] with \(N\) representing number of claims and \(X\) the size of claims. If one assumes \(\sigma_X=0\) then the full credibility standard for frequency results. If \(\sigma_N=0\) then the full credibility formula for severity follows. Probability \(p\) and \(k\) value are often selected using judgment and experience.

In practice it is often assumed that the number of claims is Poisson distributed so that \(\sigma_N^2/\mu_N=1\). In this case the formula can be simplified to \[\begin{equation*} n_S=\left(\frac{y_p}{k}\right)^2\left[\frac{\mathrm{E}(X^2)}{(\mathrm{E}(X))^2}\right]. \end{equation*}\] An empirical mean and second moment for the sizes of individual claim losses can be computed from past data, if available.

Show Quiz Solution

12.7 Balancing Credibility Estimators

The credibility weighted model \(\hat{\mu}(\theta_i)=Z_i\bar{X}_i+(1-Z_i)\bar{X}\), where \(\bar{X}_i\) is the loss per exposure for risk \(i\) and \(\bar{X}\) is loss per exposure for the population, can be used to estimate the expected loss for risk \(i\). The overall mean is \(\bar{X}=\sum_{i=1}^r(m_i/m) \bar{X}_i\) where \(m_i\) and \(m\) are number of exposures for risk \(i\) and population, respectively.

For the credibility weighted estimators to be in balance we want \[ \bar{X}=\sum_{i=1}^r(m_i/m) \bar{X}_i=\sum_{i=1}^r(m_i/m) \hat{\mu}(\theta_i). \] If this equation is satisfied then the estimated losses for each risk will add up to the population total, an important goal in ratemaking, but this may not happen if the complement of credibility is applied to \(\bar{X}\).

To achieve balance, we will set \(\hat{M}_X\) as the amount that is applied to the complement of credibility and thus analyze the following equation: \[ \sum_{i=1}^r(m_i/m) \bar{X}_i=\sum_{i=1}^r(m_i/m) \left\{Z_i\bar{X}_i+(1-Z_i) \cdot \hat{M}_X\right\} . \] A little algebra gives \[ \sum_{i=1}^r m_i \bar{X}_i=\sum_{i=1}^r m_i Z_i\bar{X}_i + \hat{M}_X\sum_{i=1}^r m_i(1-Z_i), \] and \[ \hat{M}_X=\frac{\sum_{i=1}^r m_i(1-Z_i)\bar{X}_i}{\sum_{i=1}^r m_i(1-Z_i)}. \] Using this value for \(\hat{M}_X\) will bring the credibility weighted estimators into balance.

If credibilities \(Z_i\) were computed using the Bühlmann-Straub model, then \(Z_i=m_i/(m_i+K)\). The prior formula can be simplified using the following relationship \[ m_i(1-Z_i)=m_i\left(1-\frac{m_i}{m_i+K}\right)=m_i\left(\frac{(m_i+K)-m_i}{m_i+K}\right)=KZ_i . \] Therefore, an amount when applied to the complement of credibility that will bring the credibility-weighed estimators into balance with the overall mean loss per exposure is \[ \hat{M}_X=\frac{\sum_{i=1}^r Z_i \bar{X}_i}{\sum_{i=1}^r Z_i}. \]

Example 12.7.1. An example from the nonparametric Bühlmann-Straub section had the following data for two risks. Find an amount for the complement of credibility \(\hat{M}_X\) that will produce credibility-weighted estimates that are in balance. \[ \small{ \begin{array}{|c|c|c|c|c|c|} \hline \text{Policyholder} & & \text{Year 1} & \text{Year 2} & \text{Year 3} & \text{Year 4} \\ \hline \text{A} & \text{Number of claims} & 0 & 2 & 2 & 3 \\ \hline \text{A} & \text{Insured vehicles} & 1 & 2 & 2 & 2\\ \hline & & & & & \\ \hline \text{B} & \text{Number of claims} & 0 & 0 & 1 & 2\\ \hline \text{B} & \text{Insured vehicles} & 0 & 2 & 3 & 4\\ \hline \end{array} } \]

Show Example Solution

Show Quiz Solution

12.8 Further Resources and Contributors

Contributor

  • Gary Dean, Ball State University is the author of the initial version of this chapter. Email: for chapter comments and suggested improvements.
  • Chapter reviewers include: Liang (Jason) Hong, Ambrose Lo, Ranee Thiagarajah, Hongjuan Zhou.