Chapter 12 Experience Rating Using Credibility Theory
Chapter Preview. This chapter introduces credibility theory as an important actuarial tool for estimating pure premiums, frequencies, and severities for individual risks or classes of risks. Credibility theory provides a convenient framework for combining the experience for an individual risk or class with other data to produce more stable and accurate estimates. Several models for calculating credibility estimates will be discussed including Bühlmann, Bühlmann-Straub, limited fluctuation, and nonparametric and semiparametric credibility methods. The chapter will also show a connection between credibility theory and Bayesian estimation which was introduced in Chapter 9, Bayesian Inference and Modeling.
12.1 Introduction to Applications of Credibility Theory
What premium should be charged to provide insurance? The answer depends upon the exposure to the risk of loss. A common method to compute an insurance premium is to rate an insured using a classification rating planA rating plan that uses an insured’s risk characteristics to determine premium. A classification plan is used to select an insurance rate based on an insured’s rating characteristics such as geographic territory, age, etc. All classification rating plans use a limited set of criteria to group insureds into a “class” and there will be variation in the risk of loss among insureds within the class.
An experience rating plan attempts to capture some of the variation in the risk of loss among insureds within a rating class by using the insured’s own loss experience to complement the rate from the classification rating plan. One way to do this is to use a credibility weightThe weight assigned to an insured’s historical loss experience for the purposes of determining their premium in an experience rating plan Z with 0≤Z≤1 to compute ˆR=ZˉX+(1−Z)M, ˆR=credibility weighted rate for risk,ˉX=average loss for the risk over a specified time period,M=the rate for the classification group, often called the manual rate. For a risk whose loss experience is stable from year to year, Z might be close to 1. For a risk whose losses vary widely from year to year, Z may be close to 0.
Credibility theory is also used for computing rates for individual classes within a classification rating plan. When classification plan rates are being determined, some or many of the groups may not have sufficient data to produce stable and reliable rates. The actual loss experience for a group will be assigned a credibility weight Z and the complement of credibilityThe remainder of the weight not assigned to an insured’s historical loss experience in the experience rating plan 1−Z may be given to the average experience for risks across all classes. Or, if a class rating plan is being updated, the complement of credibility may be assigned to the current class rateAverage rate per exposure for an insured in a particular classification group. Credibility theory can also be applied to the calculation of expected frequencies and severities.
Computing numeric values for Z requires analysis and understanding of the data. What are the variances in the number of losses and sizes of losses for risks? What is the variance between expected values across risks?
Show Quiz Solution
12.2 Bühlmann Credibility
In this section, you learn how to:
- Compute a credibility-weighted estimate for the expected loss for a risk or group of risks.
- Determine the credibility Z assigned to observations.
- Calculate the values required in Bühlmann credibility including the Expected Value of the Process Variance (EPV), Variance of the Hypothetical Means (VHM) and collective mean μ.
A classification rating plan groups policyholders together into classes based on risk characteristics. Although policyholders within a class have similarities, they are not identical and their expected losses will not be exactly the same. An experience rating plan can supplement a class rating plan by credibility weighting an individual policyholder’s loss experience with the class rate to produce a more accurate rate for the policyholder. Chapter 15 Experience Rating using Bonus-Malus provides examples of rating plans that adjust a policyholder’s rate to recognize their loss experience.
The Bühlmann credibility model introduced in this section is often called greatest accuracy credibility, least-squares credibility, or Bayesian credibility.
In this presentation a risk parameterParameter in a distribution whose value reflects the risk categorization θ will be assigned to each policyholder. Losses X for the policyholder with parameter θ will have a pdf fX|Θ=θ(x) and mean μ(θ)=EX(X|θ)=∫xfx|Θ=θ(x)dx and variance σ2(θ)=VarX(X|θ)=∫(x−μ(θ))2fx|Θ=θ(x)dx. The integrals are over the support for the distributions. Losses X can represent pure premiums, aggregate losses, number of claims, claim severities, or some other measure of loss for a period of time, often one year. Risk parameter θ may be continuous or discrete and may be multivariate depending on the model. For a randomly selected risk the risk parameter θ is unknown but the probability density function for θ is modeled with fΘ(θ). Averaging across the policyholders in the class the collective mean loss is μ=EΘ[EX(X|θ)]=∫fΘ(θ)μ(θ)dθ=∫fΘ(θ)∫xfx|Θ=θ(x)dxdθ.
Example 12.2.1. The number of claims X for an insured in a class has a Poisson distribution with mean θ>0. The risk parameter θ is exponentially distributed within the class with pdf f(θ)=e−θ. What is the expected number of claims for an insured chosen at random from the class?
Show Example Solution
In the prior example the risk parameter θ is a continuous random variable with an exponential distribution. In the next example there are three types of risks and the risk parameter has a discrete distribution.
Example 12.2.2. For any risk (policyholder) in a population the number of losses N in a year has a Poisson distribution with parameter λ. Individual loss amounts Xi for a risk are independent of N and are iid with Type II Pareto distribution F(x)=1−[θ/(x+θ)]α. There are three types of risks in the population as follows:
Risk PercentagePoissonParetoTypeof PopulationParameterParametersA50%λ=0.5θ=1000,α=2.0B30%λ=1.0θ=1500,α=2.0C20%λ=2.0θ=2000,α=2.0
If a risk is selected at random from the population, what is the expected aggregate loss in a year?
Show Example Solution
What is the risk parameter for a risk (policyholder) in the prior example? One could say that the risk parameter has three components (λ,θ,α) with possible values (0.5,1000,2.0), (1.0,1500,2.0), and (2.0,2000,2.0) depending on the type of risk.
Note that in both of the examples the risk parameter is a random quantity with its own probability distribution. We do not know the value of the risk parameter for a randomly chosen risk.
12.2.1 Credibility-Weighted Estimate for the Expected Loss
If a policyholder with risk parameter θ has losses x1,…,xn during n time periods then the goal is to find EΘ(μ(θ)|x1,…,xn), the conditional expectation of μ(θ) given observations x1,…,xn. Section 12.3, Bayesian Inference and Bühlmann Credibility explains how to evaluate EΘ(μ(θ)|x1,…,xn) using Bayesian inference.
The Bühlmann credibility model calculates a linear approximation ˆμ(θ)=Zˉx+(1−Z)μ to estimate EΘ(μ(θ)|x1,…,xn) with ˉx=(x1+…+xn)/n. We can rewrite this as ˆμ(θ)=a+bˉx which makes it obvious that the credibility estimate is a linear function of the mean.
In the Bühlmann model, EΘ(μ(θ)|X1,…,Xn) is approximated by the linear function a+bˉX and constants a and b are calculated to minimize the square of the difference between these two quantities G(a,b)=EX([EΘ(μ(θ)|X1,…,Xn)−(a+bˉX)]2), hence the alternative name least-squares credibility. Minimizing the expectation yields b=n/(n+K) and a=(1−b)μ. Quantity n is the number of observations and μ=EΘ(μ(θ)) is the population mean. For the moment we will assign K the mysterious equation K = (Expected Value of the Process Variance) / (Variance of the Hypothetical Means)=EPV/VHM and will clarify the meaning at the beginning of the next section. More details about this model and calculation of a and b can be found in references (Bühlmann 1967), (Bühlmann and Gisler 2005), (Klugman, Panjer, and Willmot 2012), and (Tse 2009).
The Bühlmann credibility-weighted estimate for EΘ(μ(θ)|x1,…,xn) for the policyholder is ˆμ(θ)=Zˉx+(1−Z)μ with θ=a risk parameter that identifies a policyholder's risk levelˆμ(θ)=estimated expected loss for a policyholder with parameter θand loss experience ˉxˉx=(x1+⋯+xn)/n is the average of n observations of the policyholder Z=credibility assigned to n observations =n/(n+K)K=EPV/VHMμ=the expected loss for a randomly chosen policyholder in the class.
For a selected policyholder, random variables Xj are assumed to be iid for j=1,…,n because it is assumed that the policyholder’s exposure to loss is not changing through time and EX(ˉX|θ)=EX(Xj|θ)=μ(θ).
If a policyholder is randomly chosen from the class and there is no loss information about the risk then the expected loss is μ=EΘ(μ(θ)) where the expectation is taken over all θ’s in the class. In this situation Z=0 and the expected loss is ˆμ(θ)=μ for the risk. The quantity μ can also be written as μ=E(Xj) or μ=E(ˉX) and is referred to as the overall mean, population mean, or collective mean. Note that E(Xj) is evaluated with the law of total expectationThe expected value of the conditional expected value of x given y is the same as the expected value of x: E(Xj)=EΘ[EX(Xj|θ)].
Although formula (12.1) was introduced using experience rating as an example, the Bühlmann credibility model has wider application. Suppose that a rating plan has multiple classes. Credibility formula (12.1) can be used to determine individual class rates. The overall mean μ would be the average loss for all classes combined, ˉx would be the experience for the individual class, and ˆμ(θ) would be the estimated loss for the class.
12.2.2 Credibility Z, EPV, and VHM
When computing the credibility estimate ˆμ(θ)=ZˉX+(1−Z)μ, how much weight Z should go to experience ˉX and how much weight (1−Z) to the overall mean μ? In Bühlmann credibility there are three factors that need to be considered:
- How much variation is there in a single observation Xj for a selected risk? With ˉX=(X1+⋯+Xn)/n and assuming that the observations are iid conditional on θ, it follows that VarX(ˉX|θ) = VarX(Xj|θ)/n. For larger VarX(ˉX|θ) less credibility weight Z should be given to experience ˉX. The Expected Value of the Process VarianceAverage of the natural variability of observations from within each risk, abbreviated EPV, is the expected value of VarX(Xj|θ) across all risks: EPV=EΘ(VarX(Xj|θ)). Because VarX(ˉX|θ) = VarX(Xj|θ)/n it follows that EΘ(VarX(ˉX|θ))=EPV/n.
- How homogeneous is the population of risks whose experience was combined to compute the overall mean μ? If all the risks are similar in loss potential then more weight (1−Z) would be given to the overall mean μ because μ is the average for a group of similar risks whose means μ(θ) are not far apart. The homogeneity or heterogeneity of the population is measured by the Variance of the Hypothetical MeansVariance of the means across different classes, used to determine how similar or different the classes are from one another with abbreviation VHM: VHM=VarΘ(EX(Xj|θ))=VarΘ(EX(ˉX|θ)). Note that we used EX(ˉX|θ)=EX(Xj|θ) for the second equality.
- How many observations n were used to compute ˉX? A larger sample would infer a larger Z.
Example 12.2.3. The number of claims N in a year for a risk in a population has a Poisson distribution with mean λ>0. The risk parameter λ is uniformly distributed over the interval (0,2). Calculate the EPV and VHM for the population.
Show Example Solution
The Bühlmann credibility formula includes values for n, EPV, and VHM: Z=nn+K,K=EPVVHM. If the VHM increases then Z increases. If the EPV increases then Z gets smaller. Credibility Z asymptotically approaches 1 as the number of observations n goes to infinity.
If you multiply the numerator and denominator of the Z formula by (VHM/n) then Z can be rewritten as Z=VHMVHM+(EPV/n). The number of observations n is captured in the term (EPV/n).
Example 12.2.4. The law of total varianceA decomposition of the variance of a random variable into conditional components. specifically, for random variables x and y on the same probability space, var(x) = e[var(y|x)] + var[e(x|y)]. can be written as Var(Y)=E(Var[Y|X])+Var(E[Y|X]). Show that Var(ˉX) = VHM+(EPV/n) and derive a formula for Z in terms of ˉX.
Show Example Solution
The following long example and solution demonstrate how to compute the credibility-weighted estimate with frequency and severity data.
Example 12.2.5. For any risk in a population the number of losses N in a year has a Poisson distribution with parameter λ. Individual loss amounts X for a selected risk are independent of N and are iid with exponential distribution F(x)=1−e−x/β. There are three types of risks in the population as shown below. A risk was selected at random from the population and all losses were recorded over a five-year period. The total amount of losses over the five-year period was 5,000. Use Bühlmann credibility to estimate the annual expected aggregate loss for the risk.
Risk PercentagePoissonExponentialTypeof PopulationParameterParameterA50%λ=0.5β=1000B30%λ=1.0β=1500C20%λ=2.0β=2000
Show Example Solution
In real world applications of Bühlmann credibility the value of K=EPV/VHM must be estimated. Sometimes a value for K is selected using judgment. A smaller K makes estimator ˆμ(θ) more responsive to actual experience ˉX whereas a larger K produces a more stable estimate by giving more weight to μ. Judgment may be used to balance responsiveness and stability. Section 12.5 in this chapter will discuss methods for determining K from data.
Show Quiz Solution
12.3 Bayesian Inference and Bühlmann Credibility
In this section, you learn how to:
- Calculate formulas for expected outcomes for beta-binomial and gamma-Poisson models using Bayes Theorem or Bühlmann credibility .
- Understand the connection between the Bayesian and Bühlmann estimates for conjugate families.
Chapter 9 presents Bayesian inference and modeling and it is assumed that the reader is familiar with that material, in particular, Section 9.3 which discusses conjugate families. This section will compare Bayesian inference with Bühlmann credibility and show connections between the two models.
First we will look at a Bayesian model. Suppose a risk has n observed losses x1,x2,...,xn. These losses will be represented by the vector x=(x1,x2,...,xn) which are realizations of the random variables X=(X1,X2,...,Xn) which we will assume are iid.
A risk with risk parameter θ has expected loss μ(θ)=EX(X|θ). If the risk had losses x then EΘ(μ(θ)|x) is the conditional expectation of μ(θ) given outcomes x. The expected loss is updated to reflect the observations.
The expectation EΘ(μ(θ)|x) can be calculated using the conditional density function fX|Θ=θ(x|θ) and the posterior distribution fΘ|X=x(θ|x) μ(θ)=EΘ(X|θ)=∫xfX|Θ=θ(x|θ)dxEΘ(μ(θ)|x)=∫μ(θ)fΘ|X=x(θ|x)dθ. The integrations are over the support of the distributions. The posterior distribution comes from Bayes theoremA probability law that expresses conditional probability of the event a given the event b in terms of the conditional probability of the event b given the event a and the unconditional probability of a fΘ|X=x(θ)=fX|Θ=θ(x)fΘ(θ)fX(x). The first function fX|Θ=θ(x) in the numerator is the likelihood function and the second term fΘ(θ) is the prior distribution. The denominator fX(x) is the joint density function for n losses x=(x1,…,xn).
Now we turn to the Bühlmann model. The Bühlmann credibility estimate for EΘ(μ(θ)|x) is ˆμ(θ)=Zˉx+(1−Z)μ. This model requires credibility Z and collective mean μ which can be computed from the distributions used in the Bayesian model described above, if the distributions are known.
Example 12.3.1. Using n, conditional density function fX|Θ=θ(x|θ), and prior distribution fΘ(θ), calculate credibility Z and collective mean μ for the Bühlmann credibility estimate ˆμ(θ).
Show Example Solution
12.3.1 Beta-Binomial Model
Section 9.3.1 of the chapter Bayesian Inference and Modeling analyzes the beta-binomial model.
The number of successes x in m Bernoulli trials with unknown probability of success q is given by the binomial distribution pX|Q=q(x)=(mx)qx(1−q)m−x,x∈{0,1,...,m}. The probability of success q is modeled with the conjugate prior for the binomial distribution: the beta distribution with parameters a and b. The pdf of the beta distribution is fQ(q)=Γ(a+b)Γ(a)Γ(b)qa−1(1−q)b−1,q∈[0,1]. Given x successes in m Bernoulli trials the posterior distribution for q was shown in 9.3.1 to be fQ|X=x(q)=Γ(a+b+m)Γ(a+x)Γ(b+m−x)qa+x−1(1−q)b+m−x−1, which is a beta distribution with parameters a+x and b+m−x.
The mean for the beta distribution with parameters a and b is E(Q)=a/(a+b). Given x successes in m trials in the beta-binomial model the mean of the posterior distribution is
EQ(Q|x)=a+xa+b+m.
The Bühlmann credibility estimate for EQ(Q|x) exactly matches the Bayesian estimate as demonstrated in the following example.
Example 12.3.2. The probability that a coin toss will yield heads is q. The prior distribution for probability q is beta with parameters a and b. On m tosses of the coin there were exactly x heads. Use Bühlmann credibility to estimate the expected value of q.
Show Example Solution
12.3.2 Gamma-Poisson Model
The chapter Bayesian Inference and Modeling also analyzes the gamma-Poisson conjugate family. The results are summarized below.
Let X=(X1,X2,...,Xn) be a sample of iid Poisson random variables with pXi|Λ=λ(xi)=λxie−λxi!,xi∈R+. Define the prior distribution for Λ to be gamma with parameters α and θ, fΛ(λ)=1Γ(α)θαλα−1e−λθ,λ∈R+. Given a sample of n observations x=(x1,x2,...,xn), the posterior distribution of Λ is fΛ|X=x(λ)=1Γ(α+x)(θnθ+1)α+xλα+x−1e−λ(nθ+1)θ, where x=∑ni=1xi, which is a gamma distribution with parameters α+x and θnθ+1.
We are going to make a minor change to the formulas above. Instead of a scale parameter θ, we will substitute a rate parameter β=1/θ. The posterior distribution becomes fΛ|X=x(λ)=(β+n)(α+x)Γ(α+x)λα+x−1e−λ(β+n). The posterior distribution is gamma and the expected value for Λ given observations x is easy to calculate: EΛ(Λ|x1,…,xn)=α+xβ+n. Prior to collecting a sample, E(Λ)=α/β using parameters from the prior distribution.
The Bühlmann credibility model will give the same result as seen in the following example.
Example 12.3.3 The number of claims X each year for a risk has a Poisson distribution p(x)=λxe−λ/x!. Each risk in a class has a constant risk parameter λ. Parameter λ is gamma distributed across the class with pdf f(λ)=βαλα−1e−λβ/Γ(α). A risk was selected at random from the population and observed for n years. The claims counts were x=(x1,x2,...,xn). Use Bühlmann credibility to calculate the expected value of λ for the selected risk.
Show Example Solution
We will leave it to the reader to compare the Bayesian and Bühlmann models for the normal-normal conjugate family.
12.3.3 Exact Credibility
As demonstrated in the prior section, the Bühlmann credibility estimates for the beta-binomial and gamma-Poisson models exactly match the Bayesian analysis results. The term exact credibilityA situation where the bayesian credibility estimate matches that of the buhlmann credibility estimate is applied in these situations. Exact credibility may occur if the probability distribution for Xj is in the linear exponential family and the prior distribution is a conjugate prior. Besides these two models, examples of exact credibility also include Gamma-Exponential and Normal-Normal models.
If the conditional mean EΘ(μ(θ)|X1,...,Xn) is linear in the mean of the observations, then the Bühlmann credibility estimate will coincide with the Bayesian estimate. More information about exact credibility can be found in (Bühlmann and Gisler 2005), (Klugman, Panjer, and Willmot 2012), and (Tse 2009).
12.4 Bühlmann-Straub Credibility
In this section, you learn how to:
- Compute a credibility-weighted estimate for the expected loss for a risk or group of risks using the Bühlmann-Straub model.
- Determine the credibility Z assigned to observations.
- Calculate required values including the Expected Value of the Process Variance (EPV), Variance of the Hypothetical Means (VHM) and collective mean μ.
- Recognize situations when the Bühlmann-Straub model is appropriate.
With standard Bühlmann credibility as described in the prior section, losses X1,…,Xn arising from a selected policyholder are assumed to be iid. If the subscripts indicate year 1, year 2 and so on up to year n, then the iid assumption means that the policyholder has the same exposure to loss every year. For commercial insurance this assumption is frequently violated.
Consider a commercial policyholder that uses a fleet of vehicles in its business. In year 1 there are m1 vehicles in the fleet, m2 vehicles in year 2, .., and mn vehicles in year n. The exposure to loss from ownership and use of this fleet is not constant from year to year. The annual losses for the fleet are not iid.
Define Yjk to be the loss for the kth vehicle in the fleet for year j. Then, the total losses for the fleet in year j are Yj1+⋯+Yjmj where we are adding up the losses for each of the mj vehicles. In the Bühlmann-Straub model it is assumed that random variables Yjk are iid across all vehicles and years for the policyholder. With this assumption the means EY(Yjk|θ)=μ(θ) and variances VarY(Yjk|θ)=σ2(θ) are the same for all vehicles and years. The quantity μ(θ) is the expected loss and σ2(θ) is the variance in the loss for one year for one vehicle for a policyholder with risk parameter θ.
If Xj is the average loss per unit of exposure in year j, Xj=(Yj1+⋯+Yjmj)/mj, then EY(Xj|θ)=μ(θ) and VarY(Xj|θ)=σ2(θ)/mj for a policyholder with risk parameter θ. Note that we used the fact that the Yjk are iid for a given policyholder. The average loss per vehicle for the entire n-year period is ˉX=1mn∑j=1mjXj,m=n∑j=1mj. It follows that EY(ˉX|θ)=μ(θ) and VarY(ˉX|θ)=σ2(θ)/m where μ(θ) and σ2(θ) are the mean and variance for a single vehicle for one year for the policyholder.
Example 12.4.1. Prove that VarY(ˉX|θ)=σ2(θ)/m for a risk with risk parameter θ.
Show Example Solution
The Buhlmann-Straub credibilityAn extension of the buhlmann credibility model that allows for varying exposure by year estimate is: ˆμ(θ)=Zˉx+(1−Z)μ with θ=a risk parameter that identifies a policyholder's risk levelˆμ(θ)=estimated expected loss for one exposure for the policyholderwith loss experience ˉXˉx=1mn∑j=1mjxj is the average loss per exposure for m exposures.xj is the average loss per exposure and mj is the number of exposures in year j.Z=credibility assigned to m exposures μ=expected loss for one exposure for randomly chosen policyholder from population.
Note that ˆμ(θ) is the estimator for the expected loss for one exposure. If the policyholder has mj exposures then the expected loss is mjˆμ(θ).
In Example 12.2.4, it was shown that Z=VarΘ(EX(ˉX|θ))/Var(ˉX) where ˉX is the average loss for n observations. In equation (12.3) the ˉX is the average loss for m exposures and the same Z formula can be used: Z=VarΘ(EY(ˉX|θ))Var(ˉX)=VarΘ(EY(ˉX|θ))EΘ(VarY(ˉX|θ))+VarΘ(EY(ˉX|θ)). (Note that Xj is a sum of Yjk’s and ˉX is an average of Yjk’s.) The denominator was expanded using the law of total varianceA decomposition of the variance of a random variable into conditional components. specifically, for random variables x and y on the same probability space, var(x) = e[var(y|x)] + var[e(x|y)].. As noted above EY(ˉX|θ)=μ(θ) so VarΘ(EY(ˉX|θ))=VarΘ(μ(θ))=VHM. Because VarY(ˉX|θ)=σ2(θ)/m it follows that EΘ(VarY(ˉX|θ))=EΘ(σ2(θ))/m = EPV/m. Making these substitutions and using a little algebra gives Z=mm+K,K=EPVVHM. This is the same Z as for Bühlmann credibility except number of exposures m replaces number of years or observations n.
Example 12.4.2. A commercial automobile policyholder had the following exposures and claims over a three-year period: YearNumber of VehiclesNumber of Claims19521243154
- The number of claims in a year for each vehicle in the policyholder’s fleet is Poisson distributed with the same mean (parameter) λ.
- Parameter λ is distributed among the policyholders in the population with pdf f(λ)=6λ(1−λ) with 0<λ<1.
The policyholder has 18 vehicles in its fleet in year 4. Use Bühlmann-Straub credibility to estimate the expected number of policyholder claims in year 4.
Show Example Solution
12.5 Estimating Credibility Parameters
In this section, you learn how to:
- Perform nonparametric estimation with the Bühlmann and Bühlmann-Straub credibility models.
- Identify situations when semiparametric estimation is appropriate.
- Use data to approximate the EPV and VHM.
The examples in this chapter have provided assumptions for calculating credibility parameters. In actual practice the actuary must use real world data and judgment to determine credibility parameters.
12.5.1 Nonparametric Estimation for Bühlmann and Bühlmann-Straub Models
Bayesian analysis as described previously requires assumptions about a prior distribution and likelihood. It is possible to produce estimates without these assumptions and these methods are often referred to as empirical Bayes methodsCredibility methods that estimate the credibility weight without using any assumptions about prior distributions or likelihoods, instead relying only on empirical data. Bühlmann and Bühlmann-Straub credibility with parameters estimated from the data are included in the category of empirical Bayes methods.
Bühlmann Model. First we will address the simpler Bühlmann model. Assume that there are r risks in a population. For risk i with risk parameter θi the losses for n periods are Xi1,…,Xin. The losses for a given risk are iid across periods as assumed in the Bühlmann model. For risk i the sample mean is ˉXi=∑nj=1Xij/n and the unbiased sample process variance is s2i=∑nj=1(Xij−ˉXi)2/(n−1). An unbiased estimator for the EPV can be calculated by taking the average of s2i for the r risks in the population: ^EPV=1rr∑i=1s2i=1r(n−1)r∑i=1n∑j=1(Xij−ˉXi)2. The individual risk means ˉXi for i=1,…,r can be used to estimate the VHM. An unbiased estimator of Var(ˉXi) is ^Var(ˉXi)=1r−1r∑i=1(ˉXi−ˉX)2 and ˉX=1rr∑i=1ˉXi, but Var(ˉXi) is not the VHM. The total variance formula or unconditional variance formula is Var(ˉXi)=EX(VarΘ(ˉXi|θi))+VarΘ(EX(ˉXi|θi)). The VHM is the second term on the right because μ(θi)=EX(ˉXi|θi) is the hypothetical mean for risk i. So, VHM=Var(ˉXi)−EΘ(VarX(ˉXi|θi)). As discussed previously in Section 12.2.2, EPV/n = EΘ(VarX[ˉXi|θi]) and using the above estimators gives an estimator for the VHM: ^VHM=1r−1r∑i=1(ˉXi−ˉX)2−^EPVn. Although the expected loss for a risk with parameter θi is μ(θi)=EX(ˉXi|θi), the variance of the sample mean ˉXi is greater than or equal to the variance of the hypothetical means: Var(ˉXi)≥Var(μ(θi)). The variance in the sample means Var(ˉXi) includes both the variance in the hypothetical means plus a process variance term.
In some cases formula (12.6) can produce a negative value for ^VHM because of the subtraction of ^EPV/n, but a variance cannot be negative. The process variance within risks is so large that it overwhelms the measurement of the variance in means between risks. In this case we cannot use this method to determine the values needed for Bühlmann credibility.
Example 12.5.1. Two policyholders had claims over a three-year period as shown in the table below. Estimate the expected number of claims for each policyholder using Bühlmann credibility and calculating necessary parameters from the data. YearRisk ARisk B102211302
Show Example Solution
Example 12.5.2. Two policyholders had claims over a three-year period as shown in the table below. Calculate the nonparametric estimate for the VHM.
YearRisk ARisk B133200303
Show Example Solution
Bühlmann-Straub Model Empirical formulas for EPV and VHM in the Bühlmann-Straub model are more complicated because a risk’s number of exposures can change from one period to another. Also, the number of experience periods does not have to be constant across the population. First some definitions:
- Xij is the losses per exposure for risk i in period j. Losses can refer to number of claims or amount of loss. There are r risks so i=1,…,r.
- ni is the number of observation periods for risk i
- mij is the number of exposures for risk i in period j for j=1,…,ni
Risk i with risk parameter θi has mij exposures in period j which means that the losses per exposure random variable can be written as Xij=(Yi1+⋯+Yimij)/mij. Random variable Yik is the loss for one exposure. For risk i losses Yik are iid with mean EY(Yik|θi)=μ(θi) and process variance VarY(Yik|θi)=σ2(θi). It follows that VarY(Xij|θi) = σ2(θi)/mij.
Two more important definitions are:
- ˉXi=1mi∑nij=1mijXij with mi=∑nij=1mij. ˉXi is the average loss per exposure for risk i for all observation periods combined.
- ˉX=1m∑ri=1miˉXi with m=∑ri=1mi. ˉX is the average loss per exposure for all risks for all observation periods combined.
An unbiased estimator for the process variance σ2(θi) of one exposure for risk i is si2=∑nij=1mij(Xij−ˉXi)2ni−1. The weights mij are applied to the squared differences because the Xij are the averages of mij exposures. The weighted average of the sample variances si2 for each risk i in the population with weights proportional to the number of (ni−1) observation periods will produce the expected value of the process variance (EPV) estimate ^EPV=∑ri=1(ni−1)si2∑ri=1(ni−1)=∑ri=1∑nij=1mij(Xij−ˉXi)2∑ri=1(ni−1). The quantity ^EPV is an unbiased estimator for the expected value of the process variance of one exposure for a risk chosen at random from the population.
To calculate an estimator for the variance in the hypothetical means (VHM) the squared differences of the individual risk sample means ˉXi and population mean ˉX are used. An unbiased estimator for the VHM is ^VHM=∑ri=1mi(ˉXi−ˉX)2−(r−1)^EPVm−1m∑ri=1m2i. This complicated formula is necessary because of the varying number of exposures. Proofs that the EPV and VHM estimators shown above are unbiased can be found in several references mentioned at the end of this chapter including (Bühlmann and Gisler 2005), (Klugman, Panjer, and Willmot 2012), and (Tse 2009).
Example 12.5.3. Two policyholders had claims shown in the table below. Estimate the expected number of claims per vehicle for each policyholder using Bühlmann-Straub credibility and calculating parameters from the data.
PolicyholderYear 1Year 2Year 3Year 4ANumber of claims0223AInsured vehicles1222BNumber of claims0012BInsured vehicles0234
Show Example Solution
12.5.2 Semiparametric Estimation for Bühlmann and Bühlmann-Straub Models
In the prior section on nonparametric estimationStatistical method that allows the functional form of a fit from data to have no assumed prior distribution, constraints, or parameters, there were no assumptions about the distribution of the losses per exposure Xij. Assuming that the Xij have a particular distribution and using properties of the distribution along with the data to determine credibility parameters is referred to as semiparametric estimationCredibility method that assumes a distribution for the loss per exposure random variable and otherwise uses empirical data.
An example of semiparametric estimation would be the assumption of a Poisson distribution when estimating claim frequencies. The Poisson distribution has the property that the mean and variance are identical and this property can simplify calculations. The following simple example comes from the prior section but now includes a Poisson assumption about claim frequencies.
Example 12.5.4. Two policyholders had claims over a three-year period as shown in the table below. Assume that the number of claims for each risk has a Poisson distribution. Estimate the expected number of claims for each policyholder using Bühlmann credibility and calculating necessary parameters from the data. YearRisk ARisk B102211302
Show Example Solution
Although we assumed that the number of claims for each risk was Poisson distributed in the prior example, we did not need this additional assumption because there was enough information to use nonparametric estimation. In fact, the Poisson assumption might not be appropriate because for risk B the sample mean is not equal to the sample variance: ˉxB=53≠s2B=13.
The following example is commonly used to demonstrate a situation where semiparametric estimation is needed. There is insufficient information for nonparametric estimation but with the Poisson assumption, estimates can be calculated.
Example 12.5.5. A portfolio of 2,000 policyholders generated the following claims profile during a five-year period: Number of ClaimsIn 5 YearsNumber of policies092316822249370451525
In your model you assume that the number of claims for each policyholder has a Poisson distribution and that a policyholder’s expected number of claims is constant through time. Use Bühlmann credibility to estimate the annual expected number of claims for policyholders with 3 claims during the five-year period.
Show Example Solution
12.6 Limited Fluctuation Credibility
In this section, you learn how to:
- Calculate full credibility standards for number of claims, average size of claims, and aggregate losses.
- Learn how the relationship between means and variances of underlying distributions affects full credibility standards.
- Determine credibility-weight Z using the square-root partial credibility formula.
Limited fluctuation credibilityA credibility method that attempts to limit fluctuations in its estimates, also called “classical credibility” and “American credibility,” was given this name because the method explicitly attempts to limit fluctuations in estimates for claim frequencies, severities, or losses. For example, suppose that you want to estimate the expected number of claims N for a group of risks in an insurance rating class. How many risks are needed in the class to ensure that a specified level of accuracy is attained in the estimate? First the question will be considered from the perspective of how many claims are needed.
12.6.1 Full Credibility for Claim Frequency
Let N be a random variable representing the number of claims for a group of risks, for example, risks within a particular rating classification. The observed number of claims will be used to estimate μN=E[N], the expected number of claims. How big does μN need to be to get a good estimate? One way to quantify the accuracy of the estimate would be with a statement like: ``The observed value of N should be within 5% of μN at least 90% of the time.” Writing this as a mathematical expression would give Pr. Generalizing this statement by letting the range parameter k replace 5\% and probability level p replace 0.90 gives the equation \begin{equation} \Pr[(1-k) \mu_N \leq N \leq (1+k) \mu_N] \geq p . \tag{12.7} \end{equation} The expected number of claims required for the probability on the left-hand side of (12.7) to equal p is called the full credibility standardThe threshold of experience necessary to assign 100% credibility to the insured’s own experience.
If the expected number of claims is greater than or equal to the full credibility standard then full credibility can be assigned to the data so Z=1. Usually the expected value \mu_N is not known so full credibility will be assigned to the data if the actual observed number of claims n is greater than or equal to the full credibility standard. The k and p values must be selected and the actuary may rely on experience, judgment, and other factors in making the choices.
Subtracting \mu_N from each term in (12.7) and dividing by the standard deviation \sigma_N of N gives \begin{equation} \Pr\left[\frac{-k\mu_N}{\sigma_N}\leq \frac{N-\mu_N}{\sigma_N} \leq \frac{k\mu_N}{\sigma_N}\right] \geq p. \tag{12.8} \end{equation} In limited fluctuation credibility the standard normal distribution is used to approximate the distribution of (N-\mu_N)/\sigma_N. If N is the sum of many claims from a large group of similar risks and the claims are independent, then the approximation may be reasonable.
Let y_p be the value such that \Pr[-y_p\leq \frac{N-\mu_N}{\sigma_N} \leq y_p]=\Phi(y_p)-\Phi(-y_p)=p where \Phi( ) is the cumulative distribution function of the standard normalCumulative density function for the normal distribution with mean 0 and standard deviation 1. Because \Phi(-y_p)=1-\Phi(y_p), the equality can be rewritten as 2\Phi(y_p)-1=p. Solving for y_p gives y_p=\Phi^{-1}((p+1)/2) where \Phi^{-1}( ) is the inverse of \Phi( ).
Equation (12.8) will be satisfied if k\mu_N/\sigma_N \geq y_p assuming the normal approximation. First we will consider this inequality for the case when N has a Poisson distribution: \Pr[N=n] = \lambda^n\textrm{e}^{-\lambda}/n!. Because \lambda=\mu_N=\sigma_N^2 for the Poisson, taking square roots yields \mu_N^{1/2}=\sigma_N. So, k\mu_N/\mu_N^{1/2} \geq y_p which is equivalent to \mu_N \geq (y_p/k)^2. Let’s define \lambda_{kp} to be the value of \mu_N for which equality holds. Then the full credibility standard for the Poisson distribution is \begin{equation} \lambda_{kp} = \left(\frac{y_p}{k}\right)^2 \textrm{with } y_p=\Phi^{-1}((p+1)/2). \tag{12.9} \end{equation} If the expected number of claims \mu_N is greater than or equal to \lambda_{kp} then equation (12.7) is assumed to hold and full credibility can be assigned to the data. As noted previously, because \mu_N is usually unknown, full credibility is given if the observed number of claims n satisfies n \geq \lambda_{kp}.
Example 12.6.1. The full credibility standard is set so that the observed number of claims is to be within 5% of the expected value with probability p=0.95. If the number of claims has a Poisson distribution find the number of claims needed for full credibility.
Show Example Solution
If claims are not Poisson distributed then equation (12.8) does not imply (12.9). Setting the upper bound of (N-\mu_N)/\sigma_N in (12.8) equal to y_p gives k\mu_N/\sigma_N=y_p. Squaring both sides and moving everything to the right side except for one of the \mu_N’s gives \mu_N=(y_p/k)^2(\sigma_N^2/\mu_N). This is the full credibility standard for frequency and will be denoted by n_f, \begin{equation} n_f=\left(\frac{y_p}{k}\right)^2\left(\frac{\sigma_N^2}{\mu_N}\right)=\lambda_{kp}\left(\frac{\sigma_N^2}{\mu_N}\right). \tag{12.10} \end{equation} This is the same equation as the Poisson full credibility standard except for the (\sigma_N^2/\mu_N) multiplier. When the claims distribution is Poisson this extra term is one because the variance equals the mean.
Example 12.6.2. The full credibility standard is set so that the total number of claims is to be within 5\% of the observed value with probability p=0.95. The number of claims has a negative binomial distribution, \Pr(N=x)={x+r-1\choose x} \left(\frac{1}{1+\beta}\right)^r \left(\frac{\beta}{1+\beta}\right)^x , with \beta=1. Calculate the full credibility standard.
Show Example Solution
We see that the negative binomial distribution with (\sigma_N^2/\mu_N)>1 requires more claims for full credibility than a Poisson distribution for the same k and p values. The next example shows that a binomial distribution which has (\sigma_N^2/\mu_N)<1 will need fewer claims for full credibility.
Example 12.6.3. The full credibility standard is set so that the total number of claims is to be within 5\% of the observed value with probability p=0.95. The number of claims has a binomial distribution \Pr(N=x)={m\choose x}q^x(1-q)^{m-x}. Calculate the full credibility standard for q=1/4.
Show Example Solution
Rather than using expected number of claims to define the full credibility standard, the number of exposures can be used for the full credibility standard. An exposure is a measure of risk. For example, one car insured for a full year would be one car-year. Two cars each insured for exactly one-half year would also result in one car-year. Car-years attempt to quantify exposure to loss. Two car-years would be expected to generate twice as many claims as one car-year if the vehicles have the same risk of loss. To translate a full credibility standard denominated in terms of number of claims to a full credibility standard denominated in exposures one needs a reasonable estimate of the expected number of claims per exposure.
Example 12.6.4. The full credibility standard should be selected so that the observed number of claims will be within 5\% of the expected value with probability p=0.95. The number of claims has a Poisson distribution. If one exposure is expected to have about 0.20 claims per year, find the number of exposures needed for full credibility.
Show Example Solution
Frequency can be defined as the number of claims per exposure. Letting m denote the number of exposures. Then, if observed claim frequency N/m is used to estimate \mathrm{E}(N/m):
\Pr[(1-k)\mathrm{E}(N/m)\leq N/m \leq(1+k)\mathrm{E}(N/m)] \geq p.
Because the number of exposures is not a random variable, \mathrm{E}(N/m)=\mathrm{E}(N)/m=\mu_N/m and the prior equation becomes
\Pr\left[(1-k)\frac{\mu_N}{m}\leq \frac{N}{m} \leq(1+k)\frac{\mu_N}{m}\right] \geq p.
Multiplying through by m results in equation (12.7) at the beginning of the section. The full credibility standards that were developed for estimating expected number of claims also apply to frequency.
12.6.3 Full Credibility for Severity
Let X be a random variable representing the size of one claim. Claim severity is \mu_X=\mathrm{E}(X). Suppose that {X_1,X_2, \ldots, X_n} is a random sample of n claims that will be used to estimate claim severity \mu_X. The claims are assumed to be iid. The average value of the sample is \bar{X}=\frac{1}{n}\left(X_1+X_2+\cdots+X_n\right). How big does n need to be to get a good estimate? Note that n is not a random variable whereas it is in the aggregate loss model.
In Section 12.6.1 the accuracy of an estimator for frequency was defined by requiring that the number of claims lie within a specified interval about the mean number of claims with a specified probability. For severity this requirement is \Pr[(1-k)\mu_X\leq \bar{X} \leq(1+k)\mu_X ]\geq p , where k and p need to be specified. Following the steps in Section 12.6.1, the mean claim severity \mu_X is subtracted from each term and the standard deviation of the claim severity estimator \sigma_{\bar{X}} is divided into each term yielding \Pr\left[\frac{-k~\mu_X}{\sigma_{\bar{X}}}\leq (\bar{X}-\mu_X)/\sigma_{\bar{X}} \leq \frac{k~\mu_X}{\sigma_{\bar{X}}}\right] \geq p . As in prior sections, it is assumed that (\bar{X}-\mu_X)/\sigma_{\bar{X}} is approximately normally distributed and the prior equation is satisfied if k\mu_X/\sigma_{\bar{X}}\geq y_p with y_p=\Phi^{-1}((p+1)/2). Because \bar{X} is the average of individual claims X_1, X_2,\dots, X_n, its standard deviation is equal to the standard deviation of an individual claim divided by \sqrt{n}: \sigma_{\bar{X}}=\sigma_X/\sqrt{n}. So, k\mu_X/(\sigma_X/\sqrt{n})\geq y_p and with a little algebra this can be rewritten as n \geq (y_p/k)^2(\sigma_X/\mu_X)^2. The full credibility standard for severity is \begin{equation} n_X=\left(\frac{y_p}{k}\right)^2\left(\frac{\sigma_X}{\mu_X}\right)^2=\lambda_{kp}\left(\frac{\sigma_X}{\mu_X}\right)^2. \tag{12.12} \end{equation} Note that the term \sigma_X/\mu_X is the coefficient of variationStandard deviation divided by the mean of a distribution, to measure variability in terms of units of the mean for an individual claim. Even though \lambda_{kp} is the full credibility standard for frequency given a Poisson distribution, there is no assumption about the distribution for the number of claims.
Example 12.6.6. Individual loss amounts are independently and identically distributed with a Type II Pareto distribution F(x)=1-[\theta/(x+\theta)]^{\alpha}. How many claims are required for the average severity of observed claims to be within 5\% of the expected severity with probability p=0.95?
Show Example Solution
12.6.4 Partial Credibility
In prior sections full credibility standards were calculated for estimating frequency (n_f), pure premium (n_{PP}), and severity (n_X) - in this section these full credibility standards will be denoted by n_{0}. In each case the full credibility standard was the expected number of claims required to achieve a defined level of accuracy when using empirical data to estimate an expected value. If the observed number of claims is greater than or equal to the full credibility standard then a full credibility weight Z=1 is given to the data.
In limited fluctuation credibility, credibility weights Z assigned to data are Z= \left\{ \begin{array}{ll} \sqrt{n /n_{0}} &\textrm{if } n < n_{0} \\ 1 & \textrm{if } n \ge n_{0} , \end{array} \right.
where n_0 is the full credibility standard. The quantity n is the number of claims for the data that is used to estimate the expected frequency, severity, or pure premium.
Example 12.6.7. The number of claims has a Poisson distribution. Individual loss amounts are independently and identically distributed with a Type II Pareto distribution F(x)=1-[\theta/(x+\theta)]^{\alpha}. Assume that \alpha=3. The number of claims and loss amounts are independent. The full credibility standard is that the observed pure premium should be within 5\% of the expected value with probability p=0.95. What credibility Z is assigned to a pure premium computed from 1,000 claims?
Show Example Solution
Limited fluctuation credibility uses the formula Z=\sqrt{n/n_0} to limit the fluctuation in the credibility-weighted estimate to match the fluctuation allowed for data with expected claims at the full credibility standard. Variance or standard deviation is used as the measure of fluctuation. Next we show an example to explain why the square-root formula is used.
Suppose that average claim severity is being estimated from a sample of size n that is less than the full credibility standard n_0=n_X. Applying credibility theory, the estimate \hat{\mu}_X would be \hat{\mu}_X=Z\bar{X}+(1-Z)M_X , with \bar{X}=(X_1+X_2+\cdots+X_n)/n and iid random variables X_i representing the sizes of individual claims. The complement of credibility is applied to M_X which could be last year’s estimated average severity adjusted for inflation, the average severity for a much larger pool of risks, or some other relevant quantity selected by the actuary. It is assumed that the variance of M_X is zero or negligible. With this assumption \mathrm{Var}(\hat{\mu}_X)=\mathrm{Var}(Z\bar{X})=Z^2\mathrm{Var}(\bar{X})=\frac{n}{n_0}\mathrm{Var}(\bar{X}). Because \bar{X}=(X_1+X_2+\cdots+X_n)/n it follows that \mathrm{Var}(\bar{X})=\mathrm{Var}(X_i)/n where random variable X_i is one claim. So, \mathrm{Var}(\hat{\mu}_X)=\frac{n}{n_0}\mathrm{Var}(\bar{X})=\frac{n}{n_0}\frac{\mathrm{Var}(X_i)}{n}=\frac{\mathrm{Var}(X_i)}{n_0}. The last term is exactly the variance of a sample mean \bar{X} when the sample size is equal to the full credibility standard n_0=n_X.
12.6.5 Full Credibility Standard for Limited Fluctuation Credibility
Limited-fluctuation credibility requires a full credibility standard. The general formula for aggregate losses or pure premium, as obtained in formula (12.11), is n_S=\left(\frac{y_p}{k}\right)^2\left[\left(\frac{\sigma_N^2}{\mu_N}\right)+\left(\frac{\sigma_X}{\mu_X}\right)^2\right] , with N representing number of claims and X the size of claims. If one assumes \sigma_X=0 then the full credibility standard for frequency results. If \sigma_N=0 then the full credibility formula for severity follows. Probability p and k value are often selected using judgment and experience.
In practice it is often assumed that the number of claims is Poisson distributed so that \sigma_N^2/\mu_N=1. In this case the formula can be simplified to \begin{equation*} n_S=\left(\frac{y_p}{k}\right)^2\left[\frac{\mathrm{E}(X^2)}{(\mathrm{E}(X))^2}\right]. \end{equation*} An empirical mean and second moment for the sizes of individual claim losses can be computed from past data, if available.
Show Quiz Solution
12.7 Balancing Credibility Estimators
The credibility weighted model \hat{\mu}(\theta_i)=Z_i\bar{X}_i+(1-Z_i)\bar{X}, where \bar{X}_i is the loss per exposure for risk i and \bar{X} is loss per exposure for the population, can be used to estimate the expected loss for risk i. The overall mean is \bar{X}=\sum_{i=1}^r(m_i/m) \bar{X}_i where m_i and m are number of exposures for risk i and population, respectively.
For the credibility weighted estimators to be in balance we want \bar{X}=\sum_{i=1}^r(m_i/m) \bar{X}_i=\sum_{i=1}^r(m_i/m) \hat{\mu}(\theta_i). If this equation is satisfied then the estimated losses for each risk will add up to the population total, an important goal in ratemaking, but this may not happen if the complement of credibility is applied to \bar{X}.
To achieve balance, we will set \hat{M}_X as the amount that is applied to the complement of credibility and thus analyze the following equation: \sum_{i=1}^r(m_i/m) \bar{X}_i=\sum_{i=1}^r(m_i/m) \left\{Z_i\bar{X}_i+(1-Z_i) \cdot \hat{M}_X\right\} . A little algebra gives \sum_{i=1}^r m_i \bar{X}_i=\sum_{i=1}^r m_i Z_i\bar{X}_i + \hat{M}_X\sum_{i=1}^r m_i(1-Z_i), and \hat{M}_X=\frac{\sum_{i=1}^r m_i(1-Z_i)\bar{X}_i}{\sum_{i=1}^r m_i(1-Z_i)}. Using this value for \hat{M}_X will bring the credibility weighted estimators into balance.
If credibilities Z_i were computed using the Bühlmann-Straub model, then Z_i=m_i/(m_i+K). The prior formula can be simplified using the following relationship m_i(1-Z_i)=m_i\left(1-\frac{m_i}{m_i+K}\right)=m_i\left(\frac{(m_i+K)-m_i}{m_i+K}\right)=KZ_i . Therefore, an amount when applied to the complement of credibility that will bring the credibility-weighed estimators into balance with the overall mean loss per exposure is \hat{M}_X=\frac{\sum_{i=1}^r Z_i \bar{X}_i}{\sum_{i=1}^r Z_i}.
Example 12.7.1. An example from the nonparametric Bühlmann-Straub section had the following data for two risks. Find an amount for the complement of credibility \hat{M}_X that will produce credibility-weighted estimates that are in balance. \small{ \begin{array}{|c|c|c|c|c|c|} \hline \text{Policyholder} & & \text{Year 1} & \text{Year 2} & \text{Year 3} & \text{Year 4} \\ \hline \text{A} & \text{Number of claims} & 0 & 2 & 2 & 3 \\ \hline \text{A} & \text{Insured vehicles} & 1 & 2 & 2 & 2\\ \hline & & & & & \\ \hline \text{B} & \text{Number of claims} & 0 & 0 & 1 & 2\\ \hline \text{B} & \text{Insured vehicles} & 0 & 2 & 3 & 4\\ \hline \end{array} }
Show Example Solution
Show Quiz Solution
12.8 Further Resources and Contributors
Contributor
- Gary Dean, Ball State University is the author of the initial version of this chapter. Email: cgdean@bsu.edu for chapter comments and suggested improvements.
- Chapter reviewers include: Liang (Jason) Hong, Ambrose Lo, Ranee Thiagarajah, Hongjuan Zhou.