Chapter 2 Frequency Modeling

Chapter Description

A primary focus for insurers is estimating the magnitude of aggregate claims it must bear under its insurance contracts. Aggregate claims are affected by both the frequency and the severity of the insured event. Decomposing aggregate claims into these two components, each of which warrant significant attention, is essential for analysis and pricing. This chapter discusses frequency distributions, summary measures, and parameter estimation techniques.

2.1 Basic Frequency Distributions


In this section, you learn how to:

  • Determine quantities that summarize a distribution such as the (cumulative) distribution as well as moments such as the mean and variance.
  • Define and compute the moment and probability generating functions.
  • Describe and understand relationships among three important frequency distributions, the binomial, Poisson, and negative binomial distributions.

Video: Basic Frequency Distributions

Overheads: Basic Frequency Distributions (Click Tab to View)

Hide
Hide
Hide
Hide
Hide
Hide
Hide
Hide

2.1.1 Exercise. Representing the Number of Cyber Events with a Binomial Distribution

Assignment Text

Cyber risk for a firm is based on its liability for a data breach involving sensitive customer information, such as Social Security numbers, credit card numbers, account numbers, driver’s license numbers and health records. A company models its cyber risk using the following assumptions:

  1. In any calendar quarter, there can be at most one cyber event.
  2. In any calendar quarter, the probability of a cyber event is 0.1.
  3. The numbers of cyber events in different calendar quarters are mutually independent.

Based on these assumptions, you represent the total number of cyber events as a binomial distribution.

Instructions

  • Identify the binomial distribution parameters for the number of cyber events in a 12 quarter (3 year) period.
  • Calculate the probability that there are \(k\) cyber events for \(k = 0, 1, \ldots, 12\) using the function dbinom().
  • Create a data frame to present your results. All values within a specific column should be rounded to the same number of decimal places. Display the data frame.
  • Graph the probability mass function of the number of cyber events using the function barplot(). Include a descriptive title and axis labels for the graph.


eyJsYW5ndWFnZSI6InIiLCJwcmVfZXhlcmNpc2VfY29kZSI6IiMgTm9uZSBmb3IgdGhpcyBleGVyY2lzZSIsInNhbXBsZSI6InNpemUgPSA/P1xuQ3liZXIucHJvYiA9ID8/XG5cbm91dGNvbWVzIDwtIDA6c2l6ZVxucG1mIDwtIGRiaW5vbSh4PW91dGNvbWVzLCBzaXplPXNpemUsIHByb2I9Q3liZXIucHJvYikgIFxucG1mXG5cbnBtZjEgPC0gcm91bmQocG1mLCBkaWdpdHMgPSA2KVxub3V0Y29tZWRhdGFmIDwtIHJiaW5kKG91dGNvbWVzLCA/Pylcbm91dGNvbWVkYXRhZiBcblxuYmFycGxvdChwbWYsIG5hbWVzLmFyZz0/PywgY29sPVwibGlnaHRncmVlblwiLCB5bGFiID0gPz8sIHhsYWIgPSA/PykiLCJzb2x1dGlvbiI6InNpemUgPSAxMlxuQ3liZXIucHJvYiA9IDAuMDFcblxub3V0Y29tZXMgPC0gMDpzaXplXG5wbWYgPC0gZGJpbm9tKHg9b3V0Y29tZXMsIHNpemU9c2l6ZSwgcHJvYj1DeWJlci5wcm9iKSAgXG5wbWZcblxucG1mMSA8LSByb3VuZChwbWYsIGRpZ2l0cyA9IDYpXG5vdXRjb21lZGF0YWYgPC0gcmJpbmQob3V0Y29tZXMsIHBtZjEpXG5vdXRjb21lZGF0YWYgXG5cbmJhcnBsb3QocG1mLCBuYW1lcy5hcmc9b3V0Y29tZXMsIGNvbD1cImxpZ2h0Z3JlZW5cIiwgeWxhYiA9IFwiUHJvYmFiaWxpdHlcIiwgeGxhYiA9IFwiQ3liZXIgRXZlbnRzXCIpIiwic2N0Ijoic2l6ZW1zZyA8LSBcIkRpZCB5b3UgY29ycmVjdGx5IHNwZWNpZnkgdGhlIG9iamVjdCBgc2l6ZWA/XCJcbmV4KCkgJT4lIGNoZWNrX29iamVjdChcInNpemVcIiwgdW5kZWZpbmVkX21zZyA9IFwiTWFrZSBzdXJlIHRvIG5vdCByZW1vdmUgYHNpemVgIVwiKSAlPiUgY2hlY2tfZXF1YWwoaW5jb3JyZWN0X21zZz1zaXplbXNnKVxuQ3liZXIucHJvYm1zZyA8LSBcIkRpZCB5b3UgY29ycmVjdGx5IHNwZWNpZnkgdGhlIG9iamVjdCBgQ3liZXIucHJvYmA/XCJcbmV4KCkgJT4lIGNoZWNrX29iamVjdChcIkN5YmVyLnByb2JcIiwgdW5kZWZpbmVkX21zZyA9IFwiTWFrZSBzdXJlIHRvIG5vdCByZW1vdmUgYEN5YmVyLnByb2JgIVwiKSAlPiUgY2hlY2tfZXF1YWwoaW5jb3JyZWN0X21zZz1DeWJlci5wcm9ibXNnKVxub3V0Y29tZWRhdGFmbXNnIDwtIFwiRGlkIHlvdSBjb3JyZWN0bHkgc3BlY2lmeSB0aGUgb2JqZWN0IGBvdXRjb21lZGF0YWZgP1wiXG5leCgpICU+JSBjaGVja19vYmplY3QoXCJvdXRjb21lZGF0YWZcIiwgdW5kZWZpbmVkX21zZyA9IFwiTWFrZSBzdXJlIHRvIG5vdCByZW1vdmUgYG91dGNvbWVkYXRhZmAhXCIpICU+JSBjaGVja19lcXVhbChpbmNvcnJlY3RfbXNnPW91dGNvbWVkYXRhZm1zZylcbmJhcnBsb3Rtc2cgPC0gXCJEaWQgeW91IGNvcnJlY3RseSBzcGVjaWZ5IHRoZSBhcmd1bWVudHM/XCJcbmV4KCkgJT4lIGNoZWNrX2Z1bmN0aW9uKFwiYmFycGxvdFwiLCBub3RfY2FsbGVkX21zZyA9YmFycGxvdG1zZykgJT4lIGNoZWNrX3Jlc3VsdChlcnJvcl9tc2c9YmFycGxvdG1zZykgJT4lIGNoZWNrX2VxdWFsKClcblxuc3VjY2Vzc19tc2coXCJFeGNlbGxlbnQgam9iISBNb3N0IGNvbnN1bWVycyBoYXZlIHNvbWUgZmFtaWxpYXJpdHkgd2l0aCB0aGUgYmlub21pYWwgZGlzdHJpYnV0aW9ucywgc28gaXQgaXMgaW1wb3J0YW50IGluIGFjdHVhcmlhbCBhcHBsaWNhdGlvbnMgdG8gYmUgYWJsZSB0byBjb252ZXkgbWVzc2FnZXMgYWJvdXQgY291bnRzIGluIHRoZSBjb250ZXh0IG9mIHRoaXMgZGlzdHJpYnV0aW9uLlwiKSIsImhpbnQiOiJSZXZpZXcgdGhlIGltcG9ydGFudCBjb3VudCBkaXN0cmlidXRpb25zIGluIDxhIGhyZWY9XCJodHRwczovL29wZW5hY3R0ZXh0cy5naXRodWIuaW8vTG9zcy1EYXRhLUFuYWx5dGljcy9DLUZyZXF1ZW5jeS1Nb2RlbGluZy5odG1sI1M6aW1wb3J0YW50LWZyZXF1ZW5jeS1kaXN0cmlidXRpb25zXCI+TG9zcyBEYXRhIEFuYWx5dGljcyBGcmVxdWVuY3kgTW9kZWxpbmc8L2E+IGNoYXB0ZXIuIn0=

2.1.2 Exercise. Representing the Number of Cyber Events with a Poisson Distribution

Assignment Text

Another company is also concerned with cyber risk. Compared to the company in the prior exercise, this company is larger and does not wish to assume at most one cyber event in a quarter. Moreover, it believes that the distribution of cyber events is a function of its technical support staff size that has increased over time. Thus, it wishes to model the number of cyber events as a Poisson distribution with expected number of events as:

\[ {\small \begin{array}{l|cccccc} \hline \text{Quarter} & 1 & 2 & 3& 4& 5& 6 \\\hline \text{Expected Number}& 0.1 & 0.1 & 0.1 & 0.1 & 0.2 & 0.2 \\ \hline \text{Quarter} & 7 & 8 & 9& 10& 11& 12 \\ \hline \text{Expected Number}& 0.2 & 0.2 & 0.3 & 0.3 & 0.4 & 0.5 \\ \hline \end{array} } \]

Assuming that the numbers of cyber events in different calendar quarters are mutually independent, the total number of cyber events over the three year period (12 quarters) has a Poisson distribution with expected number \(\lambda = 2.7\). (Recall that the sum of independent Poisson random variables has a Poisson distribution.)

Instructions

  • Graph the probability mass function (pmf) of the number of cyber events using the function barplot().
  • Calculate the pmf and the cumulative probability distribution function for \(k = 0, 1, \ldots, 12\) cyber events using the functions dpois(), ppois(). Create a data frame to present your results and display the data frame.
  • From your data frame, identify the 95th percentile. Confirm your result using the qpois() function.
  • How are the probabilities changing over time? Plot the probability of zero cyber events versus quarter number \(k = 0, 1, \ldots, 12\).


eyJsYW5ndWFnZSI6InIiLCJzYW1wbGUiOiJsYW1iZGF2ZWMgPC0gYyhyZXAoMC4xLDQpLCByZXAoMC4yLDQpLCAwLjMsIDAuMywgMC40LCAwLjUpXG5DeWJlci5sYW1iZGEgPC0gc3VtKD8/KVxuXG5vdXRjb21lcyA8LSAwOjEyXG5DeWJlci5wbWYgPC0gZHBvaXMoeD0/PywgbGFtYmRhPUN5YmVyLmxhbWJkYSlcbmJhcnBsb3QoQ3liZXIucG1mLCBuYW1lcy5hcmc9b3V0Y29tZXMsIGNvbD1cImxpZ2h0Z3JlZW5cIiwgeWxhYiA9IFwiUHJvYmFiaWxpdHlcIiwgeGxhYiA9IFwiQ3liZXIgRXZlbnRzXCIpXG5cbkN5YmVyLmRmIDwtIHBwb2lzKG91dGNvbWVzLCA/Pylcbm91dGNvbWVkYXRhZiA8LSByYmluZChvdXRjb21lcywgQ3liZXIucG1mLCBDeWJlci5kZilcbm91dGNvbWVkYXRhZiBcblxucXBvaXMoPz8sIGxhbWJkYT1DeWJlci5sYW1iZGEpXG5cblByb2J6ZXJvIDwtIGRwb2lzKHg9MCwgbGFtYmRhPWxhbWJkYXZlYylcbnBsb3QoPz8sUHJvYnplcm8sIHR5cGUgPSBcImxcIiwgeGxhYiA9IFwicXVhcnRlclwiLCB5bGFiID0gXCJQcm9iIG9mIFplcm9cIikiLCJzb2x1dGlvbiI6ImxhbWJkYXZlYyA8LSBjKHJlcCgwLjEsNCksIHJlcCgwLjIsNCksIDAuMywgMC4zLCAwLjQsIDAuNSlcbkN5YmVyLmxhbWJkYSA8LSBzdW0obGFtYmRhdmVjKVxuXG5vdXRjb21lcyA8LSAwOjEyXG5DeWJlci5wbWYgPC0gZHBvaXMoeD1vdXRjb21lcywgbGFtYmRhPUN5YmVyLmxhbWJkYSlcbmJhcnBsb3QoQ3liZXIucG1mLCBuYW1lcy5hcmc9b3V0Y29tZXMsIGNvbD1cImxpZ2h0Z3JlZW5cIiwgeWxhYiA9IFwiUHJvYmFiaWxpdHlcIiwgeGxhYiA9IFwiQ3liZXIgRXZlbnRzXCIpXG5cbkN5YmVyLmRmIDwtIHBwb2lzKG91dGNvbWVzLCBsYW1iZGE9Q3liZXIubGFtYmRhKVxub3V0Y29tZWRhdGFmIDwtIHJiaW5kKG91dGNvbWVzLCBDeWJlci5wbWYsIEN5YmVyLmRmKVxub3V0Y29tZWRhdGFmIFxuXG5xcG9pcygwLjk1LCBsYW1iZGE9Q3liZXIubGFtYmRhKVxuXG5Qcm9iemVybyA8LSBkcG9pcyh4PTAsIGxhbWJkYT1sYW1iZGF2ZWMpXG5wbG90KDE6MTIsUHJvYnplcm8sIHR5cGUgPSBcImxcIiwgeGxhYiA9IFwicXVhcnRlclwiLCB5bGFiID0gXCJQcm9iIG9mIFplcm9cIikiLCJzY3QiOiJDeWJlci5sYW1iZGFtc2cgPC0gXCJEaWQgeW91IGNvcnJlY3RseSBzcGVjaWZ5IHRoZSBvYmplY3QgYEN5YmVyLmxhbWJkYWA/XCJcbmV4KCkgJT4lIGNoZWNrX29iamVjdChcIkN5YmVyLmxhbWJkYVwiLCB1bmRlZmluZWRfbXNnID0gXCJNYWtlIHN1cmUgdG8gbm90IHJlbW92ZSBgQ3liZXIubGFtYmRhYCFcIikgJT4lIGNoZWNrX2VxdWFsKGluY29ycmVjdF9tc2c9Q3liZXIubGFtYmRhbXNnKVxuQ3liZXIucG1mbXNnIDwtIFwiRGlkIHlvdSBjb3JyZWN0bHkgc3BlY2lmeSB0aGUgb2JqZWN0IGBDeWJlci5wbWZgP1wiXG5leCgpICU+JSBjaGVja19vYmplY3QoXCJDeWJlci5wbWZcIiwgdW5kZWZpbmVkX21zZyA9IFwiTWFrZSBzdXJlIHRvIG5vdCByZW1vdmUgYEN5YmVyLnBtZmAhXCIpICU+JSBjaGVja19lcXVhbChpbmNvcnJlY3RfbXNnPUN5YmVyLnBtZm1zZylcbkN5YmVyLmRmbXNnIDwtIFwiRGlkIHlvdSBjb3JyZWN0bHkgc3BlY2lmeSB0aGUgb2JqZWN0IGBDeWJlci5kZmA/XCJcbmV4KCkgJT4lIGNoZWNrX29iamVjdChcIkN5YmVyLmRmXCIsIHVuZGVmaW5lZF9tc2cgPSBcIk1ha2Ugc3VyZSB0byBub3QgcmVtb3ZlIGBDeWJlci5kZmAhXCIpICU+JSBjaGVja19lcXVhbChpbmNvcnJlY3RfbXNnPUN5YmVyLmRmbXNnKVxuXG5zdWNjZXNzX21zZyhcIkV4Y2VsbGVudCBqb2IhIFRoZSBQb2lzc29uIGlzIHRoZSBiYXNpYyBjb3VudCBkaXN0cmlidXRpb24gaW4gYWN0dWFyaWFsIGFwcGxpY2F0aW9ucy4gQ29uc3VtZXJzIG9mdGVuIGhhdmUgZmFtaWxpYXJpdHkgd2l0aCB0aGUgZXhwZWN0ZWQgbnVtYmVyIG9mIGV2ZW50cyBhcyB3ZWxsIGFzIHRoZSBwcm9iYWJpbGl0eSBvZiB6ZXJvIGV2ZW50cy4gSW4gdGhlIGxhc3QgcGFydCwgeW91IGxlYXJuZWQgaG93IHRvIHBvcnRyYXkgaW5jcmVhc2luZyBtZWFucyBpbiB0ZXJtcyBvZiBkZWNyZWFzaW5nIGZyZXF1ZW5jeSBvZiB6ZXJvIGV2ZW50cywgY29ubmVjdGluZyB0aGVzZSBpbXBvcnRhbnQgc3VtbWFyeSBtZWFzdXJlcy5cIikiLCJoaW50IjoiWW91IHdpbGwgZmluZCBtb3JlIGRpc2N1c3Npb24gb2YgcGVyY2VudGlsZXMgaW4gU2VjdGlvbiA0LjEgb2YgPGEgaHJlZj1cImh0dHBzOi8vb3BlbmFjdHRleHRzLmdpdGh1Yi5pby9Mb3NzLURhdGEtQW5hbHl0aWNzL0MtTW9kZWxTZWxlY3Rpb24uaHRtbCNTOk1TOk5vblBhckluZlwiPkxvc3MgRGF0YSBBbmFseXRpY3M8L2E+LiJ9

2.1.3 Exercise. Comparing Basic Count Distributions

Assignment Text

Your supervisor would like to have a better understanding of relationships among three important count distributions, the binomial, Poisson, and negative binomial. You could develop a mathematical appendix, demonstrating how:

  • A binomial distribution with parameters \(m \to \infty\) and \(mq \to \lambda\) converges to a Poisson distribution.
  • A negative binomial distribution with mean parameter \(r \beta = \lambda\) and dispersion parameter \(r\) converges to a Poisson distribution as \(r \to \infty\).

Instead, you decide to demonstrate these relationships graphically.

Instructions

  • Plot the probability mass function (pmf) of the binomial distribution with \(m=12\) and \(q=0.1\) over \(k = 0, 1, \ldots, 12\) potential outcomes. Superimpose on this plot a Poisson pmf with the same mean using the lines() function.
  • Repeat this step with the same Poisson distribution but, for the binomial distribution, multiply \(m\) by 5 and divide \(q\) by 5. (You should see how the binomial becomes a better approximation to the Poisson.)
  • Determine the pmf of the negative binomial distribution with mean parameter \(r \beta\) and dispersion parameter \(r=1\) using the function dnbinom(). Use the same mean as for the binomial distribution.
  • Demonstrate the convergence of the negative binomial to the Poisson by creating side-by-side graphical comparisons. That is, using the par(mfrow = …) syntax, compare:
    • A plot of this negative binomial distribution pmf, superimposed with baseline Poisson distribution (with the same mean).
    • A plot of the negative binomial distribution pmf with the same mean and dispersion parameter \(r=100\), superimposed with baseline Poisson distribution.


eyJsYW5ndWFnZSI6InIiLCJzYW1wbGUiOiJvdXRjb21lcyA8LSAwOjEyXG5cbkJpbm9tMS5wbWYgIDwtIGRiaW5vbSh4PW91dGNvbWVzLCBzaXplPTEyLCA/PykgIFxuUG9pc3Nvbi5wbWYgPC0gZHBvaXMoeD1vdXRjb21lcywgbGFtYmRhPTEuMilcbnBsb3Qob3V0Y29tZXMsID8/KVxubGluZXMob3V0Y29tZXMsID8/KVxuXG5CaW5vbTIucG1mIDwtIGRiaW5vbSh4PW91dGNvbWVzLCBzaXplPT8/LCBwcm9iPT8/KSBcbnBsb3Qob3V0Y29tZXMsIEJpbm9tMi5wbWYpXG5saW5lcyhvdXRjb21lcywgUG9pc3Nvbi5wbWYpXG5cbiggTmVnQmlub20xLnBtZiA8LSBkbmJpbm9tKG91dGNvbWVzLCBtdT0/PyBzaXplPT8/KSApXG5cbnBhcihtZnJvdyA9IGMoMSwyKSlcbnBsb3Qob3V0Y29tZXMsIE5lZ0Jpbm9tMS5wbWYpXG5saW5lcyhvdXRjb21lcywgUG9pc3Nvbi5wbWYpXG5OZWdCaW5vbTIucG1mIDwtIGRuYmlub20ob3V0Y29tZXMsIG11PT8/LCBzaXplPT8/KSAgIFxucGxvdChvdXRjb21lcywgTmVnQmlub20yLnBtZilcbmxpbmVzKG91dGNvbWVzLCBQb2lzc29uLnBtZikiLCJzb2x1dGlvbiI6Im91dGNvbWVzIDwtIDA6MTJcblxuQmlub20xLnBtZiAgPC0gZGJpbm9tKHg9b3V0Y29tZXMsIHNpemU9MTIsIHByb2I9MC4xKSAgXG5Qb2lzc29uLnBtZiA8LSBkcG9pcyh4PW91dGNvbWVzLCBsYW1iZGE9MS4yKVxucGxvdChvdXRjb21lcywgQmlub20xLnBtZilcbmxpbmVzKG91dGNvbWVzLCBQb2lzc29uLnBtZilcblxuQmlub20yLnBtZiA8LSBkYmlub20oeD1vdXRjb21lcywgc2l6ZT0xMio1LCBwcm9iPTAuMS81KSBcbnBsb3Qob3V0Y29tZXMsIEJpbm9tMi5wbWYpXG5saW5lcyhvdXRjb21lcywgUG9pc3Nvbi5wbWYpXG5cbiggTmVnQmlub20xLnBtZiA8LSBkbmJpbm9tKG91dGNvbWVzLCBtdT0xLjIsIHNpemU9MSkgKVxuXG5wYXIobWZyb3cgPSBjKDEsMikpXG5wbG90KG91dGNvbWVzLCBOZWdCaW5vbTEucG1mKVxubGluZXMob3V0Y29tZXMsIFBvaXNzb24ucG1mKVxuTmVnQmlub20yLnBtZiA8LSBkbmJpbm9tKG91dGNvbWVzLCBtdT0xLjIsIHNpemU9MTAwKSAgIFxucGxvdChvdXRjb21lcywgTmVnQmlub20yLnBtZilcbmxpbmVzKG91dGNvbWVzLCBQb2lzc29uLnBtZikiLCJzY3QiOiJCaW5vbTEucG1mbXNnIDwtIFwiRGlkIHlvdSBjb3JyZWN0bHkgc3BlY2lmeSB0aGUgb2JqZWN0IGBCaW5vbTEucG1mYD9cIlxuZXgoKSAlPiUgY2hlY2tfb2JqZWN0KFwiQmlub20xLnBtZlwiLCB1bmRlZmluZWRfbXNnID0gXCJNYWtlIHN1cmUgdG8gbm90IHJlbW92ZSBgQmlub20xLnBtZmAhXCIpICU+JSBjaGVja19lcXVhbChpbmNvcnJlY3RfbXNnPUJpbm9tMS5wbWZtc2cpXG5CaW5vbTIucG1mbXNnIDwtIFwiRGlkIHlvdSBjb3JyZWN0bHkgc3BlY2lmeSB0aGUgb2JqZWN0IGBCaW5vbTIucG1mYD9cIlxuZXgoKSAlPiUgY2hlY2tfb2JqZWN0KFwiQmlub20yLnBtZlwiLCB1bmRlZmluZWRfbXNnID0gXCJNYWtlIHN1cmUgdG8gbm90IHJlbW92ZSBgQmlub20yLnBtZmAhXCIpICU+JSBjaGVja19lcXVhbChpbmNvcnJlY3RfbXNnPUJpbm9tMi5wbWZtc2cpXG5OZWdCaW5vbTEucG1mbXNnIDwtIFwiRGlkIHlvdSBjb3JyZWN0bHkgc3BlY2lmeSB0aGUgb2JqZWN0IGBOZWdCaW5vbTEucG1mYD9cIlxuZXgoKSAlPiUgY2hlY2tfb2JqZWN0KFwiTmVnQmlub20xLnBtZlwiLCB1bmRlZmluZWRfbXNnID0gXCJNYWtlIHN1cmUgdG8gbm90IHJlbW92ZSBgTmVnQmlub20xLnBtZmAhXCIpICU+JSBjaGVja19lcXVhbChpbmNvcnJlY3RfbXNnPU5lZ0Jpbm9tMS5wbWZtc2cpXG5OZWdCaW5vbTIucG1mbXNnIDwtIFwiRGlkIHlvdSBjb3JyZWN0bHkgc3BlY2lmeSB0aGUgb2JqZWN0IGBOZWdCaW5vbTIucG1mYD9cIlxuZXgoKSAlPiUgY2hlY2tfb2JqZWN0KFwiTmVnQmlub20yLnBtZlwiLCB1bmRlZmluZWRfbXNnID0gXCJNYWtlIHN1cmUgdG8gbm90IHJlbW92ZSBgTmVnQmlub20yLnBtZmAhXCIpICU+JSBjaGVja19lcXVhbChpbmNvcnJlY3RfbXNnPU5lZ0Jpbm9tMi5wbWZtc2cpXG5cbnN1Y2Nlc3NfbXNnKFwiRXhjZWxsZW50IGpvYiEgQXMgeW91IHN0dWR5IGFjdHVhcmlhbCBkYXRhIGFwcGxpY2F0aW9ucywgZG8gbm90IHNoeSBhd2F5IGZyb20gdGhlIG1hdGhlbWF0aWNzISBPZnRlbiwgdGhlIGRpc2NpcGxpbmUgdXNlcyB0aGUgcmlnb3Igb2YgbWF0aGVtYXRpY3MgdG8gY3J5c3RhbGxpemUgaW1wb3J0YW50IGlkZWFzLiBPbmUgb2YgeW91ciBqb2JzIGlzIHRvIGJlIGFibGUgdG8gY29tbXVuaWNhdGUgdGhlc2UgaWRlYXMgdG8gYSBicm9hZGVyIHB1YmxpYy4gWW91IHdpbGwgZmluZCB0aGF0IGdyYXBoaWNhbCBwcmVzZW50YXRpb25zIGFyZSBoZWxwZnVsIGluIHRoaXMgcmVnYXJkLlwiKSIsImhpbnQiOiJUbyByZXZpZXcgdGhlIG5lZ2F0aXZlIGJpbm9taWFsIGRpc3RyaWJ1dGlvbiwgc2VlIDxhIGhyZWY9XCJodHRwczovL2V3ZnJlZXMuZ2l0aHViLmlvL0xvc3MtRGF0YS1BbmFseXRpY3MvQy1TdW1tYXJ5RGlzdHJpYnV0aW9ucy5odG1sI1M6RGlzY3JldGVEaXN0cmlidXRpb25zXCI+RGlzY3JldGUgRGlzdHJpYnV0aW9uczwvYT4ifQ==

2.2 The (a,b,0) Class


In this section, you learn how to:

  • Define the \((a,b,0)\) class of frequency distributions.
  • Discuss the importance of the recursive relationship underpinning this class of distributions.
  • Identify conditions under which this general class reduces to each of the binomial, Poisson, and negative binomial distributions.

Video: The (a,b,0) Class

Overheads: The (a,b,0) Class (Click Tab to View)

Hide
Hide
Hide

2.2.1 Exercise. Determining Probabilities Recursively

Assignment Text

The \((a,b,0)\) class can be expressed through the recursion

\[ \Pr(N=k) = p_k = p_{k-1} \left( a+ \frac{b}{k}\right) , \quad k\ge 1 , \]

where \(N\) is a count random variable. From Section 2.3 of the text, we know that:

  • if \(a=0\) and \(b=\lambda\), then the recursion yields a Poisson distribution with parameter \(\lambda\)
  • if \(a=-q/(1-q)\) and \(b=(m+1)q/(1+q)\), then the recursion yields a binomial distribution with parameters \(m\) and \(q\)
  • if \(a=\beta/(1+\beta)\) and \(b=(r-1)\beta/(1+\beta)\), then the recursion yields a negative binomial distribution with parameters \(r\) and \(\beta\).

The \((a,b,0)\) class is a foundation for other, more complex, distributions, so let us check that we understand the recursions.

Instructions

  • For \(k=0, \ldots, 20\), using \(\lambda = 1.24\) obtain \(p_k\) values using dpois().
  • With the starting value \(p_0 = \exp(-\lambda)\), use the recursive \((a,b,0)\) formula to obtain these probability values.
  • Check your code by summing over the absolute value of the differences between the dpois and the \((a,b,0)\) generated values.
  • For \(k=0, \ldots, 20\), obtain \(p_k\) values using the negative binomial distribution using the function dnbinom(). Use the same mean as for the Poisson distribution but let the variance be 1.1 times the mean. Hint. See the Loss Data Analytics Summary of Distributions for the parameterization used in this short course. It differs from that used by the R package.
  • Use the recursive \((a,b,0)\) formula to obtain these probability values.
  • Check your code by summing over the absolute value of the differences between the dnbinom and the \((a,b,0)\) generated values.


eyJsYW5ndWFnZSI6InIiLCJwcmVfZXhlcmNpc2VfY29kZSI6IiMgTm9uZSBOZWVkZWQgZm9yIHRoaXMgRXhlcmNpc2UiLCJzYW1wbGUiOiJrdmVjIDwtIGMoMDo/PykgXG5Qb2lzc29uLnBfayA8LSBkcG9pcyh4PWt2ZWMsIGxhbWJkYT0/PykgXG5hID0gPz8gXG5iID0gPz9cbmFiMC52ZWNQIDwtICBleHAoLTEuMjQpIC0+IHAubmV3XG5mb3IgKGsgaW4gMjoyMSl7XG4gIHAubmV3IDwtIHAubmV3ICooYStiLyhrLTEpKVxuICBhYjAudmVjUCA8LSBhcHBlbmQoYWIwLnZlY1AscC5uZXcpXG4gIH1cbnN1bShhYnMoUG9pc3Nvbi5wX2sgLSA/PykpXG5cbk5lZ0Jpbm9tLnBfayA8LSBkbmJpbm9tKGt2ZWMsIHByb2I9MS8xLjEsIHNpemUgPSAxMi40KVxuYSA9ID8/IFxuYiA9ID8/XG5hYjAudmVjTkIgPC0gIDEuMSoqKC0xMi40KSAtPiBwLm5ld1xuZm9yIChrIGluID8/KXtcbiAgcC5uZXcgPC0gcC5uZXcgKihhK2IvaylcbiAgYWIwLnZlY05CIDwtIGFwcGVuZChhYjAudmVjTkIscC5uZXcpXG4gIH1cbnN1bShhYnMoTmVnQmlub20ucF9rIC0gPz8pKSIsInNvbHV0aW9uIjoia3ZlYyA8LSBjKDA6MjApIFxuUG9pc3Nvbi5wX2sgPC0gZHBvaXMoeD1rdmVjLCBsYW1iZGE9MS4yNCkgXG5hID0gMDsgXG5iID0gMS4yNFxuYWIwLnZlY1AgPC0gIGV4cCgtMS4yNCkgLT4gcC5uZXdcbmZvciAoayBpbiAxOjIwKXtcbiAgcC5uZXcgPC0gcC5uZXcgKihhK2IvaylcbiAgYWIwLnZlY1AgPC0gYXBwZW5kKGFiMC52ZWNQLHAubmV3KVxuICB9XG5zdW0oYWJzKFBvaXNzb24ucF9rIC0gYWIwLnZlY1ApKVxuXG5OZWdCaW5vbS5wX2sgPC0gZG5iaW5vbShrdmVjLCBwcm9iPTEvMS4xLCBzaXplID0gMTIuNClcbmEgPSAuMS8xLjE7IFxuYiA9IDExLjQqKDAuMSkvMS4xXG5hYjAudmVjTkIgPC0gIDEuMSoqKC0xMi40KSAtPiBwLm5ld1xuZm9yIChrIGluIDI6MjEpe1xuICBwLm5ldyA8LSBwLm5ldyAqKGErYi8oay0xKSlcbiAgYWIwLnZlY05CIDwtIGFwcGVuZChhYjAudmVjTkIscC5uZXcpXG4gIH1cbnN1bShhYnMoTmVnQmlub20ucF9rIC0gYWIwLnZlY05CKSkiLCJzY3QiOiJrdmVjbXNnIDwtIFwiRGlkIHlvdSBjb3JyZWN0bHkgc3BlY2lmeSB0aGUgb2JqZWN0IGBrdmVjYD9cIlxuZXgoKSAlPiUgY2hlY2tfb2JqZWN0KFwia3ZlY1wiLCB1bmRlZmluZWRfbXNnID0gXCJNYWtlIHN1cmUgdG8gbm90IHJlbW92ZSBga3ZlY2AhXCIpICU+JSBjaGVja19lcXVhbChpbmNvcnJlY3RfbXNnPWt2ZWNtc2cpXG5Qb2lzc29uLnBfa21zZyA8LSBcIkRpZCB5b3UgY29ycmVjdGx5IHNwZWNpZnkgdGhlIG9iamVjdCBgUG9pc3Nvbi5wX2tgP1wiXG5leCgpICU+JSBjaGVja19vYmplY3QoXCJQb2lzc29uLnBfa1wiLCB1bmRlZmluZWRfbXNnID0gXCJNYWtlIHN1cmUgdG8gbm90IHJlbW92ZSBgUG9pc3Nvbi5wX2tgIVwiKSAlPiUgY2hlY2tfZXF1YWwoaW5jb3JyZWN0X21zZz1Qb2lzc29uLnBfa21zZylcbmFiMC52ZWNQbXNnIDwtIFwiRGlkIHlvdSBjb3JyZWN0bHkgc3BlY2lmeSB0aGUgb2JqZWN0IGBhYjAudmVjUGA/XCJcbmV4KCkgJT4lIGNoZWNrX29iamVjdChcImFiMC52ZWNQXCIsIHVuZGVmaW5lZF9tc2cgPSBcIk1ha2Ugc3VyZSB0byBub3QgcmVtb3ZlIGBhYjAudmVjUGAhXCIpICU+JSBjaGVja19lcXVhbChpbmNvcnJlY3RfbXNnPWFiMC52ZWNQbXNnKVxuTmVnQmlub20ucF9rbXNnIDwtIFwiRGlkIHlvdSBjb3JyZWN0bHkgc3BlY2lmeSB0aGUgb2JqZWN0IGBOZWdCaW5vbS5wX2tgP1wiXG5leCgpICU+JSBjaGVja19vYmplY3QoXCJOZWdCaW5vbS5wX2tcIiwgdW5kZWZpbmVkX21zZyA9IFwiTWFrZSBzdXJlIHRvIG5vdCByZW1vdmUgYE5lZ0Jpbm9tLnBfa2AhXCIpICU+JSBjaGVja19lcXVhbChpbmNvcnJlY3RfbXNnPU5lZ0Jpbm9tLnBfa21zZylcbmFiMC52ZWNOQm1zZyA8LSBcIkRpZCB5b3UgY29ycmVjdGx5IHNwZWNpZnkgdGhlIG9iamVjdCBgYWIwLnZlY05CYD9cIlxuZXgoKSAlPiUgY2hlY2tfb2JqZWN0KFwiYWIwLnZlY05CXCIsIHVuZGVmaW5lZF9tc2cgPSBcIk1ha2Ugc3VyZSB0byBub3QgcmVtb3ZlIGBhYjAudmVjTkJgIVwiKSAlPiUgY2hlY2tfZXF1YWwoaW5jb3JyZWN0X21zZz1hYjAudmVjTkJtc2cpXG5cbnN1Y2Nlc3NfbXNnKFwiRXhjZWxsZW50IGpvYiEgQmVpbmcgYWJsZSB0byBkbyBjYWxjdWxhdGlvbnMgcmVjdXJzaXZlbHksIHN1Y2ggYXMgdXNpbmcgZm9yIGxvb3BzLCBpcyBhbiBpbXBvcnRhbnQgZm9yIGFjdHVhcmlhbCBtb2RlbHMuXCIpIiwiaGludCI6IlRoZXJlIGFyZSBzZXZlcmFsIHdheXMgdG8gZG8gcmVjdXJzaW9ucyBpbiA8Y29kZT5SPC9jb2RlPiwgeW91IG1pZ2h0IHN0YXJ0IHdpdGggYSBzaW1wbGUgPGEgaHJlZj1cImh0dHBzOi8vd3d3LnJkb2N1bWVudGF0aW9uLm9yZy9wYWNrYWdlcy9iYXNlL3ZlcnNpb25zLzMuNi4yL3RvcGljcy9Db250cm9sXCI+Zm9yKCk8L2E+IGV4cHJlc3Npb24uIn0=

2.2.2 Exercise. Reverse Engineering and Recursive R Functions

Assignment Text

You want to generate probabilities from the \((a,b,0)\) class so that later on you will be able to modify your code to produce alternative distributions (the subject of Section 2.5), a bit of so-called “reverse engineering.” In the first part of this problem, from a known distribution (e.g., the binomial), you will compute the ratio

\[ \frac{k ~p_k}{p_{k-1}} = a k + b , \quad k\ge 1 , \]

to determine values of \(a\) and \(b\). The second part of this problem utilizes recursive R functions. This is a function defined in terms of the same function but at a prior iteration. The classic example is the factorial function \(f(n) = n!\) so \(f(n) = n f(n-1)\). For example, you can define the function

recursive.factorial <- function(n) {
  if (n == 0)    return (1)
  else           return (n * recursive.factorial(n-1))
  }

to determine that recursive.factorial(5) = 120. In this problem, we use a recursive R function to generate \((a,b,0)\) probabilities.

Instructions

  • For \(k=0, \ldots, 4\), using \(\lambda = 1.24\) obtain \(p_k\) values from the dpois() function.
  • Compute the ratio to identify values of \(a\) and \(b\). Hint: In R, use of negative indexing is permitted. For example, pop[-c(3, 7)] removes the third and seventh elements of pop.
  • For \(k=0, \ldots, 4\), using \(q =0.1\) and \(m=10\) obtain \(p_k\) values from the dbinom() function.
  • Compute the ratio to identify values of \(a\) and \(b\).
  • To check your work, develop the recursive \((a,b,0)\) function to determine \(p_4 = \Pr(N=4)\) based on the binomial distribution in the prior part.


eyJsYW5ndWFnZSI6InIiLCJwcmVfZXhlcmNpc2VfY29kZSI6IiMgTm9uZSBOZWVkZWQgZm9yIHRoaXMgRXhlcmNpc2UiLCJzYW1wbGUiOiJrdmVjIDwtIGMoMDo0KSAgXG5Qb2lzc29uLnBfayA8LSBkcG9pcyh4PWt2ZWMsIGxhbWJkYT0xLjI0KSBcbmt2ZWNbPz9dICogUG9pc3Nvbi5wX2tbPz8vUG9pc3Nvbi5wX2tbLWxlbmd0aChrdmVjKV1cblxuQmlub20ucF9rIDwtIGRiaW5vbSh4PWt2ZWMsIHByb2IgPSAwLjEsIHNpemUgPSAxMCkgXG5vdXR2ZWMgPC0ga3ZlY1s/P10gKiBCaW5vbS5wX2tbPz9dL0Jpbm9tLnBfa1stbGVuZ3RoKGt2ZWMpXVxuKCBhID0gPz8gLSA/PyApXG4oIGIgPSA/PyAtIGEgKVxuXG4jIFJlY3Vyc2l2ZSBmdW5jdGlvbiB0byBmaW5kIGZhY3RvcmlhbFxucmVjdXJzaXZlLmFiMCA8LSBmdW5jdGlvbihrKSB7XG4gIGlmIChrID09IDApICByZXR1cm4gKCBCaW5vbS5wX2tbMV0gKVxuICBlbHNlICAgICAgICAgcmV0dXJuICggKGErYi9rKSAqIHJlY3Vyc2l2ZS5hYjAoay0xKSApXG4gIH1cbnJlY3Vyc2l2ZS5hYjAoPz8pXG5CaW5vbS5wX2tbPz9dIiwic29sdXRpb24iOiJrdmVjIDwtIGMoMDo0KSAgXG5Qb2lzc29uLnBfayA8LSBkcG9pcyh4PWt2ZWMsIGxhbWJkYT0xLjI0KSBcbmt2ZWNbLTFdICogUG9pc3Nvbi5wX2tbLTFdL1BvaXNzb24ucF9rWy1sZW5ndGgoa3ZlYyldXG5cbkJpbm9tLnBfayA8LSBkYmlub20oeD1rdmVjLCBwcm9iID0gMC4xLCBzaXplID0gMTApIFxub3V0dmVjIDwtIGt2ZWNbLTFdICogQmlub20ucF9rWy0xXS9CaW5vbS5wX2tbLWxlbmd0aChrdmVjKV1cbiggYSA9IG91dHZlY1syXSAtIG91dHZlY1sxXSApXG4oIGIgPSBvdXR2ZWNbMV0gLSBhIClcblxuIyBSZWN1cnNpdmUgZnVuY3Rpb24gdG8gZmluZCBmYWN0b3JpYWxcbnJlY3Vyc2l2ZS5hYjAgPC0gZnVuY3Rpb24oaykge1xuICBpZiAoayA9PSAwKSAgcmV0dXJuICggQmlub20ucF9rWzFdIClcbiAgZWxzZSAgICAgICAgIHJldHVybiAoIChhK2IvaykgKiByZWN1cnNpdmUuYWIwKGstMSkgKVxuICB9XG5yZWN1cnNpdmUuYWIwKDQpXG5CaW5vbS5wX2tbNV0iLCJzY3QiOiJQb2lzc29uLnBfa21zZyA8LSBcIkRpZCB5b3UgY29ycmVjdGx5IHNwZWNpZnkgdGhlIG9iamVjdCBgUG9pc3Nvbi5wX2tgP1wiXG5leCgpICU+JSBjaGVja19vYmplY3QoXCJQb2lzc29uLnBfa1wiLCB1bmRlZmluZWRfbXNnID0gXCJNYWtlIHN1cmUgdG8gbm90IHJlbW92ZSBgUG9pc3Nvbi5wX2tgIVwiKSAlPiVjaGVja19lcXVhbChpbmNvcnJlY3RfbXNnPVBvaXNzb24ucF9rbXNnKVxuQmlub20ucF9rbXNnIDwtIFwiRGlkIHlvdSBjb3JyZWN0bHkgc3BlY2lmeSB0aGUgb2JqZWN0IGBCaW5vbS5wX2tgP1wiXG5leCgpICU+JSBjaGVja19vYmplY3QoXCJCaW5vbS5wX2tcIiwgdW5kZWZpbmVkX21zZyA9IFwiTWFrZSBzdXJlIHRvIG5vdCByZW1vdmUgYEJpbm9tLnBfa2AhXCIpICU+JWNoZWNrX2VxdWFsKGluY29ycmVjdF9tc2c9Qmlub20ucF9rbXNnKVxucmVjdXJzaXZlLmFiMG1zZyA8LSBcIkRpZCB5b3UgY29ycmVjdGx5IHNwZWNpZnkgdGhlIG9iamVjdCBgcmVjdXJzaXZlLmFiMGA/XCJcbmV4KCkgJT4lIGNoZWNrX29iamVjdChcInJlY3Vyc2l2ZS5hYjBcIiwgdW5kZWZpbmVkX21zZyA9IFwiTWFrZSBzdXJlIHRvIG5vdCByZW1vdmUgYHJlY3Vyc2l2ZS5hYjBgIVwiKSAlPiVjaGVja19lcXVhbChpbmNvcnJlY3RfbXNnPXJlY3Vyc2l2ZS5hYjBtc2cpXG5cbnN1Y2Nlc3NfbXNnKFwiU3VwZXJiISBXb3JraW5nIGZvcndhcmRzIGFuZCBiYWNrd2FyZHMgaXMgaW1wb3J0YW50IGluIGNvbXBsZXggdGFza3MuIFRoZSBpZGVhIG9mICdyZXZlcnNlIGVuZ2luZWVyaW5nJyBpcyBjb21tb24gaW4gY29tcGxleCBidXNpbmVzcyBzZXR0aW5ncy4gXCIpIiwiaGludCI6IllvdSBtaWdodCB3aXNoIHRvIGxlYXJuIG1vcmUgYWJvdXQgPGEgaHJlZj1cImh0dHBzOi8vd3d3LmRhdGFtZW50b3IuaW8vci1wcm9ncmFtbWluZy9yZWN1cnNpb24vXCI+UiByZWN1cnNpdmUgZnVuY3Rpb25zPC9hPiJ9

2.3 Estimating Frequency Distributions


In this section, you learn how to:

  • Define a likelihood for a sample of observations from a discrete distribution.
  • Define the maximum likelihood estimator (mle) for a random sample of observations from a discrete distribution.
  • Calculate the mle for the binomial, Poisson, and negative binomial distributions.

Video: Estimating Frequency Distributions

Overheads: Estimating Frequency Distributions (Click Tab to View)

Hide
Hide
Hide
Hide
Hide
Hide

2.3.1 Exercise. Count Data Compression - 1

Assignment Text

Raw count data can often be compressed without loss of any statistical information. Typically, the compression is to the sequence of values \(\{m_k\}_{k\geq 0}\), where \(m_k\) is the number of observations equal to \(k\), that is, \(m_k=\sum_{i= 1}^n I(x_i=k).\) In this and the following exercise, we discuss two data structures for this compressed data and suggest implementations in R.

Instructions

  • Store the count data \(\{3, 6, 0, 2, 3, 4, 4, 2, 4, 4, 6, 2, 3, 0, 2, 3, 1, 2, 4, 2\}\) into an array, say \(\bf x\), using the c() (for concatenate) function
  • Use the table() function to generate a frequency table of the data.
  • Initialize an array (say \(\bf m.vec\)) to hold the summarized counts of frequency, \(m_0, m_1, \ldots\). Do this using the rep() function that replicates the values in \(x\). (It is a generic function, and the (internal) default method is described here.)
  • Fill in the appropriate values into the array \(\bf m.vec\). Some illustrative code uses the functions names() (functions to get or set the names of an object), as.integer() (it creates or tests for objects of type “integer”), and as.vector() (it produces a vector of the given length and mode).
  • Use the array \(\bf m.vec\) to determine the mean frequency.
  • With the array \(\bf m.vec\), determine the count distribution using the function barplot() (it creates a bar plot with vertical or horizontal bars).


eyJsYW5ndWFnZSI6InIiLCJzYW1wbGUiOiJ4IDwtIGMoPz8pXG5cbnRhYmxlKD8/KVxuXG5tLnZlYyA8LSByZXAoMCxtYXgoPz8pKzEpXG4jIEZpbGwgaW4gYXBwcm9wcmlhdGVseVxuPz9bYXMuaW50ZWdlcihuYW1lcyh0YWJsZSg/PykpKSsxXSA9IGFzLnZlY3Rvcih0YWJsZSg/PykpXG5cbnN1bSgoMDoobGVuZ3RoKD8/KS0xKSkqPz8pL3N1bSg/PylcblxuYmFycGxvdCg/PyxuYW1lcy5hcmc9MDoobGVuZ3RoKD8/KS0xKSx5bGFiPVwiRnJlcXVlbmN5XCIsIHhsYWI9XCJDb3VudFwiKSIsInNvbHV0aW9uIjoieCA8LSBjKDMsIDYsIDAsIDIsIDMsIDQsIDQsIDIsIDQsIDQsIDYsIDIsIDMsIDAsIDIsIDMsXG4gICAgIDEsIDIsIDQsIDIpXG5cbnRhYmxlKHgpXG5cbm0udmVjIDwtIHJlcCgwLG1heCh4KSsxKVxuIyBGaWxsIGluIGFwcHJvcHJpYXRlbHlcbm0udmVjW2FzLmludGVnZXIobmFtZXModGFibGUoeCkpKSsxXT1hcy52ZWN0b3IodGFibGUoeCkpXG5cbnN1bSgoMDoobGVuZ3RoKG0udmVjKS0xKSkqbS52ZWMpL3N1bShtLnZlYylcblxuYmFycGxvdChtLnZlYyxuYW1lcy5hcmc9MDoobGVuZ3RoKG0udmVjKS0xKSx5bGFiPVwiRnJlcXVlbmN5XCIseGxhYj1cIkNvdW50XCIpIiwic2N0IjoieG1zZyA8LSBcIkRpZCB5b3UgY29ycmVjdGx5IHNwZWNpZnkgdGhlIG9iamVjdCBgeGA/XCJcbmV4KCkgJT4lIGNoZWNrX29iamVjdChcInhcIiwgdW5kZWZpbmVkX21zZyA9IFwiTWFrZSBzdXJlIHRvIG5vdCByZW1vdmUgYHhgIVwiKSAlPiVjaGVja19lcXVhbChpbmNvcnJlY3RfbXNnPXhtc2cpXG5tLnZlY21zZyA8LSBcIkRpZCB5b3UgY29ycmVjdGx5IHNwZWNpZnkgdGhlIG9iamVjdCBgbS52ZWNgP1wiXG5leCgpICU+JSBjaGVja19vYmplY3QoXCJtLnZlY1wiLCB1bmRlZmluZWRfbXNnID0gXCJNYWtlIHN1cmUgdG8gbm90IHJlbW92ZSBgbS52ZWNgIVwiKSAlPiVjaGVja19lcXVhbChpbmNvcnJlY3RfbXNnPW0udmVjbXNnKVxudGFibGVtc2cgPC0gXCJEaWQgeW91IGNvcnJlY3RseSBzcGVjaWZ5IHRoZSBhcmd1bWVudHM/XCJcbmV4KCkgJT4lIGNoZWNrX2Z1bmN0aW9uKFwidGFibGVcIiwgbm90X2NhbGxlZF9tc2cgPXRhYmxlbXNnKSAlPiUgY2hlY2tfcmVzdWx0KGVycm9yX21zZz10YWJsZW1zZykgJT4lIGNoZWNrX2VxdWFsKClcbmJhcnBsb3Rtc2cgPC0gXCJEaWQgeW91IGNvcnJlY3RseSBzcGVjaWZ5IHRoZSBhcmd1bWVudHM/XCJcbmV4KCkgJT4lIGNoZWNrX2Z1bmN0aW9uKFwiYmFycGxvdFwiLCBub3RfY2FsbGVkX21zZyA9YmFycGxvdG1zZykgJT4lIGNoZWNrX3Jlc3VsdChlcnJvcl9tc2c9YmFycGxvdG1zZykgJT4lIGNoZWNrX2VxdWFsKClcbnN1Y2Nlc3NfbXNnKFwiRXhjZWxsZW50IGpvYiEgQWN0dWFyaWFsIGFwcGxpY2F0aW9ucyBvZnRlbiBpbnZvbHZlIG1hc3NpdmUgZGF0YXNldHMuIEl0IGNhbiBiZSBoYW5keSBpZGVudGlmeWluZyBzaXR1YXRpb25zIGluIHdoaWNoIGRhdGEgY2FuIGJlIGNvbXByZXNzZWQgd2l0aG91dCBsb3NzIG9mIGluZm9ybWF0aW9uLiBTb21lIGFwcGxpY2F0aW9ucyBvZiBjb3VudCBkYXRhIGlzIG9uZSBzdWNoIHNpdHVhdGlvbi5cIikiLCJoaW50IjoiUGxheSB3aXRoIGEgc21hbGxlciBkYXRhIHNldCAoaS5lLiB4KSB0byB1bmRlcnN0YW5kIGVhY2ggbGluZSBvZiBjb2RlLiJ9

2.3.2 Exercise. Count Data Compression - 2

Assignment Text

In Exercise 2.3.1, the count data range was narrow - counts range from 0 to 6. In that case the suggested manner to store \(m_k\)’s worked well. In Section 1.3 of this short course, the Wisconsin Property Fund data has been introduced which consists of claim experience for fund members over the years 2006-2010, inclusive. It includes the frequency of claims Freq as well as the claim year Year. The Wisconsin Property Fund data has already been read into a data frame called Insample. In this assignment, we will compress the claims frequency data using a data structure that differs from that presented in the preceding exercise.

Instructions

  • Using the dataframe Insample, create a smaller data set based on year 2007 experience.
  • From the 2007 experience, create a frequency table.
  • Store distinct claim counts observed into an array named \(\bf values\).
  • Store the frequency of claim counts into an array named \(\bf m.vec\).
  • Use the array \(\bf m.vec\) to determine the mean frequency.
  • The frequency table shows one observation with 157 claims. Use the match() function to verify this. (This function returns a vector of the positions of (first) matches of its first argument in its second).
  • With the array \(\bf m.vec\), graph the count distribution.


eyJsYW5ndWFnZSI6InIiLCJwcmVfZXhlcmNpc2VfY29kZSI6Ikluc2FtcGxlIDwtIHJlYWQuY3N2KFwiaHR0cHM6Ly9yYXcuZ2l0aHVidXNlcmNvbnRlbnQuY29tL09wZW5BY3RUZXh0cy9MREFDb3Vyc2UxL21haW4vRGF0YS9JbnNhbXBsZS5jc3ZcIiwgaGVhZGVyPVQsbmEuc3RyaW5ncz1jKFwiLlwiKSxzdHJpbmdzQXNGYWN0b3JzPUZBTFNFKSIsInNhbXBsZSI6Ikluc2FtcGxleXIgPC0gc3Vic2V0KD8/LCBZZWFyPT0/PylcblxudGFibGUoPz8kRnJlcSlcblxudmFsdWVzIDwtIGFzLmludGVnZXIobmFtZXModGFibGUoPz8pKSlcblxubS52ZWMgPC0gYXMudmVjdG9yKHRhYmxlKD8/KSlcblxuc3VtKD8/Km0udmVjKS9zdW0oPz8pXG4jIE51bWJlciBvZiBjbGFpbSBjb3VudHMgZXF1YWwgdG8gMTU3XG5tLnZlY1ttYXRjaCg/Pyw/PyldXG5cbmJhcnBsb3QoPz8sbmFtZXMuYXJnPT8/LHlsYWI9XCJGcmVxdWVuY3lcIix4bGFiPVwiQ291bnRcIikiLCJzb2x1dGlvbiI6Ikluc2FtcGxleXIgPC0gc3Vic2V0KEluc2FtcGxlLCBZZWFyPT0yMDA3KVxuXG50YWJsZShJbnNhbXBsZXlyJEZyZXEpXG5cbnZhbHVlcyA8LSBhcy5pbnRlZ2VyKG5hbWVzKHRhYmxlKEluc2FtcGxleXIkRnJlcSkpKVxuXG5tLnZlYyA8LSBhcy52ZWN0b3IodGFibGUoSW5zYW1wbGV5ciRGcmVxKSlcblxuc3VtKHZhbHVlcyptLnZlYykvc3VtKG0udmVjKVxuIyBOdW1iZXIgb2YgY2xhaW0gY291bnRzIGVxdWFsIHRvIDE1N1xubS52ZWNbbWF0Y2goMTU3LHZhbHVlcyldXG5cbmJhcnBsb3QobS52ZWMsbmFtZXMuYXJnPXZhbHVlcyx5bGFiPVwiRnJlcXVlbmN5XCIseGxhYj1cIkNvdW50XCIpIiwic2N0IjoiSW5zYW1wbGV5cm1zZyA8LSBcIkRpZCB5b3UgY29ycmVjdGx5IHNwZWNpZnkgdGhlIG9iamVjdCBgSW5zYW1wbGV5cmA/XCJcbmV4KCkgJT4lIGNoZWNrX29iamVjdChcIkluc2FtcGxleXJcIiwgdW5kZWZpbmVkX21zZyA9IFwiTWFrZSBzdXJlIHRvIG5vdCByZW1vdmUgYEluc2FtcGxleXJgIVwiKSAlPiVjaGVja19lcXVhbChpbmNvcnJlY3RfbXNnPUluc2FtcGxleXJtc2cpXG52YWx1ZXNtc2cgPC0gXCJEaWQgeW91IGNvcnJlY3RseSBzcGVjaWZ5IHRoZSBvYmplY3QgYHZhbHVlc2A/XCJcbmV4KCkgJT4lIGNoZWNrX29iamVjdChcInZhbHVlc1wiLCB1bmRlZmluZWRfbXNnID0gXCJNYWtlIHN1cmUgdG8gbm90IHJlbW92ZSBgdmFsdWVzYCFcIikgJT4lY2hlY2tfZXF1YWwoaW5jb3JyZWN0X21zZz12YWx1ZXNtc2cpXG5tLnZlY21zZyA8LSBcIkRpZCB5b3UgY29ycmVjdGx5IHNwZWNpZnkgdGhlIG9iamVjdCBgbS52ZWNgP1wiXG5leCgpICU+JSBjaGVja19vYmplY3QoXCJtLnZlY1wiLCB1bmRlZmluZWRfbXNnID0gXCJNYWtlIHN1cmUgdG8gbm90IHJlbW92ZSBgbS52ZWNgIVwiKSAlPiVjaGVja19lcXVhbChpbmNvcnJlY3RfbXNnPW0udmVjbXNnKVxudGFibGVtc2cgPC0gXCJEaWQgeW91IGNvcnJlY3RseSBzcGVjaWZ5IHRoZSBhcmd1bWVudHM/XCJcbmV4KCkgJT4lIGNoZWNrX2Z1bmN0aW9uKFwidGFibGVcIiwgbm90X2NhbGxlZF9tc2cgPXRhYmxlbXNnKSAlPiUgY2hlY2tfcmVzdWx0KGVycm9yX21zZz10YWJsZW1zZykgJT4lIGNoZWNrX2VxdWFsKClcbmJhcnBsb3Rtc2cgPC0gXCJEaWQgeW91IGNvcnJlY3RseSBzcGVjaWZ5IHRoZSBhcmd1bWVudHM/XCJcbmV4KCkgJT4lIGNoZWNrX2Z1bmN0aW9uKFwiYmFycGxvdFwiLCBub3RfY2FsbGVkX21zZyA9YmFycGxvdG1zZykgJT4lIGNoZWNrX3Jlc3VsdChlcnJvcl9tc2c9YmFycGxvdG1zZykgJT4lIGNoZWNrX2VxdWFsKClcbnN1Y2Nlc3NfbXNnKFwiRXhjZWxsZW50IGpvYiEgVGhlcmUgYXJlIDExMzggZGlzdGluY3Qgb2JzZXJ2YXRpb25zIGluIHRoZSAyMDA3IGV4cGVyaWVuY2UuIFdpdGhvdXQgYW55IGxvc3Mgb2YgaW5mb3JtYXRpb24sIHRoZXNlIGNvdWxkIGJlIHN1bW1hcml6ZWQgaW50byBvbmx5IDIwIHZhbHVlcywgYSB0ZXJyaWZpYyByZWFsIHdvcmxkIGV4YW1wbGUgb2YgY29tcHJlc3Npb24uXCIpIiwiaGludCI6IlBsYXkgd2l0aCBhIHNtYWxsZXIgZGF0YSBzZXQgKGkuZS4geCkgdG8gdW5kZXJzdGFuZCBlYWNoIGxpbmUgb2YgY29kZS4ifQ==

Video: Fitting a Poisson Distribution

2.3.3 Exercise. Graphing Likelihoods

Assignment Text

In this assignment you are asked to plot the Poisson likelihood and the log-likelihood for the data from Exercise 2.3.1. It is instructive to see that the maximum for both the curves are attained at the same point, the sample mean. These data have already been read into an array \(\bf x\) with the counts in the array \(\bf m.vec\). Once you have a working code, scroll through the plots to understand what each piece of graph code does.

Instructions

  • Write the likelihood as a function of the parameter \(\theta\).
  • Create an array with values of \(\theta = 1, 1.01, 1.02, \ldots, 5\).
  • Plot the likelihood over values of \(\theta\). You may find useful the function sapply(). (It applies a function over a list or vector.)
  • Plot the log-likelihood over values of \(\theta\).
  • Superimpose a vertical line at the mean to emphasize that both the likelihood and the log-likelihood reach their maximum values at the mean. For this, use the abline() (this function adds one or more straight lines through the current plot).


eyJsYW5ndWFnZSI6InIiLCJwcmVfZXhlcmNpc2VfY29kZSI6Ing8LWMoMywgNiwgMCwgMiwgMywgNCwgNCwgMiwgNCwgNCwgNiwgMiwgMywgMCwgMiwgMyxcbiAgICAgMSwgMiwgNCwgMilcbm0udmVjIDwtIHJlcCgwLG1heCh4KSsxKVxubS52ZWNbYXMuaW50ZWdlcihuYW1lcyh0YWJsZSh4KSkpKzFdPWFzLnZlY3Rvcih0YWJsZSh4KSkiLCJzYW1wbGUiOiIjIExpa2VsaWhvb2QgRnVuY3Rpb25cbkxpa2UgIDwtICBmdW5jdGlvbih0aGV0YSl7XG4gIHByb2QoZHBvaXMoMDptYXgoeCksdGhldGEpXj8/KVxuICB9XG50aGV0YSA8LSA/P1xuICBcbnBhcihtYXI9Yyg1LCA1LCA0LCA1KSArIDAuMSkgIyBQbG90IE1hcmdpbnNcbiMgUGxvdCBMaWtlbGlob29kXG5wbG90KHRoZXRhLHNhcHBseSg/PyxMaWtlKSx0eXBlPVwibFwiLGF4ZXM9RkFMU0UseGxhYj1cIlwiLHlsYWI9XCJcIilcbmF4aXMoMiwgeWxpbT1jKDAsNWUtMTcpLGNvbD1cImJsYWNrXCIsbGFzPTEpICMgRmlyc3QgeS1heGlzXG5tdGV4dChcIkxpa2VsaWhvb2RcIixzaWRlPTIsbGluZT0zLjc1KVxuXG5wYXIobmV3PVRSVUUpICMgUmV1c2UgR3JhcGhcbiMgUGxvdCBMb2ctbGlrZWxpaG9vZFxucGxvdCg/Pyxsb2coc2FwcGx5KHRoZXRhLD8/KSksYXhlcz1GQUxTRSx4bGFiPVwiXCIseWxhYj1cIlwiLHlsaW09YygtNjUsLTM1KSxjb2w9XCJyZWRcIix0eXBlPVwibFwiKVxubXRleHQoXCJMb2ctbGlrZWxpaG9vZFwiLHNpZGU9NCxjb2w9XCJyZWRcIixsaW5lPTMpXG5heGlzKDQsIHlsaW09YygtMjUsLTE2KSwgY29sPVwicmVkXCIsY29sLmF4aXM9XCJyZWRcIiwgbGFzPTEpICMgU2Vjb25kIHktYXhpc1xuYXhpcygxLCBjKDEsMS41LDIsMi41LG1lYW4oeCksMy41LDQsNC41LDUpICkgIyBDb21tb24geC1heGlzXG5tdGV4dChleHByZXNzaW9uKHRoZXRhKSxzaWRlPTEsY29sPVwiYmxhY2tcIixsaW5lPTIuNSlcbiMgVmVydGljYWwgbGluZSBhdCB0aGUgY29tbW9uIGFyZ21heCAobWxlKVxuYWJsaW5lKHY9bWVhbig/PyksY29sPVwiZ3JlZW5cIikiLCJzb2x1dGlvbiI6IiMgTGlrZWxpaG9vZCBGdW5jdGlvblxuTGlrZSA8LSBmdW5jdGlvbih0aGV0YSl7XG4gIHByb2QoZHBvaXMoMDptYXgoeCksdGhldGEpXm0udmVjKVxuICB9XG50aGV0YSA8LSAoMTAwOjUwMCkvMTAwXG5cbnBhcihtYXI9Yyg1LCA1LCA0LCA1KSArIDAuMSkgIyBQbG90IE1hcmdpbnNcbiMgUGxvdCBMaWtlbGlob29kXG5wbG90KHRoZXRhLHNhcHBseSh0aGV0YSxMaWtlKSx0eXBlPVwibFwiLGF4ZXM9RkFMU0UseGxhYj1cIlwiLHlsYWI9XCJcIilcbmF4aXMoMiwgeWxpbT1jKDAsNWUtMTcpLGNvbD1cImJsYWNrXCIsbGFzPTEpICMgRmlyc3QgeS1heGlzXG5tdGV4dChcIkxpa2VsaWhvb2RcIixzaWRlPTIsbGluZT0zLjc1KVxuXG5wYXIobmV3PVRSVUUpICMgUmV1c2UgR3JhcGhcbiMgUGxvdCBMb2ctbGlrZWxpaG9vZFxucGxvdCh0aGV0YSxsb2coc2FwcGx5KHRoZXRhLExpa2UpKSxheGVzPUZBTFNFLHhsYWI9XCJcIix5bGFiPVwiXCIseWxpbT1jKC02NSwtMzUpLGNvbD1cInJlZFwiLHR5cGU9XCJsXCIpXG5tdGV4dChcIkxvZy1saWtlbGlob29kXCIsc2lkZT00LGNvbD1cInJlZFwiLGxpbmU9MylcbmF4aXMoNCwgeWxpbT1jKC0yNSwtMTYpLCBjb2w9XCJyZWRcIixjb2wuYXhpcz1cInJlZFwiLGxhcz0xKSAjIFNlY29uZCB5LWF4aXNcbmF4aXMoMSxjKDEsMS41LDIsMi41LG1lYW4oeCksMy41LDQsNC41LDUpKSAjIENvbW1vbiB4LWF4aXNcbm10ZXh0KGV4cHJlc3Npb24odGhldGEpLHNpZGU9MSxjb2w9XCJibGFja1wiLGxpbmU9Mi41KVxuIyBWZXJ0aWNhbCBsaW5lIGF0IHRoZSBjb21tb24gYXJnbWF4IChtbGUpXG5hYmxpbmUodj1tZWFuKHgpLGNvbD1cImdyZWVuXCIpIiwic2N0IjoidGhldGFtc2cgPC0gXCJEaWQgeW91IGNvcnJlY3RseSBzcGVjaWZ5IHRoZSBvYmplY3QgYHRoZXRhYD9cIlxuZXgoKSAlPiUgY2hlY2tfb2JqZWN0KFwidGhldGFcIiwgdW5kZWZpbmVkX21zZyA9IFwiTWFrZSBzdXJlIHRvIG5vdCByZW1vdmUgYHRoZXRhYCFcIikgJT4lY2hlY2tfZXF1YWwoaW5jb3JyZWN0X21zZz10aGV0YW1zZylcbnN1Y2Nlc3NfbXNnKFwiRXhjZWxsZW50IGpvYiEgSW4gY29tcGxleCBhY3R1YXJpYWwgYXBwbGljYXRpb25zLCBtYXhpbXVtIGxpa2VsaWhvb2QgaXMgdXNlZCBleHRlbnNpdmVseSB0byBjYWxpYnJhdGUgbW9kZWxzIGJ5IGVzdGltYXRpbmcgdGhlIG1vc3QgbGlrZWx5IHBhcmFtZXRlciB2YWx1ZXMuIE9uZSBnZXRzIGluc2lnaHRzIGludG8gdGhpcyBwcm9jZXNzIGJ5IHZpc3VhbGl6aW5nIGZ1bmN0aW9ucyB0aGF0IGFyZSBiZWluZyBtYXhpbWl6ZWQuXCIpIiwiaGludCI6IlRvIHNpbXBsaWZ5IG1hdHRlcnMgZXZlbiBtb3JlLCB5b3UgY291bGQgdHJ5IHdpdGgganVzdCB0aHJlZSBwb2ludHMuIn0=

Video: Fitting Binominal and Negative Binomial Distributions

Overheads: Fitting Binominal and Negative Binomial Distributions (Click Tab to View)

Hide
Hide
Hide
Hide

2.3.4 Exercise. Fitting a Binomial Distribution

Assignment Text

In this assignment you are asked to fit the binomial model to a small set of count data, \(\{1, 3, 3, 3, 5, 3, 0, 2, 4, 3, 4\}\). Recall that in this parameterization of the binomial model, \(m\) is the potential number of 1’s (the number of “trials”). You will recall from Section 2.4.2 of Loss Data Analytics that when \(m\) is known, the mle of \(q\) is simply a sample average. Because \(m\) is restricted to integer values, it is convenient to resort to brute force maximization of the reduced likelihood in order to find the mle for \(m\). But then it is a must to visually confirm the solution by plotting the reduced log-likelihood.

Note that it is important to check that the sample mean exceeds sample variance for the mle of \(m\) to be finite - lest the Poisson is the better model. The data are available in an array \(\bf x\) and the array \(\bf m.vec\) of Exercise 2.3.1 is also made available.

Instructions

  • Develop the reduced likelihood function based on the compressed data in the array \(\bf m.vec\).
  • The value of \(m\) must be at least as large as the largest observed value in the sample. Calculate potential values of the reduced likelihood over a range beginning from the maximum value in \(\bf m.vec\).
  • Determine the maximum likelihood estimators of the parameters \(m\) and \(q\) by selecting the largest likelihood. For this, you will find helpful the function match() (it returns a vector of the positions of (first) matches of its first argument in its second).
  • To check your results, plot the likelihood function over a range of \(m\). Superimpose a vertical line at the mle.
  • Check to see whether the sample mean is greater than the sample variance. Doing so, you can use the function var() (it computes the variance using a denominator of \(n-1\)). To avoid this, the sample code instead uses var(c(x,mean(x))) instead of var(x). Ask yourself why?


eyJsYW5ndWFnZSI6InIiLCJwcmVfZXhlcmNpc2VfY29kZSI6InNldC5zZWVkKFwiMTIzNFwiKVxueCA8LSByYmlub20oMTEsOSwxLzMpXG4jIHggPC0gYygxLCAzLCAzLCAzLCA1LCAzLCAwLCAyLCA0LCAzLCA0KVxubS52ZWMgPC0gcmVwKDAsbWF4KHgpKzEpXG5tLnZlY1thcy5pbnRlZ2VyKG5hbWVzKHRhYmxlKHgpKSkrMV0gPSBhcy52ZWN0b3IodGFibGUoeCkpIiwic2FtcGxlIjoiIyBSZWR1Y2VkIExpa2VsaWhvb2QgRnVuY3Rpb24gXG5yZWR1Y2VkX2xpa2U8LWZ1bmN0aW9uKHBhcmFfbSl7XG4gIGlmIChwYXJhX20+PW1heCh4KSlcbiAgICBwcm9kKGRiaW5vbSgwOm1heCh4KSxwYXJhX20sbWVhbih4KS9jZWlsaW5nKHBhcmFfbSkpXm0udmVjKVxuICBlbHNlICAgIDBcbiAgfVxuIyBDb21wdXRlIHRoZSByZWR1Y2VkIGxpa2VsaWhvb2QgYWNyb3NzIGEgcmFuZ2Ugb2YgdmFsdWVzIGZvciBtXG5vYmogPC0gc2FwcGx5KG1heCh4KTooMyptYXgoeCkpLD8/KVxuXG4jIG1sZSBmb3IgbSBhbmQgcVxubV9NTEUgPC0gbWF0Y2gobWF4KG9iaiksb2JqKSttYXgoeCktMVxucV9NTEUgPC0gbWVhbih4KS8/P1xuIyBWaXN1YWxseSBjaGVjayB0aGUgcmVkdWNlZCBsb2ctbGlrZWxpaG9vZFxucGxvdChtYXgoeCk6KDMqbWF4KHgpKSxsb2coPz8pLHR5cGU9XCJsXCIseGxhYj1cIm1cIix5bGFiPWV4cHJlc3Npb24obChtLGJhcih4KS9tKSkpXG5hYmxpbmUodj1tX01MRSxjb2w9XCJncmVlblwiKVxuXG4jIENoZWNrIGlmIHNhbXBsZSBtZWFuID4gc2FtcGxlIHZhcmlhbmNlXG5jKG1lYW4oPz8pLCB2YXIoYyh4LG1lYW4oeCkpKSkiLCJzb2x1dGlvbiI6IiMgTGlrZWxpaG9vZCBGdW5jdGlvbiBcbnJlZHVjZWRfbGlrZTwtZnVuY3Rpb24ocGFyYV9tKXtcbiAgaWYgKHBhcmFfbT49bWF4KHgpKVxuICAgIHByb2QoZGJpbm9tKDA6bWF4KHgpLHBhcmFfbSxtZWFuKHgpL2NlaWxpbmcocGFyYV9tKSlebS52ZWMpXG4gIGVsc2UgICAgICAwXG4gIH1cbiMgIENvbXB1dGUgdGhlIHJlZHVjZWQgbGlrZWxpaG9vZCBhY3Jvc3MgYSByYW5nZSBvZiB2YWx1ZXMgZm9yIG1cbm9iaiA8LSBzYXBwbHkobWF4KHgpOigzKm1heCh4KSkscmVkdWNlZF9saWtlKVxuXG4jIE1MRSBmb3IgbSBhbmQgcVxubV9NTEUgPC0gbWF0Y2gobWF4KG9iaiksb2JqKSttYXgoeCktMVxucV9NTEUgPC0gbWVhbih4KS9tX01MRVxuIyBJbXBvcnRhbnQgdG8gdmlzdWFsbHkgY2hlY2sgdGhlIHJlZHVjZWQgbG9nLWxpa2VsaWhvb2RcbnBsb3QobWF4KHgpOigzKm1heCh4KSksbG9nKG9iaiksdHlwZT1cImxcIix4bGFiPVwibVwiLHlsYWI9ZXhwcmVzc2lvbihsKG0sYmFyKHgpL20pKSlcbmFibGluZSh2PW1fTUxFLGNvbD1cImdyZWVuXCIpXG5cbiMgQ2hlY2sgaWYgc2FtcGxlIG1lYW4gPiBzYW1wbGUgdmFyaWFuY2VcbmMobWVhbih4KSwgdmFyKGMoeCxtZWFuKHgpKSkpIiwic2N0Ijoib2JqbXNnIDwtIFwiRGlkIHlvdSBjb3JyZWN0bHkgc3BlY2lmeSB0aGUgb2JqZWN0IGBvYmpgP1wiXG5leCgpICU+JSBjaGVja19vYmplY3QoXCJvYmpcIiwgdW5kZWZpbmVkX21zZyA9IFwiTWFrZSBzdXJlIHRvIG5vdCByZW1vdmUgYG9iamAhXCIpICU+JWNoZWNrX2VxdWFsKGluY29ycmVjdF9tc2c9b2JqbXNnKVxubV9NTEVtc2cgPC0gXCJEaWQgeW91IGNvcnJlY3RseSBzcGVjaWZ5IHRoZSBvYmplY3QgYG1fTUxFYD9cIlxuZXgoKSAlPiUgY2hlY2tfb2JqZWN0KFwibV9NTEVcIiwgdW5kZWZpbmVkX21zZyA9IFwiTWFrZSBzdXJlIHRvIG5vdCByZW1vdmUgYG1fTUxFYCFcIikgJT4lY2hlY2tfZXF1YWwoaW5jb3JyZWN0X21zZz1tX01MRW1zZylcbnFfTUxFbXNnIDwtIFwiRGlkIHlvdSBjb3JyZWN0bHkgc3BlY2lmeSB0aGUgb2JqZWN0IGBxX01MRWA/XCJcbmV4KCkgJT4lIGNoZWNrX29iamVjdChcInFfTUxFXCIsIHVuZGVmaW5lZF9tc2cgPSBcIk1ha2Ugc3VyZSB0byBub3QgcmVtb3ZlIGBxX01MRWAhXCIpICU+JWNoZWNrX2VxdWFsKGluY29ycmVjdF9tc2c9cV9NTEVtc2cpXG5zdWNjZXNzX21zZyhcIkV4Y2VsbGVudCBqb2IhIFdoZW4gY29tcGFyaW5nIHRoZSBtZWFuIHRvIHRoZSB2YXJpYW5jZSBmb3IgdGhlIGJhc2ljIGNvdW50IGRpc3RyaWJ1dGlvbnMsIHRoZSB2YXJpYW5jZSBpcyBzbWFsbGVyIGZvciB0aGUgYmlub21pYWwsIGVxdWFsIGZvciB0aGUgUG9pc3NvbiwgYW5kIGxhcmdlciBmb3IgdGhlIG5lZ2F0aXZlIGJpbm9taWFsLiBTbyBjb21wYXJpbmcgdGhlIG1lYW4gdG8gdGhlIHZhcmlhbmNlIGlzIGEgcXVpY2sgc2lnbmFsIHRoYXQgaGVscHMgdG8gc3BlY2lmeSBhIGNvdW50IGRpc3RyaWJ1dGlvbi5cIikiLCJoaW50IjoiV2hlbiBpbnRlcnByZXRpbmcgdGhlIGNvZGUsIHRoaXMgZXhlcmNpc2UgdXNlcyB0aGUgbW9yZSBiYXNpYyBsaWtlbGlob29kLCBpbiBjb250cmFzdCB0byB0aGUgbW9yZSBjb21tb24gbG9nLWxpa2VsaWhvb2QgZnVuY3Rpb24uIEZ1cnRoZXIsIGZvciBhIGZpeGVkIDxlbT5tPC9lbT4sIHRoZSA8ZW0+bWxlPC9lbT4gb2YgPGVtPnE8L2VtPiBpcyBzaW1wbHkgdGhlIGF2ZXJhZ2UuIFNvLCB3ZSBvbmx5IG5lZWQgdG8gZG8gb25lIGRpbWVuc2lvbmFsIG9wdGltaXphdGlvbiBldmVuIHRob3VnaCB0aGVyZSBhcmUgdHdvIHBhcmFtZXRlcnMuIn0=

2.3.5 Exercise. Fitting a Negative Binomial Distribution

Assignment Text

In this assignment, you are asked to fit the negative binomial model to a small set of count data, \(\{1, 1, 2, 6, 1, 1, 2, 5, 11\}\). Because \(r\), a parameter of the negative binomial distribution, is a positive real number, it is convenient to use the optimize function for maximizing the reduced likelihood in order to find its mle. It is always advisable to visually confirm the solution by plotting the reduced log-likelihood.

Note that it is important to check that the sample mean is lower than sample variance for the mle of \(r\) to be finite - lest the Poisson is the better model. The data are available in an array \(\bf x\) and the array \(\bf m.vec\) of Exercise 2.3.1 is also made available.

Instructions

  • Develop the reduced log-likelihood function based on the compressed data in the array \(\bf m.vec\).
  • The optimization routine will need a range of potential values of \(r\). The illustrative code uses a moment estimator (based on techniques that we will cover formally later in Section 4.1.1 of Loss Data Analytics).
  • Determine the value of \(r\) that minimizes the reduced log-likelihood using the function optimize() (a one dimensional optimization function).
  • To check your results, plot the reduced log-likelihood function over a range of \(r\). Superimpose a vertical line at the mle.
  • Check to see whether the sample mean is smaller than the sample variance. Is this consistent with the negative binomial distribution?


eyJsYW5ndWFnZSI6InIiLCJwcmVfZXhlcmNpc2VfY29kZSI6InNldC5zZWVkKFwiMTIzNFwiKVxueCA8LSBybmJpbm9tKDksMi4zLG11PTQpXG4jeCA8LSBjKDEsICAxLCAgMiwgIDYsICAxLCAgMSwgIDIsICA1LCAxMSlcbm0udmVjIDwtIHJlcCgwLG1heCh4KSsxKVxubS52ZWNbYXMuaW50ZWdlcihuYW1lcyh0YWJsZSh4KSkpKzFdID0gYXMudmVjdG9yKHRhYmxlKHgpKSIsInNhbXBsZSI6InJlZHVjZWRfbG9nbGlrZTwtZnVuY3Rpb24ocil7XG4gICAgc3VtKD8/KmxvZyhkbmJpbm9tKDA6bWF4KHgpLD8/LG11PW1lYW4oeCkpKSlcbn1cblxuIyBFc3RpbWF0b3IgZm9yIHJcbnJfbW9tZW50IDwtIG1lYW4oeCleMi8odmFyKHgpLW1lYW4oeCkpXG5yX01MRSA8LSBvcHRpbWl6ZSg/Pyxsb3dlcj0wLHVwcGVyPTMqcl9tb21lbnQsbWF4aW11bT1UUlVFKSRtYXhpbXVtXG5iZXRhX01MRSA8LSBtZWFuKHgpLz8/XG5jKHJfTUxFLGJldGFfTUxFKVxuXG4gICMgSW1wb3J0YW50IHRvIHZpc3VhbGx5IGNoZWNrIHRoZSBsb2ctbGlrZWxpaG9vZFxucGxvdChyPC0oMTooMzAwKnJfbW9tZW50KSkvMTAwLHNhcHBseSg/Pyw/PyksdHlwZT1cImxcIix4bGFiPVwibVwiLHlsYWI9ZXhwcmVzc2lvbihsKG0sYmFyKHgpL20pKSlcbmFibGluZSh2PT8/LGNvbD1cImdyZWVuXCIpXG5cbmMobWVhbig/PyksIHZhcihjKHgsPz8pKSkiLCJzb2x1dGlvbiI6InJlZHVjZWRfbG9nbGlrZTwtZnVuY3Rpb24ocil7XG4gICAgc3VtKG0udmVjKmxvZyhkbmJpbm9tKDA6bWF4KHgpLHIsbXU9bWVhbih4KSkpKVxufVxuXG4jIEVzdGltYXRvciBmb3Igclxucl9tb21lbnQgPC0gbWVhbih4KV4yLyh2YXIoeCktbWVhbih4KSlcbnJfTUxFIDwtIG9wdGltaXplKHJlZHVjZWRfbG9nbGlrZSxsb3dlcj0wLHVwcGVyPTMqcl9tb21lbnQsbWF4aW11bT1UUlVFKSRtYXhpbXVtXG5iZXRhX01MRSA8LSBtZWFuKHgpL3JfTUxFXG5jKHJfTUxFLGJldGFfTUxFKVxuXG4jIEltcG9ydGFudCB0byB2aXN1YWxseSBjaGVjayB0aGUgbG9nLWxpa2VsaWhvb2RcbnBsb3QocjwtKDE6KDMwMCpyX21vbWVudCkpLzEwMCxzYXBwbHkocixyZWR1Y2VkX2xvZ2xpa2UpLHR5cGU9XCJsXCIseGxhYj1cIm1cIix5bGFiPWV4cHJlc3Npb24obChtLGJhcih4KS9tKSkpXG5hYmxpbmUodj1yX01MRSxjb2w9XCJncmVlblwiKVxuXG5jKG1lYW4oeCksIHZhcihjKHgsbWVhbih4KSkpKSIsInNjdCI6InJfTUxFbXNnIDwtIFwiRGlkIHlvdSBjb3JyZWN0bHkgc3BlY2lmeSB0aGUgb2JqZWN0IGByX01MRWA/XCJcbmV4KCkgJT4lIGNoZWNrX29iamVjdChcInJfTUxFXCIsIHVuZGVmaW5lZF9tc2cgPSBcIk1ha2Ugc3VyZSB0byBub3QgcmVtb3ZlIGByX01MRWAhXCIpICU+JWNoZWNrX2VxdWFsKGluY29ycmVjdF9tc2c9cl9NTEVtc2cpXG5iZXRhX01MRW1zZyA8LSBcIkRpZCB5b3UgY29ycmVjdGx5IHNwZWNpZnkgdGhlIG9iamVjdCBgYmV0YV9NTEVgP1wiXG5leCgpICU+JSBjaGVja19vYmplY3QoXCJiZXRhX01MRVwiLCB1bmRlZmluZWRfbXNnID0gXCJNYWtlIHN1cmUgdG8gbm90IHJlbW92ZSBgYmV0YV9NTEVqYCFcIikgJT4lY2hlY2tfZXF1YWwoaW5jb3JyZWN0X21zZz1iZXRhX01MRW1zZylcbnN1Y2Nlc3NfbXNnKFwiU3VwZXJiISBUaGUgaW1wb3J0YW5jZSBvZiBtYXhpbXVtIGxpa2VsaWhvb2QgZXN0aW1hdGlvbiBpcyB2YWx1YWJsZSBpbiBhY3R1YXJpYWwgYXBwbGljYXRpb25zLiBIYXZpbmcgZmFtaWxpYXJpdHkgd2l0aCBmdW5kYW1lbnRhbCBjb25jZXB0cyBzdWNoIGFzIHdvcmtpbmcgd2l0aCByZWR1Y2VkIGxpa2VsaWhvb2RzIGFuZCBwbG90dGluZyBsaWtlbGlob29kIGZ1bmN0aW9ucyBoZWxwcyB0byBkZXZlbG9wIGEgZGVlcCBhcHByZWNpYXRpb24gb2YgdGhpcyBlc3RpbWF0aW9uIGFwcHJvYWNoLlwiKSIsImhpbnQiOiJGb3IgdGhlIG5lZ2F0aXZlIGJpbm9taWFsLCBhdCB0aGUgPGVtPm1sZTwvZW0+LCB3ZSBoYXZlIG9mIDxlbT5yIGJldGE8L2VtPiBpcyBzaW1wbHkgdGhlIGF2ZXJhZ2UuIFNvLCB3ZSBvbmx5IG5lZWQgdG8gZG8gb25lIGRpbWVuc2lvbmFsIG9wdGltaXphdGlvbiBldmVuIHRob3VnaCB0aGVyZSBhcmUgdHdvIHBhcmFtZXRlcnMuIFJldmlldyA8YSBocmVmPVwiaHR0cHM6Ly9vcGVuYWN0dGV4dHMuZ2l0aHViLmlvL0xvc3MtRGF0YS1BbmFseXRpY3MvQy1GcmVxdWVuY3ktTW9kZWxpbmcuaHRtbCNTOmZyZXF1ZW5jeS1kaXN0cmlidXRpb25zLW1sZVwiPlNlY3Rpb24gMi40LjIgb2YgPGVtPkxvc3MgRGF0YSBBbmFseXRpY3M8L2VtPjwvYT4gZm9yIGEgcmVtaW5kZXIuIn0=

2.4 Other Frequency Distributions


In this section, you learn how to:

  • Define the \((a,b,1)\) class of frequency distributions and discuss the importance of the recursive relationship underpinning this class of distributions.
  • Interpret zero truncated and modified versions of the binomial, Poisson, and negative binomial distributions.
  • Compute probabilities using the recursive relationship.

Video: Other Frequency Distributions

Overheads: Other Frequency Distributions (Click Tab to View)

Hide
Hide
Hide
Hide
Hide
Hide
Hide
Hide

2.4.1 Exercise. The (a,b,1) Distribution and its Moments

(As a reminder, when you see this symbol, it means that this exercise is challenging and you may wish to skip it on your first pass through the course.)

Assignment Text

An earlier exercise used recursions for the \((a,b,0)\) class of distributions. The \((a,b,1)\) features the same recursion but it starts at \(k=2\). In this assignment you are given \(p_1\), \(p_2\) and \(p_3\) (in an array \(\bf p\)) from an \((a,b,1)\) distribution. You are tasked to identify the distribution and compute its mean and variance.

Instructions. For this exercise, you may find it useful to review matrix operations in R. For one nice resource click here.

  • Setup equation in matrix form and solve for \(a\) and \(b\). Consider a \(2 \times 2\) coefficient matrix \(\bf C\) such that \({\bf C} \left(\begin{array}{c}a \\ b \end{array}\right) = \left(\begin{array}{c}p_3/p_2 \\ p_2/p_1 \end{array}\right)\). Write an expression for \(\bf C\).
  • Invert the matrix \(\bf C\) and solve for the vector of coefficients \(\left(\begin{array}{c}a \\ b \end{array}\right)\).
  • Use the sign of the coefficients to identify the \((a,b,1)\) distribution.
  • Use the functional form of this distribution to compute pmf up to \(p_{100}\) directly.
  • From these generated probabilities, determine the mean and the variance.
  • From the functional form, use the closed form expressions for this distribution to check your mean and the variance calculations in the prior step. See in particular Section 18.2 of Loss Data Analytics.


eyJsYW5ndWFnZSI6InIiLCJwcmVfZXhlcmNpc2VfY29kZSI6ImR1bSA8LSAzLzRcbnAgPC0gYyhkdW0sZHVtXjIvMixkdW1eMy8zKS9sb2coMS1kdW0pIiwic2FtcGxlIjoiQyA8LSBtYXRyaXgoYygxLD8/LD8/LDEvMiksbnJvdz0yLGJ5cm93PVRSVUUpIFxuXG5hYiA8LSBzb2x2ZShDKSUqJWMoPz8scFsyXS9wWzFdKVxuIyBVc2luZyB0aGUgc2lnbiBvZiB0aGUgY29lZmZpY2llbnRzIGlkZW50aWZ5IHRoZSAoYSxiLDEpIGRpc3RyaWJ1dGlvblxuYWJbMl0vYWJbMV1cbiMgVXNlIHRoZSBmdW5jdGlvbmFsIGZvcm0gb2YgdGhpcyBkaXN0cmlidXRpb24gdG8gY29tcHV0ZSBwbWYgdXAgdG8gcF8xMDAgZGlyZWN0bHkgXG5wX2NvbXAgPC0gPz9cbiAgIyBDaGVjayBmb3IgemVyby1tb2RpZmljYXRpb25cbnBbPz9dLXBfY29tcFs/P11cbiMgTWVhbiBhbmQgVmFyaWFuY2VcbmMobXU8LXN1bShwX2NvbXAqKD8/OjEwMCkpLHN1bShwX2NvbXAqKD8/OjEwMCleMiktbXVeMilcbiMgQ2hlY2sgdXNpbmcgY2xvc2VkIGZvcm0gZm9ybXVsYWUgZm9yIHRoZSBmaXJzdCB0d28gbW9tZW50c1xuYyg/Pyw/PykiLCJzb2x1dGlvbiI6IkMgPC0gbWF0cml4KGMoMSwxLzMsMSwxLzIpLG5yb3c9MixieXJvdz1UUlVFKSBcblxuYWIgPC0gc29sdmUoQyklKiVjKHBbM10vcFsyXSxwWzJdL3BbMV0pXG4jIFVzaW5nIHRoZSBzaWduIG9mIHRoZSBjb2VmZmljaWVudHMgaWRlbnRpZnkgdGhlIChhLGIsMSkgZGlzdHJpYnV0aW9uXG4jIFNpbmNlIGE+MCBhbmQgYjwwIGl0IGlzIEVUTkIgb3IgTG9nYXJpdGhtaWM7IGhlbmNlIGNvbXB1dGUgYi9hXG5hYlsyXS9hYlsxXVxuIyBTaW5jZSBhYm92ZSBlcXVhbHMgLTEsIHI9MCBhbmQgaXQgaXMgbG9nYXJpdGhtaWNcbiMgcF9rID0gLTEvbG9nKDEtYSkgYV5rL2tcbnBfY29tcCA8LSAtMS9sb2coMS1hYlsxXSkgKmFiWzFdXihrPC0oMToxMDApKS8oaylcbiMgQ2hlY2sgZm9yIHplcm8tbW9kaWZpY2F0aW9uXG5wWzFdLXBfY29tcFsxXVxuIyBNZWFuIGFuZCBWYXJpYW5jZVxuYyhtdSA8LSBzdW0ocF9jb21wKmspLHN1bShwX2NvbXAqa14yKS1tdV4yKVxuIyBDaGVjayB1c2luZyBmb3JtdWxhZSBvZiBsb2dhcml0aG1pYyBtb21lbnRzXG5jKC0xL2xvZygxLWFiWzFdKSAqYWJbMV0vKDEtYWJbMV0pLC0oYWJbMV1eMithYlsxXSpsb2coMS1hYlsxXSkpLygoMS1hYlsxXSleMioobG9nKDEtYWJbMV0pKV4yKSkiLCJzY3QiOiJDbXNnIDwtIFwiRGlkIHlvdSBjb3JyZWN0bHkgc3BlY2lmeSB0aGUgb2JqZWN0IGBDYD9cIlxuZXgoKSAlPiUgY2hlY2tfb2JqZWN0KFwiQ1wiLCB1bmRlZmluZWRfbXNnID0gXCJNYWtlIHN1cmUgdG8gbm90IHJlbW92ZSBgQ2AhXCIpICU+JWNoZWNrX2VxdWFsKGluY29ycmVjdF9tc2c9Q21zZylcbmFibXNnIDwtIFwiRGlkIHlvdSBjb3JyZWN0bHkgc3BlY2lmeSB0aGUgb2JqZWN0IGBhYmA/XCJcbmV4KCkgJT4lIGNoZWNrX29iamVjdChcImFiXCIsIHVuZGVmaW5lZF9tc2cgPSBcIk1ha2Ugc3VyZSB0byBub3QgcmVtb3ZlIGBhYmAhXCIpICU+JWNoZWNrX2VxdWFsKGluY29ycmVjdF9tc2c9YWJtc2cpXG5wX2NvbXBtc2cgPC0gXCJEaWQgeW91IGNvcnJlY3RseSBzcGVjaWZ5IHRoZSBvYmplY3QgYHBfY29tcGA/XCJcbmV4KCkgJT4lIGNoZWNrX29iamVjdChcInBfY29tcFwiLCB1bmRlZmluZWRfbXNnID0gXCJNYWtlIHN1cmUgdG8gbm90IHJlbW92ZSBgcF9jb21wYCFcIikgJT4lY2hlY2tfZXF1YWwoaW5jb3JyZWN0X21zZz1wX2NvbXBtc2cpXG5zdWNjZXNzX21zZyhcIlRlcnJpZmljISBXb3JraW5nIHdpdGggbWF0cmljZXMgY2FuIHNhdmUgY29uc2lkZXJhYmxlIHRpbWUgaW4gaGlnaC1kaW1lbnNpb25hbCBhbmFseXRpY3MgcHJvYmxlbXMuXCIpIiwiaGludCI6IldpdGggc21hbGwgbWF0cmljZXMgYW5kIHZlY3RvcnMsIHRyeSB2YXJpb3VzIG9wZXJhdGlvbnMuIFlvdSB3aWxsIGZpbmQgc3VtbWFyaWVzIG9mICg8ZW0+YSxiPC9lbT4sMSkgZGlzdHJpYnV0aW9ucyBpbiA8YSBocmVmPVwiaHR0cHM6Ly9vcGVuYWN0dGV4dHMuZ2l0aHViLmlvL0xvc3MtRGF0YS1BbmFseXRpY3MvQy1TdW1tYXJ5RGlzdHJpYnV0aW9ucy5odG1sI3RoZS1hYjEtY2xhc3NzXCI+U2VjdGlvbiAxOC4xLnMgb2YgPGVtPkxvc3MgRGF0YSBBbmFseXRpY3M8L2VtPjwvYT4ifQ==

2.5 Mixture Distributions


In this section, you learn how to:

  • Define a mixture distribution when the mixing component is based on a finite number of sub-groups.
  • Compute mixture distribution probabilities from mixing proportions and knowledge of the distribution of each subgroup.
  • Define a mixture distribution when the mixing component is continuous.

Video: Mixture Distributions

Overheads: Mixture Distributions (Click Tab to View)

Hide
Hide
Hide
Hide
Hide
Hide
Hide

2.5.1 Exercise. Mixtures of Workers’ Compensation Claims

Assignment Text

You are analyzing a set of workers’ compensation claims (claims that pay in the event of injury at a work-place) and focus on the frequency portion. Suppose that it is known that if claims arise from a low-risk class, such as accountants and actuaries working within “four walls,” that the number of claims follows a Poisson distribution with parameter \(\lambda=4\). However, if claims arise from a high-risk class, such as roofers and lumberjacks, then the number follows a negative binomial distribution with parameters \(r=4\) and \(\beta=3\). For a particular firm, you do not know whether it is low or high risk but you do know that probability of being low-risk is \(\alpha=0.6\).

In this exercise, we will compare the shape of the mixture distribution to the low and high risk distributions.

Instructions

  • Determine the probability mass functions for the low and high risk populations for \(k=0, \ldots, 20\) possible claim outcomes.
  • Compute the corresponding probability mass function for the mixture distribution.
  • Plot the mixture distribution with superimposed lines for the low and high risk populations. Use different colors and plotting symbols for the three distributions to help viewers distinguish among them.
  • Determine distribution functions for the low, high, and mixture distributions.
  • Plot the mixture distribution function with superimposed lines for the low and high risk populations.


eyJsYW5ndWFnZSI6InIiLCJzYW1wbGUiOiJhbHBoYSA8LSAwLjY7IGxhbWJkYSA8LSA/PyAgXG5yIDwtIDQ7ICAgICAgIGJldGEgPC0gPz8gXG5rdmVjIDwtIDA6MjBcblxubG93cmlzayA8LSBkcG9pcyhrdmVjLCBsYW1iZGE9bGFtYmRhKVxuaGlnaHJpc2sgPC0gZG5iaW5vbShrdmVjLCBwcm9iPT8/ICwgc2l6ZSA9ID8/IClcbnBvcHJpc2sgPC0gPz8gXG5cbnBsb3Qoa3ZlYywgcG9wcmlzaywgeWxpbSA9IGMoMCwgLjIpLCB4bGFiID0gXCJOdW1iZXIgb2YgQ2xhaW1zXCIsIHlsYWIgPSBcIlByb2JhYmlsaXR5XCIsIHR5cGUgPSBcImJcIiwgcGNoID0gMTkpXG5saW5lcyhrdmVjLCBsb3dyaXNrLCBjb2wgPSBcImJsdWVcIiwgdHlwZSA9IFwiYlwiKVxubGluZXMoa3ZlYywgaGlnaHJpc2ssIGNvbCA9IFwicmVkXCIsIHR5cGUgPSBcImJcIiwgcGNoID0gMjMpXG5cbmxvd3Jpc2sucCA8LSBwcG9pcyhrdmVjLCBsYW1iZGE9bGFtYmRhKVxuaGlnaHJpc2sucCA8LSBwbmJpbm9tKGt2ZWMsIHByb2I9Pz8gLCBzaXplID0gPz8gKVxucG9wcmlzay5wIDwtID8/IFxuXG5wbG90KGt2ZWMsIHBvcHJpc2sucCwgeWxpbSA9IGMoMCwgMSksIHhsYWIgPSBcIk51bWJlciBvZiBDbGFpbXNcIiwgeWxhYiA9IFwiRGlzdHJpYnV0aW9uIEZ1bmN0aW9uXCIsIHR5cGUgPSBcImJcIiwgcGNoID0gMTkpXG5saW5lcyhrdmVjLCBsb3dyaXNrLnAsIGNvbCA9IFwiYmx1ZVwiLCB0eXBlID0gXCJiXCIpXG5saW5lcyhrdmVjLCBoaWdocmlzay5wLCBjb2wgPSBcInJlZFwiLCB0eXBlID0gXCJiXCIsIHBjaCA9IDIzKSIsInNvbHV0aW9uIjoiYWxwaGEgPC0gMC42OyBsYW1iZGEgPC0gNDsgXG5yIDwtIDQ7ICAgICAgIGJldGEgPC0gM1xua3ZlYyA8LSAwOjIwXG5cbmxvd3Jpc2sgPC0gZHBvaXMoa3ZlYywgbGFtYmRhPWxhbWJkYSlcbmhpZ2hyaXNrIDwtIGRuYmlub20oa3ZlYywgcHJvYj0xLygxK2JldGEpLCBzaXplID0gcilcbnBvcHJpc2sgPC0gYWxwaGEqbG93cmlzayArICgxLWFscGhhKSpoaWdocmlza1xuXG5wbG90KGt2ZWMsIHBvcHJpc2ssIHlsaW0gPSBjKDAsIC4yKSwgeGxhYiA9IFwiTnVtYmVyIG9mIENsYWltc1wiLCB5bGFiID0gXCJQcm9iYWJpbGl0eVwiLCB0eXBlID0gXCJiXCIsIHBjaCA9IDE5KVxubGluZXMoa3ZlYywgbG93cmlzaywgY29sID0gXCJibHVlXCIsIHR5cGUgPSBcImJcIilcbmxpbmVzKGt2ZWMsIGhpZ2hyaXNrLCBjb2wgPSBcInJlZFwiLCB0eXBlID0gXCJiXCIsIHBjaCA9IDIzKVxuXG5sb3dyaXNrLnAgPC0gcHBvaXMoa3ZlYywgbGFtYmRhPWxhbWJkYSlcbmhpZ2hyaXNrLnAgPC0gcG5iaW5vbShrdmVjLCBwcm9iPTEvKDErYmV0YSksIHNpemUgPSByKVxucG9wcmlzay5wIDwtIGFscGhhKmxvd3Jpc2sucCArICgxLWFscGhhKSpoaWdocmlzay5wXG5cbnBsb3Qoa3ZlYywgcG9wcmlzay5wLCB5bGltID0gYygwLCAxKSwgeGxhYiA9IFwiTnVtYmVyIG9mIENsYWltc1wiLCB5bGFiID0gXCJEaXN0cmlidXRpb24gRnVuY3Rpb25cIiwgdHlwZSA9IFwiYlwiLCBwY2ggPSAxOSlcbmxpbmVzKGt2ZWMsIGxvd3Jpc2sucCwgY29sID0gXCJibHVlXCIsIHR5cGUgPSBcImJcIilcbmxpbmVzKGt2ZWMsIGhpZ2hyaXNrLnAsIGNvbCA9IFwicmVkXCIsIHR5cGUgPSBcImJcIiwgcGNoID0gMjMpIiwic2N0IjoibGFtYmRhbXNnIDwtIFwiRGlkIHlvdSBjb3JyZWN0bHkgc3BlY2lmeSB0aGUgb2JqZWN0IGBsYW1iZGFgP1wiXG5leCgpICU+JSBjaGVja19vYmplY3QoXCJsYW1iZGFcIiwgdW5kZWZpbmVkX21zZyA9IFwiTWFrZSBzdXJlIHRvIG5vdCByZW1vdmUgYGxhbWJkYWAhXCIpICU+JWNoZWNrX2VxdWFsKGluY29ycmVjdF9tc2c9bGFtYmRhbXNnKVxuYmV0YW1zZyA8LSBcIkRpZCB5b3UgY29ycmVjdGx5IHNwZWNpZnkgdGhlIG9iamVjdCBgYmV0YWA/XCJcbmV4KCkgJT4lIGNoZWNrX29iamVjdChcImJldGFcIiwgdW5kZWZpbmVkX21zZyA9IFwiTWFrZSBzdXJlIHRvIG5vdCByZW1vdmUgYGJldGFgIVwiKSAlPiVjaGVja19lcXVhbChpbmNvcnJlY3RfbXNnPWJldGFtc2cpXG5oaWdocmlza21zZyA8LSBcIkRpZCB5b3UgY29ycmVjdGx5IHNwZWNpZnkgdGhlIG9iamVjdCBgaGlnaHJpc2tgP1wiXG5leCgpICU+JSBjaGVja19vYmplY3QoXCJoaWdocmlza1wiLCB1bmRlZmluZWRfbXNnID0gXCJNYWtlIHN1cmUgdG8gbm90IHJlbW92ZSBgaGlnaHJpc2tgIVwiKSAlPiVjaGVja19lcXVhbChpbmNvcnJlY3RfbXNnPWhpZ2hyaXNrbXNnKVxucG9wcmlza21zZyA8LSBcIkRpZCB5b3UgY29ycmVjdGx5IHNwZWNpZnkgdGhlIG9iamVjdCBgcG9wcmlza2A/XCJcbmV4KCkgJT4lIGNoZWNrX29iamVjdChcInBvcHJpc2tcIiwgdW5kZWZpbmVkX21zZyA9IFwiTWFrZSBzdXJlIHRvIG5vdCByZW1vdmUgYHBvcHJpc2tgIVwiKSAlPiVjaGVja19lcXVhbChpbmNvcnJlY3RfbXNnPXBvcHJpc2ttc2cpXG5oaWdocmlzay5wbXNnIDwtIFwiRGlkIHlvdSBjb3JyZWN0bHkgc3BlY2lmeSB0aGUgb2JqZWN0IGBoaWdocmlzay5wYD9cIlxuZXgoKSAlPiUgY2hlY2tfb2JqZWN0KFwiaGlnaHJpc2sucFwiLCB1bmRlZmluZWRfbXNnID0gXCJNYWtlIHN1cmUgdG8gbm90IHJlbW92ZSBgaGlnaHJpc2sucGAhXCIpICU+JWNoZWNrX2VxdWFsKGluY29ycmVjdF9tc2c9aGlnaHJpc2sucG1zZylcbnBvcHJpc2sucG1zZyA8LSBcIkRpZCB5b3UgY29ycmVjdGx5IHNwZWNpZnkgdGhlIG9iamVjdCBgcG9wcmlzay5wYD9cIlxuZXgoKSAlPiUgY2hlY2tfb2JqZWN0KFwicG9wcmlzay5wXCIsIHVuZGVmaW5lZF9tc2cgPSBcIk1ha2Ugc3VyZSB0byBub3QgcmVtb3ZlIGBwb3ByaXNrLnBgIVwiKSAlPiVjaGVja19lcXVhbChpbmNvcnJlY3RfbXNnPXBvcHJpc2sucG1zZylcblxuc3VjY2Vzc19tc2coXCJTdXBlcmIhIERldGVybWluaW5nIG1peHR1cmUgZGlzdHJpYnV0aW9ucyBhcmUgdXN1YWxseSBkaWZmaWN1bHQgdG8gZG8gYnkgaGFuZCBidXQgYXJlIHN0cmFpZ2h0Zm9yd2FyZCB3aXRoIGNvbXB1dGF0aW9uYWwgdG9vbHMgc3VjaCBhcyAnUicuIEluc3VyYW5jZSBhbmFseXN0cyBjb250aW51YWxseSBmcmV0IGFib3V0IHVub2JzZXJ2ZWQgY2hhcmFjdGVyaXN0aWNzIChzdWNoIGFzIGxvdyB2ZXJzdXMgaGlnaCByaXNrKSBhbmQgbWl4dHVyZSBkaXN0cmlidXRpb25zIGlzIGEgdG9vbCBvZnRlbiB1c2VkIHRvIGhlbHAgcXVhbnRpZnkgdGhlc2UgdW5vYnNlcnZlZCBwaWVjZXMgb2YgaW5mb3JtYXRpb24uIFwiKSIsImhpbnQiOiJTZWUgdGhlIDxhIGhyZWY9XCJodHRwczovL29wZW5hY3R0ZXh0cy5naXRodWIuaW8vTG9zcy1EYXRhLUFuYWx5dGljcy9DLVN1bW1hcnlEaXN0cmlidXRpb25zLmh0bWwjZGlzY3JldGUtZGlzdHJpYnV0aW9uc1wiPmFwcGVuZGl4IG9mIHRoZSBMb3NzIERhdGEgQW5hbHl0aWNzPC9hPiBcbiAgIGZvciBjb2RlIG9uIHVzaW5nIHBhcmFtZXRlcnMgaW4gUi4ifQ==

2.5.2 Exercise. Finite Number of Mixture Distributions

Assignment Text

The following describes a “classic” actuarial exam problem. We use this problem to motivate an introduction of more complex techniques for calculating mixture distributions. Unlike classic exam problems designed for hand calculations, these techniques can readily be extended to a large number of unobserved sub-populations.

In a certain town the number of common colds an individual will get in a year follows a Poisson distribution that depends on the individual’s age and smoking status:

\[ {\small \begin{array}{l|cc} \hline & \text{Proportion of population} & \text{Mean number of colds} \\ \hline \text{Children} & 0.3 & 3 \\ \text{Adult Non-Smokers} & 0.6 & 1 \\ \text{Adult Smokers} & 0.1 & 4 \\\hline \end{array} } \]

In this exercise, we will use R to calculate the probabilities that a randomly drawn person has a cold in a year.

Instructions

  • Create a vector of proportions \(\alpha\) and a vector of Poisson parameters \(\lambda\).
  • Use the function dpois() to obtain a vector of Poisson probability mass function (pmf) with different means for \(k=3\) colds. Then, use the matrix operation %*% to obtain the mixture pmf as the inner product of the two vectors containing the Poisson pmfs and population percentages.
  • In the same way, use the ppois() function to compute the probability of at most 3 colds within a year.
  • Now, consider \(k=0, \ldots, 8\) colds during a year. For each value of \(k\), determine the probability of \(k\) colds within a year.
  • Provide a barplot() of the distribution of number of colds during a year over the range \(k=0, \ldots, 8\).


eyJsYW5ndWFnZSI6InIiLCJzYW1wbGUiOiJhbHBoYSA8LSBjKDAuMywgMC42LCAwLjEpXG5sYW1iZGEudmVjPC0gPz9cblxuIyBQcm9iYWJpbGl0eSBvZiBoYXZpbmcgMyBjb21tb24gY29sZHMgaW4gYSB5ZWFyXG5ieXJpc2sgPC0gZHBvaXMoPz8sIGxhbWJkYT1sYW1iZGEudmVjKVxuYnlyaXNrICAlKiUgYWxwaGFcblxuIyBQcm9iYWJpbGl0eSBvZiBhdCBtb3N0IDMgY29tbW9uIGNvbGRzIGluIGEgeWVhclxucHBvaXMocT0zLCBsYW1iZGE9bGFtYmRhLnZlYykgJSolID8/XG5cbmt2ZWMgPSAwOjhcbnByb2JzID0gcmVwKDAsbGVuZ3RoKGt2ZWMpKVxuZm9yIChpbmRleCBpbiBrdmVjKSB7cHJvYnNbaW5kZXgrMV0gPSA/PyhpbmRleCwgbGFtYmRhPWxhbWJkYS52ZWMpICUqJSBhbHBoYX1cblxuYmFycGxvdChwcm9icywgeGxhYiA9IFwiTnVtYmVyIG9mIENsYWltc1wiLCBuYW1lcy5hcmcgPSBrdmVjKSIsInNvbHV0aW9uIjoiYWxwaGEgPC0gYygwLjMsIDAuNiwgMC4xKVxubGFtYmRhLnZlYzwtIGMoMywgMSwgNClcbmJ5cmlzayA8LSBkcG9pcygzLCBsYW1iZGE9bGFtYmRhLnZlYylcbiMgUHJvYmFiaWxpdHkgb2YgaGF2aW5nIDMgY29tbW9uIGNvbGRzIGluIGEgeWVhclxuYnlyaXNrICAlKiUgYWxwaGFcblxuIyBQcm9iYWJpbGl0eSBvZiBhdCBtb3N0IDMgY29tbW9uIGNvbGRzIGluIGEgeWVhclxucHBvaXMocT0zLCBsYW1iZGE9bGFtYmRhLnZlYykgJSolIGFscGhhXG5cbmt2ZWMgPSAwOjhcbnByb2JzID0gcmVwKDAsbGVuZ3RoKGt2ZWMpKVxuZm9yIChpbmRleCBpbiBrdmVjKSB7cHJvYnNbaW5kZXgrMV0gPSAgZHBvaXMoaW5kZXgsIGxhbWJkYT1sYW1iZGEudmVjKSAlKiUgYWxwaGF9XG5cbmJhcnBsb3QocHJvYnMsIHhsYWIgPSBcIk51bWJlciBvZiBDbGFpbXNcIiwgbmFtZXMuYXJnID0ga3ZlYykiLCJzY3QiOiJsYW1iZGEudmVjbXNnIDwtIFwiRGlkIHlvdSBjb3JyZWN0bHkgc3BlY2lmeSB0aGUgb2JqZWN0IGBsYW1iZGEudmVjYD9cIlxuZXgoKSAlPiUgY2hlY2tfb2JqZWN0KFwibGFtYmRhLnZlY1wiLCB1bmRlZmluZWRfbXNnID0gXCJNYWtlIHN1cmUgdG8gbm90IHJlbW92ZSBgbGFtYmRhLnZlY2AhXCIpICU+JWNoZWNrX2VxdWFsKGluY29ycmVjdF9tc2c9bGFtYmRhLnZlY21zZylcbmJ5cmlza21zZyA8LSBcIkRpZCB5b3UgY29ycmVjdGx5IHNwZWNpZnkgdGhlIG9iamVjdCBgYnlyaXNrYD9cIlxuZXgoKSAlPiUgY2hlY2tfb2JqZWN0KFwiYnlyaXNrXCIsIHVuZGVmaW5lZF9tc2cgPSBcIk1ha2Ugc3VyZSB0byBub3QgcmVtb3ZlIGBieXJpc2tgIVwiKSAlPiVjaGVja19lcXVhbChpbmNvcnJlY3RfbXNnPWJ5cmlza21zZylcblxuc3VjY2Vzc19tc2coXCJTdXBlcmIhIFRoaXMgZXhlcmNpc2UgZXhwbGljaXRseSBpbmNsdWRlcyBvbmx5IHRocmVlIHN1Yi1wb3B1bGF0aW9ucyBidXQgaG9wZWZ1bGx5IGl0IGlzIGFwcGFyZW50IGhvdyBpdCBjb3VsZCBiZSBleHRlbmRlZCB0byBhIGxhcmdlIG51bWJlciBvZiBzdWItcG9wdWxhdGlvbnMuIEluIHRoZSBuZXh0IGV4ZXJjaXNlLCB3ZSBjb25zaWRlciBhbiBpbmZpbml0ZSBudW1iZXIhXCIpIiwiaGludCI6IlRha2Ugc29tZSB0aW1lIHRvIGV4cGxvcmUgdGhlIG9ubGluZSA8Y29kZT5SPC9jb2RlPiBkb2N1bWVudGF0aW9uLiJ9

2.5.3 Exercise. Gamma Mixture of Poissons

Assignment Text

For a population, suppose that each risk has a Poisson number of claims with a parameter \(\lambda\) that is specific to that risk (an infinite number of risk classes). We can think of the risk parameter as following a distribution and so is itself random, denoted as a capital \(\Lambda\). A mathematically convenient assumption is to assume that the risk parameter follows a gamma distribution. That is, as we have learned from the text, a gamma mixture of Poissons turns out to have a negative binomial distribution. More precisely, if \(N|\Lambda \sim\) Poisson\((\Lambda)\) and \(\Lambda \sim \text{gamma}(\alpha, \theta)\), then \(N \sim \text{Negative Binomial}\) \((r = \alpha, \beta = \theta)\). For example, one can determine the probability mass function of \(N\) as

\[ \Pr(N=k) = \int^{\infty}_0 e^{-\lambda} \frac{\lambda^k}{k!} ~ g(\lambda;\alpha, \beta = \theta )~ d \lambda, \]

where \(g(\cdot;\alpha, \beta = \theta )\) is a gamma density. The proof of this result is in the text; here, we check it using R, in the special case of \(k=3\), \(\alpha =3\), and \(\theta = 4\).

Instructions

  • Establish the parameter values for \(\alpha =3\) and \(\theta = 4\).
  • Express the conditional Poisson mass function as a function of the parameter \(\lambda\) (not the number of outcomes \(k\)) (called “lambda.arg” for the lambda argument in the following sample code).
  • Express the product of the conditional Poisson mass function and the gamma density as a function of \(\lambda\).
  • integrate() this product over values of \(\lambda\). Check the result by using the negative binomial probability mass function.
  • Repeat this process using distribution functions in lieu of probability mass functions. Specifically, express the product of the conditional Poisson distribution function and the gamma density as a function of \(\lambda\). Integrate this and check the result using the negative binomial distribution function.


eyJsYW5ndWFnZSI6InIiLCJzYW1wbGUiOiJhbHBoYSA9ID8/OyAgICAgdGhldGEgPSA/P1xuXG5wZGZQb2lzc29uIDwtIGZ1bmN0aW9uKGxhbWJkYS5hcmcpe2Rwb2lzKDMsIGxhbWJkYT1sYW1iZGEuYXJnKX1cbnBkZmdhbVBvaSA8LSBmdW5jdGlvbihsYW1iZGEuYXJnKXtkZ2FtbWEobGFtYmRhLmFyZywgc2hhcGUgPSBhbHBoYSwgc2NhbGUgPSB0aGV0YSkqcGRmUG9pc3NvbihsYW1iZGEuYXJnKX1cbmludGVncmF0ZShwZGZnYW1Qb2ksIGxvd2VyID0gMCwgdXBwZXIgPSBJbmYpJHZhbHVlIFxuXG5kbmJpbm9tKDMsIHByb2I9MS8oMSt0aGV0YSksIHNpemUgPSBhbHBoYSlcblxucGRmZ2FtUG9pLnAgPC0gZnVuY3Rpb24obGFtYmRhLmFyZyl7ZGdhbW1hKD8/LCBzaGFwZSA9IGFscGhhLCBzY2FsZSA9IHRoZXRhKSpwcG9pcygzLCBsYW1iZGE9Pz8pfVxuaW50ZWdyYXRlKHBkZmdhbVBvaS5wLCBsb3dlciA9IDAsIHVwcGVyID0gPz8pJHZhbHVlIFxuXG5wbmJpbm9tKD8/LCBwcm9iPT8/LCBzaXplID0gPz8pIiwic29sdXRpb24iOiJhbHBoYSA9IDNcbnRoZXRhID0gNFxuXG5wZGZQb2lzc29uIDwtIGZ1bmN0aW9uKGxhbWJkYS5hcmcpe2Rwb2lzKDMsIGxhbWJkYT1sYW1iZGEuYXJnKX1cbnBkZmdhbVBvaSA8LSBmdW5jdGlvbihsYW1iZGEuYXJnKXtkZ2FtbWEobGFtYmRhLmFyZywgc2hhcGUgPSBhbHBoYSwgc2NhbGUgPSB0aGV0YSkqcGRmUG9pc3NvbihsYW1iZGEuYXJnKX1cbmludGVncmF0ZShwZGZnYW1Qb2ksIGxvd2VyID0gMCwgdXBwZXIgPSBJbmYpJHZhbHVlIFxuXG5kbmJpbm9tKDMsIHByb2I9MS8oMSt0aGV0YSksIHNpemUgPSBhbHBoYSlcblxucGRmZ2FtUG9pLnAgPC0gZnVuY3Rpb24obGFtYmRhLmFyZyl7ZGdhbW1hKGxhbWJkYS5hcmcsIHNoYXBlID0gYWxwaGEsIHNjYWxlID0gdGhldGEpKnBwb2lzKDMsIGxhbWJkYT1sYW1iZGEuYXJnKX1cbmludGVncmF0ZShwZGZnYW1Qb2kucCwgbG93ZXIgPSAwLCB1cHBlciA9IEluZikkdmFsdWUgXG5cbnBuYmlub20oMywgcHJvYj0xLygxK3RoZXRhKSwgc2l6ZSA9IGFscGhhKSIsInNjdCI6ImFscGhhbXNnIDwtIFwiRGlkIHlvdSBjb3JyZWN0bHkgc3BlY2lmeSB0aGUgb2JqZWN0IGBhbHBoYWA/XCJcbmV4KCkgJT4lIGNoZWNrX29iamVjdChcImFscGhhXCIsIHVuZGVmaW5lZF9tc2cgPSBcIk1ha2Ugc3VyZSB0byBub3QgcmVtb3ZlIGBhbHBoYWAhXCIpICU+JWNoZWNrX2VxdWFsKGluY29ycmVjdF9tc2c9YWxwaGFtc2cpXG50aGV0YW1zZyA8LSBcIkRpZCB5b3UgY29ycmVjdGx5IHNwZWNpZnkgdGhlIG9iamVjdCBgdGhldGFgP1wiXG5leCgpICU+JSBjaGVja19vYmplY3QoXCJ0aGV0YVwiLCB1bmRlZmluZWRfbXNnID0gXCJNYWtlIHN1cmUgdG8gbm90IHJlbW92ZSBgdGhldGFgIVwiKSAlPiVjaGVja19lcXVhbChpbmNvcnJlY3RfbXNnPXRoZXRhbXNnKVxucGRmZ2FtUG9pLnBtc2cgPC0gXCJEaWQgeW91IGNvcnJlY3RseSBzcGVjaWZ5IHRoZSBvYmplY3QgYHBkZmdhbVBvaS5wYD9cIlxuZXgoKSAlPiUgY2hlY2tfb2JqZWN0KFwicGRmZ2FtUG9pLnBcIiwgdW5kZWZpbmVkX21zZyA9IFwiTWFrZSBzdXJlIHRvIG5vdCByZW1vdmUgYHBkZmdhbVBvaS5wYCFcIikgJT4lY2hlY2tfZXF1YWwoaW5jb3JyZWN0X21zZz1wZGZnYW1Qb2kucG1zZylcbnBuYmlub21tc2cgPC0gXCJDaGVjayB0aGUgcGFyYW1ldGVycyBvZiB0aGUgbmVnYXRpdmUgYmlub21pYWwgZGlzdHJpYnV0aW9uLlwiXG5leCgpICU+JSBjaGVja19mdW5jdGlvbihcInBuYmlub21cIikgJT4lIGNoZWNrX3Jlc3VsdCgpICU+JSBjaGVja19lcXVhbChpbmNvcnJlY3RfbXNnPXBuYmlub21tc2cpXG5cbnN1Y2Nlc3NfbXNnKFwiRXhjZWxsZW50ISBBbiBpbXBvcnRhbnQgc3RyZW5ndGggb2YgdGhpcyBjb21wdXRhdGlvbmFsIGFwcHJvYWNoIChhcyB3ZSB3aWxsIHNlZSBtb3JlIGluIHRoZSBCYXllc2lhbiBzZWN0aW9uKSBpcyB0aGF0IHdlIG5vIGxvbmdlciBhcmUgbGltaXRlZCB0byBzaW1wbHkgZ2FtbWEgbWl4aW5nIGRpc3RyaWJ1dGlvbnMuIEdhbW1hcyBhcmUgdGVycmlmaWMgZm9yIGdldHRpbmcgY2xvc2VkIGZvcm0gbmVnYXRpdmUgYmlub21pYWwgZGlzdHJpYnV0aW9ucyBidXQsIGlmIHdlIG9ubHkgd2FudCBudW1lcmljYWwgcmVzdWx0cywgdGhlbiB3ZSBoYXZlIG1hbnkgbW9yZSBjaG9pY2VzLlwiKSIsImhpbnQiOiJGb3IgdGhlIGxhc3QgcGFydCBvZiB0aGlzIHByb2JsZW0sIGl0IGlzIGVhc3kgdG8gZ2V0IGNvbmZ1c2VkIGFib3V0IHdoaWNoIGlzIGEgZGYgYW5kIHdoaWNoIGlzIGEgPGVtPnBtZjwvZW0+IChvciA8ZW0+cGRmPC9lbT4pLiBSZW1lbWJlciwgd2UgYXJlIGludGVncmF0aW5nIG92ZXIgZGlmZmVyZW50IHZhbHVlcyBvZiBsYW1iZGEgYW5kIHRoaXMgZGlzdHJpYnV0aW9uIGlzIGRldGVybWluZWQgYnkgdGhlIGdhbW1hIHByb2Nlc3MuIFNvLCB0aGUgZ2FtbWEgc3RheXMgYXMgYSA8ZW0+cGRmPC9lbT4uIn0=

2.6 Goodness of Fit


In this section, you learn how to:

  • Calculate a goodness of fit statistic to compare a hypothesized discrete distribution to a sample of discrete observations.
  • Compare the statistic to a reference distribution to assess the adequacy of the fit.

Video: Goodness of Fit

Overheads: Goodness of Fit (Click Tab to View)

Hide
Hide
Hide
Hide

2.6.1 Exercise. Goodness of Fit: Zero-Modified Poisson

Assignment Text

A dataset pertaining to a 1993 portfolio of 7,483 automobile insurance policies from a major Singaporean insurance company that contains several characteristics to explain automobile claim frequency is provided by the General Insurance Association of Singapore. The claims frequency is contained in the field Clm_Count, and the data set has already been read into a data frame called Insample and made available to you. You can learn more about the data set at Singapore Auto Claims (see description of Table 19: Singapore Auto Claims on page 21).

In this assignment, you are asked to fit a zero-modified Poisson using the mle method and test the goodness of fit. In a sense, it is a continuation of the example discussed in Section 2.7 of Loss Data Analytics where the inadequacy of the Poisson model was observed.

The mle of \(p_0\) is simply the number of zeros divided by the sample size. The mle of \(\lambda\) turns out to be the solution of the equation

\[ \frac{{\lambda}}{1-\exp({-\lambda})}=\frac{\sum_{k\geq0} k \cdot m_k}{n-m_0}. \]

You can learn more about the development of the mle below.

Verify Development of the MLE derivation for zero-modified Poisson

Instructions

  • To get a feel for the data, start by generating the frequency table.
  • Store distinct claim counts observed in array named values.
  • Store the frequency of claim counts in \(\bf m.vec\).
  • Calculate sample mean of claim counts and the mle for \(p_0\).
  • Code the function that provides the framework for determining the mle of \(\lambda\)
  • Find the mle of \(\lambda\) by solving for the root of the function using uniroot(). (This function searches the interval from lower to upper for a root (i.e., zero) of the function f with respect to its first argument.)
  • Construct a 2 by 1 vector containing the observed and estimated probabilities. The five bins are \(\{0\}, \{1\},\) \(\{2\}, \{3\}, \{4,5,...\}\).
  • Use a barplot to compare observed and estimated probabilities.
  • Compute the chi-square statistic and the 95th percentile of the appropriate chi- square distribution.


eyJsYW5ndWFnZSI6InIiLCJwcmVfZXhlcmNpc2VfY29kZSI6IkluU2FtcGxlIDwtIHJlYWQuY3N2KFwiaHR0cHM6Ly9yYXcuZ2l0aHVidXNlcmNvbnRlbnQuY29tL09wZW5BY3RUZXh0cy9MREFDb3Vyc2UxL21haW4vRGF0YS9TaW5nYXBvcmVBdXRvLmNzdlwiLCBoZWFkZXI9VCxuYS5zdHJpbmdzPWMoXCIuXCIpLHN0cmluZ3NBc0ZhY3RvcnM9RkFMU0UpIiwic2FtcGxlIjoidGFibGUoSW5TYW1wbGUkPz8pXG4jIFN0b3JlIGRpc3RpbmN0IGNsYWltIGNvdW50cyBvYnNlcnZlZCBpbiBhcnJheSBuYW1lZCB2YWx1ZXNcbnZhbHVlcyA8LSBhcy5pbnRlZ2VyKG5hbWVzKHRhYmxlKD8/KSkpXG4jIFN0b3JlIHRoZSBmcmVxdWVuY3kgb2YgY2xhaW0gY291bnRzIGluIG0udmVjXG5tLnZlYyA8LSBhcy52ZWN0b3IodGFibGUoPz8pKVxuIyBDYWxjdWxhdGUgc2FtcGxlIG1lYW4gb2YgY2xhaW0gY291bnRzXG54YmFyIDwtIHN1bSg/Pyo/Pykvc3VtKD8/KVxuIyBNTEUgZm9yIHBfMFxucF8wX01MRSA8LSBtLnZlY1s/P10vc3VtKD8/KVxuIyBNTEUgZm9yIGxhbWJkYVxuTUxFX2VxbiA8LSBmdW5jdGlvbihsYW0pe1xuICBpZmVsc2UobGFtPT0wLD8/LGxhbS8oMS1leHAoLWxhbSkpKS14YmFyLygxLXBfMF9NTEUpXG59XG5sYW1iZGFfTUxFIDwtIHVuaXJvb3QoPz8sYygwLHhiYXIvKDEtcF8wX01MRSkpKSRyb290XG4jIENvbnN0cnVjdGluZyBhIDJ4MSB2ZWN0b3IgY29udGFpbmluZyB0aGUgb2JzZXJ2ZWQgYW5kIGVzdGltYXRlZCBwcm9iYWJpbGl0aWVzIFxuIyBUaGUgZml2ZSBiaW5zIGFyZSB7MH0sIHsxfSwgezJ9LCB7M30sIHs0LDUsLi4ufVxuZGF0YV9maXQgPC0gY2JpbmQoYyhtLnZlYy9zdW0obS52ZWMpLDApLCBjKHBfMF9NTEUsKDEtcF8wX01MRSkvKDEtZXhwKC1sYW1iZGFfTUxFKSkqZHBvaXModmFsdWVzWzI6bGVuZ3RoKHZhbHVlcyldLGxhbWJkYV9NTEUpLFxuKDEtcF8wX01MRSkvKDEtZXhwKC1sYW1iZGFfTUxFKSkqKDEtZXhwKC1sYW1iZGFfTUxFKS1zdW0oZHBvaXModmFsdWVzWzI6bGVuZ3RoKHZhbHVlcyldLGxhbWJkYV9NTEUpKSkgKSApXG4jYmFycGxvdFxuYmFycGxvdCh0KGRhdGFfZml0KSxuYW1lcy5hcmc9Yyh2YWx1ZXMsXCI+PTRcIikseWxhYj1cIkZyZXF1ZW5jeVwiLHhsYWI9XCJDb3VudFwiLGJlc2lkZT1ULGNvbD1jKFwiYmxhY2tcIixcImJsdWVcIikseWxpbT1jKDAsMSkpXG5sZWdlbmQoNSwwLjgsIGMoXCJPYnNlcnZlZFwiLCBcIkZpdHRlZFwiKSwgaG9yaXogPSBULCBjb2w9YyhcImJsYWNrXCIsXCJibHVlXCIpLCBmaWxsPWMoXCJibGFja1wiLFwiYmx1ZVwiKSlcbiMgQ29tcHV0ZSB0aGUgY2hpLXNxdWFyZSBzdGF0aXN0aWNcbiMgQW5kIDk1dGgtJWlsZSBvZiB0aGUgYXBwcm9wcmlhdGUgY2hpLSBzcXVhcmUgZGlzdHJpYnV0aW9uXG5jKHFjaGlzcSg/Pyw/Pyksc3VtKChkYXRhX2ZpdCUqJWMoMSw/PykpXj8/L2RhdGFfZml0Wyw/P10pKnN1bSg/PykpIiwic29sdXRpb24iOiJ0YWJsZShJblNhbXBsZSRDbG1fQ291bnQpXG4jIFN0b3JlIGRpc3RpbmN0IGNsYWltIGNvdW50cyBvYnNlcnZlZCBpbiBhcnJheSBuYW1lZCB2YWx1ZXNcbnZhbHVlcyA8LSBhcy5pbnRlZ2VyKG5hbWVzKHRhYmxlKEluU2FtcGxlJENsbV9Db3VudCkpKVxuIyBTdG9yZSB0aGUgZnJlcXVlbmN5IG9mIGNsYWltIGNvdW50cyBpbiBtLnZlY1xubS52ZWMgPC0gYXMudmVjdG9yKHRhYmxlKEluU2FtcGxlJENsbV9Db3VudCkpXG4jIENhbGN1bGF0ZSBzYW1wbGUgbWVhbiBvZiBjbGFpbSBjb3VudHNcbnhiYXIgPC0gc3VtKHZhbHVlcyptLnZlYykvc3VtKG0udmVjKVxuIyBNTEUgZm9yIHBfMFxucF8wX01MRSA8LSBtLnZlY1sxXS9zdW0obS52ZWMpXG4jIE1MRSBmb3IgbGFtYmRhXG5NTEVfZXFuIDwtIGZ1bmN0aW9uKGxhbSl7XG4gIGlmZWxzZShsYW09PTAsMSxsYW0vKDEtZXhwKC1sYW0pKSkteGJhci8oMS1wXzBfTUxFKVxufVxubGFtYmRhX01MRSA8LSB1bmlyb290KE1MRV9lcW4sYygwLHhiYXIvKDEtcF8wX01MRSkpKSRyb290XG4jIENvbnN0cnVjdGluZyBhIDJ4MSB2ZWN0b3IgY29udGFpbmluZyB0aGUgb2JzZXJ2ZWQgYW5kIGVzdGltYXRlZCBwcm9iYWJpbGl0aWVzIFxuIyBUaGUgZml2ZSBiaW5zIGFyZSB7MH0sIHsxfSwgezJ9LCB7M30sIHs0LDUsLi4ufVxuZGF0YV9maXQgPC0gY2JpbmQoYyhtLnZlYy9zdW0obS52ZWMpLDApLCAgYyhwXzBfTUxFLCgxLXBfMF9NTEUpLygxLWV4cCgtbGFtYmRhX01MRSkpKmRwb2lzKHZhbHVlc1syOmxlbmd0aCh2YWx1ZXMpXSxsYW1iZGFfTUxFKSxcbigxLXBfMF9NTEUpLygxLWV4cCgtbGFtYmRhX01MRSkpKigxLWV4cCgtbGFtYmRhX01MRSktc3VtKGRwb2lzKHZhbHVlc1syOmxlbmd0aCh2YWx1ZXMpXSxsYW1iZGFfTUxFKSkpICApIClcbiNiYXJwbG90XG5iYXJwbG90KHQoZGF0YV9maXQpLG5hbWVzLmFyZz1jKHZhbHVlcyxcIj49NFwiKSx5bGFiPVwiRnJlcXVlbmN5XCIseGxhYj1cIkNvdW50XCIsYmVzaWRlPVQsY29sPWMoXCJibGFja1wiLFwiYmx1ZVwiKSx5bGltPWMoMCwxKSlcbmxlZ2VuZCg1LDAuOCwgYyhcIk9ic2VydmVkXCIsIFwiRml0dGVkXCIpLCBob3JpeiA9IFQsIGNvbD1jKFwiYmxhY2tcIixcImJsdWVcIiksIGZpbGw9YyhcImJsYWNrXCIsXCJibHVlXCIpKVxuIyBDb21wdXRlIHRoZSBjaGktc3F1YXJlIHN0YXRpc3RpY1xuIyBBbmQgOTV0aC0laWxlIG9mIHRoZSBhcHByb3ByaWF0ZSBjaGktc3F1YXJlIGRpc3RyaWJ1dGlvblxuYyhxY2hpc3EoMC45NSwyKSxzdW0oKGRhdGFfZml0JSolYygxLC0xKSleMi9kYXRhX2ZpdFssMl0pKnN1bShtLnZlYykpIiwic2N0IjoidmFsdWVzbXNnIDwtIFwiRGlkIHlvdSBjb3JyZWN0bHkgc3BlY2lmeSB0aGUgb2JqZWN0IGB2YWx1ZXNgP1wiXG5leCgpICU+JSBjaGVja19vYmplY3QoXCJ2YWx1ZXNcIiwgdW5kZWZpbmVkX21zZyA9IFwiTWFrZSBzdXJlIHRvIG5vdCByZW1vdmUgYHZhbHVlc2AhXCIpICU+JWNoZWNrX2VxdWFsKGluY29ycmVjdF9tc2c9dmFsdWVzbXNnKVxubS52ZWNtc2cgPC0gXCJEaWQgeW91IGNvcnJlY3RseSBzcGVjaWZ5IHRoZSBvYmplY3QgYG0udmVjYD9cIlxuZXgoKSAlPiUgY2hlY2tfb2JqZWN0KFwibS52ZWNcIiwgdW5kZWZpbmVkX21zZyA9IFwiTWFrZSBzdXJlIHRvIG5vdCByZW1vdmUgYG0udmVjYCFcIikgJT4lY2hlY2tfZXF1YWwoaW5jb3JyZWN0X21zZz1tLnZlY21zZylcblxucF8wX01MRW1zZyA8LSBcIkRpZCB5b3UgY29ycmVjdGx5IHNwZWNpZnkgdGhlIG9iamVjdCBgcF8wX01MRWA/XCJcbmV4KCkgJT4lIGNoZWNrX29iamVjdChcInBfMF9NTEVcIiwgdW5kZWZpbmVkX21zZyA9IFwiTWFrZSBzdXJlIHRvIG5vdCByZW1vdmUgYHBfMF9NTEVgIVwiKSAlPiVjaGVja19lcXVhbChpbmNvcnJlY3RfbXNnPXBfMF9NTEVtc2cpXG5cbmxhbWJkYV9NTEVtc2cgPC0gXCJEaWQgeW91IGNvcnJlY3RseSBzcGVjaWZ5IHRoZSBvYmplY3QgYGxhbWJkYV9NTEVgP1wiXG5leCgpICU+JSBjaGVja19vYmplY3QoXCJsYW1iZGFfTUxFXCIsIHVuZGVmaW5lZF9tc2cgPSBcIk1ha2Ugc3VyZSB0byBub3QgcmVtb3ZlIGBsYW1iZGFfTUxFYCFcIikgJT4lY2hlY2tfZXF1YWwoaW5jb3JyZWN0X21zZz1sYW1iZGFfTUxFbXNnKVxuXG5kYXRhX2ZpdG1zZyA8LSBcIkRpZCB5b3UgY29ycmVjdGx5IHNwZWNpZnkgdGhlIG9iamVjdCBgZGF0YV9maXRgP1wiXG5leCgpICU+JSBjaGVja19vYmplY3QoXCJkYXRhX2ZpdFwiLCB1bmRlZmluZWRfbXNnID0gXCJNYWtlIHN1cmUgdG8gbm90IHJlbW92ZSBgZGF0YV9maXRgIVwiKSAlPiVjaGVja19lcXVhbChpbmNvcnJlY3RfbXNnPWRhdGFfZml0bXNnKVxuXG5zdWNjZXNzX21zZyhcIlN1Y2Nlc3MhIE5vdyBjb21wYXJlIHRoZSBhYm92ZSAqbWxlKiBmb3IgbGFtYmRhIHdpdGggdGhhdCBmb3IgdGhlIFBvaXNzb24gbW9kZWwgKHNlZSBTZWN0aW9uIDIuNylcIikiLCJoaW50IjoiUmVhZCB0aHJvdWdoIDxhIGhyZWY9XCJodHRwczovL29wZW5hY3R0ZXh0cy5naXRodWIuaW8vTG9zcy1EYXRhLUFuYWx5dGljcy9DLUZyZXF1ZW5jeS1Nb2RlbGluZy5odG1sI1M6Z29vZG5lc3Mtb2YtZml0dFwiPlNlY3Rpb24gMi43IG9mIDxlbT5Mb3NzIERhdGEgQW5hbHl0aWNzPC9lbT48L2E+IGFuZCB0aGUgIGluc3RydWN0aW9ucyBjYXJlZnVsbHkuIn0=

A Practicing Actuary’s Perspective

Here is a perspective on Chapter Two from Motoharu Dei, an actuary and data scientist with Accenture and part of an ASTIN research group with the Institute of Actuaries of Japan.

Chapter Contributors

  • Authors. N.D. Shyamalkumar, The University of Iowa, Michelle Xia, Northern Illinois University, and Edward (Jed) Frees, University of Wisconsin-Madison and Australian National University, are the principal authors of the initial version of this chapter.
  • Chapter Maintainers. Please contact Michelle and/or Jed at for chapter comments and suggested improvements.