VIII Simposio Internacional de Actuaria
Modern machine learning tools are founded on classical regression principles. Participants in this course will receive an exposure, both theoretical and practical, to regression tools. This course will provide participants with a springboard to further their studies to gain a deep understanding of machine learning approaches.
The format of the course will consist of alternating between presentations of the underlying principles and practical applications. In a typical block, the instructor will spend 45 minutes reviewing the insurance motivation and key mathematical underpinnings. This will be followed by a 45 minute block of time in which participants will actively explore a selected case study. Thus, it is anticipated that participants will bring a laptop with the statistical package R installed to the workshop. (Sharing a laptop with a colleague is also a great way to learn.)
Target Audience: Practicing actuaries, students, and educators interested in exposure to the foundations of insurance analytics.
Keywords and Phrases: Insurance analytics, ratemaking, statistical distributions, estimation and model selection, regression analysis.
Workshop Preparation. To prepare for the workshop, many participants will wish to review R code available at the website:
This site employs interactive “Datacamp” style exercises.
See also below for additional suggestions on things you can do to prepare yourself for the workshop.
\[ {\small \begin{array}{l | l} \hline \text{9:00-9:20 am} &\text{Welcome, Introduction to the workshop,} \\ &~~~~\text{ How it will work, What you can get out of it.} \\ \hline \text{9:20-10:00 am} &\text{Topic 1. Introduction to Analytics} \\ \text{10:00-10:30 am} &\text{Participants will explore features that influence} \\ &~~~~\text{prices of term life policies} \\ \text{10:30-11:00 am} &\text{Coffee Break} \\\hline \text{11:00-11:45 am} &\text{Topic 2. Regression and Linear Models} \\ &~~~~\text{ with an Emphasis on Prediction} \\ \text{11:45-12:30} &\text{Participants will explore health data} \\ \text{12:30-2:00 pm} &\text{Lunch} \\\hline \text{2:00-2:45 pm} &\text{Topic 3. Logistic Regression and Generalized Linear Models} \\ \text{2:45-3:30 pm} &\text{Participants will explore health expenditure data} \\ \text{3:30-3:50 pm} &\text{Break} \\\hline \text{3:50-4:20 pm} &\text{Topic 4. Predictive Modeling: Frequency-Severity Models} \\ \text{4:20-4:50 pm} &\text{Participants will explore automobile data for pricing purposes} \\ \text{4:50-5:00 pm} &\text{Wrap-Up} \\\hline \end{array} } \]
Summaries of the theory and the cases are drawn from three sources written by the instructor:
Participants need not purchase these books for the workshop. The sources are provided for your information.
A group of volunteers is working to make available a Spanish translation of the first book that will be free and available online. See our current progress at
You can contact Jed at jfrees@bus.wisc.edu. See the Frees Research Website for more information about his background.
Here are additional resources you can use to prepare yourself for the course.
Given a lot of time, the best way to learn the R
code is to go to the original sources and work carefully through the motivation and the execution. However, in this workshop, you will not have a lot of time. So, here is a sequence of R markdown (.Rmd) files that contain the code.
I will demonstrate how to use these .Rmd files if you haven’t seen them before. Essentially, you will load each file into R studio and then you can execute each block of code by hitting a green arrow in the upper right of the code.
Time | Topic | Activity |
---|---|---|
9:00-9:20 am | Welcome | Introduction to the workshop, how it will work, what you can get out of it |
9:20-10:00 am | Topic 1. Introduction to Analytics | Instructor presentation of Chapter 2 of Loss Data Analytics, Second Edition |
Instructor provides an overview of code from the Online Tutorial on Regression Modeling with Actuarial and Financial Applications | ||
10:00-10:30 am | Explore features that influence prices of term life policies | Start with the Chapter 2 code from LDA. Then, participants can review term life data and code from the Chapter 3, Regression Modeling with Actuarial and Financial Applications. |
If there is sufficient time, participants can review code from Chapter 3 of the Online Tutorial | ||
10:30-11:00 am | Coffee Break | |
11:00-11:45 am | Topic 2. Regression and Linear Models with an emphasis on Prediction | Instructor overview of Chapters 2-6 of Regression Modeling with Actuarial and Financial Applications |
11:45-12:30 | Explore health expenditures data | Start by reviewing the lecture BMI Modeling Code. Then, learn about the expenditures data in Regression Modeling with Actuarial and Financial Applications, Exercise 1.1. If there is sufficient time, participants can review the code in Chapter 5 of the Online Tutorial. |
12:30-2:00 pm | Lunch | |
2:00-2:45 pm | Topic 3. Logistic Regression and Generalized Linear Models | Chapters 11 and 13 of Regression Modeling with Actuarial and Financial Applications |
2:45-3:30 pm | Explore health expenditures data | Review the code in Regression Modeling with Actuarial and Financial Applications, Example 11.1, Section 11.4, and Section 13.4. Also see Exercise 12.5 (no code available). |
3:30-3:50 pm | Break | |
4:50-4:20 pm | Topic 4. Predictive Modeling: Frequency-Severity Models | Instructor presentation of Chapter 6 of Predictive Modeling Applications in Actuarial Science, Volume 1. |
4:20-4:50 pm | Explore automobile data for pricing purposes | Start by reviewing the lecture study based on the Massachusetts Automobile Claims. If there is sufficient time, participants can review the Chapter 6 Code and Data for Predictive Modeling Applications in Actuarial Science. |
4:50-5:00 pm | Wrap-Up |