Taking too long? Close loading screen.

SOA ASA Exam: Predictive Analysis (PA) – 4.3. Generalized Linear Models Case Study 2

[mathjax] Case Study 2: GLMs for Binary Target Variables Learning Objectives Compared to GLMs for numeric target variables, GLM-based classifiers enjoy some subtly unique features, which will be revealed in the course of this case study. At the completion of this section, you should be able to: Combine factor levels to reduce the dimension of the data. Select appropriate link functions for binary target variables. Implement different kinds of GLMs for binary target variables in R. Incorporate an offset into a logistic regression model. Interpret the results of a fitted logistic regression model.   Background In this case study, we will examine the dataCar dataset in the insuranceData package. This dataset is based on a total of n = 67,856 one-year vehicle insurance policies taken out in 2004 or 2005. The variables in this dataset pertain to different characteristics of the policyholders and their vehicles. The target variable is clm, a binary variable equal to 1 if a claim occurred over the policy period and 0 otherwise.   Stage 1: Define the Business Problem Objective Our objective here is to construct appropriate GLMs to identify key factors associated with claim occurrence. Such factors will provide insurance companies offering vehicle insurance …

Read more

SOA ASA Exam: Predictive Analysis (PA) – 4.2. Generalized Linear Models Case Study 1

[mathjax] Case Study 1: GLMs for Continuous Target Variables Learning Objectives Select appropriate distributions and link functions for a positive, continuous target variable with a right skew. Fit a GLM using the glm() function in R and specify the options of this function appropriately. Make predictions for GLMs using the predict() function and compare the predictive performance of different GLMs. Generate and interpret diagnostic plots for a GLM.   Preparatory Steps Background persinj contains the information of n = 22,036 settled personal injury insurance claims which were reported during the period from July 1989 to the end of 1999. Claims settled with zero payment were not included.   Objective Our objective here is to build GLMs to predict the size of settled claims using related risk factors in the dataset, select the most promising GLM, and quantify its predictive accuracy. For claim size variables, which are continuous, positive-valued and often highly skewed, common modeling options include: Apply a log transformation to claim size and fit a normal linear model to the log-transformed claim size. Build a GLM with the normal distribution and a link function such as the log link to ensure that the target mean is non-negative. Build a …

Read more

SOA ASA Exam: Predictive Analysis (PA) – 4.1. Generalized Linear Models

[mathjax] EXAM PA LEARNING OBJECTIVES Learning Objectives The Candidate will be able to describe and select a Generalized Linear Model (GLM) for a given data set and regression or classification problem. Learning Outcomes The Candidate will be able to: Understand the specifications of the GLM and the model assumptions. Create new features appropriate for GLMs. Interpret model coefficients, interaction terms, offsets, and weights. Select and validate a GLM appropriately. Explain the concepts of bias, variance, model complexity, and the bias-variance trade-off. In Exam PA, there are often tasks that require you to describe, in high-level terms, what a GLM is and the pros and cons of a GLM relative to other predictive models, so the conceptual aspects of GLMs will be useful not only for understanding the practical implementations of GLMs in the next three sections, but also for tackling exam items. Because all of the feature generation (e.g., binarization of categorical predictors, introduction of polynomial and interaction terms) and feature selection techniques (e.g., stepwise selection algorithms and regularization) for linear models generalize to GLMs in essentially the same way and everything we learned about the bias-variance trade-off for linear models also applies here, our focus in this section is …

Read more

SOA ASA Exam: Predictive Analysis (PA) Case Studies

[mathjax] Regularization What is regularization? Reduce model complexity: Reduces the magnitude of the coefficient estimates via the use of a penalty term and serves to prevent overfitting. An alternative to using stepwise selection for identifying useful features. How does regularization work? Variables with limited predictive power will receive a coefficient estimate that is small, if not exactly zero, and therefore are removed from the model. α If it is to identify key factors affecting the target variable, using α = 0, which is ridge regression and does not eliminate any variables, is not appropriate.   Interactions Interpretation There is a significant interaction between [A] and [B], meaning that: The effect of [A] on [Y] varies for … with and without [B].   3.2 Linear Models Case Study 2: Feature Selection and Regularization Learning Objectives After completing this case study, you should be able to: Fit a multiple linear regression model with both numeric and categorical (factor) predictors. Detect and accommodate interactions between predictors which can be quantitative or qualitative. Perform explicit binarization of categorical predictors and understand why doing so may be beneficial. library(caret) dummyVars() Perform stepwise selection and be familiar with the different options allowed by this function. library(MASS) …

Read more

SOA ASA Exam: Predictive Analysis (PA) – 3.2 Linear Models Case Study 2: Feature Selection and Regularization

[mathjax] Learning Objectives After completing this case study, you should be able to: Fit a multiple linear regression model with both numeric and categorical (factor) predictors. Detect and accommodate interactions between predictors which can be quantitative or qualitative. Perform explicit binarization of categorical predictors using the dummyVars() function from the caret package and understand why doing so may be beneficial. Perform stepwise selection using the stepAIC() function from the MASS package and be familiar with the different options allowed by this function. Generate and interpret diagnostic plots for a linear model. Implement regularized regression using the glmnet() and cv.glmnet() functions from the glmnet package.   Stage 1: Define the Business Problem Objectives Our goal here is to identify and interpret key factors that relate to a higher or lower Balance with the aid of appropriate linear models.   Stage 2: Data Collection Data Design Relevance Read in data and remove irrelevant variables. # CHUNK 1 library(ISLR) data(Credit) Credit$ID <- NULL   Data Description The Credit dataset contains n = 400 observations and 11 variables. Numeric predictors are listed first, followed by categorical ones. The target variable is the last variable in the dataset, Balance, an integer-valued variable that ranges from …

Read more

SOA ASA Exam: Predictive Analysis (PA) – 3.1 Linear Models Case Study 1: Fitting Linear Models in R

[mathjax] Context Suppose that we are statistical consultants hired by the company that offers the product. The company is interested in boosting sales of the product, but cannot directly do so (that is determined by market demand). Instead, it has the liberty to control the advertising expenditure in each of the three advertising media: TV, radio, and newspaper. If we can construct a linear model that accurately predicts sales (the target variable) on the basis of the budgets spent on the three advertising media (the predictors), then such a model can provide the basis for a profitable marketing plan that specifies how much should be spent on the three media to maximize sales, a business issue of great interest to the company. Learning Objectives After completing this case study, you should be able to: Fit a multiple linear regression model using the lm() function and extract useful information from a fitted model using the summary() function. Appreciate why variable significance may change as a result of correlations between variables. Generate additional features such as interaction and polynomial terms in a linear model. Partition the data into training and test sets using the createDataPartition() from the caret package. Generate predictions on …

Read more

SOA ASA Exam: Predictive Analysis (PA) – 3. Linear Models

[mathjax] Basic Terminology Classification of Variables There are two ways to classify variables in a predictive analytic context: By their role in the study (intended use) or by their nature (characteristics). By role The variable that we are interested in predicting is called the target variable (or response variable, dependent variable, output variable). The variables that are used to predict the target variable go by different names, such as predictors, explanatory variables, input variables, or sometimes simply variables if no confusion arises. In an actuarial context, predictors are also known as risk factors or risk drivers.   By nature Variables can also be classified as numeric variables or categorical variables. Such a classification has important implications for developing an effective predictive model that aligns with the character of the target variable and predictors to produce realistic output. Numeric (a.k.a. quantitative) variables: Numeric variables take the form of numbers with an associated range. They can be further classified as discrete / continuous variables. Categorical (a.k.a. qualitative, factor) variables: As their name implies, categorical variables take predefined values, called levels or classes, out of a countable collection of “categories”. When a categorical variable takes only two possible levels, it is called a …

Read more

SOA ASA Exam: Predictive Analysis (PA) – 2. ggplot

[mathjax] Making ggplots Basic Features Load library library(ggplot2)   ggplot Function ggplot(data = <DATA>, mapping= aes(<AESTHETIC_1> = <VARIABLE_1>,                                     <AESTHETIC 2> = <VARIABLE_2>,                                     …)) +     geom_<TYPE>(< … >) +     geom_<TYPE>(< … >) +     <OTHER_FUNCTIDNS> +      … The ggplot() function initializes the plot, defines the source of data using the data argument (almost always a data frame), and specifies what variables in the data are mapped to visual elements in the plot by the mapping argument. Mappings in a ggplot are specified using the aes() function, with aes standing for “aesthetics“. The geom functions: Subsequent to the ggplot() function, we put in geometric objects, or geoms for short, which include points, lines, bars, histograms, boxplots, and many other possibilities, by means of one or more geom functions. Placed layer by layer, these geoms determine what kind of plot is to be drawn and modify its visual characteristics, taking the data and aesthetic mappings specified in the ggplot() function as inputs. …

Read more

SOA ASA Exam: Predictive Analysis (PA) – 1. Basics of R

[mathjax] Data Types Create an integer append “L” to an integer: x <- 1L Data Structures Vectors Create a vector c(…) a <- c(1:5) b <- c(5:1) c <- c(“A”, “B”, “C”) d <- c(TRUE, FALSE, FALSE, TRUE, TRUE) print(a) [1] 1 2 3 4 5 print(b) [1] 5 4 3 2 1 print(c) [1] “A” “B” “C” print(c) [1] “A” “B” “C” print(d) [1] TRUE FALSE FALSE TRUE TRUE   Create a sequence of numbers seq(from, to, by) x <- seq(0, 5, 1) [1] 1 2 3 4 5   Extract subsets of vectors [] # Using positive integers a[2] [1] 2 a[c(2, 4)] [1] 2 4 # Using negative integers b[-1] [1] 4 3 2 1 b[-(2:4)] [1] 5 1 # Using logical vectors a[d] [1] 1 4 5 Remark: Unequal Length: For two vectors of unequal length, the shorter vector is recycled by repeating the elements in shorter vector to match the longer vector. > print(a + 1:3) [1] 2 4 6 5 7   Factors Create a factor: factor(…) # define x as a vector x <- c(“M”, “F”, “M”, “O”, “F”) # factorize x and assign to x.factor x.factor <- factor(x) x.factor [1] M F M O F Levels: F …

Read more

SOA FSA Module: Regulation and Taxation

Module Overview Module Introduction Many government and quasi-government agencies regulate life insurance companies. They exercise authority over both the life insurance industry and the individual companies. Regulation and taxation affect product design—sometimes by incentive and sometimes by required standards. For example, the states in the United States have laws that govern solvency of companies and also often levy state premium taxes. Regulation: Laws and rules that govern financial services industries. Taxation: System to raise revenue for governments. This module addresses regulation and taxation separately. Through this module, you will: Understand the basis for those laws and how they serve the public’s interest. Become familiar with key international regulatory topics. Apply the legal and regulatory principles to realistic examples. Relate the regulatory environment to product design and management. Consider the effect that the conduct of people in various roles has on the solvency of life insurance companies. Be introduced to the various government agencies that regulate insurance and annuity products and companies in the United States and Canada and the laws under which they operate. Understand the basis for those laws and how they serve the public’s interest. Module Objectives By the end of your module study you will be able …

Read more