2.3 Tasks
2.3.7 A Case Study for Analyze Data
Guide to Business Data Analytics
ABC insurance Co., one of the largest issuers of life insurance in Japan, formed a new data science team comprised of data scientists, actuaries, insurance underwriters, and business data analytics professionals. Their mandate is to challenge the status quo and address existing customer experience challenges. This team promotes new ways of working, including evidence-based decision-making, in the hopes of helping ABC become more responsive to market demands and organizational priorities. In many ways, the goal is to introduce a "start-up culture" into a one-hundred years-old traditionally structured insurance company.
.1 The Challenge
The team conducted a thorough current state analysis of the existing processes that shape customer experience and how ABC uses technology to support its underwriting, quoting, and policy issuing functions. The team identified a number of challenges:
the application process is cumbersome, taking approximately a month to approve.
applicants are asked to provide extensive information including demographic information, medical history, and employment information taking 90 minutes to enter.
they are asked to select among several insurance products or combinations of products
The result? Customers are disengaging, as demonstrated by ABC Insurance's web analytics, which indicate 35% of individuals end their transaction prematurely during the application process.
The team's senior business data analytics expert, Haru Kobayashi, was asked to analyze the data and recommend actions. In addition to reviewing the data collected through the current state analysis, he also analyzed the results of interviews conducted by the team. They interviewed both customers and individuals that abandoned their online applications. Haru analyzed all this data and concluded that consumers are accustomed to seamless online transactions. Respondents acknowledged the need for providing a large amount of information, but they expect the organizations they interact with to provide seamless transactions or risk losing them as a customer.
.2 The Way Forward
The team concluded that ABC's processes are antiquated and inefficient. They believed it was important to make it easier for applicants to submit their information and reduce the time it takes to issue a quote. They also identified the disconnect between an applicant entering information online and the manual processing that takes place to verify information, assess the risk, generate a quote, and respond to the applicant.
The team recommended ABC utilize a technology-based solution, one that delivers a predictive data analytics model and can be customized to accurately classify risk using ABC's standard approach. Using this type of technology can significantly reduce the time taken for approval. The team's immediate goal was to better understand the predictive power of the data from existing underwriting assessments and to enable ABC to use that information to improve the overall process.
.3 Working with Available Data
The team worked with approximately 80,000 customer applications and almost 130 predictor variables. Haru parsed through the data to understand what could be useful and how the data could be used. His initial analysis allowed the team to rationalize data items, understand reasons for missing data, determine what data needs to be input, develop rationale to be followed, and develop a more robust data set.
After transforming the available data, it was categorized into different business relevant data elements such as product information, age, height, weight, employment information, insured information, insurance history, family history, and medical history.
.4 Identifying Key Questions
Initial analysis of the business context suggested that most of the time was spent in risk classification. The team agreed that the biggest reduction in processing time would result from automatically and accurately classifying customer applications to appropriate risk classes. By doing so, the application processing time could be dramatically reduced. Data scientists on the team considered multiple algorithms to produce the desired results. They had some fundamental questions to better understand the data required by underwriters, including:
.5 Business Data Analytics Approach
Often, the underlying concepts and mathematical background required to understand data-related challenges turn out to be quite complex. The heavy use of data science terminology by data scientists was not understood by business stakeholders, in this case the underwriters on the team. Likewise, the data scientists were struggling to understand business needs. Haru developed the following approach:
Haru worked hard to socialize understanding of several key terms and business rules and helped others understand the impact of their decisions, including:
Haru's data analysis experience and his ability to bridge the two worlds of data science and business ensured the team achieved its desired outcome.
.7 Key Takeaways
.1 The Challenge
The team conducted a thorough current state analysis of the existing processes that shape customer experience and how ABC uses technology to support its underwriting, quoting, and policy issuing functions. The team identified a number of challenges:
the application process is cumbersome, taking approximately a month to approve.
applicants are asked to provide extensive information including demographic information, medical history, and employment information taking 90 minutes to enter.
they are asked to select among several insurance products or combinations of products
The result? Customers are disengaging, as demonstrated by ABC Insurance's web analytics, which indicate 35% of individuals end their transaction prematurely during the application process.
The team's senior business data analytics expert, Haru Kobayashi, was asked to analyze the data and recommend actions. In addition to reviewing the data collected through the current state analysis, he also analyzed the results of interviews conducted by the team. They interviewed both customers and individuals that abandoned their online applications. Haru analyzed all this data and concluded that consumers are accustomed to seamless online transactions. Respondents acknowledged the need for providing a large amount of information, but they expect the organizations they interact with to provide seamless transactions or risk losing them as a customer.
.2 The Way Forward
The team concluded that ABC's processes are antiquated and inefficient. They believed it was important to make it easier for applicants to submit their information and reduce the time it takes to issue a quote. They also identified the disconnect between an applicant entering information online and the manual processing that takes place to verify information, assess the risk, generate a quote, and respond to the applicant.
The team recommended ABC utilize a technology-based solution, one that delivers a predictive data analytics model and can be customized to accurately classify risk using ABC's standard approach. Using this type of technology can significantly reduce the time taken for approval. The team's immediate goal was to better understand the predictive power of the data from existing underwriting assessments and to enable ABC to use that information to improve the overall process.
.3 Working with Available Data
The team worked with approximately 80,000 customer applications and almost 130 predictor variables. Haru parsed through the data to understand what could be useful and how the data could be used. His initial analysis allowed the team to rationalize data items, understand reasons for missing data, determine what data needs to be input, develop rationale to be followed, and develop a more robust data set.
After transforming the available data, it was categorized into different business relevant data elements such as product information, age, height, weight, employment information, insured information, insurance history, family history, and medical history.
.4 Identifying Key Questions
Initial analysis of the business context suggested that most of the time was spent in risk classification. The team agreed that the biggest reduction in processing time would result from automatically and accurately classifying customer applications to appropriate risk classes. By doing so, the application processing time could be dramatically reduced. Data scientists on the team considered multiple algorithms to produce the desired results. They had some fundamental questions to better understand the data required by underwriters, including:
- How are risk classes related? Do risk classes depend on each other? Are they categorical in nature?
- Is the risk function monotonic?
- Is this a multinomial classification problem?
- What is the best metric to evaluate performance of the predictive model - Accuracy, MCC, or Cohen Kappa?
.5 Business Data Analytics Approach
Often, the underlying concepts and mathematical background required to understand data-related challenges turn out to be quite complex. The heavy use of data science terminology by data scientists was not understood by business stakeholders, in this case the underwriters on the team. Likewise, the data scientists were struggling to understand business needs. Haru developed the following approach:
- Collaborate closely with data scientists to learn terminology.
- Understand the relevance of the questions asked by the data scientists.
- Translate this learning to business terms that would be meaningful to underwriters.
- Communicate the correct business implications so the team could develop a shared understanding of the proposed model.
Haru worked hard to socialize understanding of several key terms and business rules and helped others understand the impact of their decisions, including:
| Key Terms | Application to Predictive Models/Algorithms |
| Categorical Risk Classes Insurance risk classes describe groups of individuals with similar risk characteristics. For example, 20-40 years of age, new driver, or smoker may be grouped and classified into a higher risk class and therefore a higher cost to insure. |
With this understanding, the team identified categories and determined an effective algorithm to use in the predictive model with output aligned to a single specific risk class (for example, 1 to 8). |
| Monotonic Risk Function The risk classes are ordinal if the outcome (risk class) follows a specific order. In other words, does risk class 3 have a higher risk profile than risk classes 2 and 1? Similarly, does the risk increase or decrease based on an increasing or decreasing input? For example, the output (likelihood of death) may be monotonic if the input is age. |
The behaviour of expected output determines the modelling process. In this case, if an algorithm outputs the probability of the event, then the business stakeholders may have to qualify what the risk classes mean. For example, probability of death of 1-50% may correspond to risk class 1, 50-60% may correspond to risk class 2 and so forth. The modelling must take these aspects into account, and that requires a shared understanding between business stakeholders and the team. |
| Multinomial Classification Simply stated, it means that the output of the predictive model is one of the eight risk classes. |
Like other classification problems, different parameters such as accuracy, precision, and others can be used, but these parameters may not be robust in evaluating the performance of the predictive model. |
| Accuracy, Matthews Correlation Coefficient (MCC), or Cohen's Kappa Coefficient These measure algorithm performance. The mathematical background for these metrics needs to be applied for the business context. For example, Cohen's Kappa may be explained in simple terms to the business stakeholder as two underwriters (A and B) are trying to classify the same set of applications independently into two risk classes (for example, 1, 2). Accuracy is the metric that describes the probability or percentage of times both underwriters agree on the risk class for an application. However, underwriters may agree on the risk class by pure chance, say due to lack of information or uncertainty. The Kappa corrects this issue. |
The entire focus of modelling and optimization of an algorithm depends upon the definition of success for the algorithm, so it is important to accurately communicate modelling assumptions. The evaluation criteria adopted by the team needs to be communicated to the business stakeholders or there may be other criteria that business stakeholders may think are more relevant (for example, the actuaries may suggest a different one). The success of the entire initiative depends upon what is chosen as the evaluation parameter. Any measure of success must be consistent with the business problem and help create a shared understanding of their use. |
Haru's data analysis experience and his ability to bridge the two worlds of data science and business ensured the team achieved its desired outcome.
.7 Key Takeaways
- When data is analyzed from an algorithmic or modelling perspective, the challenge is to translate many of the technically challenging questions to more accessible format for business stakeholders.
- Business data analytics experts combine their business analysis skills with their data analytics skills and leverage underlying competencies such as learning, systems thinking, business acumen, and teaching throughout their work.
- Business data analytics experts play a key role in helping develop shared understanding, including the ability to translate complex data analytics concepts and describe their potential impact on business results. A shared understanding between different teams often forms the first step towards managing the change implications from an analytics initiative.