
As actuaries, we are accustomed to Division of Insurance (DOI) scrutiny — from basic filing requirements, generalized linear models (GLM) questionnaires and responding to objections. Complying with regulations is old hat. Artificial intelligence (AI) is relatively new on the scene and moving quickly. Insurance regulations in this space are just now starting to catch up, giving us all a chance to continue to flex our compliance muscle.
The National Association of Insurance Commissioners (NAIC) first published their model bulletin on AI in December 2023 (as mentioned in the Actuarial Review article, “NAIC Model Bulletin Recommends NIST’s Approach” and my Developing News article in the Jan/Feb AR). Since that time, just under half of all U.S. jurisdictions have adopted the bulletin. Some adopted the bulletin wholesale; others adjusted to be either more or less prescriptive.
Other states decided to take a different route to the same intended destination. The Colorado (CO) DOI worked with O’Neil Risk Consulting & Algorithmic Auditing (ORCAA) to develop regulation for life insurers, with the personal auto regulations in development as of the writing of this article. Similarly, the New York Department of Financial Services created their own rules.
This accounting only considers state-level insurance-specific regulations within the U.S. At a state level, federally and globally, AI legislation and regulation continue to be proposed, refined and approved. These regulations are all aiming for the same goal — namely, to protect consumers from potential adverse outcomes. Their means to do so differ, however, setting us down a path of forked roads.
Same Ends, Different Means
What are these regulations asking insurers to do to reach their goal of consumer protection?
Definitions
The first step to understanding a given regulation is understanding the definitions: What shared lexicon will facilitate our compliance? See Table 1 for the definition the NAIC provides for AI.
Table 1. Comparison of definitions of terms in NAIC, Colorado, and New York regulations regarding the use of AI in insurance.
Regulation | |||
Defined Term | NAIC Model Bulletin on AI | CO DOI 3 CCR 702-10 (P&C regs in development) | NY DFS Insurance Circular Letter No. 7 |
Artificial Intelligence (AI) | Refers to a branch of computer science that uses data processing systems that perform functions normally associated with human intelligence, such as reasoning, learning and self-improvement, or the capability of a device to perform functions that are normally associated with human intelligence such as reasoning, learning and self-improvement. This definition considers machine learning to be a subset of artificial intelligence. | Not included | Not included |
AI System (AIS) | A machine-based system that can, for a given set of objectives, generate outputs such as predictions, recommendations, content (such as text, images, videos or sounds) or other output influencing decisions made in real or virtual environments. AI Systems are designed to operate with varying levels of autonomy. | Not included | Means any machine-based system designed to perform functions normally associated with human intelligence, such as reasoning, learning and self-improvement, that is used — in whole or in part — to supplement traditional health, life, property or casualty underwriting or pricing, as a proxy for traditional health, life, property or casualty underwriting or pricing, or to identify “lifestyle indicators” that may contribute to an underwriting or pricing assessment of an applicant for insurance coverage. |
Predictive Model | Refers to the mining of historic data using algorithms and/or machine learning to identify patterns and predict outcomes that can be used to make or support the making of decisions. | Means a process of using mathematical and computational methods that examine current and historical data sets for underlying patterns and calculate the probability of an outcome | Not included |
External Consumer Data and Information Source (ECDIS) | Not included | Means, for the purposes of this regulation, a data or an information source that is used by a life insurer to supplement or supplant traditional underwriting factors or other insurance practices or to establish lifestyle indicators that are used in insurance practices. This term includes credit scores, social media habits, locations, purchasing habits, home ownership, educational attainment, licensures, civil judgments, court records, occupation that does not have a direct relationship to mortality, morbidity or longevity risk, consumer-generated Internet of Things data, biometric data and any insurance risk scores derived by the insurer or third-party from the above listed or similar data and/or information sources. | Includes data or information used — in whole or in part — to supplement traditional medical, property or casualty underwriting or pricing, as a proxy for traditional medical, property or casualty underwriting or pricing, or to identify “lifestyle indicators” that may contribute to an underwriting or pricing assessment of an applicant for insurance coverage. ECDIS does not include an MIB Group, Inc. member information exchange service, a motor vehicle report, prescription drug data or a criminal history search. |
With these different definitions come different interpretations. Does Colorado really mean something different by predictive model than the NAIC? Why do Colorado and New York omit a definition of AI? Why doesn’t the NAIC include a definition of External Consumer Data and Information Source (ECDIS)? Carriers doing business in multiple states now need to grapple with these questions to determine how to comply.
Once these definitions are clarified — likely in partnership with legal and compliance departments — the scope of your company’s models for each state becomes clearer (even if your interpretation varies from your competitors’). Now the question becomes: What do insurers need to do to comply with these regulations?
Requirements
Each regulation has extensive requirements that touch on governance, risk-based frameworks, basic model testing, fairness testing and bias testing. How they approach each is highly summarized below; links are provided to dive deeper into the specific language used:
- The CO regulation requires a governance and risk management framework to be established to ensure ECDIS and predictive models are documented, tested and validated. Further, annual testing to detect unfair discrimination as well as steps taken to address unfairly discriminatory outcomes is required.
- Although the P&C regulations are in development, they are expected to have similar requirements.
- The NY DFS circular letter requires an insurer to establish that the underwriting or pricing guidelines are not unfairly or unlawfully discriminatory with a comprehensive assessment with three steps that need to be included at minimum. A governance and risk management framework are also required with appropriate documentation and oversight, like Colorado.
- These guidelines speak in generalities rather than mention specific tests or imputation methods.
- Several quantitative assessments are provided as examples, though none are prescribed.
- Testing must be done on a “regular cadence.”
- The NAIC model bulletin requires a governance and risk management framework around the use of AIS to ensure predictive models are documented, tested and validated including unfair discrimination in the insurance practices resulting from the use of the model.
- No specific tests or imputation methods are listed.
- No frequency of testing is mentioned.
Yet again we see the different means toward the shared end of consumer protection, this time in the descriptiveness and prescriptiveness of the regulations. As is addressed to different degrees by these regulations, and what we all need to be focused on, is bias testing.
Bias — The heart of the issue
As described, these regulations target models and their associated outputs. The data is the driver of this vehicle and is likely riddled with biases before a model is even considered. From data selection to sampling and the values themselves, it will reflect the biases unique to each carrier and the jurisdictions within which it does business.
For example, data sampling bias may be present if the historical batch of insurance policies used to develop a model contains a different mix of risk characteristics than the future policies to which the model-informed decisions are applied (as described in Part 1 — Practical Application of Bias Measurement and Mitigation Techniques in Insurance Rating). On the face of it, this is a common problem we often face when building pricing models. We want our model to generalize well on future business. We try to ensure our model predictions appropriately reflect what that future business looks like. Due to seemingly benign practices like underwriting risk selection guidelines and marketing strategies, however, certain protected classes may be inadvertently over- or under-represented in the modeling dataset.
Data values can also have biases. Some are seemingly driven by business practices, like differences in claims denials, payouts and fraud identification. Other data values may be biased due to decisions not within control of the business, such as the likelihood of certain groups to be pulled over for traffic stops. If violations are used in a model, these data biases will be amplified.
Beyond data, several aspects of the modeling process open the door to further bias. These biases all potentially contribute to the adverse consumer impacts these regulations are trying to avoid.
Complying with regulations: Bias identification
First, Label
To identify whether and how these biases are showing up in our data, we must first label our data with the variable of interest. Of primary interest to insurance regulators is race. We don’t track demographic data by necessity — most insurers don’t want even the semblance of discrimination. Yet now we are in a bind since collecting that information would facilitate more appropriate measurements of bias. In the absence of accurate race data, we must look to imputation methods to comply with these AI regulations. The path to imputed race continues us down our forking road; these decisions add to the complexity of compliance.
A few different methods have been developed to impute a policyholders’ race. The primary methods to impute race are the Bayesian Improved Surname Geocoding (BISG) and the Bayesian Improved First Name Surname Geocoding (BIFSG) approaches. Deciding which method of imputation to use is the first fork in the road for insurers attempting to comply with these new regulations. [Colorado explicitly calls out BIFSG in their regs.]
Models must be built to leverage these methods. Datasets are available, published by the United States Census Bureau, though these datasets are not consistent representations of the U.S. population. Deciding which dataset to use to make the imputations is another fork.
More forks are encountered along every subsequent road. If using BIFSG, do you start with geocode, surname or first name? What programming package do you decide to use? How will you cleanse your policyholders’ data to be able to impute properly? How do you use the imputed probabilities: classify based on maximum probability, use the probability directly, randomly assign with likelihood based on the imputed probabilities? Every decision made leads to a potentially different outcome for the company’s bias identification.
Then, Assess
Now that the modeling dataset has been labeled, you need to assess for fairness, discrimination and bias. The regulations do not specify what tests to use for this purpose, though many are available. Testing should be done throughout the modeling lifecycle, from data collection and adjustments to model development, implementation and monitoring. A single test is unlikely to yield robust results, so multiple are often advised. Note this list is not exhaustive:1
- From Quantifying Discriminatory effects paper:
- Demographic parity
- Conditional demographic parity
- Equal opportunity
- Equalized odds
- Calibration
- Well-calibration
- From Practical Applications Part 1 paper:
- Premium parity
- Loss ratio parity
- Lift charts
- From the NY DFS circular letter Section C.18:
- Adverse impact ratio
- Denials odds ratios
- Marginal effects
- Standardized mean differences
- Z-tests and T-tests
- Drivers of disparity
Yet again we are making decisions that could differ significantly from our competitors. How many tests should we conduct? What is the appropriate threshold for a given test? How do we consider tests that provide different answers? We choose our forks in the road and drive on to the next stop: What do we do when we determine unfair discrimination is present in our models?
Doing the right thing: Mitigation
We know bias is not a new problem, and we know it is not possible to eliminate bias entirely. What these new regulations are asking of us as actuaries, though, is to do the right thing. What this means for each company may differ. For those doing business in Colorado, they will (eventually) be required to take steps to address unfair discrimination. For everyone else, this is another decision to be made. Here are some potential pathways toward mitigating the impacts of bias present in your data and models. Again, this list is not exhaustive.
- From Quantifying Discriminatory effects paper:
- Reweighting
- Disparate impact remover
- Prejudice remover
- Bayes optimal equalized odds predictor
- Reject option classification
- Calibrated equalized odds
- From Practical Applications Part 2 paper:
- Remove linear dependence
- Equalize outcomes
- Perturb variables
- Control for protected class within the model
- Use a penalized fitting process
- Transform model estimates to align with fairness axioms
- Leverage adversarial debiasing
Depending on how proactive your company would like to be, you may or may not yet need to take steps to mitigate any observed disparity in fairness. This decision leads us further down different forked roads — do they lead us all to the same destination?
Unintended consequences?
As illustrated, these regulations are prone to interpretations and assumptions along every step of the path. If implementation of and accountability to these regulations is not handled in an equitable way, there is significant potential for adverse selection. Suppose one carrier, Carrier A, was proactive in identifying bias, reported the results to a state DOI, and was advised to remove variable X due to its results across several tests. Another carrier, Carrier B, who has yet to file a new model, uses variable X in their models. Until Carrier B undertakes the necessary testing and unless that testing shows similar results of discrimination based on the use of variable X, then there is significant potential for Carrier A to be adversely selected against. And they were trying to do the right thing!
With these regulations, fairness is yet another constraint to be considered in the modeling process. The challenge lies in the bifurcating decisions piling up within each company that lead to potentially different outcomes. Whereas for existing state regulations that restrict use of variables like gender, these new regulations could see different variables being used or not used depending on the decisions made by each company individually, influenced by their historical and current decision-making.
Several outstanding questions remain that will hopefully be resolved prior to adverse consequences coming to bear.
- How will these regulations be enforced with timing that would allow for all insurers to be held to the same standards at the same time?
- If a given variable runs afoul of bias testing requirements inconsistently across carriers due to differences in underlying data, what is the outcome?
- If a given variable runs afoul of bias testing requirements inconsistently across different models at the same company, what is the outcome?
- How will regulators account for different standards being established by each company due to their unique decisions?
There is a real tradeoff between “doing the right thing” and keeping your company profitable. Do you proactively assess and address any bias within your company’s data and models? Or do you take your time, waiting for regulations to force your company’s hand? Making the correct decision could be critical to your company’s bottom line.
The worry about bias is real, and so are the potential unintended consequences for insurers and the insurance market more broadly. Collaboration between regulators and carriers is crucial to get this right without adversely impacting consumers — our shared goal in the first place.
Erin Lachen, FCAS, CPCU, is the vice president and senior director, data science at Liberty Mutual Insurance. She is a member of the AR Writing Subgroup.