Title: Subrogation Prediction Through Text Mining and Data Modeling
1Subrogation Prediction Through Text Mining and
Data Modeling
Sergei Ananyan, Ph.D. Megaputer
Intelligence www.megaputer.com
2Why Subrogating?
- While only a few percent of cases have
subrogation potential, significant amounts of
money can be recovered - Estimates Missed subro opportunities in USA
15Billion annually - Efficient subrogations facilitate in keeping
insurance premiums low, providing an extra
competitive edge
3Challenges of Subrogation
- Overwhelming volume of claims
- Over 5 million reported workplace injuries in the
USA annually - Over 6 million auto insurance claims in the USA
annually - Subrogation opportunities comprise only a few
percent of all claims - Subro decisions involve manual analysis of
textual notes in claims - Thorough investigations can be lengthy and costly
- Missed subrogation opportunities can be even more
costly - Subro decisions should be made soon after the
accident. Relevant evidence may disappear
quickly.
4Who makes a subro decision?
5Traditional Way Adjusters
- Individual Adjusters determine subrogation cases
- Pros
- Subro decisions can be made at early stages of
claim handling - Investigation can be conducted on the spot
- Cons
- Subrogation determination is at the bottom of a
long list of actions - Verifying coverage
- Determining compensation
- Approving payments
- Reporting
- Different experience of adjusters no consistency
across organization - Either the lack of formal rules or a set of rules
that is too rigid to determine subrogation
potential of many cases - Looking for a needle in a haystack easily
overlooked
6Traditional Way Recovery Teams
- Specialized Recovery Teams determine subrogation
opportunities - Pros
- Highly trained professionals better
determination of opportunities - Consistency across the organization
- Cons
- Small group of investigators overloaded with
large numbers of claims - Located remotely need to coordinate efforts with
local adjusters - Delays in starting investigations
7Recovery Teams are Overloaded
8Subrogation Prediction Objectives
- A perfect solution for subrogation prediction
should be - Accurate
- Automated
- Objective
- Consistent
- Fast
9New Way Automated Modeling
- New predictive modeling tools can identify subro
opportunities - They provide many benefits
- Timely detect good new candidate claims for
subrogation - Capture missed opportunities throughout closed
cases - Focus attention of investigators on cases with
high potential - Eliminate wasted time and efforts
- Standardize subrogation prediction practice
across the enterprise - Enhance customer satisfaction
10Modeling and Text Mining
- Knowledge discovery tools for business users
- Easy-to-understand actionable results
Data Overload
Useful Knowledge
11What is Data Modeling?
- Computer models learn from historical data and
predict outcomes of future situations - Models are developed through training on data
with known outcomes - Training is based on machine learning and
statistical algorithms - The Megaputer solution PolyAnalyst for
Subrogation Prediction offers a selection of
modeling algorithms - Decision Trees
- Neural Networks
- CHAID
- Bayesian Networks
- Random Forest
- Best model can be selected automatically
- Developed models are used for scoring new data to
predict - Probability of the subrogation success
- Potential recovered amount
12Training and Applying the Model
- Model Training
- Modeling is carried out on data collected from
claim forms and notes - Successful past subrogation cases are considered
as positive examples - No subrogation cases are negative examples
- A model learns combinations of features
determining positive cases - Another model predicts the amount of possible
subrogation - The developed model is stored for future use
- Model Application
- Models are applied to new data to produce scores
- Calculate
- Subrogation probability
- Subrogation amount
- Claims with the highest scores on these two
attributes are presented for investigation by a
human
13Investigations involve data analysis
Data Analyst
Visual analytic scenario
14Behind the Scenes
15Output Subrogation Prediction
- Probability of the subrogation success
- Estimated recovery amount
16Data Integration
17Data Cleansing
18Aggregation keys and attributes
19Aggregations - measures
20Derivative Attributes
21Complications of Text Analysis
- The need to analyze free text notes further
complicates things - Statistical tools are good at processing
structured data, but not text - Human analysts had to read text notes to extract
relevant features
22Text Mining Technology
- Text Mining is an automated process of analyzing
text to extract information from it for
particular purposes - Text Mining is different from traditional search
technology - In search, the user is typically looking for
something that is already known and has been
written by someone else - Text Mining involves pushing aside irrelevant
material in order to extract relevant information - Text Mining extracts relevant features from
natural language notes. These features are
included in modeling.
23Typical Text Mining Tasks
- Categorization
- Feature and entity extraction
- Summarization
24Complications of Text Analysis
- Typical textual descriptions
- SLIPPED OFF BACK OFVAN LOADING TOOLS
- PUSHED WHILE CONFRONTING AN ALLEGED SHOPLIFTER
- TRIPPED ON A SHEET OF WIRE MESH FELL ON PAKRING
LOT - REACHING FOR PAKAGES ON BELT WHEN HE TRIPPED OVER
PAKAGES THAT WERE IN FRONT OF BELT AND FELL - EE WAS CUTTING ONIONS ON THE SLICER AND HE CUT
OFF THE TIP OF HIS RIGHT THUM - CLT WAS STRUCK ON HEAD WITH ICE IN THE FREEZER
- EMP WAS WALKING BACK TO PKG CAR WHEN 2 DOGS BEGAN
TO CHASE HIM, HE RAN SLIPPED ON STEPS OF PKG
CAR - EE WAS USING A BAND SAW TO CUT IRON FOREIGN BODY
ENTERED LT EYE
25Intelligent Spell-Checking
26Categorization V2 rear ended V1
Key points of the claim
27Categorization policy holder arrested
Key points of the claim
28Domain-specific Dictionaries
29Patterns related to Pain
30Predicted Subro Probability for a Claim
31Predicted Subro Amount for a Claim
32PolyAnalyst Subro Prediction flow
New claim
Text Mining
Extracted Features
Modeling
Historical claims data
Subrogation Model
Subrogation prediction
33Touch Points for Modeling
- First Report of Incident
- Detect subro opportunities, while evidence is
still available - Focus efforts only on claims that have good subro
potential - Perform timely and thorough investigations
- Retrospective Analysis of Claims
- Check closed and still open claims
- Identify missed subro opportunities
- Pursue recovery whenever still possible
34First Report of Incident (work comp)
- Available data
- Date
- Injury Type
- Body part injured
- Textual description of the incident
- Build models based on historical data
- Use a pre-built model to score new claims
35Retrospective Claims Analysis
- Extra data (new)
- Claim notes
- Financial results
- Applicable legislation, Arbitration notices, etc.
- Build models based on historical examples
- Discover missed subrogation opportunities
36PolyAnalyst Benefits
- Dramatic time and cost reduction
- Increase in quality and speed of the analysis
- Objective and uniform data-driven analysis
- Discovery of even unexpected issues suggested by
data - Automated monitoring of known problems
- Timely discovery of newly developing issues
- Utilization of 100 of available data structured
and text - Up-to-date reports for executives
- Easy to use and to maintain solution
37Data and Text Mining in Insurance
- Fraud Detection
- Subrogation Prediction
- Database Marketing
- Response Prediction
- Cross-sell Analysis
- Market Segmentation
- Text Analysis
- Call Center transcripts analysis
- Survey analysis
- Competitive intelligence
- Compliance analysis
38Select Customers
Government Insurance Financial High
Tech Pharmaceutical Marketing Manufacturing
39Contacting Megaputer
Call (812) 330-0110 or email info_at_megaputer.com
120 W Seventh Street, Suite 314 Bloomington, IN
47404 USA