Predictive Tax Compliance - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

Predictive Tax Compliance

Description:

IRS Refund Fraud Detection Project Case Study. Where Does Data Mining ... Organization, Location, Name, Phone Number, etc. Custom Built Subject Dictionaries ... – PowerPoint PPT presentation

Number of Views:38
Avg rating:3.0/5.0
Slides: 31
Provided by: simran5
Category:

less

Transcript and Presenter's Notes

Title: Predictive Tax Compliance


1
Predictive Tax Compliance
  • Presentation to the IRS

SPSS Benjamin Chard Senior Solution
Engineer bchard_at_spss.com Sarah Mattingly IRS
Account Executive smattingly_at_spss.com
SRA Ted Fischer Project Manager ted_fischer_at_sra.c
om or theodore.i.fischer_at_irs.gov 301-731-3534
2
Agenda
  • Introduction to Data Mining
  • Predictive Tax Compliance
  • Using Clementine for Audit Selection
  • Whats New in Clementine Version 11.1
  • IRS Refund Fraud Detection Project Case Study

3
Where Does Data Mining Fit?
  • Operational Setting
  • Reporting
  • Case Mgt
  • Claim Scoring

Build Models Data Mining Workbench
  • Existing Data
  • Historical Claims
  • Current Claims

4
Data Mining vs. Query/Reporting
  • Reporting (Tables, Graphics, OLAP)
  • Provide you with a very good view of what is
    happening, but within a limited view of the data
    and only in models defined by the user

5
Statistics vs. Data Mining Statistics
Hypothesis Testing
6
  • What is Data Mining?
  • Three classes of data mining algorithms

What events occur together? Given a series of
actions what action is likely to occur next?
7
Predictive Tax Compliance
8
Predictive Tax Compliance
Register
Assess
Collect
  • Tax Collection
  • Risk Models
  • Audit Selection
  • Audit Models
  • Non-Filer Discovery
  • Soft-Matching
  • Prioritization Models

DATA MINING PREDICTIVE ANALYTICS TOOLS
DATA WAREHOUSE
  • Right work to the right resources at the right
    time

9
Predictive Modeling
  • Building a predictive profile of the claim that
    after investigation was flagged as an improper
    payment regardless of amount.
  • Select positive investigations Maximize those
    claims with the highest dollar adjustment found
    per audit hour.
  • Minimize the number of no-change audits.

10
Anomaly Detection
  • Find emerging trends in claims data. Use data
    mining to show the emerging patterns in current
    year data. Reported results will present specific
    cases that either
  • Exhibit a common pattern or
  • Exhibit an unusual pattern
  • Unusual cases are deployed to the field
    investigators for further analysis.

11
Case Study Audit Selection Goals
  • Build models to predict different outcomes.
  • Positive Adjustment (Y/N).
  • DPH group membership.
  • Actual Adjustment.
  • Historical Cases selected for model build
  • Cases with Prior audit prior audit and
    organizational data.
  • All Cases organizational data only.
  • Deployment
  • For each outcome combine predictions for those
    with and without previous audit data .
  • For each outcome predict using organizational
    data only.

12
Clementine Workbench
13
Case Study Results
14
Text Mining and Linguistic Extraction
15
Text Mining Timeline Text Extraction
Mr. Smith aka Mr. Ahmed was seen on the corner
of Church St. and Magnolia Ave. on Nov 13th
Bag of  Words  extraction
Mr. Smith (Person) -gt aka (Alias) -gt Mr. Ahmed
(Person) was seen (location) -gt Church and
Magnolia (address) -gt November 13 (Date)
Expressions extraction
Mr. Ahmed in database wanted for questioning
Suspect -gt send agent to this location
Mr. Smith aka was seen with Ahmed on the corner of
Church Etc.
Named Entities extraction
Mr. Smith was seen Mr. Ahmed corner Church
St. Magnolia Ave. Nov 13th
Events/Sentiment Extraction
Mr. Smith -gt Person Mr. Ahmed-gt Person aka -gt
Alias was seen -gt location Church St. -gt
Address Magnolia Ave. -gt Address Nov 13th -gt Date
Combined with structured data
16
Text Mining Management
  • General Dictionaries
  • Organization, Location, Name, Phone Number, etc
  • Custom Built Subject Dictionaries
  • Tax Code, Form Names, Commodity, Business, etc
  • Interactive Synonym Dictionaries
  • Exclude Dictionaries
  • NEW! Classification algorithms enable you to
    aggregate concepts from a wide variety of
    unstructured text data and group them into a
    small number of categories.

17
Whats New
18
Binary Classifier Automation of Many Models
  • Sophisticated users hundreds of models
    (scripting)
  • Binary Classifier Node imitates this
  • but easily, with a pre-built node

19
Time Series Algorithm
  • ARIMA Exponential Smoothing
  • Expert Modeler finds best model automatically
  • Forecast Multiple Series at once
  • Data Preparation Tools

20
Optimal Binning
  • Splitting up numeric data into sub-ranges
  • New capability to make this optimal for prediction

New Capability Optimal bins
Existing Capability Equal bins
21
SPSS Reporting
  • SPSS Statistics and Graphs Within Clementine

22
Configuration Management
  • Predictive Enterprise
  • Services (PES) Top Four

23
Deployment and Integration
  • Configuration Management
  • Exporting Data, Models and Streams
  • Explore and Describe

24
1. Improve Collaboration
  • In single project there is the potential to
    create a large number of models and versions of
    models
  • different out variables
  • different algorithms
  • different settings
  • different training samples.
  • X different data sets
  • X different users
  • X different locations.

25
2. Improve Transparency
  • Provide information on which models are run on
    which data.
  • For audit standards, track who has made changes
    to the model and when.

Your analytics team from their desktop can see
which models were most recently run on data, so
that they would be able to provide this for
internal audits.
26
3. Automate Process
  • Combine Clementine, SPSS, SAS other processes
  • Scheduling notification

27
4. Centralize and Control Access
28
Contact information
  • Project personnel
  • Ted Fischer ted_fischer_at_sra.com or
    theodore.i.fischer_at_irs.gov, 301-731-3534
  • Anthony Colyandro anthony_colyandro_at_sra.com or
    anthony.colyandro_at_irs.gov, 301-731-3524
  • SRA Director of Business Intelligence
  • Dave Vennergrund dave_vennergrund_at_sra.com,
    703-803-1614

29
How do I get SPSS software?
IRS Cathy J. Allen Enterprise System
Management Software Management Section Idea
Branch - MS 5850 (304) 264-7279  -  voice (304)
279-5309  -  cell (304) 260-3033  - 
fax cathy.j.allen_at_irs.gov
SPSS Contacts Account Executive Sarah
Mattingly Email smattingly_at_spss.com W
703-740-2446 C 703-389-6485 Account Manager
Matt Madden W - 312 651 3894
30
Predictive Tax Compliance
  • Presentation to the IRS

SPSS Benjamin Chard Senior Solution
Engineer bchard_at_spss.com Sarah Mattingly IRS
Account Executive smattingly_at_spss.com
SRA Ted Fischer Project Manager ted_fischer_at_sra.c
om or theodore.i.fischer_at_irs.gov 301-731-3534
Write a Comment
User Comments (0)
About PowerShow.com