Title: Name: Stuart Hamilton
1Using Technology to Improve Compliance
State Compliance Conference
External
9 November 2006
SEGMENT
AUDIENCE
DATE
SUBJECT
UNCLASSIFIED
Optimising ComplianceThe role of analytic
techniques
- Presented by
- Name Stuart Hamilton
- Assistant Commissioner Corporate Intelligence
Risk - Australian Taxation Office
Version 1.1
2- Contents
- Context The Australian Taxation Office
- ATO business model
- Resource constrained optimisation
- Views on risk tax gap or risk to budgeted
revenue - Understanding our clients
- Integrate intelligence (qualitative) and
analytics (quantitative) - Compliance model view and degree of
personalisation - What are we measuring
- Distribution of client scores
- Client risk profile
- Selecting the right treatment - Champion /
challenger treatment evaluation - Selecting the right model
- Selecting the right mix
3Context The Australian Taxation Office
- Some highlights from our Annual Report for 2005/6
- Net tax collections of 232.6b (principal revenue
collection agency). - 7.5b in transfers and payments (second largest
payer of benefits). - Operating expenditure of 2.5b with 21,500 staff.
- In midst of major (453m) systems change program.
Implemented Seibel CRM as our single case
management system (down from over 100 separate
systems). - Processed 13.5m tax returns, 12.9m activity
statements 18.1m payments, - Trialled pre-population of some aspects of client
returns. - Some 1.4m returns lodged online using e-Tax (an
increase of 27 on py). - Some 11.6m log-ins to our tax agent portal. Some
9m phone calls received. - Implemented around 100 new legislative measures.
- Raised 6.9b from compliance activities and
collected 4.5b - Compliance activities (excl lodgment debt) 84k
fieldwork, 331k phone, 1.1m letters - We are a significant business from any viewpoint
4ATO business model
- Business intent To optimise voluntary compliance
and make payments under the law in a way that
builds community confidence
Analytics
5Resource constrained optimisation
- Revenue authorities arent resourced to go after
every dollarand even if they were, they couldnt
in practice
6US IRS view of 2001 theoretical tax gap does it
help us ?
Views on Risk Tax Gap or Risk to Budgeted
Revenue
- Compliance isnt black and white.
- The law often requires interpretation and views
will differ. - Clients may not comply for a variety of reasons
from ignorance of the law, to differing views of
its application, to honest mistakes, to
carelessness, negligence and deliberate intent.
7Risk to budgeted revenue - from compliance
movements
8Community relationship model
9Understanding our clients - discovery v detection
10Understanding the data, understanding the client
- Exploratory data analysis
- It is a capital mistake to theorize before one
has data. Insensibly one begins to twist facts to
suit theories, instead of theories to suit
facts." - Sherlock Holmes in A Scandal in Bohemia (1891)
Tools. SAS JMP SAS Insight SAS EM NCR WHM
Rattle
11Integrate intelligence (qualitative) and
analytics (quantitative)
12Integrate intelligence (qualitative) and
analytics (quantitative)
13Intelligence risk management - analytics
- Analytic underpinning -
14Compliance model view and degree of
personalisation
Analytics
Investigate prosecute civil / criminal Audit
penalise administrative detect
deterrence Review advise assist to
comply Market educate assist to comply / make
it easy
15What are we measuringkey client obligations
- OECD Client obligations
- Registering in the system (either with the
revenue authority or with some other body) - Lodging or filing the appropriate forms on time
- Providing accurate information on those forms
- Making any transfers or payments due on time
- Most revenue systems also require a client to
maintain records of appropriate information for
some set period. Ie - Keeping records that allow verification of the
information used to satisfy the above
obligations.
16What are we measuring common measuring sticks
- Without a standard measuring stick views on
relative risk will be more subjective - ?Tax Delta tax - The change in primary tax
associated with the non-compliance. ie
Identifies those who may have the most tax wrong.
An absolute amount. Client A may have
underpaid 5,500 in tax in year y. - ?Tax/(?Tax Tax) Severity - The relative
severity of the non-compliance as a percentage of
tax paid. ie Identifies those who may have most
of their tax wrong. A relative value. Client A
may have underpaid 15 of their tax in year y. - Cf(?Tax) Confidence - The confidence interval
associated with our estimate of ?Tax. ie
Identifies how confident we are of the estimate
in ?Tax. We are 90 confident that Client A
underpaid 5,500 in tax in year y - Pf(?Tax) Proportion collectable - The proportion
of ?Tax estimated to be collectable. A function
of a clients propensity to pay and their
capacity to pay. We estimate that 80 of the
5,500 estimated to be underpaid by Client A will
be collectable.
17Distribution of client scores that equate to
revenue risk.
We propose using ?Tax as a standard risk measure
Cases would be prioritised by ?Tax
18Distribution of client scores that fit with
verification intensity
?Tax Who avoided or evaded the most tax?
?Tax scores tell us who we predict evaded or
avoided the most tax in absolute terms a
critical factor for a revenue collection
agency. Using ?Tax for lodgement, reporting and
account scores enables a consistent view of risk
across obligations and products.
n
x
x
x
Confidence distribution in ?Tax Estimate
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
?Tax ?
Y
19Distribution of client scores that fit with
compliance model
Severity ?Tax/(?Tax Tax) Who avoided or
evaded most of their tax?
?Tax/(?Tax Tax) scores tell us who we predict
evaded or avoided most of their tax in relative
terms a critical factor for a revenue
collection agency looking at serious
non-compliance and aggressive tax planning.
n
x
x
x
Confidence distribution in ?Tax/(?TaxTax)
Estimate
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
0
?Tax/(?TaxTax) ? 1
20Client risk profile
21Initial risk modeling
Initial modelling has focussed on the Income Tax
and GST product obligations This will be extended
over time to cover all product and obligation
types
Initial modelling target areas
Lodge
Register
Report/Advise
Account
Fully Compliant
Obligation -gt
Propensity to Lodge On-time
Propensity to Register Correctly
Propensity for Correct information
Propensity to Pay On-time In full
Propensity to Meet All Obligations
Administrative Product ?
(weighted scores)
Risk Score
Risk Score
Risk Score
Risk Score
Risk Score
Risk Score
- Income Tax
- GST
- Excise
- Super
- (other FBT etc)
- All Products(weighted scores)
Risk Score
Risk Score
Risk Score
Risk Score
Risk Score
Risk Score
Risk Score
Risk Score
Risk Score
Risk Score
Risk Score
Risk Score
Risk Score
Risk Score
Risk Score
Risk Score
Risk Score
Risk Score
Risk Score
Risk Score
Risk Score
Risk Score
Risk Score
Risk Score
Risk Score
- Risk Attributes
- Assessment History
- Label Analysis
- Ratio Analysis
- Refunds/Liabilities
- Risk Attributes
- Registration History
- Proof of Identity
- Risk Attributes
- Lodgment History
- Timeliness
- Ageing
- Predicted revenue
- Risk Attributes
- Payment History
- Debt Level
- Timeliness/ Ageing
- Capacity to pay
Whole of Client Score
NOTE Client Scores can be further aggregated
to support Industry, Occupation and Product Risk
Scores for the whole client population. .
22Champion / challenger treatment evaluation
- Champion treatment assigned to majority (80)
of target segment (similar clients) - Challenger(s) treatment(s) assigned to minority
(2x10) of target segment (similar clients) - gtgt Identify treatment that gives best long
term outcome make champion - gtgt Invent new challenger treatments to test
- Note Champion/challenger control groups give you
the information needed to evaluate the
effectiveness of your treatment strategies...
Champion
Today
Potential actions
Challenger 1
RETURN ON INVESTMENT
Break even
Current ROI trajectory
Challenger 2
KEY
Champion treatment Challenger Treatment
1 Challenger Treatment 2
TIME
23Client Scoring for treatment selection
So we can personalise our treatment strategies to
the client
Decision Tree of Rules derived from data to
assign scores
Letter X
Letter Y
Treatment Audit
Call
Treatment Review
In fact scores are likely to be done via several
models voting together Ensembles.
24Simple decision tree modelnow grow 500 and have
them vote
- Models can be relatively simple
- conceptually
- to more complex such as
- Random Forest approaches
- Support Vector Machines
- Neural Networks
- Even where a complex method
- is used it is useful to have a
- simple decision tree for
- explanatory purposes
- why was this client selected
25Revenue lift over methods that dont prioritise
clients
Diagrams such as risk charts allow management to
see the revenue caseload trade-off that a
analytic model provides. Often 40 to 50 of the
caseload will provide 90 to 95 of the revenue
when an analytic model is deployed.
If there is a mechanism to prioritise cases
within a pool then the revenue result will be
higher at lower caseload levels. If cases are
prioritised on revenue outcome the mechanism
lifts the revenue result that would otherwise
result from a random selection within the case
pool. The strike rate line will be higher at
lower case load levels and fall off as more of
the original case load is done.
If there is no effective mechanism to prioritise
cases within a case pool then the revenue result
will be linearly linked to the case numbers. With
significant numbers of reasonable similar cases
this line will be a 45 line. ie 20 of cases
will give you 20 of the revenue. The strike
rate line will be essentially flat across the
pool at a level equal to the number of productive
cases in the pool over the total number of cases.
26Risk chart performance caseload
- Risk charts provide an easy to understand view to
management of the trade-off between caseload and
revenue allowing more informed decisions to be
made regarding resource use. - Here 40 of the caseload yields 82 of the
revenue while 70 of cases gives 98 of the
revenue.
27The impact of strike changes varies
- Targeting effectiveness or efficiency?
- Fixed staffing /fixed revenue impacts
Effort time differential can be overlooked and
it can make a real difference
28Understanding which model works best
29Understanding which model works best area under
ROC curve
- A variety of risk scoring models
- can be compared by seeing
- where they outperform another
- model and by how much.
- Create ensembles that
- outperform a single model
30Operationalising the results
31Optimising case mixlinear programming/simulation
- Decision support
- approaches such as
- linear programming
- can assist judgements
- regarding numbers
- types of cases to
- pursue
32End to end process
Optimise treatment candidate selection
Modelling
Coverage Revenue targets
Operationalise Analytics
Seibel Work Case Mgmt
Optimise risk priority case mix selection
33Applying results of data mining
1
2
3
4
Apply New Risk Segmentation
TuneScreening Rules
Optimise a Treatment Strategy
Optimise Treatment Portfolio
Instead of using value or market segment as
proxy for risk, identify actual group and its
characteristics. Create new language and
awareness of risk.
Adjust screening rules (thresholds, ratios,
exceptions) to reflect better understanding of
risk. Look at adjusting, combining rules. Can
be applied straight away.
Find the optimal point to maximise revenue
collection, while minimising caseload and
occurrence of fraud.Apply risk scores to case
selection to get best overall outcomes.
Find the optimal point to maximise revenue
collection, while minimising caseload and
occurrence of fraud for the whole of treatment
portfolio. Optimise the treatment mix
Degree of Sophistication
Optimisation is more than picking the right
clients the right treatment and right work mix
also need to be optimised
34 Questions?
Regression Models K Nearest Neighbor Neural
Networks Decision Trees Self Organized Maps
Text Mining Sampling Outlier Filtering
Assessment