Title: Terminology and Knowledge Engineering in Fraud Detection
1Terminology and Knowledge Engineering in Fraud
Detection
Koen KerremansRita Temmerman
Gang Zhao
Centrum voor Vaktaal en Communicatie
(CVC)Department of Applied LinguisticsErasmusho
geschool Brusselhttp//cvc.ehb.be
Semantics Technology and Applications Research
Laboratory (STAR Lab)Department of Computer
ScienceVrije Universiteit Brusselhttp//www.sta
rlab.vub.ac.be
2- How are terminology and knowledge engineering
used in the fight against financial fraud? - How to organise terminology and knowledge
engineering methods into a development process of
technological solutions in the fight against
financial fraud?
3General outline
- FF POIROT
- Aims
- Cases
- Partners
- Methodologies
- AKEM (knowledge engineering)
- Termontography (terminology engineering)
- Interaction of methodologies
- Future work
- Conclusion
4FF POIROT
Financial Fraud Prevention Oriented Information
Resources using Ontology Technology
- Aims
- Apply Semantic Web technology to fraud detection
and prevention, thereby showing the potential of
ontologies in these areas - Construct multililingual terminological as well
as formal knowledge repositories covering the
domains of interest - Propose methods and guidelines in terminology and
knowledge engineering - Develop new and/or improve existing tools to
support terminology and knowledge engineering
5FF POIROT cases
- VAT carousel fraud
- VAT fraud in which fraudsters sell goods at VAT
inclusive prices and disappear without paying the
VAT paid by their customers to the tax
authorities - Companies unwittingly involved in this type of
fraud can be held responsible for the missing
VAT - Each company has to find out whether or not it is
safe to do business with a trader from another
EU country - On-line investment fraud
- the selling of overpriced or worthless shares,
bonds, or other financial instruments to the
general public - In Italy, Consob searches suspicious websites via
traditional search engines such as Google,
Altavista,
6FF POIROT
- Use of the ontology
- Knowledge management of fraud investigative
expertise - Information exchange between investigative bodies
- Automation of parts of monitoring or
investigative procedures with knowledge-based
applications (e.g. information extraction) - Use of multilingual terminology
- Dicionary purposes
- Multilingual keywords in information extraction
- Explanation of reasoning in natural language
- Knowledge resource consulted during ontology
development
7FF POIROT partners
8AKEM
- Application Knowledge Engineering Methodology
- Development cycle
- Knowledge scoping (result stories)
- Knowledge analysis
- Ontology development
- Deployment
9AKEM
- Based on DOGMA
- Developing Ontology-Guided Mediation for Agents
- Ontology a set of lexons and their commitments
in particular applications - Lexon a grouping element stored in a lexon base
and composed of terms and roles - ltContext, Term_1, Role_1, Term_2, Role_2gt
10AKEM
- Why Application Knowledge Engineering
Methodology? - There is a need to organise a geographically
distributed, multidisciplinary team of domain
experts, knowledge analysts and engineers in a
methodical traceable development cycle - There is a need to examine how knowledge can be
extracted from different perspectives on fraud to
improve the quality of the fraud ontology
11AKEM
H1
- An example of a legal view Wigmore chart
- Blue hypothesis
- Red claim
- Purple evidence
- Green fact
1
1.1
1.2
1.3
1.1.1
E1.2.1
E1.3.1
E1.1.1
F1.1.1.1
F1.1.1.2
12AKEM
- H1 Public offer of company X is unlawful
- 1.1 X solicits investors on the WWW
- E1.1.1 X manages website that solicits investors
- F1.1.1.1 Website states name X
- F1.1.1.2 Registration details indicate X as
registrant of website - 1.2 No advance notice of solicitation to Consob
- E1.2.1 X did not give a notification to Consob
regarding public offer to purchase - 1.3 No related prospectus filed with Consob
- E1.3.1 X did not draft or file a prospectus with
Consob regarding public offer to purchase
13AKEM
- Extraction of knowledge constituents and
abstraction into production rules, allow
knowledge modellers to identify and organise the
abstract concepts and relations into a lexon base - Example
14Termontography
- a terminological approach in which
(multilingual) terminological knowledge,
retrieved from texts, is structured according to
a framework of knowledge (i.e. categorisation
framework) - Why Termontography?
- Terminographers need a common reference framework
to scope the terminology work - There are significant commonalities between
terminology compilation and text-based ontology
development - In our view a terminological analysis can
contribute to the formalisation of a given domain
15Termontography
Search phase (3)
(mono- or multilingual) domain-specific corpus
first version of termontological database
Ontology
Dictionary
Refinement phase (4)
(mono- or multilingual) termontological database
Information gathering phase (2)
TSR categorisation framework
Verification phase (5)
Domain- experts
Validation phase (6)
Knowledge Analysis phase (1)
16Interaction of methodologies
AKEM
TERMONTOGRAPHY
KNOWLEDGE SCOPING
17Interaction of methodologies
- Knowledge scoping
- Developing terminological resources and
ontological repositories requires above all an
insight in the domain of interest - Domain experts can support the knowledge
acquisition process by pointing out the relevant
categories/topics (given the envisaged
tasks/applications) - Example Transactions for which no VAT is
required
18Interaction of methodologies
- Terminology base ? ontology development
- Rationale
- The AKEM extraction task seeks for basic semantic
elements and follows linguistic units in natural
language texts - Experience shows that ontology engineers resort
from time to time to terminological resources for
background information or exact definitions - Characteristics of terminological analysis
- Special emphasis on documenting semantic contexts
by means of textual contexts - Entries also include linguistic semantic
descriptions such as agent-predicate-patient/recip
ient links and cross references among items of
contents
19Interaction of methodologies
20Interaction of methodologies
21Interaction of methodologies
- Consequences
- Productivity of ontology engineers is improved by
suggestions from terminologists, who examine the
same knowledge resources - Terminography adds a linguistic viewpoint to
application-specific modeling - During ontology development, multilingual
terminological information can help discover
semantic gaps across languages due to social
and cultural differences and facilitate consensus
building in a multilingual team of developers
22Future work
- How to represent meaning variations of
lexicalisations referring to the same category? - E.g. An event on which VAT is to be paid
- Irish legislation chargeable event
- VAT will be due on the date the invoice is issued
- UK legislation chargeable event
- VAT will be due no later than the 15th day
following the month in which the supply takes
place - French legislation fait générateur
- VAT will be due at the moment the goods are
supplied
23Future work
24Conclusion
- We have shown how terminology and knowledge
engineering is used in the fight against
financial fraud - We have shown how the methods of experts with
different backgrounds have been translated into a
coherent and traceable workflow
25Conclusion
26Conclusion
- Project
- http//www.ffpoirot.org
- Partners
- STAR Lab http//www.starlab.vub.ac.be
- CVC http//cvc.ehb.be
- JBC http//www.cfslr.ed.ac.uk/
- RACAI http//www.racai.ro
- KS http//www.knowledgestones.com/
- LC http//www.landcglobal.com/index.php
- Consob http//www.consob.it/main/index.html
- VAT_at_ http//www.vatat.com