David Budet - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

David Budet

Description:

Violent assaults against customers and employees. Delays ... Findings: Customer Assaults ... Findings: Employee Assaults ... – PowerPoint PPT presentation

Number of Views:77
Avg rating:3.0/5.0
Slides: 21
Provided by: Hec14
Learn more at: http://csis.pace.edu
Category:
Tags: assault | budet | david

less

Transcript and Presenter's Notes

Title: David Budet


1
Data Mining Customer Employee-RelatedSubway
Incidents Phase II
  • David Budet
  • Mariel Castro
  • Jason Jaworski
  • Yevgeny Khait
  • Florangel Marte
  • Client Richard Washington, NYC Transit Authority

2
Presentation Summary
  • Project Description
  • Review
  • Progression
  • City Crime vs. Subway Crime
  • Results Customer Assaults
  • Results Employee Assaults
  • Results Robberies (Simple Theft)
  • Results Train Delays
  • Weka ID3 Decision Trees
  • Future Research Avenues

3
Project Description
  • Phase I concentrated on looking at incidents and
    identifying reasons for aggression, specifically
    what effects delays had on aggression incidents
  • Phase II is more specifically concentrated on
    subway assaults and possible correlations with
    the datas attributes
  • Main focus of both phases analysis of a dataset
    of incidents which occurred in the New York City
    Subway system over multiple years and mining of
    the data to establish relationships and trends

4
Review
The first half of the study focused on mining
data with Microsoft SQL Server 2008 and the
program Weka. Utilizing these tools and team
methodologies, we determined which stations and
train lines had the most
  • Violent assaults against customers and employees
  • Delays
  • Simple thefts (unarmed robberies, pick-pocketing,
    etc.)

5
Progression
The second half of the study had a more regional
focus. The team
  • Acquired US Census data regarding crime and
    population in NYC
  • Normalized the Census crime data and subway crime
    data by population for Manhattan, Brooklyn,
    Queens and the Bronx 
  • Analyzed Subway crime as a microcosm of overall
    NYC crime for 2007
  • Created an interactive Javascript map pinpointing
    stations with most violent incidents and delays

6
City Crime vs. Subway Crime
In comparing overall crime in New York City for
2007 to crime in the NYC Subway system
  • We found that Manhattan, though the third largest
    borough in terms of population, accounted for
    over half the crime in NYC
  • The Bronx has the smallest population, but in
    terms of crime per resident, had the second
    highest rate of crime
  • Subway crime accounts for less of a percentage of
    overall crime in Manhattan than the other three
    boroughs researched

7
City Crime vs. Subway Crime
8
City Crime vs. Subway Crime
  • When normalized for population, subway crime
    in Brooklyn and Queens accounts for a greater
    percentage of overall crime than in Manhattan and
    the Bronx, signaling these boroughs may have more
    dangerous, or incident prone stations than
    Manhattan or Queens.

9
Findings Customer Assaults
The stations with the most assaults (all types of
assault) against customers from 2005 2007 were
59th Street, 14th Street and 125th Street.
10
Findings Customer Assaults
Between 2005 2007, the highest number of
assaults (all types) committed against customers
took place on the A, 2 and 4 lines.
11
Findings Employee Assaults
Stations with more than 5 total assaults (all
types of assault) against employees between 2005
2007
12
Findings Employee Assaults
Between 2005 2007, the highest number of
assaults (all types) committed against employees
took place on the 6, 2 and A lines.
13
Findings Robberies (Simple Theft)
14
Findings Robberies (Simple Theft)
15
Findings Train Delays
Number of delays by month over 3 year period
16
Findings Train Delays
17
Findings Train Delays
18
Weka ID3 Decision Tree
19
Weka ID3 Decision Tree
20
Future Research Avenues
  • MTA and project team can separately mine an
    identical data set and introduce an objective
    methodology for determining the best results and
    techniques from both databases
  • Continue in-depth data mining
  • Identify and research other algorithms in Weka
    conducive to mining and correlating NYC Subway
    data (we propose the next team utilize clustering
    analysis via the algorithm SimpleKMeans)
  • Investigate possible correlations between
    neighborhood income levels and stations where
    subway crime is prevalent
  • Continue to expand and build on Javascript map
Write a Comment
User Comments (0)
About PowerShow.com