Sequence Models in Modern AI - PowerPoint PPT Presentation

About This Presentation
Title:

Sequence Models in Modern AI

Description:

Solve restricted problem Find all the faces Recognize a person Align two images Modern Computer Vision Applications Face / Object detection ... shape library ... – PowerPoint PPT presentation

Number of Views:72
Avg rating:3.0/5.0
Slides: 45
Provided by: ginal5
Category:

less

Transcript and Presenter's Notes

Title: Sequence Models in Modern AI


1
Sequence Models in Modern AI
  • Probabilistic sequence models
  • HMMs, N-grams
  • Train from available data
  • Classification with contextual influence
  • Robust to noise/variability
  • E.g. Sentences vary in degrees of acceptability
  • Provides ranking of sequence quality
  • Exploits large scale data, storage, memory, CPU

2
Computer Vision
  • CMSC 25000
  • Artificial Intelligence
  • March 1, 2007

3
Roadmap
  • Motivation
  • Computer vision applications
  • Is a Picture worth a thousand words?
  • Low level features
  • Feature extraction intensity, color
  • High level features
  • Top-down constraint shape from stereo, motion,..
  • Case Study Vision as Modern AI
  • Fast, robust face detection (Viola Jones 2002)

4
Perception
  • From observation to facts about world
  • Analogous to speech recognition
  • Stimulus (Percept) S, World W
  • S g(W)
  • Recognition Derive world from percept
  • Wg(S)
  • Is this possible?

5
Key Perception Problem
  • Massive ambiguity
  • Optical illusions
  • Occlusion
  • Depth perception
  • Objects are closer than they appear
  • Is it full-sized or a miniature model?

6
Image Ambiguity
7
Handling Uncertainty
  • Identify single perfect correct solution
  • Impossible!
  • Noise, ambiguity, complexity
  • Solution
  • Probabilistic model
  • P(WS) aP(SW) P(W)
  • Maximize image probability and model probability

8
Handling Complexity
  • Dont solve the whole problem
  • Dont recover every object/position/color
  • Solve restricted problem
  • Find all the faces
  • Recognize a person
  • Align two images

9
Modern Computer Vision Applications
  • Face / Object detection
  • Medical image registration
  • Face recognition
  • Object tracking

10
Vision Subsystems
11
Image Formation
12
Images and Representations
  • Initially pixel images
  • Image as NxM matrix of pixel values
  • Alternate image codings
  • Grey-scale intensity values
  • Color encoding intensities of RGB values

13
Images
14
Grey-scale Images
15
Color Images
16
Image Features
  • Grey-scale and color intensities
  • Directly access image signal values
  • Large number of measures
  • Possibly noisy
  • Only care about intensities as cues to world
  • Image Features
  • Mid-level representation
  • Extract from raw intensities
  • Capture elements of interest for image
    understanding

17
Edge Detection
18
Edge Detection
  • Find sharp demarcations in intensity
  • 1) Apply spatially oriented filters
  • E.g. vertical, horizontal, diagonal
  • 2) Label above-threshold pixels with edge
    orientation
  • 3) Combine edge segments with same orientation
    line

19
Top-down Constraints
  • Goal Extract objects from images
  • Approach apply knowledge about how the world
    works to identify coherent objects reconstruct
    3D

20
Motion Optical Flow
  • Find correspondences in sequential images
  • Units which move together represent objects

21
Stereo
22
Stereo Depth Resolution
23
Texture and Shading
24
Edge-Based 2-3D Reconstruction
Assume world of solid polyhedra with 3-edge
vertices Apply Waltz line labeling via
Constraint Satisfaction
25
Basic Object Recognition
  • Simple idea
  • extract 3-D shapes from image
  • match against shape library"
  • Problems
  • extracting curved surfaces from image
  • representing shape of extracted object
  • representing shape and variability of library
    object classes
  • improper segmentation, occlusion
  • unknown illumination, shadows, markings, noise,
    complexity, etc.
  • Approaches
  • index into library by measuring invariant
    properties of objects
  • alignment of image feature with projected library
    object feature
  • match image against multiple stored views
    (aspects) of library object
  • machine learning methods based on image
    statistics

26
Hand-written Digit Recognition
27
Summary
  • Vision is hard
  • Noise, ambiguity, complexity
  • Prior knowledge is essential to constrain problem
  • Cohesion of objects, optics, object features
  • Combine multiple cues
  • Motion, stereo, shading, texture,
  • Image/object matching
  • Library features, lines, edges, etc
  • Apply domain knowledge Optics
  • Apply machine learning NN, NN, CSP, etc

28
Computer Vision Case Study
  • Rapid Object Detection using a Boosted Cascade
    of Simple Features, Viola/Jones 01
  • Challenge
  • Object detection
  • Find all faces in an arbitrary images
  • Real-time execution
  • 15 frames per second
  • Need simple features, classifiers

29
Rapid Object Detection Overview
  • Fast detection with simple local features
  • Simple fast feature extraction
  • Small number of computations per pixel
  • Rectangular features
  • Feature selection with Adaboost
  • Sequential feature refinement
  • Cascade of classifiers
  • Increasingly complex classifiers
  • Repeatedly rule out non-object areas

30
Picking Features
  • What cues do we use for object detection?
  • Not direct pixel intensities
  • Features
  • Can encode task specific domain knowledge (bias)
  • Difficult to learn directly from data
  • Reduce training set size
  • Feature system can speed processing

31
Rectangle Features
  • Treat rectangles as units
  • Derive statistics
  • Two-rectangle features
  • Two similar rectangular regions
  • Vertically or horizontally adjacent
  • Sum pixels in each region
  • Compute difference between regions

32
Rectangle Features II
  • Three-rectangle features
  • 3 similar rectangles horizontally/vertically
  • Sum outside rectangles
  • Subtract from center region
  • Four-rectangle features
  • Compute difference between diagonal pairs
  • HUGE feature set 180,000

33
Rectangle Features
34
Computing Features Efficiently
  • Fast detection requires fast feature calculation
  • Rapidly compute intermediate representation
  • Integral image
  • Value for point (x,y) is sum of pixels above,
    left
  • ii(x,y) Sxltx,ylty i(x,y)
  • Computed by recurrence
  • s(x,y) s(x,y-1) i(x,y) , where s(x,y)
    cumulative row
  • ii(x,y) ii(x-1,y) s(x,y)
  • Compute rectangle sum with 4 array references

35
Rectangle Feature Summary
  • Rectangle features
  • Relatively simple
  • Sensitive to bars, edges, simple structure
  • Coarse
  • Rich enough for effective learning
  • Efficiently computable

36
Learning an Image Classifier
  • Supervised training /- examples
  • Many learning approaches possible
  • Adaboost
  • Selects features AND trains classifier
  • Improves performance of simple classifiers
  • Guaranteed to converge exponentially rapidly
  • Basic idea Simple classifier
  • Boosts performance by focusing on previous errors

37
Feature Selection and Training
  • Goal Pick only useful features from 180000
  • Idea Small number of features effective
  • Learner selects single feature that best
    separates /- ve examples
  • Learner selects optimal threshold for each
    feature
  • Classifier h(x) 1 if pf(x)ltp?, 0 otherwise

38
Boosting
  • Initialize weights, where mneg, lpos
  • For t 1,,T
  • 1. Normalize the weights, so that wt is
    probability distribn
  • 2. For each feature, j, train a classifier, hi,
    which is restricted
  • to a single feature. Error is evaluated with
    respect to wt
  • 3. Choose the classifier, ht, with lowest error
  • 4. Update the weights where ei0 if example xi
    classified
  • correctly and ei 1 o.w.
  • The final classifier isßt et/(1-et)

39
Basic Learning Results
  • Initial classification Frontal faces
  • 200 features
  • Finds 95, 1/14000 false positive
  • Very fast
  • Adding features adds to computation time
  • Features interpretable
  • Darker region around eyes that nose/cheeks
  • Eyes are darker than bridge of nose

40
Primary Features
41
Attentional Cascade
  • Goal Improved classification, reduced time
  • Insight Small fast classifiers can reject
  • But have very few false negatives
  • Reject majority of uninteresting regions quickly
  • Focus computation on interesting regions
  • Approach Degenerate decision tree
  • Aka cascade
  • Positive results passed to high detection
    classifiers
  • Negative results rejected immediately

42
Cascade Schematic
All Sub-window Features
T
T
T
CL 1
CL 2
CL 3
More Classifiers
F
F
F
Reject Sub-Window
43
Cascade Construction
  • Each stage is a trained classifier
  • Tune threshold to minimize false negatives
  • Good first stage classifier
  • Two feature strong classifier eye/check
    eye/nose
  • Tuned Detect 100 40 false positives
  • Very computationally efficient
  • 60 microprocessor instructions

44
Cascading
  • Goal Reject bad features quickly
  • Most features are bad
  • Reject early in processing, little effort
  • Good regions will trigger full cascade
  • Relatively rare
  • Classification is progressively more difficult
  • Rejected the most obvious cases already
  • Deeper classifiers more complex, more error-prone

45
Cascade Training
  • Tradeoffs Accuracy vs Cost
  • More accurate classifiers more features, complex
  • More features, more complex Slower
  • Difficult optimization
  • Practical approach
  • Each stage reduces false positive rate
  • Bound reduction in false pos, increase in miss
  • Add features to each stage until meet target
  • Add stages until overall effectiveness targets met

46
Results
  • Task Detect frontal upright faces
  • Face/non-face training images
  • Face 5000 hand-labeled instances
  • Non-face 9500 random web-crawl, hand-checked
  • Classifier characteristics
  • 38 layer cascade
  • Increasing number of features 1,10,25, 6061
  • Classification Average 10 features per window
  • Most rejected in first 2 layers
  • Process 384x288 image in 0.067 secs

47
Detection Tuning
  • Multiple detections
  • Many subwindows around face will alert
  • Create disjoint subsets
  • For overlapping boundaries, only report one
  • Return average of corners
  • Voting
  • 3 similarly trained detectors
  • Majority rules
  • Improves overall

48
Conclusions
  • Fast, robust facial detection
  • Simple, easily computable features
  • Simple trained classifiers
  • Classification cascade allows early rejection
  • Early classifiers also simple, fast
  • Good overall classification in real-time

49
Some Results
50
Vision in Modern Ai
  • Goals
  • Robustness
  • Multidomain applicability
  • Automatic acquisition
  • Speed Real time
  • Approach
  • Simple mechanisms, feature selection
  • Machine learning Tune features, classification
Write a Comment
User Comments (0)
About PowerShow.com