Title: CS221 Artificial Intelligence: Principles
1CS221 Artificial Intelligence Principles
Techniques
- Challenge Problem
- Object Recognition and Tracking
Stephen Gould January 2008
2Overview
- Challenge problem
- Problem statement
- Source code overview
- Sliding-window detectors and Haar features
- Milestone requirements
- OpenCV tutorial
- Introduction and installation
- Code samples
- Object recognition tips and tricks
3Challenge Problem
4Challenge Problem
- Important Dates
- Important People
- Adam Coates, Ian Goodfellow, Timothy Hunter,
Lawson Wong
milestone
final submission
team
5Source Code Overview
6Command-line Options
- train
- is the root directory containing
(subdirectories of) all the training images - -c writes learned parameters to a file
after training using CClassifiersaveState() - -h provides help
- -v gives verbose output
- test
- is the name of the video you want
to test on (e.g. easy.avi) - -c configures the classifier with
parameters from a file using CClassifierloadStat
e() - -g displays ground truth labels from
an XML file - -h provides help
- -o saves classifications to an XML
file (same format as g) - -v gives verbose output
- -x disables display of the video (if you dont
have X-windows)
7Training File Lists
- typedef struct _TTrainingFile
- stdstring filename // full path to image
file - stdstring label // subdirectory name
- TTrainingFile
- typedef struct _TTrainingFileList
- stdvector files // list of
files - stdvector classes // list of
classes (subdirectories) - TTrainingFileList
8CObject Class
- class CObject
- public
- CvRect rect // object's bounding
box (x,y,width,height) - stdstring label // object's class
- public
- // constructors
- CObject()
- CObject(const CObject)
- CObject(const CvRect, const stdstring)
-
- // destructor
- virtual CObject()
-
- // helper functions
- void writeAsXML(stdostream)
- void draw(IplImage , CvScalar, CvFont )
- CvRect intersect(const CObject)
- int overlap(const CObject)
9CClassifier Class
- class CClassifier
- protected
- CvRNG rng
- CvMat parameters
- // TO DO ADD YOUR MEMBER VARIABLES HERE
-
- public
- // constructors and destructors
- CClassifier()
- virtual CClassifier()
- // load and save classifier configuration
- virtual bool loadState(const char )
- virtual bool saveState(const char )
- // run the classifier over a single frame
- virtual bool run(const IplImage ,
CObjectList ) -
- // train the classifier using given set of
files
10Overview
- Challenge problem
- Problem statement
- Source code overview
- Sliding-window detectors and Haar features
- Milestone requirements
- OpenCV tutorial
- Introduction and installation
- Code samples
- Object recognition tips and tricks
11Sliding-window Object Detectors
- e.g. task find all coffee cups
12Sliding-window Object Detectors
- e.g. task find all coffee cups
13Haar Features
- Compute difference of intensity over image regions
14Milestone Requirements
- Build a decision tree classifier for the mug
class using Haar features - Load positive and negative training images
- Convert to grayscale
- Resize to 64-by-64
- Extract (given list of) Haar features
- Train decision tree
- Implement runtime code to run classifier over all
scales (you can assume height width for the
milestone) and shifts (in increments of 8 pixels)
within each video frame - Remember after the milestone you are free to use
whatever features and classifiers you like
15Overview
- Challenge problem
- Problem statement
- Source code overview
- Sliding-window detectors and Haar features
- Milestone requirements
- OpenCV tutorial
- Introduction and installation
- Code samples
- Object recognition tips and tricks
16What is OpenCV?
- The Open Computer Vision Library is a collection
of algorithms and sample code for various
computer vision problems - libcxcore core data structures and linear
algebra library - libcv computer vision library
- libhighgui media and graphics i/o handling
- libml machine learning library (decision trees,
boosting, neural networks) - Originally developed by Intel now supported by
Willow Garage - Wiki has lots of information and API
documentation - http//opencvlibrary.sourceforge.net/
17Installing OpenCV (Linux)
- Install ffmpeg
- svn checkout svn//svn.mplayerhq.hu/ffmpeg/trunk
ffmpeg - ./configure --prefix --enable-shared
- make make install
- Install opencv
- download (version 1.0.0) from
- http//sourceforge.net/projects/opencvlibrary/
- tar -xvf opencv-1.0.0.tar.gz
- ./configure --prefix \ CPPFLAGS"-I/
include" \ LDFLAGS"-L/lib" - make make install
18Example 1 Loading and Displaying Images
- include "cv.h"
- include "cxcore.h"
- include "highgui.h"
- define WINDOW_NAME "MyWindow"
- int main(int argc, char argv)
-
- IplImage image
- cvNamedWindow(WINDOW_NAME, CV_WINDOW_AUTOSIZE)
- for (int i 1 i
- image cvLoadImage(argvi, 0) // load from
file - cvShowImage(WINDOW_NAME, image) // display on
screen - cvWaitKey(0) // wait for key press
- cvReleaseImage(image) // free memory
-
- cvDestroyWindow(WINDOW_NAME)
19Example 2 Converting from Color to Grayscale
- IplImage image
- // acquire RGB color image somehow
- ...
- // allocate memory for grayscale image
- IplImage gray cvCreateImage(
- cvGetSize(image), // same size as original image
- IPL_DEPTH_8U, // data type (8-bit unsigned)
- 1) // grayscale has one channel
- // color convert the image (source, destination)
- cvCvtColor(image, gray, CV_BGR2GRAY)
- // do something with greyscale image
- ...
- // free memory used by images
- cvReleaseImage(gray)
20Example 3 Resizing an Image
- IplImage image
- // acquire image somehow
- ...
- // allocate memory for resized image
- IplImage resizedImage cvCreateImage(
- cvSize(64, 64), // new size (width, height)
- image-depth, // data type (e.g. 8-bit
unsigned) - image-nChannels) // number of planes (e.g.
RGB) - // resize the image (source, destination)
- cvResize(image, resizedImage)
- // do something with resized image
- ...
- // free memory used by images
- cvReleaseImage(resizedImage)
21Example 4 Clipping a Small Region out of an Image
- IplImage image
- ... // acquire image somehow
- // clip out 64-by-64 image patch at (4,8)
- CvRect region cvRect(4, 8, 64, 64)
- IplImage clippedImage cvCreateImage(
- cvSize(region.width, region.height),
- image-depth, image-nChannels)
- cvSetImageROI(image, region)
- cvCopyImage(image, clippedImage)
- cvResetImageROI(image)
- ... // do something with clipped region
- // and always free memory
- cvReleaseImage(clippedRegion)
- cvReleaseImage(image)
22Example 5 Computing an Integral Image
- IplImage image
- CvRect r
- ...
- // compute integral image on grayscale image
- IplImage iImage cvCreateImage(
- cvSize(image-width 1, image-height 1),
- IPL_DEPTH_32S, 1)
- cvIntegral(image, iImage)
- ...
- // compute sum of pixels in area r (could also
use CV_IMAGE_ELEM) - double value
- value cvGetReal2D(iImage, r.y, r.x)
- value cvGetReal2D(iImage, r.y r.height, r.x
r.width) - value - cvGetReal2D(iImage, r.y, r.x r.width)
- value - cvGetReal2D(iImage, r.y r.height, r.x)
23Example 6 Logistic Regression
- CvMat logistic(const CvMat X, const CvMat
theta) -
- assert(X-cols theta-rows)
- CvMat Y cvCreateMat(X-rows, 1, CV_32FC1)
- for (int i 0 i rows i)
- double sigma 0.0
- for (int j 0 j cols j)
- sigma cvmGet(X, i, j) cvmGet(theta,
j, 0) -
- cvmSet(Y, i, 0, 1.0 / (1.0 exp(-1.0
sigma)) -
- return Y // caller must free Y with
cvReleaseMat
24Example 7 Boosting (Train)
- CvMat data cvCreateMat(, , CV_32FC1)
- CvMat labels cvCreateMat(, , CV_32SC1)
- ... // acquire training data somehow
- //define variable types
- CvMat varType cvCreateMat(data-width 1, 1,
CV_8UC1) - for (int j 0 j width j)
- CV_MAT_ELEM(varType, unsigned char, j, 0)
CV_VAR_NUMERICAL - CV_MAT_ELEM(varType, unsigned char, data-width,
0) CV_VAR_CATEGORICAL - // train
- CvBoostParams parameters(CvBoostGENTLE,
numRounds, 0.95, numSplits, false,
NULL) - parameters.split_criteria CvBoostDEFAULT
- CvBoost model new CvBoost()
- models-train(data, CV_ROW_SAMPLE, labels, NULL,
NULL, varType, NULL, parameters) - // free memory
- ...
25Example 7 Boosting (Test)
- CvBoost model
- CvMat x cvCreateMat(1, , CV_32FC1)
- ... // acquire model and test sample somehow
- // allocate memory for weak learner output
- int length cvSliceLength(CV_WHOLE_SEQ,
model-get_weak_predictors()) - CvMat weakResponses cvCreateMat(length, 1,
CV_32FC1) - // test (y is prediction, score is
log-probability) - int y model-predict(x, NULL, weakResponses,
CV_WHOLE_SEQ) - double score cvSum(weakResponses).val0
- // free memory
- cvReleaseMat(weakResponses)
- ...
26Example 8 Optical Flow
- IplImage bwImg1, bwImg2
- ... // acquire images somehow
- // allocate memory for optical flow vectors
- CvMat dx cvCreateMat(bwImg1.height,
bwImg1.width, CV_32FC1) - CvMat dy cvCreateMat(bwImg1.height,
bwImg1.width, CV_32FC1) - // compute dense Lucus-Kanade optical flow
(previous, current) - // (also see cvCalcOpticalFlowPyrLK for sparse
optical flow) - cvCalcOpticalFlowLK(bwImg1, bwImg2, cvSize(5, 5),
dx, dy) - double deltaX cvAvg(dx).val0 // bulk
x-direction motion - double deltaY cvAvg(dy).val0 // bulk
y-direction motion - // free memory
- cvReleaseMat(dy)
- cvReleaseMat(dx)
- cvReleaseImage(bwImg2)
27Overview
- Challenge problem
- Problem statement
- Source code overview
- Sliding-window detectors and Haar features
- Milestone requirements
- OpenCV tutorial
- Introduction and installation
- Code samples
- Object recognition tips and tricks
28OpenCV Tips
- Dont forget to free allocated memory
- cvReleaseImage, cvReleaseMat
- Try to allocate and free memory outside of loops
if possible - Image size is width-by-height matrix size is
rows-by-columns - Never set the region of interest (ROI) outside of
the image/matrix - x 0, y 0, x width
- Never operate on images of different types
(image?depth) - Make sure you convert first (e.g. RGB to
grayscale) - Check the return values (for NULL)
- e.g., loading images and allocating images
29Object Recognition Tips
- Read some papers for ideas on better features
- Viola and Jones, 2001, Wolf, Serre and Poggio,
2004, Lowe, 2004, Dalal and Triggs, 2005,
Torralba, Murphy, and Freeman, 2007. - The milestone uses sliding-window based detector,
but there are other techniques that you can try - template matching, chamfer matching/shape
matching, bag-of-features with locality
constraints, eigenspace representation, scene
context, color - Try different classifiers and training methods
- logistic classifiers, support vector machines,
boosted classifiers - Try to normalize your features for intensity
variation - Filter the output of your classifiers and use
motion estimation to predict where an object in
frame n will be in frame n1 - Anything that works!
30Coding Tips
- Modular, general design
- allows you to test lots of things
- Use OpenCV and other libraries, e.g., GSL,
SVMLight, STAIR Vision Library (SVL) - make sure you cite external libraries
- Use source control (SVN or CVS)
- Your code must compile on CS machines
31Design Process Tips
- Automate testing early
- Consider how you will avoid overfitting
- Test features without training an entire
classifier (e.g., entropy function in Matlab) - Visualization code usually pays off
- Profile your code and optimize bottlenecks (e.g.,
use integral images) - Start early!
32Example Results
- A simple decision tree classifier trained using
Haar features is not all that good so dont
expect brilliant results
State-of-the-art Results (courtesy Ben Sapp)
Milestone Results
33Finally
- Good luck!
- (and have fun)