Feature Detection and Matching
For CS576 project #1, I implemented a Harris keypoint detector and a descriptor with gradient histogram on the provided skeleton code. To detect a keypoint, Harris corner detector is used in this project. To represent a keypoint, I computed histograms of gradient of 16 by 16 surrounding pixel values and normalized the histograms using a major orientation.
Detection
In detection of sailent keypoints, Harris corner detector is used in this project. Before computing Harris value, I reduced the noise using 3 by 3 mean filter. M is computed from 5 by 5 grid. I used a gaussian window function whose variance is 3.0. Harris value is computed by det(M) - 0.04 * trace(M) ^ 2 and I chose pixels, whose Harris values are higher than 0.0009 or 0.001, as keypoints. To take distinctive keypoints, I only chose pixels that are local-maxima within 5 by 5 grid.
Descriptor
For a discriminative represention of a keypoint, I adopt the idea of SIFT descriptor. I computed histograms of gradient of 16 by 16 surrounding pixel values. Gradient helps us to make a descriptor robust to illumination change. I made 4 histograms for upper-left, upper-right, lower-left, and lower-right 8 by 8 sub-grids. For each histogram, I collected an orientation of each pixel into 8 orientation and shifted the histograms until the major orientation places at first. Through normalization, my descriptor becomes robust to small rotation (up to 60 degree).
Results
Harris values
|
|
|
|
Yosemite image
|
Graf image
|
ROC curves
|
Yosemite image
|
Graf image
|
SSD Distance
|
|
|
AUC
|
My = 0.925773 Simple Window = 0.624433
|
My = 0.683874 Simple Window = 0.486007
|
Ratio Test Distance
|
|
|
AUC
|
My = 0.871048 Simple Window = 0.666154
|
My = 0.629409 Simple Window = 0.614402
|
Test on benchmarks
Harris corner detector + My descriptor
Benchmark
|
Average AUC (SSD / Ratio Test)
|
Bikes
|
0.389670 / 0.473977
|
Graf
|
0.581375 / 0.550690
|
Leuven
|
0.341940 / 0.497064
|
Wall
|
0.476932 / 0.515706
|
Discussion
My implementation of detector works well in ordinary scene, but it is sensitive to the resolution and illumination. It is hard to decide threshold for a harris corner detector since we want to get small number of meaningful keypoints and, at the same time, we want to get enough number of keypoints to find correspondence of two images. Our threshold get only 5 keypoints for the "Bikes" benchmark, but we get more than 3000 keypoints from "Wall" benchmark.
Also, my descriptor works better for "Graf" and "Wall" benchmarks which have rotations and view point changes. On the other hand, it works poor for "Bikes" which has various resolution. In colclusion, it is not robust to illumination and scale.
References
Code and executable
Project homepage
Tools and Libraries