The sysnopsis of Project 2 (CS576)
Paper Title : Efficient indexing for large scale visual search
Conference : ICCV 2009
Author : Xiao Zhang et al. (Tsinghua University)
Abtract : The main idea is Decompose image into two components, one for dimension reduction
and the other for residual information preservation.
Result : I cannot implement this system completely.
First, parameter estimation of IDM model is intractable. (Topic distribution and image-specific distribution)
Second, although I try to implement LSH algorithms, it doesn't work well.
Project 2 Source
1. Image Decomposition model
The IDM model Switch variable x to control the generation of visual words.
1 : topic distribution
2 : image-specific distribution
3 : background distribution
1) The representation of image
<¥èd, ¥÷large >
==> ¥Èd is topic mixtures proportion (LSH index)
==> ¥÷large means part of specific components which have the largest values (image meta information)
2) Indexing Low-dimensional representation
Using Localilty Sensitive Hashing
Memory requirement O(L*D)
3) Indexing image-specific words
< wordid, weight >
Fixed-array : O(30*D)
3. Parameter estimation
Using the Monte Carlo EM algorithm
4. Ranking (Similarity measure)
1) Extract ¥èq and ¥÷q
2) approximately computing cosine(¥èd, ¥èq)
3) Assign final scores
4) Rank all candidate images
1) Dataset (2.3 million images)
Oxford5K (5062 images, 18M descriptors, 55 queries)
Paris (6300 images, 19M descriptors)
PhotoForum2M (2.3M images, 4165M descriptors)
2) Feature extraction and quantization
SIFT features + AKM
- Construct visual vocabularies based on different types of images, i.e. Oxford5K and PhotoForum7K.
- Train IDMon both groundtruth dataset, i.e. Oxford5K, and a similar dataset, i.e. Paris.
- Topic mixture is indexed by LSH algorithms and store Image-specific information
- Similiarity measure
SIFT library : Rob Hess (School of EECS, Oregon State University)
E2LSH (LSH Algorithm and Implementation)
 Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions" (by Alexandr Andoni and Piotr Indyk). Communications of the ACM, vol. 51, no. 1, 2008, pp. 117-122.