Report Number: CS-TR-99-1620
Institution: Stanford University, Department of Computer Science
Title: Finding Color and Shape Patterns in Images
Author: Cohen, Scott
Date: May 1999
Abstract: This thesis is devoted to the Earth Mover's Distance (EMD), an edit distance between distributions, and its use within content-based image retrieval (CBIR). The major CBIR problem discussed is the pattern problem: Given an image and a query pattern, determine if the image contains a region which is visually similar to the pattern; if so, find at least one such image region. An important problem that arises in applying the EMD to CBIR is the EMD under transformation (EMD_G) problem: find a transformation of one distribution which minimizes its EMD to another, where the set of allowable transformations G is given. The problem of estimating the size/scale at which a pattern occurs in an image is phrased and efficiently solved as an EMD_G problem. For a large class of transformation sets, we also present a monotonically convergent iteration to find at least a locally optimal transformation. Our pattern problem solution is the SEDL (Scale Estimation for Directed Location) image retrieval system. Three important contributions of SEDL are (1) a general framework for finding both color and shape patterns, (2) the previously mentioned scale estimation algorithm using the EMD, and (3) a directed (as opposed to exhaustive) search strategy.
http://i.stanford.edu/pub/cstr/reports/cs/tr/99/1620/CS-TR-99-1620.pdf