Part-based models

Part-based models

Part based models refers to a broad class of detection algorithms used on images, in which various parts of the image are used separately in order to determine if and where an object of interest exists. Among these methods, a very popular one seems to be the constellation model which refers broadly to those schemes which seek to detect a small number of features and their relative positions to then determine whether or not the object of interest is present. These models build on the original idea of Fischler and Elschlager of using the relative position of a few template matches and evolve in complexity in the work of Perona and others. These models will be covered in the constellation models section. To get a better idea of what is meant by constellation model an example may be more illustrative. Say we are trying to detect faces. A constellation model would use smaller part detectors, for instance mouth, nose and eye detectors and make a judgment about whether an image has a face based on the relative positions in which the components fire.

Non-constellation models

Many overlapping ideas are included under the title part-based models even after having excluded those models of the constellation variety. The uniting thread is the use of small parts to build up to an algorithm that can detect/recognize an item (face, car, etc.) Early efforts, such as those by Yuille, Hallinan and Cohen sought to detect facial features and fit deformable templates to them. These templates were mathematically defined outlines which sought to capture the position and shape of the feature. Yuille, Hallinan and Cohen’s algorithm does have trouble finding the global minimum fit for a given model and so templates did occasionally become mismatched.

Later efforts such as those by Poggio and Brunelli focus on building specific detectors for each feature. They use successive detectors to estimate scale, position, etc. and narrow the search field to be used by the next detector. As such it is a part based model, however, they seek more to recognize specific faces rather than to detect the presence of a face. They do so by using each detector to build a 35 element vector of characteristics of a given face. These characteristic can then be compared to recognize specific faces, however cut-offs can also be used to detect whether a face is present at all.

Cootes, Lanitis and Taylor build on this work in constructing a 100 element representation of the primary features of a face. The model is more detailed and robust however, given the additional complexity (100 elements compared to 35) this might be expected. The model essentially computes deviations from a mean face in terms of shape, orientation and gray level. The model is matched by the minimization of an error function.

Of the non-constellation perhaps the most successful is that of Leibe and Schiele. Their algorithm finds templates associated with positive examples and records both the template (an average of the feature in all positive examples where it’s present) and the position of the center of the item (a face for instance) relative to the template. The algorithm then takes a test image and runs an interest point locater (hopefully one of the scale invariant variety). These interest points are then compared to each template and the probability of a match is computed. All templates then cast votes for the center of the detected object proportional to the probability of the match, and the probability the template predicts the center. These votes are all summed and if there are enough of them, well enough clustered, the presence of the object in question (i.e. a face or car) is predicted.

The algorithm is effective because it imposes much less constellational rigidity the way the constellation model does. Admittedly the constellation model can be modified to allow for occlusions and other large abnormalities but this model is naturally suited to it. Also it must be said that sometimes the more rigid structure of the constellation is desired.

See also

* Computer vision

References

* Fischler and Elschlager, http://ieeexplore.ieee.org/iel5/12/35069/01672195.pdf?tp=&isnumber=&arnumber=1672195
* Lanitis, Cootes and Taylor, http://coblitz.codeen.org:3125/citeseer.ist.psu.edu/cache/papers/cs/14392/http:zSzzSzwww.wiau.man.ac.ukzSzWIAU_PAPERS_DIRzSzlan_DSP95.pdf/lanitis95locating.pdfhttp://ieeexplore.ieee.org/iel2/3245/9796/00466919.pdf?tp=&isnumber=&arnumber=466919
* Leibe & Schiele, http://www.vision.ee.ethz.ch/~bleibe/papers/leibe-interleaved-ijcv07final.pdf, http://www.vision.ee.ethz.ch/~bleibe/papers/leibe-ism-slcv04.pdf
* Perona, Fergus and Zisserman, http://ieeexplore.ieee.org/iel5/8603/27266/01211479.pdf?tp=&isnumber=&arnumber=1211479
* Poggio and Brunelli, http://ieeexplore.ieee.org/iel1/34/6467/00254061.pdf?tp=&isnumber=&arnumber=254061
* Yuille, Hallinan and Cohen, http://www.springerlink.com/content/tp404612x8171265/fulltext.pdf


Wikimedia Foundation. 2010.

Игры ⚽ Нужно сделать НИР?

Look at other dictionaries:

  • Part-of-speech tagging — (POS tagging or POST), also called grammatical tagging or word category disambiguation, is the process of marking up the words in a text as corresponding to a particular part of speech, based on both its definition, as well as its context i.e.,… …   Wikipedia

  • Models of migration to the New World — There are several popular models of migration to the New World proposed by the anthropological community. The question of how, when and why humans first entered the Americas is of intense interest to anthropologists and has been a subject of… …   Wikipedia

  • Models of collaborative tagging — Many have argued that social tagging or collaborative tagging systems can provide navigational cues or “way finders” [1][2] for other users to explore information. The notion is that, given that social tags are labels that users create to… …   Wikipedia

  • Models of abnormality — Psychology …   Wikipedia

  • Models of communication — A simple communication model with a sender transferring a message containing information to a receiver. There is an additional working definition of communication to consider[ …   Wikipedia

  • Models of migration to the Philippines — History of Philippines This article is part of a series Early History (pre 900) …   Wikipedia

  • Part-of-speech Tagging — Unter Part of speech Tagging versteht man die Zuordnung von Wörtern und Satzzeichen eines Textes zu Wortarten (engl. part of speech). Hierzu wird sowohl die Definition des Wortes als auch der Kontext (z.B. angrenzende Adjektive oder Nomen)… …   Deutsch Wikipedia

  • Agent-based model — An agent based model (ABM) (also sometimes related to the term multi agent system or multi agent simulation) is a class of computational models for simulating the actions and interactions of autonomous agents (both individual or collective… …   Wikipedia

  • Comparison of agent-based modeling software — In the last few years, the agent based modeling (ABM) community has developed several practical agent based modeling toolkits that enable individuals to develop agent based applications. More and more such toolkits are coming into existence, and… …   Wikipedia

  • Sample-based synthesis — is a form of audio synthesis that can be contrasted to either subtractive synthesis or additive synthesis. The principal difference with sample based synthesis is that the seed waveforms are sampled sounds or instruments instead of fundamental… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”