MSc Arts Computing: Visual perception: Marr's model

The Primal Sketch

The detection of intensity changes, the representation and analysis of local geometric structures and the detection of illumination effects take place in the process of generation of the primal sketch, where independent spatial organizations of the viewed intensities in a scene reflects the structure of the visible surfaces. Marr proposed to capture these organizations by using a set of "place tokens," or low level features, which correspond to:

1. oriented edges,

2. bars,

3. ends and

4. blobs

Each of these were represented by a 5-tuple: type, position, orientation, scale, contrast.

A set of examples : http://www.cs.ucla.edu/~cguo/primal_sketch.htm

The 2.5 sketch

The 2.5-D sketch is intended to represent the orientation and depth of the visible surfaces as well as discontinuities. It is composed of some local surface orientation primitives, distance from the viewer and discontinuities in depth and surface orientation and, as in the previous representation, it is specified in a viewer-centered coordinate system.

It also takes into account visual information of motion, shading, shape and texture.

What is in the ".5" ? this is not about a fractal dimension, but rather a metaphor for the claim/concept that, in reality, we do not see all of our surroundings. For example, consider someone with her back turned to you. You can only see half of her body, although you assume their is some front part to her body (with a face, etc.).

The point Marr is making here is that we are not actually aware of all our surroundings and so construct details to fill in the gaps.

From Benjamin Kimia, Brown University.

The 3D Model

Principle:

Describe shapes and their organization using a modular and hierarchical organization of volumetric and surface primitives.

The recognition process uses a catalogue of 3-D models which is a collection of stored 3-D model descriptions and various indices into the collection that allow the association of a new description with the appropriate one in the collection.

All 3-D model descriptions can be organized in a hierarchy according to the specificity of information they carry. The top level of such a hierarchy is a model which does not have a component decomposition and describes the model's principal axis. At the next level in the hierarchy more details are added to the model, like the number and distribution of subcomponent axes along the principal axis. At the lower levels each individual object's model receives more precise descriptions, and they can now be distinguished by the angles and length of their components.

From David Marr's book: Vision, 1982.

Another famous example of related ideas human cognition (primitve-based representation of objects) are the geons of Biederman et al. :

Recognition-by-components (RBC theory):

http://www.pigeon.psy.tufts.edu/avc/kirkpatrick/

Computing & the Arts

1. Computational theories of Visual Perception

David Marr's model

The Primal Sketch

The 2.5 sketch

The 3D Model

Glossary

References