Computing
& the Arts
M.C. Escher (1898 - 1972), Bond Of Union, 1956.
1.
Computational theories of
Visual Perception
David Marr's model
(circa 1980)
"Vision can be
understood as an information
processing task
which converts a numerical image representation into a symbolic
shape-oriented representation."
Marr [Marr:Vision:1982] proposed three different levels for the
understanding of information
processing systems (having vision systems as the target example):
1. computational
theory;
2. representation and algorithm; and
3. hardware
implementation.
One of the Marr's most
important contribution was made
in the level of representation
and algorithm when he
proposed a
representational framework for vision. He concentrated on the vision
task of deriving shape
information from images.
After D. Marr, in "Vision," 1982.
From S. Lehar.
The intensities perceived by any
visual system are a
function of four main factors:
1. the geometry
(meaning shape and relative placement);
2. the reflectance and absorption properties of
the visible surfaces (physical properties);
3. the illumination (light sources); and
4. the camera
(viewpoint, optics).
The Primal
Sketch
The
detection of intensity changes, the representation and analysis
of local geometric structures and the detection of illumination
effects take place in the process of generation of the primal
sketch, where independent spatial organizations of the viewed
intensities in a scene
reflects the structure of the visible surfaces. Marr proposed to
capture these organizations by using a set of "place tokens," or low
level features, which correspond to:
1. oriented edges,
2. bars,
3. ends and
4. blobs
Each of these were represented by a 5-tuple: type,
position, orientation, scale, contrast.
A set of examples : http://www.cs.ucla.edu/~cguo/primal_sketch.htm
The 2.5 sketch
The 2.5-D
sketch is intended to represent the orientation
and depth of the
visible surfaces as well as discontinuities. It is composed of
some
local surface orientation primitives, distance from the viewer and
discontinuities in depth and surface orientation and, as in the
previous representation, it is specified in a viewer-centered
coordinate system.
It also takes into account visual information of motion, shading, shape
and texture.
What is in the ".5" ? this
is not about a fractal
dimension, but rather a metaphor for the
claim/concept that, in reality, we do not see all of
our surroundings. For example, consider someone with her
back turned to you. You can only
see half of her body, although you assume their is some front part to
her body (with a face, etc.).
The point Marr is making here is that we
are not actually aware of all our surroundings and so
construct
details to fill in the gaps.
From Benjamin Kimia, Brown University.
The 3D Model
Principle:
- Describe shapes and their organization using a modular and hierarchical organization of volumetric
and surface primitives.
The recognition process
uses a catalogue of 3-D
models
which is a collection of stored 3-D model descriptions and various
indices into the collection that allow the association of a new
description with the appropriate one in the collection.
All 3-D model descriptions
can be organized in a hierarchy
according to the specificity of information they carry. The top level
of such a hierarchy is a model which does not have a component
decomposition and describes the model's principal axis. At the next
level in the hierarchy more details are added to the model, like the
number and distribution of subcomponent axes along the principal axis.
At the lower levels each individual object's model receives more
precise descriptions, and they can now be distinguished by the angles
and length of their components.
From David Marr's book: Vision, 1982.
Another famous example of related ideas human cognition
(primitve-based representation of objects) are the geons of Biederman et al. :
- Recognition-by-components
(RBC theory):
Glossary
Object-centered representation:
- The description of the object (shape) is relative to the object;
in particular, a coordinate frame is attached to a center (e.g., of mass) for the object.
- Examples: Constructive
Solid Geometry (CSG), boundary-based representation (B-rep, such as
NURBS), generalized cylinders, medial axis.
Observer-based (or view-based)
representation:
- The description of the object is dependent on the camera
parameters (essentially its field of view) and attached to the image
space for the selected viewpoint.
- Example: Curvature description for a given outline (projection).
References
Edelman:Marr:2001
Guo:Primal:2003
Guo, C., Zhu, S.-C, &
Wu, Y., "A Mathematical Theory of Primal Sketch and Sketchability,"
Proc. of the Int. Conf. on Computer Vision (ICCV),
pp. 1228-1235, Nice, France, 2003.
Website:
www.cs.ucla.edu/~cguo/primal_sketch.htm
Marr:Vision:1982
Marr, D., Vision: A
computational investigation into the human representation and
processing of visual information, W.H. Freeman publ., 1982.
Watt:Visual:1988
Watt, R.J., Visual Processing: Computational,
Psychophical and Cognitive Research, Lawrence Erlbaum publ.,
1988.
BACK
Last update: Oct. 10, 2006.