Computer Vision

Overview and aims

By the end of the lecture you will be able to:

  • Understand the basics function of computer vision and how
    it can be used to create interaction methods
  • Use the OpenCV library in Processing
  • Use basic techniques such as blob detection and background subtraction
  • Create computer vision based interaction methods

Computer Vision

Computer vision is about computers extracting information from images and in some way “understanding” them.

Extracting meaningful objects

While last lecture we looked at image processing as a way of creating visual effects on an image, computer vision is about extracting meaningful information, detecting objects and actions. It has been used for many things, such as allowing robots to see their environment to navigate around it, or to interact socially with humans, automatically detecting faults in products on a production line, or detecting traffic offenses in speed cameras.

Camera as input device

Recently it has been used, particularly by artists as a new form of interaction. Using a camera as an input device to allow people to interact with computers with more natural and unencumbered movements.

Examples

Contour from vanderlin on Vimeo.

Contour, which I’ve shown before, uses computer vision as its main input.

The eye toy uses computer vision as an interaction method for games, it has been around for many years now.

Messa di Voce • 23/2/2009 from j saavedra on Vimeo.

Messa di voce uses computer vision to allow interaction between performers and virtual elements.

Augmented reality uses computer vision based methods to embed virtual objects on top of the real world.

drawn is an artwork that takes real drawings and turns them into virtual objects that can be manipulated.

Messa di Voce • 23/2/2009 from j saavedra on Vimeo.

Messa di voce uses computer vision to allow interaction between performers and virtual elements.

Overview and aims

By the end of the lecture you will be able to:

  • Understand the basics function of computer vision and how
    it can be used to create interaction methods
  • Use the OpenCV library in Processing
  • Use basic techniques such as blob detection and background subtraction
  • Create computer vision based interaction methods

OpenCV

OpenCV is the most popular Open Source library for Computer Vision.
It is written in C++ and used extensively in research applications.
There is a Processing interface to it that allows you simplified access to
a limited set of the functionality. JMyron provides similar functionality.

To access the full power of OpenCV you need to use C++. OpenFrameworks
provides a Processing-like environment in which to do that and is a popular
choices for artists working with computer vision

Accessing a camera with OpenCV

import hypermedia.video.*;
 
OpenCV opencv;
 
void setup(){
  size(800,640);
 
  opencv = new OpenCV(this);
  opencv.capture(800,640);
}

OpenCV is commonly used with a live camera feed in order to do interaction.

You first need to create an OpenCV object which handles everything, then you need to open the camera.
You do this with the capture command, which takes the width and height of the capture area you want.
In this case I’m making the camera the same size as the screen, but often you will want quite a small
camera area so it isn’t too slow.

Loading a movie with OpenCV

import hypermedia.video.*;
 
OpenCV opencv;
 
void setup(){
  size(800,640);
 
  opencv = new OpenCV(this);
  opencv.movie( "testmovie.mov", camWidth, camHeight );
}

You can do the same with a movie file, which is useful for debugging and testing.

Using the OpenCV image

void draw(){
  opencv.read();
 
  // do some processing
 
  image(opencv.image(), 0, 0);
}

opencv.read() gets a new image from the camera, you call this before doing any computer vision
processing. opencv.image() returns a PImage which you can use and display as you normally would in Processing.

The main Computer Vision problem is to extract useful information from the image.

Blob detection

Blob detection is a method for finding shapes in an image.

Blob[] blobs = opencv.blobs( 4, int(height*width/2),
                            10, false);

The method opencv.blobs finds blobs in an image. It returns an array of blobs sorted in order of size, with the largest first.

The Blob class contains a number of data fields containing information about a blob that has been found.

point(blobs[i].centroid.x, blobs[i].centroid.y);
The middle (centroid) of the blob
Rectangle r = blobs[i].rectangle;
rect(r.x, r.y, r.width, r.height);

A rectangle that encloses the blob (a bounding box)
beginShape();
for( int j=0; j<blobs[i].points.length; j++ ) {
        vertex( blobs[i].points[j].x, blobs[i].points[j].y );
}
endShape(CLOSE);

A list of points that define the outline of the blob

All of these methods can be used to visualize the blobs that have been found.

Control your input

Blob detection just detects bright patches in the image. This isn’t very useful on a normal image. You need to control the input you send to blob detection so that detecting patches of high colour value will in fact result in it doing something useful. There are two main ways to ensure you control your input.

Control your environment

Always when working with computer vision it is a good idea to control the environment in which you are grabbing the image as much as possible. Control the lighting conditions so that they don’t change much and the background so that it is quite plain. More on this later.

Preprocessing your image

You should also pre-process your image, applying filters and other operations to make sure that the things you are interested turn up as bright patches in your final image.

  opencv.convert(OpenCV.GRAY);
  opencv.threshold(40);

Converting to gray scale makes sure we don’t worry about colour values and running a threshold filter makes the blobs better defined with stronger contours. However, all we are doing is still just getting bright areas.

NB. we are using opencv’s filters rather than the built in Processing ones. This means they behave a bit differently. You don’t need to call them on a particular image as they automatically act on the opencv’s default image (the one that has been read from the camera or movie). Also threshold takes a colour value from 0 to 255 not a value from 0 to 1.

Still, all we are doing is still just getting bright areas. We need a way of converting the image so that interesting objects appear as bright patches in the final image.

Background subtraction

We capture an image of the background with nothing moving in front of it. Every frame we subtract this from the current image and the result is just an image of the objects moving in the foreground, everything else will be black. This means that the blobs we find will be foreground objects.

  opencv.read();
  opencv.remember();

In setup, just after loading the movie or setting up the camera we can capture a background image.

opencv.read() will capture the initial image from the movie or camera. opencv.remember() will store this image for later use for background subtraction

  opencv.absDiff();

opencv.absDiff() (absolute difference) will subtract the image that was stored by opencv.remember() from the last image captured from the camera. The result is background subtraction.

We can then perform a threshold filter as above, after which blob detection should, fairly reliably, detect objects in the foreground

Problems

Background subtraction + blob detection works pretty well as a method for interaction but you shouldn’t treat it as a completely reliable input as you might a mouse and keyboard, there are a number of problems that you have to deal with.

Blobs not Objects

Blob detection doesn’t exactly capture the shape of objects, it will give you blobs that correspond to an object but may not be the whole object. There may be several blobs per object, as shown above. Adjusting the threshold parameter might help, but it will never be completely accurate.

Lighting

There is a big problem with background subtraction. It is detecting pixels which are different from the original background. Pixels can be different because there is a new object there, but they can also be different for many irrelevant reasons, particularly because lighting conditions change slightly.

The way to get around this is to control your environment and lighting as much as possible. Try to capture your video against a plain background with no movement and have as constant lighting as possible. Capturing in doors under artificial lighting is much easier, and make sure there are no direct light sources in your image.

These can be compounded by the fact that webcams often do lots of pre-processing to make pictures look good, adjusting brightness and contrast settings automatically. This is great if you want to take pictures, but changing the contrast of an image will change the values of all pixels in an image without them moving at all and so give you lots of false results. If you are serious about doing computer vision you should turn off all auto processing on your camera, if you can, or use a camera that doesn’t do any auto correction. The Sony EyeToy cameras are designed for computer vision applications, and are very good

Motion Detection

Another approach, which is often more reliable is to detect movement rather than shapes. It is often enough to detect whether anything is moving in an area, rather than if there is an object there. You do this by calling opencv.remember() at the end of every frame rather than just once. This means that you are finding difference from the previous frame, rather than from teh background.

      opencv.read();
      opencv.absDiff();
      opencv.convert(OpenCV.GRAY);
      opencv.threshold(40);
      Blob[] blobs = opencv.blobs( 4, int(height*width/2),
                            10, false);
      opencv.remember();
The blob detection call no longer detects proper shapes, it detects areas of movement. All you need to do to detect movement is to check whether there are any blobs present. If there are then you have movement. You might need to adjust the threshold and the minimum blobs size so you don’t get too many spurious blobs.

Regions of Interest

You often want to detect movement only in a particular area of the image, rather than the whole image (to create specific areas where you detect interaction). OpenCVs Regions of Interest functionality allows you to do this.
      opencv.ROI(x, y, width, height);
opencv.ROI restricts all of OpenCVs processing to a particular region, defined as a rectangle. You pass it the position of the top left hand corner and the width and height of the rectangle.

Overview and aims

By the end of the lecture you will be able to:

  • Understand the basics function of computer vision and how
    it can be used to create interaction methods
  • Use the OpenCV library in Processing
  • Use basic techniques such as blob detection and background subtraction
  • Create computer vision based interaction methods

Myron

Installing OpenCV on windows can be problematic. An alternative is to use an different library called Myron:

http://webcamxtra.sourceforge.net/reference.shtml

It has similar functionality to OpenCV, but with a slightly different interface. To work with Myron you need for first set up a JMyron object (JMyron is Java Myron):

import JMyron.*;
 
JMyron m;//a camera object

In setup you need to create the object, and start the camera capture

  m = new JMyron();//make a new instance of the object
  m.start(camWidth,camHeight);//start a capture at 320x240

Myron doesn’t support loading video, but you can load video using the Processing video library and then send it to Myron using the Myron hijack command (see the Myron reference)

Blob detection is done automatically (though in Myron they are called “globs”). You can specify the colour your want to detect using the trackColor command. You don’t need to call a blob detect command, you can just call methods like globCenters and globBoxes to get various features of the globs.

Background subtraction and motion detection are also built in. Glob detection is automatically done on the difference between the current image and the “retinal image”, which is a stored image. By default the “retinal image” is empty so there is no background subtraction. To put something in the retinal image you call the command adapt which stores the current image. To do background subtraction you need to do this in setup.

For motion detection you need to adapt every frame. Myron lets you do this by using the adaptivity command, by setting adaptivity to 1, you automatically store every frame to the retinal image. Setting it to 0 means no automatic adaptation and setting it to a higher value makes the retinal image into an average of several previous frames. So to do motion detection all you need to do is set adaptivity to 1. The average calculates the average pixel value within a rectangle. This is a good way of telling whether there has been motion in a particular region

Unfortunately, Myron doesn’t allow you to get a the current image as a PImage, only as an array of pixels, so you need to write your own code to display it to the screen. Here is an example:

void drawCamera(){
  int[] img = m.differenceImage(); //get the normal image of the camera
  loadPixels();
  for(int i=0;i<width*height;i++){ //loop through all the pixels
    if (img[i] > 3)
      pixels[i] = img[i]; //draw each pixel to the screen
  }
  updatePixels();
}

Look at the Myron reference and examples to find out more about how to use Myron

Create PDF    Send article as PDF to