E27/CS27 Computer Vision Lab #1

Due 16 February 2001

Overall

This lab will be your first introduction to object recognition. You will be trying to recognize ten objects based on their 2-D characteristics, and possibly color. The ten objects are shown below.

Black Disk

Clip

Cone

Disk Box

Envelope

Orange Disk

Pen

Pliers

Stop-sign

Velvet

Your final system should be able to read in an image and output a list of the objects in the image, along with their bounding boxes. You should probably have your program output this information graphically as well by writing out the input image with the objects highlighted by their bounding boxes. You can assume for this lab that all images will be of the objects on a white background.

Data

There are nine test images, each containing multiple non-overlapping objects on a white background. Each of the images was taken with an Olympus D600-L digital camera. Then each images was cropped and transformed using a gamma 1.5 transformation in xv plus a slight modification to flatten the upper intensity range (upper middle active point moved to (159,210)). Each image was treated identically, so differences are due to illumination and the color balancing in the camera.

Tasks

  1. The first task is to write a program that converts an input image into a binary image, with the background white and the foreground non-white. You can use manually specified thresholds, or have your program find them automatically. You may want to discard noise in this step through the use of a Gaussian or median filter.

    To read and write the PPM images you can use the following library. The ppmmain.c file is an example of how to use the routines and modify an image.

  2. The second task is to locate the bounding box of each object in the image. Here is the implementation of a 2-pass segmentation algorithm that works on "binary" images. You may use this one or write your own. You will also need to get this include file. Here is also a simple example of how to use the segmentation algorithm.

    These are the kinds of results you should be getting.

  3. The third task is to write a routine that calculates a feature set for each object (size, % of bounding box filled, orientation, moments, color), given the bounding box of the object, a binary image containing it, and possibly the original image.

  4. The fourth task is to develop a database of features for each object. These will constitute your object models. Use image 7 or 8 (or both) to build your object database.

  5. Finally, combine these together into a program (or set of programs) that takes in an image, locates potential objects, compares them to the objects in the database, and labels each one with the closest match. One program does not have to do all of this, and it may be easier to divide it into steps and use intermediate data files and images.

    Extensions:

What to hand in

Your lab reports should be set up as web pages (easiest method of displaying images). Do not post code on your web page. Instead, email it to Professor Maxwell when you hand in your lab.

For this lab you should also plan to demonstrate your system on some new test images during the lab period on 16 February. These images will be taken under similar conditions as the images posted above.

Your posted lab report should contain the following:

  1. An abstract of no more than 100 words explaining what you did.
  2. A summary of your thresholding algorithm, including any manually selected thresholds.
  3. A summary of your segmentation algorithm.
  4. A summary of your feature extraction step.
  5. A summary of how you built the object models.
  6. A summary of how you matched new objects to the object models in your database.
  7. Pictures showing example results for each step.
  8. A confusion matrix showing your recognition results. This should have the actual objects along the vertical axis and the resulting labels given those objects along the horizontal axis. A perfect recognition system would only have entries along the diagonal.
  9. A chart showing, for each object, the percent correctly identified.
  10. What extensions did you do? Show methods and results.