Due 16 February 2001
Overall
This lab will be your first introduction to object recognition. You will be
trying to recognize ten objects based on their 2-D characteristics, and possibly
color. The ten objects are shown below.
Black Disk |
Clip |
Cone |
Disk Box |
Envelope |
Orange Disk |
Pen |
Pliers |
Stop-sign |
Velvet |
Your final system should be able to read in an image and output a list of the
objects in the image, along with their bounding boxes. You should probably have
your program output this information graphically as well by writing out the
input image with the objects highlighted by their bounding boxes. You can assume
for this lab that all images will be of the objects on a white background.
Data
There are nine test images, each containing multiple non-overlapping objects
on a white background. Each of the images was taken with an Olympus D600-L
digital camera. Then each images was cropped and transformed using a gamma 1.5
transformation in xv plus a slight modification to flatten the upper intensity
range (upper middle active point moved to (159,210)). Each image was treated
identically, so differences are due to illumination and the color balancing in
the camera.
Tasks
- The first task is to write a program that converts an input image into a
binary image, with the background white and the foreground non-white. You can
use manually specified thresholds, or have your program find them
automatically. You may want to discard noise in this step through the use of a
Gaussian or median filter.
To read and write the PPM images you can use the following library. The
ppmmain.c file is an example of how to use the routines and modify an image.
- The second task is to locate the bounding box of each object in the image.
Here is the implementation of a 2-pass
segmentation algorithm that works on "binary" images. You may use this one
or write your own. You will also need to get this include
file. Here is also a simple
example of how to use the segmentation algorithm.
These are the kinds of results you should be getting.
- The third task is to write a routine that calculates a feature set for
each object (size, % of bounding box filled, orientation, moments, color),
given the bounding box of the object, a binary image containing it, and
possibly the original image.
- The fourth task is to develop a database of features for each object.
These will constitute your object models. Use image 7 or 8 (or both) to build
your object database.
- Finally, combine these together into a program (or set of programs) that
takes in an image, locates potential objects, compares them to the objects in
the database, and labels each one with the closest match. One program does not
have to do all of this, and it may be easier to divide it into steps and use
intermediate data files and images.
Extensions:
- Automatically identify thresholds for step one.
- Write your own segmentation algorithm for step two (possibly a region
growing algorithm).
- Use a feature set that does not use color, but can still correctly
differentiate everything except the orange and black disks.
- Develop or train a decision tree for object identification that can be
easily changed.
What to hand in
Your lab reports should be set up as web pages (easiest method of displaying
images). Do not post code on your web page. Instead, email it to Professor
Maxwell when you hand in your lab.
For this lab you should also plan to demonstrate your system on some new test
images during the lab period on 16 February. These images will be taken under
similar conditions as the images posted above.
Your posted lab report should contain the following:
- An abstract of no more than 100 words explaining what you did.
- A summary of your thresholding algorithm, including any manually selected
thresholds.
- A summary of your segmentation algorithm.
- A summary of your feature extraction step.
- A summary of how you built the object models.
- A summary of how you matched new objects to the object models in your
database.
- Pictures showing example results for each step.
- A confusion matrix showing your recognition results. This should have the
actual objects along the vertical axis and the resulting labels given those
objects along the horizontal axis. A perfect recognition system would only
have entries along the diagonal.
- A chart showing, for each object, the percent correctly identified.
- What extensions did you do? Show methods and results.