next up previous
Next: References Up: Learning Membership Functions in Previous: The Chair Database

Summary and Discussion

 

We have presented a system (OMLET) which uses labeled training examples to learn fuzzy membership functions embedded in a function-based object recognition system. The fuzzy membership functions are used to provide evaluation measures which determine how well a shape fits the functional description of an object category. The OMLET system is an example of using machine learning techniques to aid in the development of a computer vision system. We have shown that it is possible to accurately and automatically learn system parameters which would otherwise have to be provided by a human expert. OMLET may be used to aid in the construction of other object categories for the GRUFF object recognition system. The expert does not need to concentrate on ``hand-tweaking" the range parameters to improve system performance, but rather on providing a good set of example objects to ``show" to OMLET. This is intuitively appealing in that we are deriving descriptions of objects we would like GRUFF to recognize by providing examples from the object category. Additionally, we have been able to demonstrate that the performance of the learning algorithm is affected by the number and quality of the training examples.

It should be possible for the learning approach described in this paper to be applied to other systems in which measurements (or other values) are combined in a tree structure. All cases are covered by our approach, except the case of 2 leaves leading directly to a POR node. However, a generalization of our method for treating POR nodes may be developed to handle this situation. The tree structure in our CV system is composed entirely of probabilistic and and probabilistic or nodes, which are used to combine measurements. It is possible that a similar approach is applicable to tree structures in which other types of nodes (T-norms or T-conorms) are used.

The OMLET system should make it easier to adapt the GRUFF system to new object domains. Early versions of GRUFF performed object recognition starting from complete 3-D shape descriptions [Stark Bowyer1991,Stark Bowyer1994,Sutton, Stark, Bowyer1993] rather than from real sensory data. The task of reliably extracting accurate object shape descriptions from normal intensity images is beyond the current state of the art in computer vision. Although work in, for example, binocular stereo, is steadily progressing, accurate models of object shape are more readily extracted from range imagery. Whereas in normal imagery a pixel value represents the intensity of reflected light, in range imagery a pixel value represents the distance to a point in the scene. A version of GRUFF has been developed which attempts to recognize object functionality from the shape model that is extracted from a single range image [Stark, Hoover, Goldgof, Bowyer1993b]. A major difficulty here is, of course, that a single range image does not yield a complete model of the 3-D shape of an object. The ``back half" of the object shape is unseen [Hoover, Goldgof, Bowyer1995]. The accumulation of a complete 3-D shape model through a sequence of range images is a topic of current research. If this problem was solved, then it is conceivable that an OMLET training example might consist of a sequence of range images along with some operator annotations to identify which portions of the images correspond to the functionally important parts of the object (seating surface, back support surface, etc.).

This research was supported by Air Force Office of Scientific Research grant F49620-92-J-0223 and National Science Foundation grant IRI-91-20895.



next up previous
Next: References Up: Learning Membership Functions in Previous: The Chair Database



Larry &
Wed Oct 18 17:48:34 EDT 1995