<< Back

Segmentation of Videos Using Color, Motion and Spatial Information

Video segmentation is different from segmentation of a single image. While several correct solutions may exist for segmenting a single image, there needs to be a consistency among segmentations of each frame for video segmentation. Previous approaches of video segmentation concentrate on motion, or combine motion and color information in a batch fashion. We propose a maximum a posteriori probability (MAP) framework that uses multiple cues, like spatial location, color and motion, for segmentation. We assign weights to color and motion terms, which are adjusted at every pixel, based on a confidence measure of each feature. We also discuss the appropriate modeling of pdfs of each feature of a region. The correct modeling of the spatial pdf imposes temporal consistency among segments in consecutive frames. This approach unifies the strengths of both color segmentation and motion segmentation in one framework, and shows good results on videos that are not suited for either of these approaches.



Associated Publication:

Object Based Segmentation of Video Using Color, Motion and Spatial Information,
in IEEE Int'l Conference on Computer Vision and Pattern Recognition (Dec. 2001).

Sequence 1: [orig*] [results]
Standard Flower Garden Sequence. We start with an incorrect initial segmentation in the beginning, but the segmentation improves steadily, to show very good results towards the end.

Sequence 2: [orig*] [results]
Tennis Sequence. The motion of the player is non-uniform, and therefore standard motion segmentation methods generally lose the player as he stops. We consistently track the player as a segment, because when there is no discriminatory information in motion, our algorithm switches to color segmentation.

Sequence 3: [orig*] [results]
Short sequence from the movie "Legally Blonde". The initial frames show the process of initial segmentation, which is improved over time. Then we show segmentation for about 30 frames. This is a fairly complicated sequence.

Sequence 4: [orig*] [results]
Mother and Daughter Sequence. Only three segments are used here.

* The original sequences available here are MPEG compressed and are made available here only for visualization of results. Actual algorithm was run on uncompressed versions of these sequences.

Currently, we are working on dynamically changing the number of segments, to account for scene changes. More sequences and results of this work will be available soon