Scene Understanding and Classification

Related Publication: Jingen Liu and Mubarak Shah, Scene Modeling using Co-Clustering, IEEE International Conference on Computer Visiona (ICCV), 2007.

  1. Introduction
  2. Framework
  3. Objective function of co-clustering
  4. Results


In this paper, we propose a novel approach for scene modeling. The proposed method is able to automatically discover the intermediate semantic concepts. We utilize Maximization of Mutual Information (MMI) co-clustering approach to discover clusters of semantic concepts, which we call intermediate concepts. Each intermediate concept correspond to a cluster of visterms in Bag of Visterms (BOV) paradigm for scene classification. MMI coclustering results in a fewer but meaningful clusters. Unlike k-means which is used to cluster image patches based on their appearances in BOV, MMI co-clustering can group the visterms which are highly correlated to some concept. Unlike probabilistic Latent Semantic Analysis (pLSA) which can be considered as one-side soft clustering, MMI coclustering simultaneously clusters visterms and images, so it is able to boost both clustering. In addition, the MMI co-clustering is an unsupervised method. We have extensively tested our proposed approach on two challenging datasets: the fifteen scene categories and LSCOM dataset, and promising results are obtained.


There are three steps in the training phase. First, generate the visual words codebook on the subset of the training images by using k-means algorithm to group the patches based on their visual similarity; Second, the visual words are further grouped into some semantic concepts based on their intrinsic semantic relationship. Finally, each patch has been assigned to some concept. We use spatial corellogram to capture the concepts distribution.

Objective Function of co-clustering

The objective function of the co-clustering is

If we define a new joint distribution q(x,y), then

Then the co-clustering process can be demonstrated by the graphical model as



We have applied our method on the publiclly available 15 scene dataset.

1. the 15 scene category examples

2. Comparison between BOV (k-means) and BOC (MMI co-clustering)

3. Performance comparison between original BOV (without dimension reduction) and BOC

a) Comparison among Different Sampling Space

b) Comparison between weak classifier and strong classifer.

4. Examples of BOC model


5. Confusion table for spatial correlogram + BOC model

5. Experiment results on LSCOM dataset (28 categories)


  1. Paper
  2. Power Point Presentation
  3. Poster