Scene Understanding by Statistical Modeling of Motion Patterns

 

Related Publication: Imran Saleemi, Lance Hartung, and Mubarak Shah, Scene Understanding by Statistical Modeling of Motion Patterns, IEEE Conference on Computer Vision and Pattern Recognition 2010, San Francisco, CA.

 

Abstract:

We present a novel method for the discovery and statistical representation of motion patterns in a scene observed by a static camera. Related methods involving learning of patterns of activity rely on trajectories obtained from object detection and tracking systems, which are unreliable in complex scenes of crowded motion. We propose a mixture model representation of salient patterns of optical flow, and present an algorithm for learning these patterns from dense optical flow in a hierarchical, unsupervised fashion. Using low level cues of noisy optical flow, K-means is employed to initialize a Gaussian mixture model for temporally segmented clips of video. The components of this mixture are then filtered and instances of motion patterns are computed using a simple motion model, by linking components across space and time. Motion patterns are then initialized and membership of instances in different motion patterns is established by using KL divergence between mixture distributions of pattern instances. Finally, a pixel level representation of motion patterns is proposed by deriving conditional expectation of optical flow. Results of extensive experiments are presented for multiple surveillance sequences containing numerous patterns involving both pedestrian and vehicular traffic.

 

I.            Problem

¨  Video sequence:

¤  Static camera

¤  Structured scene

¤  High density crowds

¤  Multiple flows

¨  Goal:

¤  Learn patterns of motion

¤  Statistical distribution

¨  Applications:

¤  Anomaly detection, prior motion model, persistent tracking

Figure: Examples of scenes to be analyzed and desirable patterns

     II.            Gaussian Mixture Formulation

 

¨    Compute optical flow

¨    Define

 

¨    A single Gaussian approximates a motion blob

 

  III.               Process

  1. Gaussian component estimation

¨    Temporal quantization

¨    K-means clustering in 4d space

¨    No optimization

¨    Insensitive to choice of K

¨    Numerous, low variance clusters

 

 

 

 

 

 

 

 

  1. Component Filtering

¨    Optical flow is noisy 

¨    Filter high directional variance components

 

 

  1. Pattern Instance Estimation

¨    Sequences of components form spatiotemporal worms (instances)

¨    Pattern instances are temporally bounded

¨    A pattern itself is periodic

 

 

 

  1. Inter-component Transition

¨    Pattern instance occurs over several clips

¨    Two components i and j form an instance if,

¤   i and j are temporally proximal,

¤   j is `reachable’ from i

 

 

 

 

 

 

 

 


 

  1. Instance Learning

¨    Define a planar graph G = (V, E)

¤   V = { components from all video clips }

¤   E = { probability value if temporally proximal }

¨    Weak connected component analysis on G

¨    Connected components are pattern instances

 

Figure: Left: One instance each from 4 patterns. Right: More instances for each of the 4 patterns.

 

  1. Motion Patterns

¨    Multiple instances per pattern

¨    Each instance is a Gaussian mixture

¨    KL divergence defines similarity between instances

¨    Approximate with Monte Carlo sampling

¨    Graph connected analysis

 

 

  1. Conditional Expectation of flow

¨    Compute conditional expected orientation / magnitude given a pixel

 

  IV.                 Experiments