ACTIVE: Activity Concept Transitions in Video Event Classification (Open Access)
University of Southern California Los Angeles United States
Pagination or Media Count:
The goal of high level event classification from videos is to assign a single, high level event label to each query video. Traditional approaches represent each video as a set of low level features and encode it into a fixed length feature vector e.g. Bag-of-Words, which leave a big gap between low level visual features and high level events. Our paper tries to address this problem by exploiting activity concept transitions in video events ACTIVE. A video is treated as a sequence of short clips, all of which are observations corresponding to latent activity concept variables in a Hidden Markov Model HMM. We propose to apply Fisher Kernel techniques so that the concept transitions over time can be encoded into a compact and fixed length feature vector very efficiently. Our approach can utilize concept annotations from independent datasets, and works well even with a very small number of training samples. Experiments on the challenging NIST TRECVID Multimedia Event Detection MED dataset shows our approach performs favorably over the state-of-the-art.
- Statistics and Probability