NEU_MITLL @ TRECVid 2015: Multimedia Event Detection by Pre-trained CNN Models
MIT Lincoln Laboratory Lexington United States
Pagination or Media Count:
We introduce a framework for multimedia event detection MED, which was developed for TRECVID 2015 using convolutional neural networks CNNs to detect complex events via deterministic models trained on video frame data. We used several well-known CNN models designed to detect objects, scenes, and a combination of both i.e., Hybrid-CNN. We also experimented with features from different networks fused together in different ways. The best score achieved was by fusing objects and scene detections at the feature-level i.e., early fusion, resulting in a mean average precision MAP of 16.02. Results showed that our framework is capable of detecting various complex events in videos when there are only a few instances of each within a large video search pool.
- Miscellaneous Detection and Detectors