Abstract

Theories and models on saliency that predict where people look at focus on regular-density scenes. A crowded scene is characterized by the cooccurrence of a relatively large number of regions/objects that would have stood out if in a regular scene, and what drives attention in crowd can be significantly different from the conclusions in the regular setting. This work presents a first focused study on saliency in crowd. To facilitate saliency in crowd study, a new dataset of 500 images is constructed with eye tracking data from 16 viewers and annotation data on faces. Statistical analyses point to key observations on features and mechanisms of saliency in scenes with different crowd levels and provide insights as of whether conventional saliency models hold in crowding scenes. Finally a new model for saliency prediction that takes into account the crowding information is proposed, and multiple kernel learning (MKL) is used as a core computational module to integrate various features at both low- and high-levels. Extensive experiments demonstrate the superior performance of the proposed model compared with the state-of-the-art in saliency computation.

Resources

Paper: Ming Jiang, Juan Xu, and Qi Zhao, "Saliency in Crowd," in ECCV 2014 [pdf] [bib] [poster]

Image Stimuli 500 Images (178 MB) Eye-tracking Data Matlab MAT (1.2 MB) Labelled Faces and Attributes Matlab MAT (0.6 MB)

Codes GitHub Codes for "Saliency in Crowd," ECCV 2014. It contains crowd feature computation, crowd stats calculation, saliency model training with MKL, saliency prediction, and evaluation measures.

Video Spotlight

Download Video (1'00'', 47M)

Results