Webpage is becoming a more and more important visual input to us. While there are few studies on saliency in webpage, we in this work make a focused study on how humans deploy their attention when viewing webpages and for the first time propose a computational model that is designed to predict webpage saliency. A dataset is built with 149 webpages and eye tracking data from 11 subjects who free-view the webpages. Inspired by the viewing patterns on webpages, multi-scale feature maps that contain object blob representation and text representation are integrated with explicit face maps and positional bias. We propose to use multiple kernel learning (MKL) to achieve a robust integration of various feature maps. Experimental results show that the proposed model outperforms its counterparts in predicting webpage saliency.
FiWI (Fixations in Webpage Images dataset): Image Stimuli, Eye Tracking Data and Code (267M)
Distributions of First three fixations on webpages
Fixation heat maps on three categories with a second-by-second visualization