main| new issue| archive| editorial board| for the authors| publishing house|
Main page
New issue
Archive of articles
Editorial board
For the authors
Publishing house



No. 9. Vol. 25. 2019

DOI: 10.17587/it.25.538-544

I. S. Grechikhin, Postgraduate Student, Senior Lecturer, e-mail:, A. V. Savchenko, Doctor of Sciences, Professor, e-mail:, National Research University Higher School of Economics, Nizhny Novgorod

Analysis of User Preferences using Photos and Videos from Mobile Device Based on Object Detection and Neural Networks

In this paper we focus on the problem of user preferences prediction using the gallery of his mobile device. We consider such categories of interests as interior items, food, transport and sport equipment. The novel two-phased method has been proposed. At the first stage, the facial regions are detected on all photos and videos, and the feature vectors are extracted using deep convolutional neural networks. These feature vectors are grouped using known agglomerative clustering techniques. Finally, we select public photos and videos which do not contain faces from the large clusters. At the second stage, these public images are processed on the remote server using high precision Faster R- CNN object detectors. Objects from other images (personal images) are detected on mobile device in offline mode using SSDLite and MobileNet. In the experimental study several neural network-based detectors have been trained using the united training sample from MS Coco, ImageNet and Open Images datasets. Their comparative analysis demonstrated that the Faster R- CNN-based models are characterized with 30 % higher recall when compared to the SSDLite detectors. However, the latter models process each image 3Ч9-times faster. Finally, we presented the experimental results of facial clustering with GFW (Grouping Faces in the Wild) dataset using either existing feature descriptors (VGGFace, VGGFace2) or the preliminarily trained MobileNet. The latter model with average link hierarchical clustering achieved the highest B-cubed F-measure.
Keywords: image processing, object detection, mobile systems, visual preferences prediction, face clustering, convolutional neural networks (CNN), Faster R-CNN, SSD


Acknowledgments. The article was prepared within the framework of the Academic Fund Program at the National Research University Higher School of Economics (HSE University) in 2019 (grant No. 19-04-004) and by the Russian Academic Excellence Project "5-100".

To the contents