Research interests: multimedia/video/image processing and analysis - multimedia content-based retrieval - machine learning for multimedia - multimedia benchmarking and evaluation - multimedia applications (e.g., video surveillance, lifelog, social media, medicine).

A Social Image Retrieval Result Diversification Dataset with User Tagging Credibility Estimation

This dataset is designed to support research in the areas of information retrieval that foster new technologies for improving both the relevance and the diversification of search results with explicit focus on the social media context. The dataset consists of Creative Commons data of 300 landmark locations represented via 45,375 Flickr photos, 16M photo links for around 3.000 users, metadata, Wikipedia pages and content descriptors for text and visual modalities. Data is annotated for the relevance and the diversity of the photos. The dataset includes also information about user annotation credibility. Credibility is determined as an automatic estimation of the quality (correctness) of a particular user's tags.

The dataset was validated during the 2014 Retrieving Diverse Social Images Task at the MediaEval Benchmarking Initiative for Multimedia Evaluation.

Using the dataset:

If you plan to make use of the Div150Multi dataset, or refer to its results, please acknowledge the work of the authors by citing the following papers:

  1. B. Ionescu, A. Popescu, M. Lupu, A.L. Gînscă, B. Boteanu, H. Müller, “Div150Cred: A Social Image Retrieval Result Diversification with User Tagging Credibility Dataset”, ACM Multimedia Systems - MMSys2015, 18-20 March, Portland, Oregon, USA, 2015 (6 pages, download draft PDF).
  2. B. Ionescu, A. Popescu, M. Lupu, A.L. Gînscă, H. Müller, “Retrieving Diverse Social Images at MediaEval 2014: Challenge, Dataset and Evaluation”, MediaEval Benchmarking Initiative for Multimedia Evaluation, vol. 1263,, ISSN: 1613-0073, October 16-17, Barcelona, Spain, 2014.
Download the dataset:

You can download the data from the ACM MMSys conference repository available here (look for the Div150Cred dataset):


Bogdan Ionescu, LAPI, University Politehnica of Bucharest, Romania (bionescu at; Adrian Popescu, CEA LIST, France (adrian.popescu at; Mihai Lupu, Vienna University of Technology, Austria (lupu at, Henning Müller, University of Applied Sciences Western Switzerland (HES-SO) in Sierre, Switzerland (henning.mueller at

This dataset was supported by the following projects: MUCKE, CUbRIK and PROMISE.

Many thanks to Alexandru Lucian Gînscă, Adrian Iftene, Bogdan Boteanu, Ioan Chera, Ionuț Duță, Andrei Filip, Corina Macovei, Cătălin Mitrea, Ionuț Mironică, Irina Emilia Nicolae, Ivan Eggel, Andrei Purică, Mihai Pușcaș, Oana Pleș, Gabriel Petrescu, Anca Livia Radu, Vlad Ruxandu for their precious help.