Research group

The project research group is composed by the master candidate Belén Saldías and her two advisors Karim Pichara and Pavlos Protopapas.

The current investigation involves the development of experimental and computational approaches for studying how to combine Bayesian inference in probabilistic graphical models and crowdsourcing to get the most accurate classification possible.

(From left to right, Christian Pieringer , Pavlos Protopapas , Belén Saldías and Karim Pichara).


The current website has been developed to get data for Belén's master thesis research. The recollected votes will be used to test our proposed Bayesian approach for crowdsourcing with different strategies of questions.


In machine learning and statistics, classification is the problem of identifying to which of a set of categories (sub-populations) a new observation belongs.

An example would be assigning a given email into "spam" or "non-spam" classes or assigning a diagnosis to a given patient as described by observed characteristics of the patient (gender, blood pressure, presence or absence of certain symptoms, etc.).

Types of questions

Let say there exist 3 classes: cat, dog and raccoon. Then the main goal is to classify all the images that each user will be presented. To classify, each user will be presented randomly with 2 type of questions:

  • Yes or No questions: for each object, each user will response one, two or three of these questions:

Is this a cat? yes or no
Is this a dog? yes or no
Is this a raccoon? yes or no

  • A or B or C questions: for some objects (at max 10% of the questions) each user will response this question:

What is the class of this object? cat or dog or racoon

Research objectives

Classification tasks may present the following challenges: very few training examples, generating training instances is very costly, and training sets are far for being representative of the data. The researched method proposes a possible cheaper way to asking by keeping accuracy metrics.

In this project, we are developing a probabilistic model that generate low-cost questions that produce partial labels, increasing the size of the training set with less resources, keeping the accuracy levels of the final classifier. This project will only be feasible with your help, we need your talent as a labeler!

Goal: Find the correct class of a given object. We select a subset of the classified objects to be labeled again by a crowd of experts. Since each expert fails differently and its expertise is assumed independent to the crowd, each expert has its own credibility, and error (confusion) matrix.

Side Goal: Estimate the credibility of the user. The current astronomical data set will be used to measure performance with real votes and humans, where the classes of the stars are in different proportions. The main results (with automatic classifiers) show that is possible to ask for less information by keeping the accuracy of the common way to label data (asking for directly for the class of a given object).


Each user is presented to a random series of questions. Per each answered question, the labeler is rewarded with a different amount of points depending on how accurate the user has been up to the last question. The accuracy score of each labeler is updated only with certain objects (for which the research group know the ground truth) and refers to the proportion of right answers.

There are eight different badges (rookie, apprentice, ace, champion, master, grand master, guru, and luminary), that depend on the labeler accuracy over the time.

The champion

The champion per each contest will be whom reach more points once finished all its questions. In case of tie, the winner will be the more accurate one between the tied champions, by the end of the project.


Catalina DB: March 17 (included), 2017. Only users with 100% of progress compete for the prize.