Paper Notes: Discovering Human Places of Interest from Multimodal Mobile Phone Data

Notes from reading the below paper.


A framework to discover place of interests (POIs).

Two levels of clustering are used.

First, user location points are grouped using a time-based clustering
technique which discovers stay points while dealing with missing
location data. The second level performs clustering on the stay
points to obtain stay regions. A grid-based clustering algorithm
has been used for this purpose.


1. Location Point – GPS coordinates along with [46.6N, 6.5E], [16:34:57]

2. Stay Point – cluster of location points. Represented using the coordinates of the centroid of the cluster and the time moments when the user arrived and left the stay [46.6N, 6.5E], [16:34:57], [17:50:12]

3. Stay Region – a cluster of stay points (from several days) with the same semantic meaning. Represented using the coordinates of the centroid of the cluster and the minimum and maximum coordinates of the stay points in the cluster. [46.6N, 6.5E], [46.595N, 46.599N], [6.498E, 6.502E]. Hence this can be represented by a rectangle centered at the centroid of the cluster whose size depends on the min and max coordinates.

(In this paper, stay region and place of interest are synonymous.)

Finger print based methods for location point detection:

These use data from GSM and Wifi sensors as opposed to geometry based methods that use GPS sensors. But exact location cannot be obtained.
1. BeaconPrint

2. PlaceSense

Time based method for stay points:

Dmax – 200 to 300 metresTmin –  time at a location – 20 to 40 minutesTmax – time between locations

Clustering algorithms for stay regions:

To estimate stay regions from stay points, couple of clustering algorithms are used.
1. k-means or variants

2. density based method (DBSCAN) – disadvantage: merges stay points with different semantic meaning in the same clusters.

3. grid based method (this performs the best)

Paper Notes: Ranking annotators for crowdsourced labeling tasks

Notes from reading the below paper by Raykar et al.

1. To get quality labels, requesters typically get the same instance labeled by multiple annotators and use majority voting or other methods to consolidate the annotations.

2. This paper ranks how spammer like each annotator is. Uses a scalar metric with spammer having score close to zero, good annotators close to 1.

3. This score is different from the accuracy computed from confusion rate matrix.

Annotator Model:

For example, this paper uses a binary classification model. If the ground truth is 1, then the sensitivity is defined as the probability that the model labels it as 1. If the ground truth is 0, then the specificity is defined as the probability that the model labels it as 0.

Spammer: The probability of the observed label is one irrespective of the true label. A perfect spammer has alpha + (beta – 1) = 0. This corresponds to diagonal line on the ROC curve.

Malicious annotator: if alpha + (beta – 1) < 0, then the annotator is a malicious one. Hence spammer score S = square(alpha + (beta – 1)). A spammer has S = 0. A perfect annotator has S = 1.

Accuracy: The above spammer score is different from accuracy. An annotator with high accuracy is a good annotator but a low accuracy annotator is not necessarily a spammer. A malicious annotator may have high accuracy if we flip their labels.

A spammer provides no information in updating the log odds, and hence does not contribute in the estimation of the actual true label.

Spammer score for categorical labels:

For multinomial categories, the term alpha (k, c) is the probability that the annotator assigns instance to category k, when the true label is c. In other words, annotator does not care if true label is c or c prime.

Spammer score for ordinal labels:

Note: a restaurant rating between 1 to 5 is ordinal. A spammer score can be given to ordinal labels just like binary classification, just that we give a score to each of the ordinals.


Experiments used Expectation Maximization algorithm. The annotator model parameters can be estimated using EM and these can be used to calculate the spammer score. The authors reported 95% Confidence Intervals for the spammer score for each annotator.

Paper Notes: Learning from Crowds

Notes from the above named paper by Raykar et al.


We do not have the ground truth. Namely, the actual annotation for an
instance. We instead have multiple subjective annotations for the

Contributions of this paper:

1. How to adapt conventional supervised learning algorithms when we have
multiple annotators providing subjective labels but no objective gold

2. How to evaluate systems when we do not have absolute gold standard.

3. Estimate how reliable or trustworthy is each annotator?

Problem with Majority voting:

Imagine there is an expert and a number of novices. Majority voting will favor
novices. A solution to this is to give weights. Still how to measure the
performance of the expert when there is no gold standard?

Maximum likelihood Estimator – jointly learns the classifier, annotator accuracy, true label.

1. Performance of each annotator is measured in terms of sensitivity and specificity with respect to the unknown gold standard.

2. Algorithm to automatically discover best experts and assign higher weights to them.

3. To incorporate prior knowledge about an expert, impose a beta prior on
the sensitivity and specificity and derive the maximum a-posteriori

4. Final estimation by an Expectation maximization algorithm.

Note 1: Can handle missing labels (each expert is not required to label each instance).

Note 2: Proposed for logistic regression, but it can handle categorical, ordinal and regression problems.

Two coin model for annotators:

The assumption is that sensitivity and specificity do not depend on the instance x.

Blog at

Up ↑