README for annotation data from the Usage Similarity annotation task:

The annotation data from the task is contained in the file
usim2ratings.csv . This csv file contains the following columns of
data:

lexsub_id1,lexsub_id2,judgment,user_id,lemma

Each row of data in the file represents a single annotation.
The first two columns, lexsub_id1 and lexsub_id2, identify the two sentences
that make up the pair being annotated. These ids correspond to the ids listed
in the file lexsub_wcdata.xml located in the Data directory.
 The third column lists the similarity
judgment (on a 1-5 scale ranging from least to greatest).
The user_id column contains the unique id for the annotator providing the
judgment. The final column lists the lemma featured in the pair. 

In addition to providing the ratings supplied by the three annotators,
the file also includes the average rating for each pair, across all annotators.
These are indicated by the user id "avg".

In addition to similarity ratings of 1-5, annotators also had the option of
responding "I don't know", in cases where they did not feel able to make a clear
comparison between two sentences.  All 28 pairs that received a rating of 
"don't know" from one or more of the eight annotators were excluded from our 
analysis and are not included in  usim2ratings.csv .

Guidelines for the Usim2 annotation task can be found at:
http://www.dianamccarthy.co.uk/downloads/WordMeaningAnno2012/

The analysis of the data from this task is described as round two in
the following paper.

Katrin Erk, Diana McCarthy and Nicholas Gaylord (2013). Measuring Word Meaning 
in Context. Computational Linguistics, 39 (3) pp 511-554


