Adverbial Presupposition Triggering Dataset: Manual Validation

Adverbial Presupposition Triggering Dataset

In order to validate the accuracy of the automatic extraction process, we hired three linguistics (all senior PhD students in linguistics at McGill University) to annotate the PTB portion of the dataset. The motivation is to have some human validation for our extraction process to provide insight into how robust the process is but also at a reasonable cost (as Gigaword is too big and it would be too costly to have it annotated by experts).

The manual validation process consisted of two phases: The first phase was a pilot study (on 100 randomly sampled sentences) to establish our trust in the hired linguists and make sure that their work is satisfactory. The results of that pilot study were positive, and we proceeded to the second phase: the annotation of the full PTB corpus.

In both cases, the linguists were asked to answer the following question:

Is there a presupposition triggered by one of the adverbs "also", "again", "still", "too" and "yet"?

The annotation results gave a 0.9 score on the Fleiss' kappa measure of agreement. According to Landis and Koch (1977), this scores falls in the highest bracket "substantial agreement" (scores between 0.81 and 1).

Furthermore, in 88.02% of the cases, the three linguists agreed that the positive samples (from the automatic extraction process) have indeed a presupposition while agreeing in 9.33% of the cases that there was no presupposition. For the remaining 2.65% of the cases where there was no agreement, we decided to discard them from the PTB portion of the dataset.