The QUD-Anno Challenge

Annotating Text with Questions under Discussion

The QUD-Anno Challenge

THE SUBMISSION DEADLINE HAS BEEN EXTENDED TO DECEMBER 16th!.
PARTICIPANTS ARE ENCOURAGED TO UPLOAD A ZIP-FOLDER CONTAINING
THE EXTENDED ABSTRACT AS PDF AND XML ANNOTATIONS IN SEPARATE .xml-FILES.

The Challenge:
Annotating three texts distinguished by genre with text-structuring Questions under Discussion (QUDs)

Purpose:
Comparison of approaches, annotation guidelines, resulting QUD hierarchies, and reports of dependencies between selected linguistic features on annotated QUD structures

Texts and Resources:
The texts and the tool with respective instructions can be found here.

Background and Motivation:
QUDs are central to many discourse analyses that explain linguistic regularities as a consequence of the assumption that the sentences and text segments with which the regularities are associated are answers to an explicit or implicit question. QUDs were early on used for explaining possible sequences of dialogue moves (Carlson, 1983; Ginzburg, 1995), clarifying information-structural concepts (e.g. the topic/focus distinction, Roberts, 2012 [1996]; van Kuppevelt, 1995), temporal progression and foreground–background relations in narration (Klein & von Stutterheim, 1987; von Stutterheim & Klein, 1989), information structural constraints on implicature (van Kuppevelt, 1996), representing discourse goals and defining contextual relevance (Roberts, 2012 [1996]), and for analysing structure and coherence of discourse, of both text and dialogue (Klein & von Stutterheim, 1987; van Kuppevelt, 1995). Since then, QUDs have been firmly established as an analytic tool, leading to fruitful applications for a wide range of linguistic phenomena.
Most theories assume that sentences are subordinated to a focus–congruent question that is again subordinated to higher discourse-structuring questions (see, for example, Klein & von Stutterheim 1987b, van Kuppevelt 1995, Roberts 2012 [1996]; see also Benz & Jasinskaja 2017). QUD-theories for phenomena such as non-at-issue content, presupposition projection, and focus assume that the phenomena can also depend on questions higher up in the hierarchy. Hence, a proper test of these theories requires explicit knowledge of the relevant discourse structuring questions.
Although there is an obvious need for QUD-annotated corpora, there has been little work in this direction. Exceptions are e.g. De Kuthy et al. (2018), Riester et al. (2018), Riester (2019) and Westera et al. (2020).

The Issue:
We think that this research gap does not exist by chance. For morphological and syntactic features, there typically exist established criteria that objectively decide how a text item should be annotated. It then only depends on the clarity of annotation guidelines, the tag system, and the qualification of the annotators how close the annotations come to the objectively correct ones. For QUDs it needs to be proven or refuted whether there is an objective text-structuring QUD-hierarchy that annotators just have to uncover. One problem is posed by the many information structural features that QUDs are supposed to explain, among them the given/new, focus/background, and at-issue/not-at-issue distinction, for which it is an open question whether they can all be predicted by a uniform question hierarchy. Another problem is the representation of discourse goals that QUDs are also assumed to represent. Annotating discourse goals in the form of QUDs makes it necessary to interpret the text and the authors’ motivations. This is a task that can easily lead to widely different results. However, testing specific claims about the role of QUDs require an explicit representation of these goal-representing QUDs. For example, to test whether the non-at-issue content of a sentence is definable as content that does not provide relevant material for answering any of its superordinated questions requires explicit knowledge of these questions.

The task:
This challenge instigates a community effort to deeply annotate three texts that belong to three different genres: an interview, a magazine car report, and a narrative text (short story). The annotations are explorative expert annotations. This means that each annotation team can start with their own guidelines and objectives. The objectives can include the annotation of a small subset of discourse features that the authors consider particularly interesting. For the aforementioned reasons, we do not expect uniform results. The result of this joint effort is an annotated corpus that allows for a comparison of different approaches and provides a working corpus for future research. This could consist, for example, in independent annotations of additional linguistic features, followed by a study of how strongly this feature correlates with the different proposed QUD hierarchies in our corpus.

As QUDs may be used for many different purposes, annotation teams may want to concentrate on specific aspects. For inspiration (only), we mention here:
• Concentrate on the given/new distinction: all material in QUDs must be given; the main new information must be an answer to the QUD.
• QUDs as devices for representing discourse topics: What a text segment is about has to be asked for in a superordinated QUD.
• Interaction with discourse relations: Do you follow an approach that replaces discourse relations (Elaboration, Narration, Background) by QUDs, or are the QUDs better considered an additional discourse structuring device?
• Do QUD analyses correlate in a meaningful way with morphosyntactic markers of information structure?
• Mapping between QUDs and discourse purposes
• Is there a meaningful distinction between main content and side content?
• Does an annotation approach chosen for one text genre equally well work for the other text genres?

References

Benz, A., & Jasinskaja, K. (2017). Questions under discussion: From sentence to discourse. Discourse Processes, 54(3): 177-186. DOI: 10.1080/0163853X.2017.1316038.

Carlson, L. (1983). Dialogue Games. Reidel: Dordrecht.

De Kuthy, K., Reiter, N. & Riester, A. (2018). QUD-based annotation of discourse structure and information structure: Tool and evaluation. In N. Calzolari et al. (eds.), Proceedings of the 11th International Conference on Language Resources and Evaluation (LREC), Miyazaki, Japan. PDF.

Ginzburg, J. (1995). Resolving questions, I. Linguistics and Philosophy 18(5): 459–527. DOI: 10.1007/BF00985365.

Klein, Wolfgang & von Stutterheim, Christine. (1987). Quaestio und referentielle Bewegung in Erzählungen. Linguistische Berichte 109: 163–183.

Riester, A. (2019). Constructing QUD trees. In M. Zimmermann, K. von Heusinger & E. Onea (eds.), Questions in Discourse. Leiden: Brill. Vol. 2: Pragmatics: 164-193. DOI: 10.1163/9789004378322_007.

Riester, A., Brunetti, L. & De Kuthy, K. (2018). Annotation guidelines for questions under discussion and information structure. In E. Adamou, K. Haude & M. Vanhove (eds.), Information Structure in Lesser-Described Languages: Studies in Prosody and Syntax. Amsterdam: Benjamins: 403-444. DOI : 10.1075/slcs.199.14rie.

Roberts, C. (2012). Information structure in discourse: Towards an integrated formal theory of pragmatics. Semantics and Pragmatics 5: 1-69. DOI: 10.3765/sp.5.6 (1996 version: OSU Working Papers in Linguistics 49. The Ohio State University).

C. von Stutterheim & W. Klein. Referential movement in descriptive and narrative discourse. In R. Dietrich & C.Graumann (eds.), Language Processing in Social Context, pages 39–76. North Holland, Amsterdam, 1989.

Van Kuppevelt, J. (1995). Discourse structure, topicality and questioning. Journal of Linguistics 31 (1): 109-147. DOI: 10.1017/S002222670000058X.

Van Kuppevelt, J. (1996). Inferring from Topics: Scalar Implicatures as Topic-Dependent Inferences. Linguistics and Philosophy, 19(4): 393–443. PDF.

Westera, M., Mayol, L., & Rohde H. (2020). TED-Q: TED talks and the questions they evoke. In N. Calzolari et al. (eds.), Proceedings of the 12th International Conference on Language Resources and Evaluation (LREC), Marseille, pp. 1118–1127. PDF.