The Challenge:
Annotating three texts distinguished by genre with text-structuring Questions
under Discussion (QUDs)
Purpose:
Comparison of approaches, annotation guidelines, resulting QUD hierarchies, and reports
of dependencies between selected linguistic features on annotated QUD structures
Texts and Resources:
The texts and the tool with respective instructions can be found here.
Background and Motivation:
QUDs are central to many discourse analyses that explain linguistic
regularities as a consequence of the assumption that the sentences and text segments with which the regularities
are associated are answers to an explicit or implicit question. QUDs were early on used for explaining possible
sequences of dialogue moves (Carlson, 1983; Ginzburg, 1995), clarifying information-structural concepts (e.g. the
topic/focus distinction, Roberts, 2012 [1996]; van Kuppevelt, 1995), temporal progression
and foreground–background relations in narration (Klein & von Stutterheim, 1987; von Stutterheim & Klein, 1989), information structural
constraints on implicature (van Kuppevelt, 1996), representing discourse goals and defining contextual relevance
(Roberts, 2012 [1996]), and for analysing structure and coherence of discourse, of both text and dialogue (Klein &
von Stutterheim, 1987; van Kuppevelt, 1995). Since then, QUDs have been firmly established as an analytic tool,
leading to fruitful applications for a wide range of linguistic phenomena.
Most theories assume that sentences are subordinated to a focus–congruent question that is again subordinated to
higher discourse-structuring questions (see, for example, Klein & von Stutterheim 1987b, van Kuppevelt 1995,
Roberts 2012 [1996]; see also Benz & Jasinskaja 2017). QUD-theories for phenomena such as non-at-issue content,
presupposition projection, and focus assume that the phenomena can also depend on questions higher up in the
hierarchy. Hence, a proper test of these theories requires explicit knowledge of the relevant discourse
structuring questions.
Although there is an obvious need for QUD-annotated corpora, there has been little work in this direction. Exceptions are e.g. De Kuthy et al. (2018), Riester et al. (2018), Riester (2019) and Westera et al. (2020).
The Issue:
We think that this research gap does not exist by chance. For morphological and
syntactic features, there typically exist established criteria that objectively decide how a text item should be
annotated. It then only depends on the clarity of annotation guidelines, the tag system, and the qualification of
the annotators how close the annotations come to the objectively correct ones. For QUDs it needs to be proven or
refuted whether there is an objective text-structuring QUD-hierarchy that annotators just have to uncover. One
problem is posed by the many information structural features that QUDs are supposed to explain, among them the
given/new, focus/background, and at-issue/not-at-issue distinction, for which it is an open question whether they
can all be predicted by a uniform question hierarchy. Another problem is the representation of discourse goals
that QUDs are also assumed to represent. Annotating discourse goals in the form of QUDs makes it necessary to
interpret the text and the authors’ motivations. This is a task that can easily lead to widely different results.
However, testing specific claims about the role of QUDs require an explicit representation of these
goal-representing QUDs. For example, to test whether the non-at-issue content of a sentence is definable as
content that does not provide relevant material for answering any of its superordinated questions requires
explicit knowledge of these questions.
The task:
This challenge instigates a community effort to deeply annotate three texts that
belong to three different genres: an interview, a magazine car report, and a narrative text (short story). The
annotations are explorative expert annotations. This means that each annotation team can start with their own
guidelines and objectives. The objectives can include the annotation of a small subset of discourse features that
the authors consider particularly interesting. For the aforementioned reasons, we do not expect uniform results.
The result of this joint effort is an annotated corpus that allows for a comparison of different approaches and
provides a working corpus for future research. This could consist, for example, in independent annotations of
additional linguistic features, followed by a study of how strongly this feature correlates with the different
proposed QUD hierarchies in our corpus.
As QUDs may be used for many different purposes, annotation teams may want to concentrate on specific aspects. For
inspiration (only), we mention here:
• Concentrate on the given/new distinction: all material in QUDs must be given; the main new information must be
an answer to the QUD.
• QUDs as devices for representing discourse topics: What a text segment is about has to be asked for in a
superordinated QUD.
• Interaction with discourse relations: Do you follow an approach that replaces discourse relations
(Elaboration, Narration, Background) by QUDs, or are the QUDs better considered an additional discourse
structuring device?
• Do QUD analyses correlate in a meaningful way with morphosyntactic markers of information structure?
• Mapping between QUDs and discourse purposes
• Is there a meaningful distinction between main content and side content?
• Does an annotation approach chosen for one text genre equally well work for the other text genres?