Computational Modelling of Discourse and Semantics (Former Research Group)

The Independent Research Group "Computational Modelling of Discourse and Semantics" headed by Caroline Sporleder was part of the Cluster of Excellence MMCI  at Saarland University from September 2008 until August 2012.

Vision and Research Strategy

In many situations, the most convenient way for humans to interact with intelligent systems is via language. To arrive at natural models of interaction, computer systems need to be equipped with the means to efficiently process language and extract meaning from it. However, computing the meaning of utterances is a complex task, which involves various linguistic levels, e.g. the lexical meaning of individual words, the semantic argument structure of clauses and sentences (who did what to whom?), the discourse context (how does a sentence relate to its neighbouring sentences?), and finally the situational context (who is speaking/listening? what does the speaker want to achieve through making the utterance?). Context plays a crucial role here: sentential context influences the meaning of words, discourse context influences the meaning of sentences, and situational context influences the meaning of a discourse. In our research group, we aim to develop intelligent models of language meaning which take context information into account. We envisage that such context-aware models will perform better than approaches which deal with different linguistic phenomena in isolation. More specifically, we work on combining lexical semantics and discourse processing, for instance as applied to word sense disambiguation and semantic parsing.

A second guiding principle of our work is a focus on unsupervised or semi-supervised models, i.e., models which require no — or only a small set of — manually-labelled training data. Manual data annotation is extremely time-consuming and thus costly, particularly for semantic and discourse phenomena. Consequently, there is a severe shortage of annotated data for tasks dealing with (deep) language meaning. Moreover, if annotated data are available, they are available only for some domains and a small minority of languages. Models trained on these data are typically highly specialised, and their performance drops if they are applied to other data. In our group, we focus on developing technology that uses and combines information from various (contextual) sources in order to reach acceptable performance levels even without large amounts of manually-labelled data. We also experiment with semi-automatic annotation schemes, e.g., using automatic pre-annotation or active learning.

Composition of Group

The research group Computational Modelling of Discourse and Semantics was established in September 2008. The group's research focus is on the development of statistical and machine learning models for computing the meaning of language. In addition to the group leader, Caroline Sporleder, the group  had two other members: Linlin Li joined in December 2008 as a PhD student, and Alexis Palmer started in May 2009 as a postdoc. Also, from October 2009 to March 2010, Cécile Grivaz joined the group as a visiting PhD student from Geneva University, Switzerland.

Research Topics and Achievements

Projects and Collaborations

Prof. Dr. Caroline Sporleder

Prof. Dr. Caroline Sporleder

Since 2012, Caroline Sporleder is a full professor at University Trier.

Fon: +49 551 39 21490