In general, humans understand language effortlessly, and are much more accurate and robust at processing language than current computer programs. Learning more about how the human brain processes language can help us in two respects: On the one hand, a better understanding of how language comprehension works in humans can enable us to make natural language processing more robust and accurate. On the other hand, a deeper understanding of human cognition will allow us to build more accurate models of language comprehension which can assess for a given sentence how difficult it will be for a human to comprehend. This in turn would enable us to optimize automatically generated text or speech with respect to how easy it is for humans to understand. This is relevant, for example, in human-computer interaction scenarios like dialogue systems, for tutoring systems, summarization systems or readability assessment. Consider a scenario where a user can concentrate on interaction with a dialogue system; the dialogue system should then convey information efficiently and effectively. In a different scenario where the user is driving a car at the same time as interacting with a dialogue system, the cognitive load on the user must be kept low to make sure he still drives safely and is not distracted too much by the dialogue system.
Our research group addresses these issues by conducting psycholinguistic experiments, constructing cognitive models of human language comprehension and applying findings from the experiments and modelling results to improve dialogue systems.
Our group was founded in October 2010 in the Cluster of Excellence. It currently consists of Dr. Vera Demberg; two postdocs, Judith Köhne and Asad Sayeed; a PhD student, Fatemeh Torabi Asr; and a student assistant, Silas Weinbach.
Computational models of human language processing
Human language comprehension difficulty (or ease) has been found to depend on the predictability of upcoming words, syntactic structures and semantic concepts, as well as on structural complexity and effect related to retrieving previously processed information from memory. We have developed a psycholinguistically motivated parser which can estimate syntactic processing difficulty on a word-by-word basis, and have started to integrate this model with a model that uses distributional semantics to assess the predictability of words.
Our group is now working on extending this model to a super-sentential level and is investigating how discourse cues can be used to make text easier to understand and remember.
Evaluation of sentence processing theories on broad-coverage data
Theories of sentence comprehension are usually inspired by observations from very specific syntactic structures, thematic role clashes etc., which are known to be difficult. It is possible that effects observed in carefully controlled lab experiments are rare or absent in naturalistic data to which humans are exposed on a day-by-day basis.
Theories of sentence processing should, however, not only model experimental results accurately, but also prove their validity on naturally-occurring broad-coverage texts. We have pioneered the evaluation of sentence processing analyses on such data by running alternative sentence processing theories on a corpus of eye-tracked newspaper texts, the Dundee Corpus.
Information Presentation in Spoken Dialogue Systems
In spoken dialog systems, information must be presented sequentially, making it difficult to quickly browse through a large number of options. Recent studies have shown that user satisfaction is negatively correlated with dialog duration, suggesting that systems should be designed to maximize the efficiency of the interactions. Analysis of the logs of 2000 dialogs between users and nine different dialog systems reveals that a large percentage of the time is spent on the information presentation phase, and thus there is potentially a large pay-off to be gained from optimizing information presentation in spoken dialog systems.
We have proposed a method that improves the efficiency of coping with large numbers of diverse options by selecting options and then structuring them based on a model of the user’s preferences. This enables the dialog system to automatically determine trade-offs between alternative options that are relevant to the user and present these trade-offs explicitly. Multiple attractive options are thereby structured such that the user can gradually refine her request to find the optimal trade-off.