Machine Learning for Natural Language Processing (Former Research Group)

Vision and Research Strategy

In recent years, much of natural language processing (NLP) research has focused on supervised methods requiring large amounts of labeled data. Constructing such datasets is often very expensive and time consuming, and resulting statistical models may still exhibit coverage problems when applied to practical NLP tasks. In our group, we focus on developing methods which exploit unlabeled data and abundant surrogate supervision (e.g., noisy user annotation available on the web, annotation provided for other languages, temporal relations between documents) to construct accurate models for NLP tasks.

Similarly, many systems in NLP are a result of many years of feature engineering and parameter tuning. Unfortunately, such systems are often brittle when applied to a different domain or language. We use latent variables models, which automatically construct composite features from elementary ones, to build state-of-the-art systems with minimal feature engineering, leading to accurate models that are easily retrainable on different languages and datasets.

Composition of Group

The group Machine Learning for Natural Language Processing was organized in October 2009. A postdoc, Dr. Minwoo Jeong, and a PhD student, Mikhail Kozhevnikov, joined the group in November and December 2009, respectively.

The group develops novel machine learning methods for natural language processing tasks, with the primary focus on (1) alleviating the need for large quantities of explicitly annotated data and (2) integrating available linguistic knowledge about the tasks.

Research Topics and Achievements

Projects and Collaborations

Dr. Ivan Titov

Dr. Ivan Titov

Ivan Titov headed the Independent Research Group Machine Learning for Natural Language Processing from October 2009 until August 2013.

Fon: +31 20 525 8334

Publications