Multiword Expressions: from Theory to Applications (MWE 2010)
Workshop at COLING 2010 (Beijing, China), August 28, 2010
Endorsed by the Special Interest Group on the Lexicon of the Association for Computational Linguistics (SIGLEX)
Last updated: August 03, 2010
Being Theoretical is Being Practical: Multiword Units and Terminological Structure Revitalised
by Kyo Kageura - University of Tokyo, Japan
Multiword units (MWUs) are critical in processing and understanding texts and have been extensively studied in relation to their occurrences in texts. MWUs also play an essential role in organising vocabulary, which is most prominently visible in domain-specific terminologies. There has been, however, a limited and mostly theoretical concern with the latter aspect of MWUs; researchers interested in NLP-related applications of terminologies have not paid sufficient attention to this aspect.
In this talk I will start by giving the basic framework within which the study of MWUs from the point of view of vocabulary can be carried out, in the process clarifying the relationships between studies of MWUs in texts and those in relation to vocabulary. I will then introduce some of the theoretical studies in terminological structure which I have carried out in recent years. Referring to some of the problems that practically-oriented research in terminology processing is currently facing, I will argue why, how and in what possible ways the understanding of the roles MWUs take in terminological structure constitute a sin qua non condition for making a breakthrough in current text-oriented studies of terminological MWUs.
Kyo Kageura, PhD, is a Professor at the Library and Information Science Course, Graduate School of Education, University of Tokyo. He works in the field of terminology and is interested in applying NLP methods in constructing practicaly useful reference resources. His publications include Quantitative Informatices (Maruzen, 2000, in Japanese) and The Dynamics of Terminology (John Benjamins, 2002). He is currently the editor of the journal Terminology and a book series Terminology and Lexicography: Research and Practice, both published by John Benjamins, with Professor Marie-Claude L'Homme of the University of Montreal. He is also a member of the development and management team of an online hosting site Minna no Hon'yaku (Translation of/by/for all: http://trans-aid.jp/).
Multi-Word Expressions as Discourse Relation Markers (DRMs)
by Aravind K. Joshi - University of Pennsylvania, USA
Usually, by Multi-Word Expressions (MWEs) we mean expressions whose structure and meaning cannot be derived from their component words as they occur independently. In this talk I will discuss a different kind of multi-word expressions that behave as discourse relation markers (DRMs), yet do not seem to belong to well-defined syntactic classes.The apparent open-endedness of these expressions is a challenge for their automatic identification*.
* This work is carried out in the context of the Penn Discourse Treebank (PDTB), jointly with Rashmi Prasad and Bonnie Webber.
Aravind Joshi is the Henry Salvatori Professor of Computer and Cognitive Science at the University of Pennsylvania. He has worked on formal grammars, complexity of syntactic processing, and aspects of discourse coherence. He has been the President of ACL, a member of ICCL, and a member of the National Academy of Engineering, USA.