Joint Workshop on Multiword Expressions and WordNet (MWE-WN 2019)
Workshop at ACL 2019 (Florence, Italy), August 2nd, 2019
Organized, sponsored and endorsed by the Global Wordnet Association (GWA) and the Special Interest Group on the Lexicon (SIGLEX) of the Association for Computational Linguistics (ACL).
This joint event is the 15th edition of the Workshop on Multiword Expressions (MWE), with a special focus on WordNet.
Last updated: June 19, 2019
When the whole is greater than the sum of its parts: Multiword expressions and idiomaticity
by Aline Villavicencio –University of Essex (UK) and Federal University of Rio Grande do Sul (Brazil)
![]() |
Aline Villavicencio is affiliated to the School of Computer Science and Electronic Engineering, University of Essex (UK) and to the Institute of Informatics, Federal University of Rio Grande do Sul (Brazil). She received her PhD from the University of Cambridge (UK) in 2001, and held postdoc positions at the University of Cambridge and University of Essex (UK). She was a Visiting Scholar at the Massachusetts Institute of Technology (USA, 2011-2012 and 2014-2015), at the École Normale Supé¬rieure (France, 2014), an Erasmus-Mundus Visting Scholar at Saarland University (Germany in 2012/2013) and at the University of Bath (UK, 2006-2009). She held a Research Fellowship from the Brazilian National Council for Scientific and Technological Development (Brazil, 2009-2017). She is a member of the editorial board of TACL and of JNLE, PC Co-Chair of CoNLL-2019, Area Chair for ACL-2019, NAACL 2018, COLING 2018, ACL-2017 and General co-chair for the 2018 International Conference on Computational Processing of Portuguese. She is also a member of the SIGLEX board and of the program committees of various *ACL and AI conferences, and has co-chaired several *ACL workshops on Cognitive Aspects of Computational Language Acquisition and on Multiword Expressions. Her research interests include lexical semantics, multilinguality, and cognitively motivated NLP, and has co-edited special issues and books dedicated to these topics. |
Abstract
Multiword expressions (MWEs) feature prominently in the mental lexicon of native speakers (Jackendoff, 1997) in all languages and domains, from informal to technical contexts (Biber et al., 1999) with about four MWEs being produced per minute of discourse (Glucksberg, 1989). MWEs come in all shapes and forms, including idioms like rock the boat (as cause problems or disturb a situation) and compound nouns like monkey business (as dishonest behaviour). Their accurate detection and understanding may often require more than knowledge about individual words and how they can be combined (Fillmore, 1979), as they may display various degrees of idiosyncrasy, including lexical, syntactic, semantic and statistical (Sag et al., 2002; Baldwin and Kim, 2010), which provide new challenges and opportunities for language processing (Constant et al., 2017). For instance, while for some combinations the meaning can be inferred from their parts like olive oil (oil made of olives) this is not always the case, as in dark horse (meaning an unknown candidate who unexpectedly succeeds), and when processing a sentence some of the challenges are to identify which words form an expression (Ramisch, 2015), and whether the expression is idiomatic (Cordeiro et al., 2019). In this talk I will give an overview of advances on the identification and treatment of multiword expressions, in particular concentrating on techniques for identifying their degree of idiomaticity. Acknowledgments : This talk includes joint work with Carlos Ramisch, Marco Idiart, Silvio Cordeiro, Rodrigo Wilkens, Felipe Paula and Leonardo Zilio.
