Joint Workshop on Multiword Expressions and Electronic Lexicons (MWE-LEX 2020)

Workshop at COLING 2020 (Barcelona, Spain online), September 14 December 13, 2020.

Organized and sponsored by:
Special Interest Group on the Lexicon (SIGLEX) of the Association for Computational Linguistics (ACL)
ELEXIS - European Lexicographic Infrastructure.

This joint event is the 16th edition of the Workshop on Multiword Expressions (MWE).
Follow @mwe_workshop

Last updated: Dec 14, 2020

[Dec 14, 2020] Thanks to all those contributing to make MWE-LEX 2020 a success! See you next year in Bangkok (fingers crossed)!
[Dec 02, 2020] MWE-LEX proceedings are now available on the ACL Anthology! Many thanks to our publication chairs Petya Osenova and Agata Savary, and to all authors and reviewers.
[Dec 02, 2020] Videos for all MWE-LEX talks are now available on Underline for registered COLING2020 participants: https://underline.io/events/54/
[Nov 15, 2020] The program of the workshop has arrived. .
[Nov 15, 2020] We're happy to announce that Roberto Navigli will be our invited speaker. See the abstract of his talk.
[Oct 06, 2020] We invite authors of accepted papers at "Findings of EMNLP" to present their work at our workshop (details below).
[Aug 31, 2020] The MWE-LEX workshop paper submission deadline has been extended to September 9, 23:59 (anywhere in the world).
[Aug 24, 2020] COLING 2020 is going entirely virtual. So is the MWE-LEX workshop! More details will follow.
[Aug 07, 2020] We're on Twitter :-)
[Apr 06, 2020] COLING 2020 has been postponed, the workshop will take place on December 13, 2020 in Barcelona.
[Feb 12, 2020] The START space for submitting a paper is now open.

Description

The joint MWE-LEX workshop addresses two domains – multiword expressions and (electronic) lexicons – with partly overlapping communities and research interests, but divergent practices and terminologies.

Multiword expressions (MWEs) are word combinations, such as by and large, hot dog, pay a visit or pull one's leg, which exhibit lexical, syntactic, semantic, pragmatic or statistical idiosyncrasies. MWEs encompass closely related linguistic objects: idioms, compounds, light-verb constructions, rhetorical figures, institutionalised phrases and collocations. Because of their unpredictable behavior, notably their non-compositional semantics, MWEs pose problems in linguistic modelling (e.g. treebank annotation, grammar engineering), NLP pipelines (notably when orchestrated with parsing), and end-user applications (e.g. information extraction). Modelling and processing of MWEs has been the topic of the MWE workshop, organised over the past years by the MWE section of SIGLEX.

Because MWE-hood is a largely lexical phenomenon, appropriately built electronic MWE lexicons turn out to be quite important for NLP. Their conception opens up, among others, the issues of lemmatization and of standardised representation of morphological, syntactic and semantic properties of MWEs. Large standardised multilingual, possibly interconnected, NLP-oriented MWE lexicons prove indispensable for NLP tasks such as MWE identification, due to its critical sensitivity to unseen data. But the development of such lexicons is challenging and calls for tools which would leverage, on the one hand, MWEs encoded in pre-existing NLP-unaware lexicons and, on the other hand, automatic MWE discovery in large non-annotated corpora.

In order to pave the way towards a better understanding of these issues, and to foster convergence and scientific innovation, the MWE and ELEXIS (European Union's Horizon 2020 research grant 731015) communities put forward a joint event. We call for papers on research related (but not limited) to:

Joint topics on MWEs and e-lexicons

Extracting and enriching MWE lists from traditional human-readable lexicons for NLP use
Formats for NLP-applicable MWE lexicons
Interlinking MWE lexicons with other language resources
Using MWE lexicons in NLP tasks (identification, parsing, translation, ...)
MWE discovery in the service of lexicography
Multiword terms in specialized lexicons
Representing semantic properties of MWEs in lexicons
Paving the way towards encoding lexical idiosyncrasies in constructions

MWE-specific topics

Computationally-applicable theoretical work on MWEs and constructions in psycholinguistics, corpus linguistics and formal grammars
MWE and construction annotation in corpora and treebanks
Processing of MWEs and constructions in syntactic and semantic frameworks (e.g. CCG, CxG, HPSG, LFG, TAG, UD, etc.), and in end-user applications (e.g. information extraction, machine translation and summarization)
Original discovery and identification methods for MWEs and constructions
MWEs and constructions in language acquisition and in non-standard language (e.g. tweets, forums, spontaneous speech)
Evaluation of annotation and processing techniques for MWEs and constructions
Retrospective comparative analyses from the PARSEME shared tasks on automatic identification of MWEs

Our intention is to also perpetuate previous converging effects with the Construction Grammar and WordNet community (see the LAW-MWE-CxG 2018 and MWE-WN 2019 workshops). Therefore, and we extend the traditional MWE scope to grammatical constructions and we include WordNets in the scope of e-lexicons.

Findings of EMNLP 2020

The MWE-LEX 2020 workshop is inviting authors of accepted papers at "Findings of EMNLP" to present their work at our workshop. To submit your paper for a presentation slot, please send an email to mwelex2020@gmail.com by Friday October 9, 2020 with:

Your paper
One or two sentences explaining why it would be a good fit for the scope of MWE-LEX 2020

Special Track: PARSEME Shared Task on Semi-Supervised verbal MWE Identification

MWE-LEX 2020 will host edition 1.2 of the PARSEME shared task on semi-supervised identification of verbal MWEs. This is a follow-up of editions 1.0 (2017), and 1.1 (2018). Edition 1.2 will feature (a) improved and extended corpora annotated with MWEs, (b) complementary unannotated corpora for unsupervised MWE discovery, and (c) a new evaluation methodology focusing on unseen MWEs. Following the synergy with Elexis, our aim is to foster the development of unsupervised methods for MWE lexicon induction, which in turn can be used for identification. Authors may submit system description papers to a special track. Details are available on the shared task 1.2 page.

Submission modalities

Regular research track:

Long papers (9 content pages + references): They should report on solid and finished research including new experimental results, resources and/or techniques.
Short papers (4 content pages + references): They should report on small experiments, focused contributions, ongoing research, negative results and/or philosophical discussion.

In regular research papers, the reported research should be substantially original. Papers available as preprints can also be submitted provided that they fulfil the conditions defined by the ACL Policies for Submission, Review and Citation.

Shared task track:

System description papers (4 content pages + references): These papers should briefly describe the approach implemented to solve the problem. They may include references and links to more detailed descriptions in other documents.

Shared task system description papers will go through a separate reviewing process. Submissions will be reviewed by the shared task organizers and participants. Participants of the shared task are not required to submit system description papers, and their acceptance depends on the quality of the paper rather than on the results obtained in the shared task.

Instructions for authors:

For all 3 types of papers, the submission is double-blind as per the COLING 2020 guidelines. For all types of submission, the COLING 2020 templates must be used. There is no limit on the number of reference pages.

The decisions as to oral or poster presentations of the selected papers will be taken by the PC chairs, depending on the available infrastructure for virtual participation. No distinction between papers presented orally and as posters is made in the workshop proceedings.

All papers should be submitted via the following START space. Please choose the appropriate track (research/shared task) and submission modality (long/short).

Important dates

All deadlines are at 23:59 UTC-12 (anywhere in the world).

~~May 20~~ ~~Sep 2~~ Sep 9, 2020: extended paper submission deadline

~~Jun 24~~ Oct 16, 2020: notification of acceptance

~~Jul 11~~ Nov 1, 2020: camera-ready papers due

~~Sep 14~~ Dec 13: MWE-LEX workshop dates

See also the important dates for the shared task systems.

Program Committee Chairs

Research track, MWE-specific topics:
- Stella Markantonatou, Institute for Language and Speech Processing, R.C. "Athena" (Greece)
- Jelena Mitrović, University of Passau (Germany)
Research track, MWE-LEX topics:
- John McCrae, National University of Ireland Galway (Ireland)
- Carole Tiberius, Dutch Language Institute in Leiden (Netherlands)
Shared task track:
- Carlos Ramisch, Aix Marseille University (France)
- Ashwini Vaidya, Indian Institute of Technology in Delhi (India)

Publication Chairs

Petya Osenova, University of Sofia and Bulgarian Academy of Sciences (Bulgaria)
Agata Savary, Université of Tours (France)

Contact

For any inquiries regarding the workshop please send an email to mwelex2020@gmail.com

Anti-harassment policy

The workshop supports the ACL anti-harassment policy.

SIGLEX-MWE (archive) Workshops: MWE-LEX 2020 (COLING)