Workshop on Technologies for MT of Low Resource Languages (LoResMT 2018)
Boston, Massachusetts, March 21, 2018
@ AMTA 2018 (http://www.conference.
Statistical and neural machine translation (SMT/NMT) methods have been successfully used to build MT systems in many popular languages in the last two decades with significant improvements on the quality of automatic translation. However, these methods still rely upon a few natural language processing (NLP) tools to help pre-process human generated texts in the forms that are required as input for these methods, and/or post-process the output in proper textual forms in target languages.
In many MT systems, the performance of these tools has great impacts on the quality of resulting translation. However, there is not much discussion on these NLP tools, their methods, their roles in different MT systems of diverse methods, their coverage of support for the many languages in the world, etc. In this workshop, we would like to bring together researchers who work on these topics and help review/overview what are the most important tasks we need from these tools for MT in the following years.
These NLP tools include, but not limited to, several kinds of word tokenizers/de-tokenizers, word segmenters, morphology analysers, etc. In this workshop, we solicit papers dedicated to these supplementary tools that are used in any language and especially in low resource languages. We would like to have an overview of these NLP tools from our community. The evaluations of these tools in research papers should include how they have improved the quality of MT output.
We solicit original research papers, review papers as well as position papers on these tools in the workshop. Multilingual and/or Cross-lingual NLP tools for MT of low resource languages are especially welcome. Topics of the workshop include but not limited to
– Research and review papers of pre-process and/or post-process NLP tools for MT
– Position papers on the development of pre-process and/or post-process tools for MT
– Word tokenizers/de-tokenizers for specific languages
– Word/morpheme segmenters for specific languages
– Alignment/Re-ordering tools for specific language-pairs
– Use of morphology analysers and/or morpheme segmenters for MT
– Multilingual and/or Cross-lingual NLP tools for MT
– Reusability of existing NLP tools for low resource languages
– Corpora curation technologies for low resource languages
– Review of available parallel corpora for low resource languages
– Research and review papers of MT methods for low resource languages
– Fast building of MT systems for low resource languages
– Reusability of existing MT systems for low resource languages
December 22, 2017: First call for papers
January 8, 2018: Second call for papers
February 4, 2018: Submission deadline of workshop papers
February 11, 2018: Notification of acceptance
February 16, 2018: Camera-ready papers due
March 21, 2018: LoResMT workshop