[New publication] Machine Translation, 32 (4)
Machine Translation, 32 (4)
Reassessing the proper place of man and machine in translation: a pre-translation scenario
Abstract: Traditionally, human–machine interaction to reach an improved machine translation (MT) output takes place ex-post and consists of correcting this output. In this work, we investigate other modes of intervention in the MT process. We propose a Pre-Edition protocol that involves: (a) the detection of MT translation difficulties; (b) the resolution of those difficulties by a human translator, who provides their translations (pre-translation); and (c) the integration of the obtained information prior to the automatic translation. This approach can meet individual interaction preferences of certain translators and can be particularly useful for production environments, where more control over output quality is needed. Early resolution of translation difficulties can prevent downstream errors, thus improving the final translation quality “for free”. We show that translation difficulty can be reliably predicted for English for various source units. We demonstrate that the pre-translation information can be successfully exploited by an MT system and that the indirect effects are genuine, accounting for around 16% of the total improvement. We also provide a study of the human effort involved in the resolution process.
A user-study on online adaptation of neural machine translation to human post-edits
Abstract: The advantages of neural machine translation (NMT) have been extensively validated for offline translation of several language pairs for different domains of spoken and written language. However, research on interactive learning of NMT by adaptation to human post-edits has so far been confined to simulation experiments. We present the first user study on online adaptation of NMT to user post-edits in the domain of patent translation. Our study involves 29 human subjects (translation students) whose post-editing effort and translation quality were measured on about 4500 interactions of a human post-editor and an NMT system integrating an online adaptive learning algorithm. Our experimental results show a significant reduction in human post-editing effort due to online adaptation in NMT according to several evaluation metrics, including hTER, hBLEU, and KSMR. Furthermore, we found significant improvements in BLEU/TER between NMT outputs and professional translations in granted patents, providing further evidence for the advantages of online adaptive NMT in an interactive setup.
Automatic quality estimation for speech translation using joint ASR and MT features
Abstract: This paper addresses the automatic quality estimation of spoken language translation (SLT). This relatively new task is defined and formalized as a sequence-labeling problem where each word in the SLT hypothesis is tagged as good or bad according to a large feature set. We propose several word confidence estimators (WCE) based on our automatic evaluation of transcription (ASR) quality, translation (MT) quality, or both (combined ASR + MT). This research work is possible because we built a specific corpus, which contains 6.7k utterances comprising the quintuplet: ASR output, verbatim transcript, text translation, speech translation, and post-edition of the translation. The conclusion of our multiple experiments using joint ASR and MT features for WCE is that MT features remain the most influential while ASR features can bring interesting complementary information. In addition, the last part of the paper proposes to disentangle ASR errors and MT errors where each word in the SLT hypothesis is tagged as good, 𝑎𝑠𝑟_𝑒𝑟𝑟𝑜𝑟 or 𝑚𝑡_𝑒𝑟𝑟𝑜𝑟. Robust quality estimators for SLT can be used for re-scoring speech translation graphs or for providing feedback to the user in interactive speech translation or computer-assisted speech-to-text scenarios.
An end-to-end model for cross-lingual transformation of paralinguistic information
Abstract: Speech translation is a technology that helps people communicate across different languages. The most commonly used speech translation model is composed of automatic speech recognition, machine translation and text-to-speech synthesis components, which share information only at the text level. However, spoken communication is different from written communication in that it uses rich acoustic cues such as prosody in order to transmit more information through non-verbal channels. This paper is concerned with speech-to-speech translation that is sensitive to this paralinguistic information. Our long-term goal is to make a system that allows users to speak a foreign language with the same expressiveness as if they were speaking in their own language. Our method works by reconstructing input acoustic features in the target language. From the many different possible paralinguistic features to handle, in this paper we choose duration and power as a first step, proposing a method that can translate these features from input speech to the output speech in continuous space. This is done in a simple and language-independent fashion by training an end-to-end model that maps source-language duration and power information into the target language. Two approaches are investigated: linear regression and neural network models. We evaluate the proposed methods and show that paralinguistic information in the input speech of the source language can be reflected in the output speech of the target language.