Cybertronian text translator

4/14/2023

Monolingual and general multi-lingual are two extremes when it comes to hosting costs. Our speech service needs to be deployed globally to tens of clusters. One key motivation for exploring this idea was to improve production efficiency. XLM-R trains exclusively with MLM objective on a huge multilingual dataset at an enormous scale. To further improve the performance, Unicoder presents three new cross-lingual pre-training tasks, including cross-lingual word recovery, cross-lingual paraphrase classification and cross-lingual masked language model. XLM introduced a translation language model (TLM) in addition to masked language model (MLM), in which bilingual sentences are concatenated as inputs. mBert trains a BERT model using Wikipedia corpora in 104 languages. Large amounts of unlabeled data from multiple languages are used to train these models, with the goal that low-resource languages can benefit from high-resource languages from shared vocabularies and underlying linguistic similarities.

Multiple approaches have since been proposed to extend it to multilingual/cross-lingual pretraining and show the success in transfer learning, such as mBERT, XLM, XLM-R, Unicoder, etc. The effectiveness of sentence encoders’ generative pre-training was first demonstrated for English natural language processing

0 Comments

Cybertronian text translator

Leave a Reply.

Author

Archives

Categories