IIIT Hyderabad Publications |
|||||||||
|
Injecting Word Embeddings with Another Languageās Resource : An Application of Bilingual EmbeddingsAuthors: Prakhar Pandey,Vikram Pudi,Manish Shrivastava Conference: International Joint Conference on Natural Language Processing (IJCNLP-2017 2017) Location Taipei, Taiwan Date: 2017-11-27 Report no: IIIT/TR/2017/91 AbstractWord embeddings learned from text corpus can be improved by injecting knowledge from external resources, while at the same time also specializing them for similarity or relatedness. These knowledge resources (like WordNet, Paraphrase Database) may not exist for all languages. In this work we introduce a method to inject word embeddings of a language with knowledge resource of another language by leveraging bilingual embeddings. First we improve word embeddings of German, Italian, French and Spanish using resources of English and test them on variety of word similarity tasks. Then we demonstrate the utility of our method by creating improved embeddings for Urdu and Telugu languages using Hindi WordNet, beating the previously established baseline for Urdu. Full paper: pdf Centre for Language Technologies Research Centre |
||||||||
Copyright © 2009 - IIIT Hyderabad. All Rights Reserved. |