Journal article

Multilingual distributed representations without word alignment

Abstract:: Distributed representations of meaning are a natural way to encode covariance relationships between words and phrases in NLP. By overcoming data sparsity problems, as well as providing information about semantic relatedness which is not available in discrete representations, distributed representations have proven useful in many NLP tasks. Recent work has shown how compositional emantic representations can successfully be applied to a number of monolingual applications such as sentiment analysis. At the same time, there has been some initial success in work on learning shared word-level representations across languages. We combine these two approaches by proposing a method for learning distributed representations in a multilingual setup. Our model learns to assign similar embeddings to aligned sentences and dissimilar ones to sentence which are not aligned while not requiring word alignments. We show that our representations are semantically informative and apply them to a cross-lingual document classification task where we outperform the previous state of the art. Further, by employing parallel corpora of multiple language pairs we find that our model learns representations that capture semantic relationships across languages for which no parallel data was used.

Publication status:: Published

Peer review status:: Peer reviewed

Actions

Email

Email this record

Send the bibliographic details of this record to your email address.

Your Email
Please enter the email address that the record information will be sent to.

-
Your message (optional)
Please add any additional information to be included within the email.
Cite

Cite this record

APA Style

Hermann, K., & Blunsom, P. (2014). Multilingual distributed representations without word alignment. International Conference on Learning Representations 2014.

MLA Style

Hermann, K., and P. Blunsom. “Multilingual Distributed Representations without Word Alignment.” International Conference on Learning Representations 2014, International Conference on Learning Representations, 2014.

Chicago Style

Hermann, K, and P Blunsom. 2014. “Multilingual Distributed Representations without Word Alignment.” International Conference on Learning Representations 2014.
Share
Print

Access Document

Files:: Blunsom and Hermann, Multilingual distributed representations ...

(Preview, Accepted manuscript, pdf, 263.2KB, Terms of use)

Authors

+ Hermann, K More by this author

Institution:: University of Oxford
Oxford college:: St Hugh's College
Role:: Author

+ Blunsom, P More by this author

Institution:: University of Oxford
Division:: MPLS
Department:: Computer Science
Role:: Author

+ Engineering and Phyiscal Sciences Research Council More from this funder

Grant:: EP/K036580/1

+ Xerox Foundation More from this funder

Publisher:: International Conference on Learning Representations
Journal:: International Conference on Learning Representations 2014 More from this journal
Publication date:: 2014-04-01
Acceptance date:: 2014-02-24

Keywords:: cs.CL
Pubs id:: pubs:444098
UUID:: uuid:be8325b2-e0ad-4192-b84e-a4609b13f68e
Local pid:: pubs:444098
Source identifiers:: 444098
Deposit date:: 2016-10-16

Terms of use

Copyright date:: 2014

Licence:: Terms and Conditions of Use for Oxford University Research Archive

Views and Downloads

About views and downloads

If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP