Send us your bug reports.
Use keywords to find the product you are looking for.
Purchase procedure & Conditions
Pricing & user licences
How to promote your resources ?
Catalog Reference : ELRA-W0124
English-Vietnamese Parallel Corpus
This is a corpus of 500,000 English-Vietnamese sentence pairs, built to develop SMT (Statistical Machine Translation) systems. The parallel corpus contains English documents translated by professional translators into Vietnamese. The source texts include books, dictionaries, newspapers, online news, collected between 2000 and 2007.
All Vietnamese sentences have been word-segmented and morphologically analyzed. The texts are provided in TEI format.
Period of coverage :
Version history :
Creation date :
Distribution medium :
Click on the arrow to display content.
Number of languages
Number of tokens :
500,000 sentence pairs
Academic - Commercial 6000.00 EUR
Academic - Research 600.00 EUR
Commercial - Commercial 6000.00 EUR
Commercial - Research 1200.00 EUR
Non Member Prices
Academic - Commercial 8000.00 EUR
Academic - Research 1000.00 EUR
Commercial - Commercial 8000.00 EUR
Commercial - Research 2000.00 EUR
Wednesday 20 June, 2018
24534127 requests since Monday 27 September, 2004
Copyright © 2008