Home Catalogue
Language Resources
Bug reports
Send us your bug reports.
Search Catalogue
Use keywords to find the product you are looking for.
Advanced Search
Anglais Français
  • Purchase procedure & Conditions

  • Pricing & user licences

  • How to promote your resources ?

  • Contact Us
  • Catalog Reference : ELRA-W0124
    English-Vietnamese Parallel Corpus
    This is a corpus of 500,000 English-Vietnamese sentence pairs, built to develop SMT (Statistical Machine Translation) systems. The parallel corpus contains English documents translated by professional translators into Vietnamese. The source texts include books, dictionaries, newspapers, online news, collected between 2000 and 2007.
    All Vietnamese sentences have been word-segmented and morphologically analyzed. The texts are provided in TEI format.

    ISLRN : 838-483-738-912-8
    Period of coverage :
    Version : 1.0
    Version history :
    Creation date : 2000-2007
    Technical Information
    Distribution medium : Downloadable
    Fileformat : Plain text
    Contents Click on the arrow to display content.
    written corpus 
    Members Prices
    Academic - Commercial 6000.00 EUR
    Academic - Research 600.00 EUR
    Commercial - Commercial 6000.00 EUR
    Commercial - Research 1200.00 EUR
    Non Member Prices
    Academic - Commercial 8000.00 EUR
    Academic - Research 1000.00 EUR
    Commercial - Commercial 8000.00 EUR
    Commercial - Research 2000.00 EUR

    Copyright © 2008 ELRA
    ELRACatalogue 0.8.0