Skip to content
/ LEXB Public

Python scripts for the construction of the LEXB parallel corpus of South Tyrolean legislation (IT-DE).

Notifications You must be signed in to change notification settings

antcont/LEXB

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 

Repository files navigation

LEXB corpus

The LEXB corpus is a bilingual (Italian-German) collection of South Tyrolean legislation.

The corpus is built in three versions:

  • LEXB_full: a full version of the corpus, annotated with contextual, structural and linguistic information.
  • LEXB_tm: a raw version of the corpus to be used as a translation memory.
  • LEXB_mt: a fully cleaned and filtered version of the corpus to be used for MT training and/or MT adaptation.

About

Python scripts for the construction of the LEXB parallel corpus of South Tyrolean legislation (IT-DE).

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages