The LEXB corpus is a bilingual (Italian-German) collection of South Tyrolean legislation.
The corpus is built in three versions:
LEXB_full
: a full version of the corpus, annotated with contextual, structural and linguistic information.LEXB_tm
: a raw version of the corpus to be used as a translation memory.LEXB_mt
: a fully cleaned and filtered version of the corpus to be used for MT training and/or MT adaptation.