This is part of the Multi Lingual Natural Language Processing exam of year 2023/2024 in M.Sc. Artificial Intelligence and Robotics.
Design and implement a transformer-based model to perform Natural Language Inference on a subset of FEVER Dataset and in Adversarial Test set.
To have a more comprehensive insight on the proposed solution and data augmentation pipeline please refer to MLNLP Adversarial Task Report
Model based on a finetuned distilBERT model (encoding head) along with a MLP classifier. It is also required to augment the data in order to perform better on the adversarial test.
The data augmentation pipeline consists of two steps:
- Premises and Hypotheses editing with synonyms substitution of adjectives, nouns, verbs, and adverbs;
- Neutral hypotheses generation with GPT-2 pretrained model