Welcome to the official repository for CLAP, an innovative architecture for AMR (Abstract Meaning Representation) parsing, presented at LREC-COLING 2024.
-
AMR Parsing and Generation: CLAP introduces a flexible and efficient AMR parsing architecture. It supports seamless transitions between different language models and facilitates multilingual adaptability.
-
Crosslingual AMR Alignment: Integration of the Crosslingual AMR Aligner enables extraction of span-to-node alignments from sentences to graphs, leveraging the model's cross-attention capabilities.
-
Perplexity Extraction: Incorporating the AMRs Assemble, CLAP can compute perplexity scores and supports training in assembly tasks.
If you use CLAP in your research, please cite our paper:
@inproceedings{martinez-lorenzo-navigli-2024-efficient-amr,
title = "Efficient {AMR} Parsing with {CLAP}: Compact Linearization with an Adaptable Parser",
author = "Martinez Lorenzo, Abelardo Carlos and Navigli, Roberto",
editor = "Calzolari, Nicoletta and Kan, Min-Yen and Hoste, Veronique and Lenci, Alessandro and Sakti, Sakriani and Xue, Nianwen",
booktitle = "Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)",
month = may,
year = "2024",
address = "Torino, Italia",
publisher = "ELRA and ICCL",
url = "https://aclanthology.org/2024.lrec-main.495",
pages = "5578--5584",
}
conf/
: Configuration files for data paths, model specifications, and training parameters.data/
: Datasets for benchmarking AMR evaluation.experiments/
: Stores checkpoints post-training.models/
: Trained Hugging Face models.src/
: Source code for the project.constant.py
: Manages tokens added to the model; customizable for new tokens.linearization.py
: Implements graph linearization in Depth-First Search and compact formats.pl_data_modules.py
: Data module classes for training.pl_modules.py
: Contains new modular components for the architecture.predict.py
: Script for making predictions using trained models.predict_alignment.py
: Script for extracting alignments.predict_perplexity.py
: Script for computing perplexity.train.py
: Entry point for training models.utils.py
: Utility functions for various operations.
# Create a Python 3.9 environment
conda create -n clap-env python=3.9
conda activate clap-env
# Install dependencies
pip install -r requirements.txt
Configure paths and hyperparameters in conf/ directory files:
- conf/data.yaml: Specify dataset paths for training and evaluation.
- conf/model.yaml: Define the model architecture, e.g., google/flan-t5-small.
- conf/train.yaml: Adjust training-specific hyperparameters.
python src/train.py
Set up the necessary paths in conf/data.yaml and conf/model.yaml. Then run:
python src/predict.py
Configure as per the prediction step and execute:
python src/predict_alignments.py
Configure as per the prediction step and execute:
python src/predict_perplexity.py
This project is released under the CC-BY-NC-SA 4.0 license (see LICENSE
). If you use AMRs-Assemble!, please reference the paper and put a link to this repo.
We welcome contributions to the Cross-lingual AMR Aligner project. If you have any ideas, bug fixes, or improvements, feel free to open an issue or submit a pull request.
For any questions or inquiries, please contact Abelardo Carlos Martínez Lorenzo at martineslorenzo@diag.uniroma.it