GitHub - Zhaoyi-Li21/compmctg_protocols: [ACL 2024] "Benchmarking and Improving Compositional Generalization of Multi-aspect Controllable Text Generation"

Implementation of 3-dimensional Testing Protocols (HoldOut, ACD and FewShot) for CompMCTG

This repo contains the implementation of of 3-dimensional Testing Protocols (HoldOut, ACD and FewShot) for CompMCTG (the main repo can be found here: https://github.com/tqzhong/CG4MCTG.)

Datasets

1. Amazon Reviews

2 attributes: "sent", "topic"

"sent"$\in${"pos","neg"}
"topic"$\in${"books", "clothing", "music", "electronics", "movies", "sports"}

2. FYelp(v.3)

4 attributes: "sentiment", "gender", "cuisine", "tense"

"sentiment"$\in${"Pos","Neg"}
"gender"$\in${"Male","Female"}
"cuisine"$\in${"Asian","American","Mexican","Bar","Dessert"}
"tense"$\in${"present","past"}
Below is an example:

{
    "gender": "Male",
    "sentiment": "Pos",
    "cuisine": "Bar",
    "tense": "Present",
    "review": "love going here for happy hour or dinner ! great patio with fans to beat the stl heat ! also ... very accomodating at this location . i like the veal milanese but with mixed greens instead of pasta ! they 'll modify the menu to suit your taste !\n"
}

3. Yelp

3 attributes: "sentiment", "pronoun", "tense"

"sentiment"$\in${"pos","neg"}
"pronoun"$\in${"plural","singular"}
"tense"$\in${"present","past"}

4. Mixture(IMDB, OpeNER and Sentube)

2 attributes: "sentiment", "topic_cged"

"sentiment"$\in${"pos","neg"}
"topic_cged"$\in${"imdb", "opener", "tablets", "auto"}

Usage: For Constructing Training / Testing Sets (w. HoldOut, ACD and Few-Shot Protocols)

Basically, you can refer to the inferences in test_load_dataset.py and view the code in the load_dataset.py.

Construction of the classifer data : training/dev/testing (70% : 15% : 15%)
Construction of the generator training data: HoldOut/MCD(max-avg-min)/FewShot(max-avg-min)

Acknowledgement:

The implementation is on the basis of Google Research's implementation of TMCD (https://github.com/google-research/language/tree/master/language/compgen/nqg, "Compositional Generalization and Natural Language Variation: Can a Semantic Parsing Approach Handle Both?", ACL'2021).

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
__pycache__		__pycache__
mcd		mcd
.gitignore		.gitignore
README.md		README.md
load_dataset.py		load_dataset.py
sota_sentiment-master.zip		sota_sentiment-master.zip
split_cls_gen.py		split_cls_gen.py
test.ipynb		test.ipynb
test_load_dataset.py		test_load_dataset.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Implementation of 3-dimensional Testing Protocols (HoldOut, ACD and FewShot) for CompMCTG

Datasets

1. Amazon Reviews

2. FYelp(v.3)

3. Yelp

4. Mixture(IMDB, OpeNER and Sentube)

Usage: For Constructing Training / Testing Sets (w. HoldOut, ACD and Few-Shot Protocols)

Acknowledgement:

About

Releases

Packages

Languages

Zhaoyi-Li21/compmctg_protocols

Folders and files

Latest commit

History

Repository files navigation

Implementation of 3-dimensional Testing Protocols (HoldOut, ACD and FewShot) for CompMCTG

Datasets

1. Amazon Reviews

2. FYelp(v.3)

3. Yelp

4. Mixture(IMDB, OpeNER and Sentube)

Usage: For Constructing Training / Testing Sets (w. HoldOut, ACD and Few-Shot Protocols)

Acknowledgement:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages