Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code implementation for XLM-E #19

Open
aloka-fernando opened this issue Apr 27, 2023 · 2 comments
Open

Code implementation for XLM-E #19

aloka-fernando opened this issue Apr 27, 2023 · 2 comments

Comments

@aloka-fernando
Copy link

I need to conduct pre-training using the pre-training objectives used by XLM-E. Can I please have the pre-processing command and the pre-training command? Thank you.

@CZWin32768
Copy link
Owner

CZWin32768 commented May 8, 2023

Thank you for your comments and interest in our work. As the work was done during my internship at Microsoft, I am not able to share the code without authorization due to company policy. Nonetheless, I would like to answer your questions if you have specific problems on pre-processing or pre-training.

@aloka-fernando
Copy link
Author

Thanks a lot for your response. Would you think I can recreate your work considering ELECTRA (Clark et al, 2020) or would there be any person whom I can discuss regarding getting the codebase?

I wish to confirm, interms of differences between XLM-E and ELECTRA. Other than XLM-E being multilingual and using parallel data in with the Translation Replace Token Detection (TRTD) what seems to be major differences?

In my research I am using parallel data to improve cross-lingual representation of multilingual pre-trained models. I am focusing on Low resource languages. In what ways would you think your work in XLM-E paper can be extended while still using parallel data ?
Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants