Implementing WSD using HMM-Viterbi and Word Embedding Overlap

Platform : Python Jupyter, Google Colab or VS Code.
VS Code (version 1.59.0) was used to run the code.

Common libraries used in the implementation of WSD

nltk (version 3.2.5)
numpy (version 1.19.5)
pandas (version 1.1.5)
pickle (version 4.0)
regex (version 2019.12.20)
seaborn (version 0.11.1)
sklearn (version 0.24.2)
matplotlib (version 3.2.2)
wordnet (version 3.0)

Additional library used in bigram based implementation

ipywidgets (version 7.6.3)

Dataset required

Run the following command in terminal to download GoogleNews dataset in the models folder we have made:
wget -c "https://s3.amazonaws.com/dl4j-distribution/GoogleNews-vectors-negative300.bin.gz"

Running the code

Running HMM_WSD_POS_Bigram.ipynb and HMM_WSD_POS_Trigram.ipynb

"word_synset_tuple" use this function to transform your data.

The section "Train function" contains the function to train the model given the training data.

The "Test function" section contains the function which tests the model using the generated transmission and emission probabilities.

Run all the cells to generate overall metrics of the model.

Both the files HMM_WSD_POS_Bigram.ipynb and HMM_WSD_POS_Trigram.ipynb use POS tags to generate the output with bigram and trigram assumption respectively.

Running Overlap_WSD.ipynb

Word Embedding Overlap method supports Lesk and Extended Lesk algorithm

word2vec requires the following file: GoogleNews-vectors-negative300.bin.gz

Run all the cells.

Error Analysis has been added in the code and the slides

Caution:

Run all cells for functions to be defined, Note and Cautions are mentioned in the Notebook as well

References

https://www.fi.muni.cz/tsd2002/papers/92_Antonio_Molina.pdf

http://csci572.com/papers/SpeechLanguageProc.pdf

https://nlp.stanford.edu/~wcmac/papers/20050421-smoothing-tutorial.pdf

https://web.stanford.edu/~jurafsky/slp3/3.pdf

https://aclanthology.org/E17-1010.pdf

http://nlpprogress.com/english/word_sense_disambiguation.html

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
HMM_WSD		HMM_WSD
Overlap_WSD		Overlap_WSD
PPT		PPT
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Implementing WSD using HMM-Viterbi and Word Embedding Overlap

Common libraries used in the implementation of WSD

Additional library used in bigram based implementation

Dataset required

Running the code

Running HMM_WSD_POS_Bigram.ipynb and HMM_WSD_POS_Trigram.ipynb

Running Overlap_WSD.ipynb

Word Embedding Overlap method supports Lesk and Extended Lesk algorithm

Caution:

References

About

Releases

Packages

Languages

piyushsinghpasi/Word-Sense-Disambiguation

Folders and files

Latest commit

History

Repository files navigation

Implementing WSD using HMM-Viterbi and Word Embedding Overlap

Common libraries used in the implementation of WSD

Additional library used in bigram based implementation

Dataset required

Running the code

Running HMM_WSD_POS_Bigram.ipynb and HMM_WSD_POS_Trigram.ipynb

Running Overlap_WSD.ipynb

Word Embedding Overlap method supports Lesk and Extended Lesk algorithm

Caution:

References

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages