Learn Chinese word representations using subword and subcharacter information
-
Updated
May 20, 2020 - Python
Learn Chinese word representations using subword and subcharacter information
Source code of paper "Incorporating prior knowledge into word embedding for Chinese word similarity measurement", accepted by ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP).
Use PTT and Chinese Wiki corpora to build count-based and prediction-based word embeddings.
Add a description, image, and links to the chinese-word-embedding topic page so that developers can more easily learn about it.
To associate your repository with the chinese-word-embedding topic, visit your repo's landing page and select "manage topics."