English Part-of-speech (POS) tagger
Perform part-of-speech tagging of english sentences using wink-pos-tagger
. It is based on transformation based learning (TBL) approach pioneered by Eric Brill.
Optimized for performance, it pos-tags and lemmatizes over 525,000 tokens per second with an accuracy of 93.2% on the standard WSJ22-24 test set. This was benchmarked on 2.2 GHz Intel Core i7 machine with 16GB RAM using its tagRawTokens()
API.
Use npm to install:
npm install wink-pos-tagger --save
The code below illustrates the steps required to pos tag a sentence:
// Load wink-pos-tagger.
var posTagger = require( 'wink-pos-tagger' );
// Create an instance of the pos tagger.
var tagger = posTagger();
// Tag the sentence using the tag sentence api.
tagger.tagSentence( 'He is trying to fish for fish in the lake.' );
// -> [ { value: 'He', tag: 'word', normal: 'he', pos: 'PRP' },
// { value: 'is', tag: 'word', normal: 'is', pos: 'VBZ', lemma: 'be' },
// { value: 'trying', tag: 'word', normal: 'trying', pos: 'VBG', lemma: 'try' },
// { value: 'to', tag: 'word', normal: 'to', pos: 'TO' },
// { value: 'fish', tag: 'word', normal: 'fish', pos: 'VB', lemma: 'fish' },
// { value: 'for', tag: 'word', normal: 'for', pos: 'IN' },
// { value: 'fish', tag: 'word', normal: 'fish', pos: 'NN', lemma: 'fish' },
// { value: 'in', tag: 'word', normal: 'in', pos: 'IN' },
// { value: 'the', tag: 'word', normal: 'the', pos: 'DT' },
// { value: 'lake', tag: 'word', normal: 'lake', pos: 'NN', lemma: 'lake' },
// { value: '.', tag: 'punctuation', normal: '.', pos: '.' } ]
Notice the way instances of the word "fish" have been tagged as verb and noun.
Check out the pos tagger API documentation to learn more.
If you spot a bug and the same has not yet been reported, raise a new issue or consider fixing it and sending a pull request.
Wink is a family of open source packages for Statistical Analysis, Natural Language Processing and Machine Learning in NodeJS. The code is thoroughly documented for easy human comprehension and has a test coverage of ~100% for reliability to build production grade solutions.
wink-pos-tagger is copyright 2017-19 GRAYPE Systems Private Limited.
It is licensed under the terms of the MIT License.