Skip to content

To perform context sensitive spelling correction. The program is to detect spelling errors that result in valid words and suggest their suitable alternatives

Notifications You must be signed in to change notification settings

yatishb/disambiguafier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

disambiguafier

To perform context sensitive spelling correction. The program is to detect spelling errors that result in valid words and suggest their suitable alternatives

Specifically, I will implement a program to determine whether a word w in a given confusion set { w1, w2 } should be disambiguated as w1 or w2. The task is formulated as a supervised learning task from labeled training sentences. I will try to implement a Bayesian classifier for context-sensitive spelling correction, making use of the naïve Bayes assumption. The features used include:

(a) Surrounding words: Each word that appears in the sentence containing the confusable word w is a feature. All surrounding words are converted to lowercase, and stop words and punctuation symbols are removed. (b) Collocations: A collocation Ci,j is an ordered sequence of words in the local, narrow context of the confusable word w. Offsets i and j denote the starting and ending positions (relative to w) of the sequence, where a negative (positive) offset refers to a word to its left (right). Each collocation string spanning positions i to j is a feature. You get to decide on the list of collocations (i.e., what pairs of offsets) to use.

About

To perform context sensitive spelling correction. The program is to detect spelling errors that result in valid words and suggest their suitable alternatives

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages