disambiguafier

To perform context sensitive spelling correction. The program is to detect spelling errors that result in valid words and suggest their suitable alternatives

Specifically, I will implement a program to determine whether a word w in a given confusion set { w1, w2 } should be disambiguated as w1 or w2. The task is formulated as a supervised learning task from labeled training sentences. I will try to implement a Bayesian classifier for context-sensitive spelling correction, making use of the naïve Bayes assumption. The features used include:

(a) Surrounding words: Each word that appears in the sentence containing the confusable word w is a feature. All surrounding words are converted to lowercase, and stop words and punctuation symbols are removed. (b) Collocations: A collocation Ci,j is an ordered sequence of words in the local, narrow context of the confusable word w. Offsets i and j denote the starting and ending positions (relative to w) of the sequence, where a negative (positive) offset refers to a word to its left (right). Each collocation string spanning positions i to j is a feature. You get to decide on the list of collocations (i.e., what pairs of offsets) to use.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.settings		.settings
bin		bin
src		src
test files		test files
.classpath		.classpath
.project		.project
Bathini Yatish A0091545A.docx		Bathini Yatish A0091545A.docx
README.md		README.md
Task.pdf		Task.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

disambiguafier

About

Releases

Packages

Languages

yatishb/disambiguafier

Folders and files

Latest commit

History

Repository files navigation

disambiguafier

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages