Society of African Journal Editors

Journal of Computer Science and Its Application

Naijaspell: spellchecking for Nigerian pidgin

Authors: F Tanshi, T Adegbola

Journal: Journal of Computer Science and Its Application

Current state of the art spellchecking techniques are based on an efficiently stored list of correct spellings of words in a language against which wrongly spelt words are checked. However, Nigerian Pidgin does not have a compiled list of such proofed spellings which is required by these techniques. As a result, people generally prepare writings in Nigerian Pidgin using different spelling styles, leading to inconsistency each time a word is spelt. To solve this problem which also holds for many other resource-scarce languages, this paper presents a machine learning approach to spellchecking that does not require an existing word list. In this approach, the correct spelling of a word is learnt based on the relative frequencies of various renditions of the spelling of the word in a document. That is, the technique flags spelling errors by depending only on words within the document that is being edited. Keywords: Edit distance, Orthography standardisation, Spellchecking, Unigram Probabilities