Journal & Conference Proceeding Publications



ID Code : CSC 0044
Title : Enhanced Rules Application Order Approach to Stem Reduplication Words in Malay Texts
Author/s :

Mohamad Nizam Kassim;
Mohd Azaini Maarof [UTM];
and Anazida Zainal [UTM]

Abstract : Word stemming algorithm is a natural language morphological process of reducing derived words to their respective root words. Due to the importance of word stemming algorithm, many Malay word stemming algorithms have been developed in the past years. However, previous researchers only focused on improving affixation word stemming with various stemming approaches. There is no reduplication word stemming has been developed for Malay language thus far. In Malay language, affixation and reduplication are derived words in which have their own morphological rules. Therefore, the use of affixation word stemming to stem reduplication words is considered inappropriate. Hence this paper presents the proposed reduplication word stemming algorithm to stem full, rhythmic and partial reduplication words to their respective root words. This proposed stemming algorithm uses Rules Application Order with Stemming Errors Reducer to stem these reduplication words. Malay online newspaper articles have been used to evaluate this proposed stemming algorithm. The experimental results showed that the proposed stemming algorithm able to stem full, rhythmic, affixed and partial reduplication with better stemming accuracy. Hence, the future improvement of Malay word stemming algorithm should include affixation and reduplication word stemming.
Publication :

Advances in Intelligent System and Computing

Year Published : 2014|657-665|Volume 287
PDF / Official URL : http://link.springer.com/chapter/10.1007/978-3-319-07692-8_62