SMC/SoC/2008

Revision as of 15:25, 3 January 2008 by Missing actor (talk) (Ideas for Google Summer of Code 2008)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Ideas for Google Summer of Code 2008

Tokenizer/Lemmatiser for malayalam for GATE

Write a Lemmatiser for Malayalam. See whether we can do a plugin for GATE for malayalam, that would help NLP reasearchers a lot and that would be a great idea. IGoogle search GATE,download and install GATE , and in the plugins directory a hindi tokenizer and lemmatiser is available.

Functional Optical character Recognition system

Add malayalam Support for tesseract OCR . Stages and objectives to be defined clearly