SMC/SoC/2008: Difference between revisions
No edit summary |
|||
| Line 5: | Line 5: | ||
==Ideas for Google Summer of Code 2008== | ==Ideas for Google Summer of Code 2008== | ||
===Tokenizer/Lemmatiser for malayalam for GATE=== | ===Tokenizer/Lemmatiser for malayalam for GATE=== | ||
Write a Lemmatiser for Malayalam. See whether we can do a plugin for GATE for malayalam, that would help NLP reasearchers a lot and that would be a great idea. | Write a Lemmatiser for Malayalam. See whether we can do a plugin for GATE for malayalam, that would help NLP reasearchers a lot and that would be a great idea. Google search GATE,download and install GATE , and in the plugins directory a hindi tokenizer and lemmatiser is available. | ||
=== Functional Optical character Recognition system=== | === Functional Optical character Recognition system=== | ||
Add malayalam Support for tesseract OCR. | Add malayalam Support for tesseract OCR. | ||