To register a new account on this wiki, contact us
SMC/SoC/2008: Difference between revisions
Jump to navigation
Jump to search
| Line 4: | Line 4: | ||
Write a Lemmatiser for Malayalam. See whether we can do a plugin for GATE for malayalam, that would help NLP reasearchers a lot and that would be a great idea. IGoogle search GATE,download and install GATE , and in the plugins directory a hindi tokenizer and lemmatiser is available. | Write a Lemmatiser for Malayalam. See whether we can do a plugin for GATE for malayalam, that would help NLP reasearchers a lot and that would be a great idea. IGoogle search GATE,download and install GATE , and in the plugins directory a hindi tokenizer and lemmatiser is available. | ||
=== Functional Optical character Recognition system=== | === Functional Optical character Recognition system=== | ||
Add malayalam Support for tesseract OCR . | Add malayalam Support for tesseract OCR. | ||
* Study tesseract OCR system | |||
* Recogntion of all characters | |||
* Layout recogization using ocropus (optional ?) | |||
http://code.google.com/p/tesseract-ocr/ | |||
http://code.google.com/p/ocropus/ | |||
=== Write a Gnome Speech Driver for Dhvani and Integrate it with Orca === | === Write a Gnome Speech Driver for Dhvani and Integrate it with Orca === | ||
#Orca for visually impaired users uses gnome speech for speech engines. Currently Festival, Espeak, freetts etc have drivers for gnome speech. We need to write a driver for dhvani. | #Orca for visually impaired users uses gnome speech for speech engines. Currently Festival, Espeak, freetts etc have drivers for gnome speech. We need to write a driver for dhvani. | ||
Revision as of 16:45, 5 March 2008
Participation of SMC in GSOC 2008 is not confirmed. Use this page for collecting the Project Ideas
Ideas for Google Summer of Code 2008
Tokenizer/Lemmatiser for malayalam for GATE
Write a Lemmatiser for Malayalam. See whether we can do a plugin for GATE for malayalam, that would help NLP reasearchers a lot and that would be a great idea. IGoogle search GATE,download and install GATE , and in the plugins directory a hindi tokenizer and lemmatiser is available.
Functional Optical character Recognition system
Add malayalam Support for tesseract OCR.
- Study tesseract OCR system
- Recogntion of all characters
- Layout recogization using ocropus (optional ?)
http://code.google.com/p/tesseract-ocr/ http://code.google.com/p/ocropus/
Write a Gnome Speech Driver for Dhvani and Integrate it with Orca
- Orca for visually impaired users uses gnome speech for speech engines. Currently Festival, Espeak, freetts etc have drivers for gnome speech. We need to write a driver for dhvani.
- Develop plugins for KTTS/Gedit/Firefox
Rewrite the Dhvani sound system with SDL
- Rewrite the ALSA sound system of dhvani with SDL to make it a cross platform application
- Packaging for different platforms
- Bug fixes for langauge modules and Code clean up
- Adding pitch/volume/pause support for the generated speech
Localization of Free Content Management Systems to Malayalam-Drupal/Joomla
100% localization of Drupal and Joomla CMS systems to Malayalam
How to Apply
see http://code.google.com/soc/2008/faqs.html