To register a new account on this wiki, contact us
SMC/SoC/2008
SMC in Google Summer of Code 2007
Santhosh Thottingal will be lead admin this year, and will deal with the administrative stuff with Google for SMC to be a mentoring organisation
Ideas for Google Summer of Code 2008
Tokenizer/Lemmatiser for malayalam for GATE
Write a Lemmatiser for Malayalam. See whether we can do a plugin for GATE for malayalam, that would help NLP reasearchers a lot and that would be a great idea. Google search GATE,download and install GATE , and in the plugins directory a hindi tokenizer and lemmatiser is available.
Functional Optical character Recognition system
Add malayalam Support for tesseract OCR.
- Study tesseract OCR system
- Recogntion of all characters
- Layout recogization using ocropus (optional ?)
http://code.google.com/p/tesseract-ocr/
http://code.google.com/p/ocropus/
Write a Gnome Speech Driver for Dhvani and Integrate it with Orca
- Orca for visually impaired users uses gnome speech for speech engines. Currently Festival, Espeak, freetts etc have drivers for gnome speech. We need to write a driver for dhvani.
- Develop plugins for KTTS/Gedit/Firefox
Write a Dhvani Interface for Speech Dispatcher
The goal of Speech Dispatcher project is to provide a high-level device independent layer for speech synthesis through a simple, stable and well documented interface. Since SD is more discussed to act as a unified TTS layer for both gnome and KDE, We can try to write a Interface for that
Rewrite the Dhvani sound system with SDL and Additional APIs
- Rewrite the ALSA sound system of Dhvani with SDL to make it a cross platform application
- Packaging for different platforms
- Bug fixes for langauge modules and Code clean up
- Adding pitch/volume/pause support for the generated speech
- API to stop the speech in between a synthesis
- Provide Dhvani as a library
- API to check whether the synthesizer is producing speech(isSpeaking)
Localization of Free Content Management Systems to Malayalam-Drupal &Joomla
100% localization of Drupal and Joomla CMS systems to Malayalam
Speech recognition system for Malayalam
Develop a speech recognition system for Malayalam using the concepts of memory prediction framework
Creating a new family of Equal Height Fonts (EHF)for Malayalam language
To design and create a new family of Equal Height Fonts for the traditional Malayalam script. Following Roman typology, serif and sans serif type of font variations are available in Malayalam. Equal Width Fonts, such as Courier, available in Roman typography are impossible for Malayalam characters and this is unnecessary. The proposed Equal Height Fonts is a new concept in the history of font making to surmount the typographical challenge of vertically stacked conjuncts.
How to Apply
Selection procedure
http://code.google.com/soc/2008/faqs.html