SMC/SoC/2008: Difference between revisions
No edit summary |
Swathantra Malayalam Corpus |
||
| Line 11: | Line 11: | ||
Details: | Details: | ||
* Needs an annotated image and speech corpus to support the Speech and image related FOSS driven research and development. | |||
* It should be able to act as a standard train and test data for the R&D activities. | |||
* In the first phase need to build a specification document, clearly written manual for building the corpus and should build the tools needed to build the corpus and use the corpus. | |||
* Anybody who like to contribute to the project must be able to do so and the specifications should be of the best covering all the aspects on classification of data, annotation of data, structure of storage and all related details. | |||
* As a part of the project, when we finish the summer, we must be able to build a complete specification document and programs to build the corpus and access the corpus(building the whole process must be a collaborative effort, it is not coming under this phase). | |||
Please add more details that can be added to a corpora project. | |||
==How to Apply == | ==How to Apply == | ||