Research: Swecha Gonthuka Telugu ASR
An overview of the research behind the Swecha Gonthuka Telugu speech recognition system. A full paper covering methodology, findings, and evaluation design is in preparation.
Overview
Telugu is spoken by over 80 million people but remains underrepresented in open ASR research. Existing multilingual models often treat Telugu as a low-resource tail language, resulting in weak recognition across native speakers and dialectical variation.
Swecha Gonthuka starts from a community-collected, Telugu-first dataset rather than adapting a generic multilingual model. The aim is a recogniser that performs across the range of speakers in the training distribution — not only standardised studio speech.
Research Paper
Releasing SoonA formal write-up is in preparation. It will cover:
- Methodology
- Model Training
- Evaluation Design
- Findings
- Limitations
- Future Directions
Use the Model
Model weights and usage documentation are available on Hugging Face.
View on Hugging Face