Speech and Vision Lab (SVL)

The purpose of speech is communication, and the research in this area primarily deals with processing, representation of speech signal to develop voice based interfaces for human-computer interaction. Such natural interfaces enable access to information via hands-free mode to literate, illiterate as well as vision impaired people.

The thrust of our activity in Speech and Vision Lab is on the development of speech and vision based interfaces for human-computer interaction. The goal is to develop of speech-to-text (speech input) and text-to-speech (speech output) systems to achieve speech translation from one Indian language to another and speech/audio information retrieval in a secure access mode using biometrics such as speech, video etc.,. The objective of Speech and Vision Lab is to conduct goal oriented basic research, and thus we address fundamental issues involved in building robust speech-to-text systems, natural sounding text-to-speech systems, spoken/audio information retrieval and biometrics using speech and video.

The broad areas of SVL are:

1. Speech and man-machine communication related research areas
2. Speech signal processing
3. Speech-to-text conversion
4. Text-to-speech conversion
5. Speaker verification and recognition
6. Speech enhancement
7. Spoken language identification
8. Event based analysis of speech
9. Speech enhancement in multi-speaker environment
10. Development of phonetic engine for Indian languages
11. Speaker segmentation and tracking
12. Automatic prosody and duration modeling and manipulation
13. Studies on throat microphone speech
14. Low bit rate coding of speech
15. Spoken Audio information retrieval and summarization
16. Processing of Biosignals: EEG, EMG, and fMRI
17. Voice conversion or morphing
18. One dimensional processing of images
19. Segmentation of textured images
20. Algorithms for matching noisy and distorted images
21. Algorithms for matching stereo images
22. Algorithms for Geographical Information System
23. Content-based storage and retrieval of information
24. Face detection and recognition studies
25. Person authentication by audio/video based techniques
26. Applications of artificial neural networks

Major Funded Projects

Indo-Swiss Project on "Cross-cultural personality perception (2009-12)
"Secure access to launch computer using multimodal biometrics" (2009-11) (DRDO, Govt of India)
Development of text-to-speech system in Telugu language (2009-11) (DIT, Govt of India)
Collection of 1500 hours of speech data in 3 Indian languages (CIIL, Mysore)
Study of source features for speech synthesis and speaker recognition (2007-11) (UKIERI Standard Awards (Indo-UK Project))
Development of phonetic engine for Indian languages (2007-10) (DIT, Govt of India)
Feasibility of ultra low-bit rate coding for transmission and storage of speech (2007-10) (DRDO, Bangalore)
Efficient representation of speech from throat microphone for transmission and storage of speech information (2007-10) (DRDO, Bangalore )
Indo-Swiss project on "Keyword spotting in continuous speech (2005-08)
Development of high quality general purpose Telugu TTS (2005-06) (Bhrigus Software (I) Pvt Ltd, Hyderabad)
Development of high quality Hindi TTS (2005-06) (Nokia, China)
Reading Aid for Visually Impaired (RAVI) (2004-06) (Ministry of Social Justice & Empowerment, Govt of India)
Speech database collection and building large continuous speech recognizer in Indian languages (2004) (HP Labs)
Development of Prosodically Guided Phonetic Engine for Searching Speech database in Indian Languages (2011-2013) (Department of Information Technology, Govt of India)
Virtual Lab for Artificial Neural Networks (2010-2011) (Ministry of Human Resource Development, Govt of India)
Virtual Lab for Speech Signal Processing (2010-2011) (Ministry of Human Resource Development, Govt of India)
Speech-based Access for Agricultural Commodity Prices in Six Languages (2010-2012) (Department of Information Technology, Govt of India)

Achievements

Tech Transfers

Technology transfer to Nokia China (2006), HP Labs India (2004)
Technology transfer to Ministry of Social Justice, Ministry of Information Tech. Govt. of India
Development of Reading Aid for Visually Impaired (RAVI) - A Screen Reader in Indian languages (2006)

Faculty

Kishore S Prahllad (Head)
S Rajendran
V G Suryakanth
Anil Kumar Vuppala
B Yegnanarayana

Students

Anusha Konduri
Vasudha Myla
Ravi Shankar Prasad
Gomathi Ramya
Naresh Kumar Elluru
NARAYANA MURTHY BH V S
Chetana Prakash
Sanjeev Gupta
Gautam Varma Mantena
Rajesh Dachiraju
Anandaswarup Vadapalli
Srinivasa Rao Ch
Aneeja G
Nivedita Chennupati
Sathya Adithya Thati
Basil George
Apoorv Reddy
Abhijeet Saxena

Speech and Vision Lab (SVL)

Language Technologies

Major Funded Projects

Achievements

Tech Transfers

Faculty

Students