Please wait...

Hamro Awaaz: An Automated Speech Recognizer for Nepali Language 

Resource tools

File information File size Options

Original PDF File

1.3 MB Request


618 × 800 pixels (0.49 MP)

5.2 cm × 6.8 cm @ 300 PPI

55 KB Request
Resource details

Resource ID




Contributed by

Rajan Karmacharya


Pooja Kumari Jha, Sheetal Giri, Rajan Karmacharya (Supervisor)


St. Xavier's College


08 July 16

Document type

Thesis or project


Computer science




Recurrent Neural Network, Connectionist Temporal Classification (CTC), N-gram
Language Model, Automatic Speech Recognition


Speech recognition is the process of enabling a computer to identify and respond to the
sounds produced in human speech. Hamro Awaaz - Nepali Automated Speech
Recognizer (ASR) performs the speaker-independent, computer‚Äźdriven transcription of
spoken Nepali into readable Devanagari text in real time.
The project is based around an android application through which user will send their
voice recording to the server, where it is processed to corresponding text and responded
back. The base of any speech recognition system is its acoustic and language model.
These models in turn are dependent on the amount and quality of data collected, and the
training algorithms. A deep recurrent neural network with Connectionist Temporal
Classification (CTC) in the output layer is being used for training acoustic model.
Language modeling is done by means of n-gram distribution among data collected.
Finally, a search algorithm is used to find the best matching transcription to user's speech
Through these techniques we will be able to achieve a speech recognition model that can
be helpful to all seeking better model of speech recognition for Nepali and other
languages as well. The resulting application can create a platform for development of
other Nepali voice based applications.

Search for similar resources