Showing posts with label hidden markov model. Show all posts
Showing posts with label hidden markov model. Show all posts

Speech Recognition using HMM and VQ - Code Updates

Hey guys,

I've done some minor changes to the Speech Recognition code that I pushed to github/googlecode couple of years ago. As I received several attention/queries from readers/students all over the world regarding couple of minor bugs within the project, I thought its time ( better late than never :P ) to address them into the code.

Significant changes are on:

- Exception handling
- File handling
- File (recorded WAVE file, VQ codebook, HMM models) save path corrected
- Delta calculation fixed when regression window is less than number of frames ( when a short audio sample is recorded)

I've also uploaded training audio samples for few  words, trained Vector Quantization coebook and trained HMM model file for the words.

If you wish to add/train more words, please use the GUI - "HMM_VQ_Speech_Recognition" class. Just a note, you need to re-train both VQ and HMM for each word that you add.

Code :

https://github.com/gtiwari333/speech-recognition-java-hidden-markov-model-vq-mfcc

Report File (contains both Speech and Speaker recognition)

http://ganeshtiwaridotcomdotnp.blogspot.com/2011/06/final-report-text-prompted-remote.html

I hope everyone will benefit from this. Keep coding, keep learning. Please let me know if you see any issues.

Thank you!

Speech Recognition Java Code - HMM VQ MFCC ( Hidden markov model, Vector Quantization and Mel Filter Cepstral Coefficient)

Hi everyone,
I have shared speech recognition code in google code :
http://code.google.com/p/speech-recognition-java-hidden-markov-model-vq-mfcc/

You can find complete source code for speech recognition using  HMM, VQ, MFCC ( Hidden markov model, Vector Quantization and Mel Filter Cepstral Coefficient). Feel free to use and modify this code.

The project report that accompanies this code is here.
http://ganeshtiwaridotcomdotnp.blogspot.com/2011/06/final-report-text-prompted-remote.html

Introduction to the project :
http://ganeshtiwaridotcomdotnp.blogspot.com/2010/12/text-prompted-remote-speaker.html

Mouse Gesture Recognition Using Hidden Markov Model - Java Source Code

Hi everyone, I have uploaded the codes for my project - "Mouse Gesture Recognition with Hidden Markov Model - Java".

You can find it @ google code : https://code.google.com/p/mouse-gesture-recognition-java-hidden-markov-model/.

This svn repository @ google code contains eclipse source code (VQ and HMM codes from OCVolume Project.) , trained hmm models and codebook, captured data for few gestures.

Similar codes for Speech Recognition System using HMM/VQ + MFCC will be uploaded SOON.

DEMO VIDEO: http://www.youtube.com/watch?v=0CNJ2fCj4xQ


Mouse Gesture Recognition with Hidden Markov Model.

Understanding gestures can be posed as a pattern recognition problem. These patterns(gestures) are variable but distinct and have an associated meaning.  Since gesture consists of continuous motion in sequential time, an HMM is an effective recognition tool.

My work is a demonstration of a mouse motion gesture recognition system using Dynamic HMM. It is developed in Java.

Code  Available @ : https://code.google.com/p/mouse-gesture-recognition-java-hidden-markov-model/

Main References were : 
DEMO VIDEO 


Final Report : Text Prompted Remote Speaker Authentication : Joint Speech and Speaker Recognition/Verification System

Here is complete report of our project :Text Prompted Remote Speaker Authentication : Joint Speech and Speaker Recognition/Verification System


Next time ... Runnable Project or may be snapshots..... just wait

Presentation Slide for this report and project work is here.

Final Presentation Slide :Text Prompted Remote Speaker Authentication : Joint Speech and Speaker Recognition/Verification System

Here is the download link for our final years project's presentation slide. The title of project is Text Prompted Remote Speaker Authentication which is a Joint Speech and Speaker Recognition/Verification System and uses Gaussian Mixture Model and Hidden Markov Model/Vector Quantization for classification and MFCC as feature Vector.

Refer to http://ganeshtiwaridotcomdotnp.blogspot.com/2010/12/text-prompted-remote-speaker.html for detail of our project.

This project was done at Tribhuvan University, Institute of Engineering-Department of Electronics and Computer Engineering, Kathmandu, Nepal during November 2010 to January 2011.
The project members were:
Ganesh Tiwari
Madhav Pandey
Manoj Shrestha

Speaker Verification for Remote Authentication

Text Prompted Remote Speaker Authentication : Joint Speech and Speaker Recognition/Verification System :: Major Project ::: Introduction

Biometrics is, in the simplest definition, something you are. It is a physical characteristic unique to each individual such as fingerprint, retina, iris, speech. Biometrics has a very useful application in security; it can be used to authenticate a person’s identity and control access to a restricted area, based on the premise that the set of these physical characteristics can be used to uniquely identify individuals.

Speech signal conveys two important types of information, the primarily the speech content and on the secondary level, the speaker identity. Speech recognizers aim to extract the lexical information from the speech signal independently of the speaker by reducing the inter-speaker variability. On the other hand, speaker recognition is concerned with extracting the identity of the person speaking the utterance. So both speech recognition and speaker recognition system is possible from same voice input.

Desired Output of the Combined System
Text Prompted Remote Speaker Authentication is a voice biometric system that authenticates a user before permitting the user to log into a system on the basis of the user’s input voice. It is a web application. Voice signal acquisition and feature extraction is done on the client. Training and Authentication task based on the voice feature obtained from client side is done on Server. The authentication task is based on text-prompted version of speaker recognition, which incorporates both speaker recognition and speech recognition. This joint implementation of speech and speaker recognition includes text-independent speaker recognition and speaker-independent speech recognition. Speaker Recognition verifies whether the speaker is claimed one or not while Speech Recognition verifies whether or not spoken word matches the prompted word.