Speech Recognition And Speech Synthesis System For Linux

Published on Feb 28, 2025

Abstract

The title of the project is Voice and Voice only -Speech Recognition and Speech Synthesis System for Linux. This project deals with creating an interface with the computer to be able to converse with it.

The project aims to make the computer to talk to the user about the facilities present in the software in words and recognize their voice utilizing the concepts of Speech Recognition and Speech Synthesis.

Speech Recognition is the process by which a computer identifies spoken words. Basically, it means talking to your computer, and having it correctly recognize what you are saying. Speech Synthesis is the way by which a computer is able to communicate verbally with a user.

Although any task that involves with a computer can potentially use Speech Recognition and Speech Synthesis, the following applications are the most common now.

Dictation

Dictation is the most common use for Speech Recognition Systems today. This includes medical transcriptions, legal and business dictation, as well as general word processing. In some cases special vocabularies are used to increase the accuracy of the system.

Command and Control

Speech Recognition Systems that are designed to perform functions and actions on the system are defined as Command and Control systems. Utterances like "Mozilla" and "emacs" will do just that.

Medical/Disabilities

Many people have difficulty typing due to physical limitations such as repetitive strain injuries (RSI), muscular dystrophy, and many others. For example, people with difficulty hearing could use a system connected to their telephone to convert the caller's speech to text. Similarly people with blindness can be able to hear a text file or an e-mail sent from a friend if the computer reads it for them.

Voice and Voice only is a simple speech system that can talk with the users telling them about the various options before them and can also recognize isolated words in a speaker dependent fashion. As part of the project, we wish to develop a

speech synthesis system that will tell the user about the various options before them when they click a menu such as when they open the 'File' menu, the computer will tell the user that the various options are : 'Open', 'Read' and 'Quit'. The Speech Recognition section will understand some simple commands such as "Mozilla", "emacs" etc. When the speaker talks to the computer through the system, it should respond with executing the appropriate command. For example, "emacs" should start a new editor and "Mozilla" should invoke the Mozilla browser.

Related Projects

An Adaptive Soft Switching Median Filter For Impulse Noise Removal

Defect Tracking Tool

Salt & Pepper Noise Removal Using Stacked Adaptive Median Filter

Online Student Feedback System

Partition-Based Median Type Filters For Suppressing Impulse Noise In Digital Images

Honeypots