The challenge for today is to build computer systems that are very easy to use. This implies that a human and a machine should be able to act as one. It implies that the machine needs to be "trained" to understand human speech. And it implies that the machine should be able to recognize anyone's speech under any condition. This is the challenge that the speech recognition research laboratory in the Modeling & Simulation Development Branch at NAWCTSD has undertaken. This paper details the R&D effort to develop software that will enable a computer system to understand the spoken word in a noisy environment. This is an important problem that must be solved if the future training devices, as envisioned by the US Navy, are to be realized in the 21st century.
Statistical modeling used in modern independent speech recognition represents the inherent variability in human speech. Adversely, these models are highly susceptible to background noise. This research exploits the response of the statistical modeling used in recognition of speech patterns specifically to mitigate the adverse effects of noise. Time domain audio from the microphone is transformed into the frequency domain by way of the Fast Fourier Transform (FFT) algorithm. Subsequently, a mel spectrum window and cosine transform are applied to create spectral feature vectors. Models of the spoken input are created by training statistical models with the spectral feature vectors. The structure of the frequency domain is well understood in terms of the quantized method used to process and store signals as data. Computationally, it is now possible to manipulate individual frequency components to remove unwanted noise spectra. This paper describes the digital signal process which has been developed to remove frequencies that are not of interest to the information content in the audio signal and thereby describe a near noise free model and associated filter. This model and filter are integrated into the speech recognizer in a holistic method to solve the background noise problem.