Limitations of Fast Fourier Transform for Real Time Speech Processing ByU. Vivekananda. Mtech The Fourier Transform is a transformation technique, which applied to a time-domain signal provides information on the different frequency components (frequency content) present in the signal. The Fourier Transform is given b 
The Inverse Transform is given by  Now let us look into the frequency response provided by the Fourier Transform for a few fundamental signals and assess it’s weakness as pertaining to a Real-time varying signal. Let us consider the example of a Sine Wave given by sin (2P10t). It has a 10Hz frequency component. This 10Hz component is represented as a peak in the frequency domain when the Fourier transform for this signal is obtained. The Sine Wave and the Fourier Transform of the signal I provided in the figures 1 & 2 respectively.  FIGURE 1: Sine Wave having a 10Hz frequency component  FIGURE 2: Figure showing a peak at 10Hz for the 10Hz sine wave Consider another example, which is a mixture of Sine Waves given by Sin(2P10t)+Sin(2P30t). In this example the Fourier transform will consist of two peaks, one at 10Hz and the other at 30Hz. The signal and the Fourier transform are shown in figures 3 & 4 respectively.  FIGURE 3: A complex sine wave consisting of a mixture of two frequency components 
FIGURE 4: Figure showing the two frequency components present in FIGURE 3 The above two depictions of signals are termed as stationary signals. That is, the signal does not change as time changes. They contain the same frequency components for the entire duration that the signal is in existence. Hence the frequency content of the signal can be accurately obtained by the use of the Fourier transform. Let us look into another type of signal. A signal that varies with time. The signal and the frequency depiction are provided in the FIGURES 5 and 6. 
FIGURE 5:A time varying signal having different frequency components at different times  FIGURE 6: Frequency Plot depicting the different frequency plot of Figure 5 The reader of the article might notice that in the time domain representation of the above signal, the 30Hz component is at the beginning of the signal and the 10Hz component is present in the latter half of the signal. It is interesting to look at the signals depicted in FIGURES 3 and 5 and their respective Fourier transforms. Both denote peaks at 10Hz and 30Hz respectively, but one signal is a time varying signal (FIG 5) and the other signal is a stationary signal (FIG 3). The Fourier transform cannot distinguish between the two signals. In other words, the Fourier transform is unable to provide information on when a particular frequency component is present in the signal. This is not a limitation for non-real time signals, but is unacceptable in a real-time signal such as Speech. The simple solution that was proposed was to break the signal into small chunks (or windows) and apply the Fourier Transform on that small chunk of signal (windowed signal). This idea is relatively a good one, though one may ask the author that even in this scenario we cannot obtain the time, that is, the time the frequency components occur in the windowed signal, but only that we postulate that these frequency components exist in the signal at this period of time (window). Better transforms such as the Wavelets have been already developed and in use. The Wavelets provide both time and frequency information. But more so on the Short Term Fourier Transform and Wavelets at a later date. The Windowing method of Fourier Transform was termed as the SHORT TERM FOURIER TRANSFORM and will be delivered in detail in the next part. References: 1) Speech Communications –Human and Machines By Douglas o’ Shaughnessy 2) Digital Processing of Speech Signals By Rabiner/R.W. Schafer 3) The Wavelet Tutorial By Robi Poliker |