Although the STI approach might seem similar to audio
measurements like Perceptual Evaluation of Speech
Quality (PESQ) and Perceptual Objective Listening
Quality Analysis (POLQA), both PESQ and POLQA
were developed to measure speech quality over telephone
transmission networks in ideal listening conditions with
very little background noise. STI, however, is a metric
for speech intelligibility more applicable to high-noise
environments where intelligibility is primary and quality is
a secondary concern.
The STI uses the modulation transfer function (MTF),
which is the ratio of modulation depth of the received
signal to the modulation depth of the transmitted signal
as a function of modulation rate. Distortions in the
transmission channel, such as noise, reverberation, echoes,
and digital codecs, reduce the modulation depth and
distort speech intelligibility.
For ABC-MRT measurement, a specialized ASR algorithm
emulates the MRT methodology by recognizing keywords
transmitted through a communication system. It also uses
frequency bands called Articulation Index (AI) bands.
There are 17 AI bands for narrowband speech and 21 for
full-band speech. Figure 2: Spectrograms of the six keywords in the MRT list.