Following the recommendation of an earlier independent report, by i2 Media Research, Ofcom commissioned the University of Salford to investigate the potential for speech recognition technology to provide live captions in Voice over Internet Protocol (VoIP) telecommunications.
Phase 1 of this work consisted of a basic review of the scope for speech recognition technology to provide subtitles for speech communications and the development of prototype software to enable user testing of the concept. The results of Phase 1 showed that the software could indeed be useful in this context.
Phase 2 of the project built on the encouraging results of phase 1. The software demonstrator system was used to test the concept with hearing-impaired users, to gain an indication of the accuracy of the speech recognition system and to discover how well the performance of the speech recognition software correlated to acceptability of its use in telephony.
The key conclusions of Phase 2 are:
Windows Speech Recogniser 8.0 appears to have now reached parity of performance with leading edge commercial speech recognition packages; market leading speech recognition is now available to Windows users for free.
A great deal of interest was shown in the software with seven out of eight participants in the focus groups who used the software stating that they would use the software for some telephone calls. Five out of eight stated that they would like to use the software for all telephone calls.
There was significant interest in using the software to make business calls (eg. to bank managers).
Although word recognition rate for some people fell below the desired performance level there were clear indications that longer usage would improve rates considerably. Error rates of less than 5% have been reliably demonstrated.
Ofcom is currently developing plans to commission a third phase of work, which would involve in-home user trials in a real world environment. It is anticipated that such trials would enable longer periods of user familiarisation and hence further improved recognition accuracy.