Title
The quality of automatic and human live captions in English and beyond
Conference name
8th International symposium on live subtitling and accessibility
City
Modalities
Date
19/04/2023
Abstract
Closed captions play a vital role in making live broadcasts accessible to many viewers. Traditionally, stenographers and respeakers have been in charge of their production, but this scenario is changing due to the steady improvements that automatic speech recognition has experienced in recent years. This technology is being used to create intralingual live captions, and broadcasters have begun to explore its potential use. Human and automatic captions co-exist now on television and, while some research has focused on the accuracy of human live captions, comprehensive assessments of the accuracy and quality of automatic captions are still needed. This presentation will tackle this issue by introducing the main findings of the largest study exploring the accuracy of automatic live captions conducted to date. Through four case studies including approximately 17.000 live captions analysed with the NER model from 2018 to 2022 in the UK, the U.S. and Canada, this presentation will track the recent developments of automatic captions, compare their accuracy to that achieved by humans and wrap up with a brief discussion of what the future of live captioning looks like for both human and automatic captions.
Beyond this, and within the framework of the Spanish-government-funded Qualisub project, the presentation will end by addressing the initial findings of two related studies: the automation of the NER model and quality assessment of two workflows used by the European Parliament to provide live interlingual captions (a completely automatic workflow and one involving simultaneous interpreting and automatic speech recognition). These findings help to shed light on the future landscape of live speech-to-text communication in the near future.
Beyond this, and within the framework of the Spanish-government-funded Qualisub project, the presentation will end by addressing the initial findings of two related studies: the automation of the NER model and quality assessment of two workflows used by the European Parliament to provide live interlingual captions (a completely automatic workflow and one involving simultaneous interpreting and automatic speech recognition). These findings help to shed light on the future landscape of live speech-to-text communication in the near future.