Title
Subtitling on Spanish TV. A quality assessment
Conference name
Media for All 10 Conference
City
Country
Belgium
Modalities
Date
06/07/2023-07/07/2023
Abstract
Throughout history, live subtitling has been produced with multiple techniques, and respeaking has become one of the most popular methods worldwide. This has been influenced, primarily, by considerable advancements in speech recognition software (SRS) technologies, which constitute the basis of respeaking as they enable the real-time conversion of speech to text on the screen.
This evolution of new technologies along with the prevailing immediacy in our everyday lives has contributed to the full automation of live subtitling. Despite the complexities inherent to real-time subtitling production, artificial intelligence has enabled significant progress in the field of automatic speech-to-text software, and this has revolutionized the industry. Although there is still no completely automatic SRS that can generate flawless subtitles, the quality of some of them is quite outstanding, and this raises great expectations. Automatic technologies have gradually become an expanding tool in subtitling, and, as a manifestation of this new reality, fully automatic subtitles are being broadcast on Spanish TV.
Just as there are several techniques to generate real-time subtitles, numerous accuracy models may be used to evaluate their quality. The problem is, however, that they are not all founded on the same principles, and not all companies or countries operate according to the same standards. Consequently, non-comparable outcomes are actually compared and, while figures are provided, subtitling quality might actually take a backseat.
Traditionally, quality analyses have been conducted using the WER Model, which is based on the precept 'difference from the original equals error'. Therefore, the assessment is built upon the literalness of the final text in comparison to the original, neglecting error severity and users’ comprehension. In contrast, the NER Model was developed with a viewer-centered perspective, focusing on the extent to which errors impair the subtitles’ coherence or the original meaning. Owing to such disparities in perspective, the results obtained with these models are indeed different, and this must be considered when interpreting the data.
This presentation is framed within QuaLiSpain, the first large-scale study about live subtitling quality on Spanish TV, and QuaLiSub, a project funded by the Spanish Ministry for Science and Innovation that aims to analyze live subtitling quality in the US (English) and Spain (Spanish and some co-official languages, namely Galician, Catalan and Basque). More than 1,000 minutes of audiovisual material in Spanish and Galician with respoken or automatic subtitles have been analyzed so far. For the majority of samples, the NER Model was used, whereas about 300 minutes were analyzed with the WER Model.
The aim of this research is to offer an overview of the quality of current intralingual live subtitles on Spanish TV, specifically in Spanish and Galician. In addition, a comparison between the results obtained with the WER and NER model for some samples will be offered. This comparison will reveal potential discrepancies in the accuracy calculation between both models for the samples under consideration, and it will expose the importance of selecting the most appropriate quality assessment tool to improve media access.
This evolution of new technologies along with the prevailing immediacy in our everyday lives has contributed to the full automation of live subtitling. Despite the complexities inherent to real-time subtitling production, artificial intelligence has enabled significant progress in the field of automatic speech-to-text software, and this has revolutionized the industry. Although there is still no completely automatic SRS that can generate flawless subtitles, the quality of some of them is quite outstanding, and this raises great expectations. Automatic technologies have gradually become an expanding tool in subtitling, and, as a manifestation of this new reality, fully automatic subtitles are being broadcast on Spanish TV.
Just as there are several techniques to generate real-time subtitles, numerous accuracy models may be used to evaluate their quality. The problem is, however, that they are not all founded on the same principles, and not all companies or countries operate according to the same standards. Consequently, non-comparable outcomes are actually compared and, while figures are provided, subtitling quality might actually take a backseat.
Traditionally, quality analyses have been conducted using the WER Model, which is based on the precept 'difference from the original equals error'. Therefore, the assessment is built upon the literalness of the final text in comparison to the original, neglecting error severity and users’ comprehension. In contrast, the NER Model was developed with a viewer-centered perspective, focusing on the extent to which errors impair the subtitles’ coherence or the original meaning. Owing to such disparities in perspective, the results obtained with these models are indeed different, and this must be considered when interpreting the data.
This presentation is framed within QuaLiSpain, the first large-scale study about live subtitling quality on Spanish TV, and QuaLiSub, a project funded by the Spanish Ministry for Science and Innovation that aims to analyze live subtitling quality in the US (English) and Spain (Spanish and some co-official languages, namely Galician, Catalan and Basque). More than 1,000 minutes of audiovisual material in Spanish and Galician with respoken or automatic subtitles have been analyzed so far. For the majority of samples, the NER Model was used, whereas about 300 minutes were analyzed with the WER Model.
The aim of this research is to offer an overview of the quality of current intralingual live subtitles on Spanish TV, specifically in Spanish and Galician. In addition, a comparison between the results obtained with the WER and NER model for some samples will be offered. This comparison will reveal potential discrepancies in the accuracy calculation between both models for the samples under consideration, and it will expose the importance of selecting the most appropriate quality assessment tool to improve media access.