Title
Shortcomings in automatic closed captions and subtitles in academic video presentations
Conference name
Media for All 10 Conference
City
Country
Belgium
Date
06/07/2023-07/07/2023
Abstract
The application of machine translation to subtitling has been put to test in recent years with the controversy around Netflix’s series Squid Game, while ongoing projects on technology and audiovisual translation (European Commission, 2022; Díaz-Cintas and Massidda, 2020) show that the topic is taken more and more seriously by researchers, professional translators, and European Institutions. In their most recent conference, the European Society for Translation Studies (EST, 2022) advised virtual presenters to record their papers and use automatic captioning to ensure accessibility. As to academic translation, post-editing is now a reality both for professional translators and for academics in general (Parra and Goulet, 2021). In view of these developments and of the increasing number of academic events being recorded or held online since the onset of the COVID19 pandemic, the present work combines automation processes in audiovisual translation (speech recognition software, automatic and machine-translated subtitles, and captions) and academic texts, more specifically, video presentations. The research questions are whether the automatic generation of captions and subtitles is functional to ensure accessibility in academic events such as the EST22 conference and how much post-editing effort would such contents require in case a translation of the subtitles is to be applied. The research method comprises several phases. Firstly, in a corpus of video presentations of specialised content in English, subtitles were generated automatically using Youtube Studio in order to ascertain the general quality and the type of errors generated in the automatic transcription. These subtitles were corrected and annotated considering the following parameters: a) post-editing time; b) type of error; and c) severity of the error. In this way, we were able to determine whether the quality of the subtitles originating in SL was adequate. Secondly, the subtitles generated by Youtube Studio and corrected were machine-translated into Spanish. Furthermore, errors detected in the machine translation of the subtitles (English - Spanish) were analysed following Multidimensional Quality Metrics (MQM), a translation quality assessment framework that allows researchers to adapt their own parameters for their quality assessment purposes. Furthermore, we studied the reception by a potential audience, as evaluated by academics from the same field of expertise. For this, a mixed-method approach was used: a) with the tool SDL Trados Studio 2011, and specifically with the Qualitivity plugin, to manage the data related to productivity and quality and for measuring post-editing effort and b) with the human evaluation of subtitles in the form of comments, following the recommendations of Läubli et al. (2020). The results of this study have multiple applications: as proof of the shortcomings of automation processes in the accessibility of academic video presentations and as an indication to post-editors (both professional and non-professional translators) of the type of errors the use of such processes may generate.