Raquel Vea Escarza (Universidad de La Rioja)
Wu flu, virus couronné, chiński wirus: A multilingual corpus study of COVID-19 neologisms
Natalia Zawadzka-Paluektau and Aleksandra Tomaszewska (University of Warsaw and University of Sevilla)
Corpus and Methods: The analysis is conducted on a corpus of 2,196 news articles (1,600,417 tokens) covering COVID-19, published during the first six months of the pandemic in market-leading newspapers in Poland, France, and the UK. The press is chosen as the study material as it has been shown to play a crucial role in disseminating new words (e.g., Loingsigh, 2018). Two methods of automatic and semi-automatic identification of neologisms are employed and integrated into Sketch Engine (Kilgariff et al., 2014). The first method is based on lexical and punctuational discriminants (Paryzek, 2008; Svanlund, 2018). As discriminants are language-independent, they allow for identifying neologisms from multilingual corpora. The second method consists in comparing the focus corpora to reference corpora to retrieve keywords.
The lists of potential neologisms obtained at the two stages of analysis are then subjected to manual verification. Thus identified neologisms are subsequently examined with regard to their thematic categories, embedding, and internationalization. Results: The two corpus procedures have allowed for the identification of 672 potential neologisms. The manual verification of neologismcandidates has limited their overall number to 300 (105 in the French, 104 in the UK, and 91 in the Polish subcorpus). The identified neologisms are predominantly related to health (e.g., superspreader and its Polish and French equivalents: superroznosiciel and super-épandeur/superpropagateur), as well as personal and institutional measures of mitigating the effects of the pandemic (e.g., social distancing and its Polish and French equivalents: dystansowanie społeczne and distanciation sociale). Internationalisms (e.g., lockdown, contact tracing) are over 30% more numerous than country-specific neologisms (e.g., TGV médicalisé, tarcza antykryzysowa, mask diplomacy) in the corpus, which provides evidence of the significant internationalization of COVID-19 media discourses in the analyzed period. The study of embeddings has revealed the frequent use of signaling and distancing devices (such as so-called and its Polish and French equivalents).
Conclusions: The study demonstrates that the COVID-19 health crisis has exercised a considerable impact on the analyzed languages, which has manifested itself in a surge of new lexis. The global reach of the pandemic, among other factors, explains the significant number of internationalisms among the identified neologisms. At the same time, however, the presence of country-specific items attests to the linguistic necessity to account for the diverging social, legal, institutional, medical, and other circumstances during the pandemic. The use of embedding strategies indicates that the new words have not yet become fully integrated into the analyzed languages. Thus, a recommendation for further research is made to establish whether COVID-19 neologisms are ephemeral or whether they are becoming integral elements of the linguistic resources of English, Polish, and French.
Serie: CILC2021: Lexicología y lexicografía basadas en corpus / Corpus-based lexicology and lexicography (+información)
Luisa Fidalgo Allo (Universidad de La Rioja)
Eva Lucía Jiménez-Navarro (Universidad de Córdoba)
Anna Beatriz Dimas Furtado and Elisa Duarte Teixeira (Universidad de Brasilia)
Carmen Varó Varó (Universidad de Cádiz)
Raquel Mateo (Universidad de La Rioja)
Jens Fleischhauer and Stefan Hartmann (Heinrich Heine University Düsseldorf)
Eleonora Guzzi and Margarita Alonso Ramos (Universidade da Coruña)
Daniela Pettersson-Traba (Universidad de Extremadura)
Sabrina Lafuente Giménez and Vanessa Gonzaga Nunes (Universidade Federal de Sergipe)
Meili Liu (Katholieke Universiteit Leuven)
María Isabel Medina Soler (Universidad de Alicante)