Idioma: Español
Fecha: Subida: 2021-04-15T00:00:00+02:00
Duración: 23m 36s
Lugar: Conferencia
Visitas: 929 visitas

Methodology and mythology in corpus linguistic enquiry

Richard Chapman (Universitá degli Studi di Ferrara)

Descripción

This paper aims to reflect on the methodological assumptions that underpin Corpus linguistic enquiry and to relate them to changes in the collection of data, and the nature of data itself.
While Corpus linguistics boasts a consolidated history of theoretical discussion, and most authors urge caution concerning the limitations of any CADS project, the paper suggests that we should be careful in assuming theoretical accomplishments and understandings of the past will continue to equip and sustain corpora-based language investigation adequately in the future, and this for two reasons: firstly, the nature of the data available (and the tools to process them) is being transformed, and secondly, questions of access and ownership are influencing sampling and risk leaving academic corpus research behind, as other entities enjoy unfettered use of highly significant stores of data that preserve and transcribe linguistic behaviour, often as a side-effect of other activities.
At the same time, corpus linguistics can be said to have moved, or extended its reach, with corpus assisted discourse studies aiming at an ever-more-contextualised view of the language to be found in a corpus, and attempting to reach pragmatic as well as semantic or grammatical readings of data. There may even come a point when we are closer to sampling and analysing society than ‘just’ language. Certainly there is a danger that the more ambitious corpus linguistic investigation is, the less firmly grounded it is in terms of the empirical data and metadata that must accompany any attempts to read language in use.
The paper identifies some of the most pressing theoretical issues at the present juncture of corpus assisted research, attempting to outline the limitations they present and occasionally suggesting innovative solutions or adjustments to current methodological practice. Issues mentioned include: the ethics of data collection (in the era of Alexa perhaps we need to re-examine what are the morally justified ways to gather evidence of linguistic behaviour; the need to re-examine copyright and its effects); reassessing what can theoretically be regarded as a corpus (e.g. the use of Google itself as a rough-and-ready mega-corpus, which is common, however frowned upon it might be, and the proliferation of small, do-it-yourself corpora), and how we as researchers should compare data from highly contrasting bodies of data; the challenge of the era of big data (this may seem a boon to corpus linguistics as memory has become cheaper and data invariably reaches us already in digital form, but poses enormous questions to the researcher); the complexity of ‘real’ pragmatics (which should make us cautious in attributing clear-cut judgements or labels to instances of language thrown up by interrogation of our corpora); the eternal question in corpus linguistics of whether we can reach elegant and accurate generalisations about language or recognise morphological or semantic patterns while remaining true to our empirical methods; the relationship between highly sophisticated statistical analysis and hands-on (re-)reading of data; the need for detailed contextualisation to remain empirical at all; the nature of claims about ‘truth’ in present-day scientific research.
The concludes by attempting to point out potential methodological changes in corpus linguistic practice that might be considered necessary in light of transformations in language, culture and resources.

Propietarios

Congreso Cilc 2021

Comentarios

Nuevo comentario

Serie: CILC2021: Diseño, compilación y tipos de corpus / Corpus design, compilation and types (+información)