Idioma: Español
Fecha: Subida: 2021-04-13T00:00:00+02:00
Duración: 15m 17s
Lugar: Conferencia
Visitas: 673 visitas

Quantifying discourse coherence with a complex network method

Jiang Niu and Yue Jiang (Xi'an Jiaotong University)


Heretofore, a number of methods to quantify discourse coherence have been put forward such as Belza Chain (Altmann, 2018; Roelcke, Popescu & Altmann, 2017), Latent Semantic Analysis (Foltz et al., 1998), entity-based approach (Barzilay & Lapata, 2008) and an approach pertinent to importance time series (Najafi & Darooneh, 2017). Helpful as they are, some of these methods, however, consider only salient discourse entities without taking into account all discourse elements. Additionally, some focus only on lexical level without paying due attention to semantic relatedness. Recently, Niu, Jiang & Zhou (2020) have proposed and ascertained the applicability of a complex network approach to coherence research based on theme-rheme structure in Systemic Functional Linguistics. Their strength lies in that it looks at not only semantic relatedness but also all discourse elements represented as themes and rhemes. Based upon that study, the present study proposes to use network parameters as a new measurement to quantify discourse coherence, and thus examines to what extent the previous methods and network method are reliable in coherence quantification. To operationalize the research goal, the above-mentioned methods were employed to predict different translation modes, in this case, for example, to distinguish a machine translation from a human professional translation. The reason translation mode was selected as dependent viable is that machine translation is widely acknowledged as notoriously low in coherence compared with human professional translation due to its less consideration given to context. Methodologically, we collected 40 Chinese essays as source texts and their English translations as target texts, including both human professional translations and machine translations, based on which a comparable translational corpus was built. By drawing on the themes, rhemes and their progressions in the corpus, we then constructed semantic networks, in which themes and rhemes are denoted as nodes and their semantic relatedness as edges. Subsequently some network parameters involving in-degree, out-degree, network density were extracted. These parameters reveal how closely a discourse is connected in semantics. To analyze the data, a step-wise logistic regression was conducted to investigate how reliable the previous methods and the proposed network method respectively are to predict the translation modes. We preliminarily found that network parameters are more reliable than previous methods owing to the above-mentioned strengths. This by no means implies that other methods should be discarded because coherence is a multi-dimensional construct. It is hoped that this study is inspiring for a network-representation of corpus for particular research such as coherence research. We also hope that we can contribute to coherence quantification by providing network method, and that this study is instructive for document level machine translation research.


Congreso Cilc 2021


Nuevo comentario

Serie: CILC2021: Discurso, análisis literario y corpus / Discourse, literary analysis and corpora (+información)