Idioma: Español
Fecha: Subida: 2021-04-14T00:00:00+02:00
Duración: 17m 16s
Lugar: Conferencia
Visitas: 971 visitas

Quantitative Study on authorship attribution with F-motifs

Xuan Yang, Yue Jiang (Xi'an Jiaotong University ) and Letao Wang (Chang'an University)

Descripción

Linguistic motif is defined as the longest continuous sequence of equal or increasing values representing a quantitative property of a linguistic unit. It helps to offer more information about the sequential organization of a text with respect to any linguistic unit and to any of its properties – without relying on a specific linguistic approach or grammar. Previous studies have mainly focused on the study of L-motif (i.e. Length motif counted in linguistic properties such as syllables) whereas few tried to verify and apply F-motif. F-motif, defined as a continuous series of equal or increasing frequency values (e.g. of morphs, words or syntactic construction types), is equally potent in text classification and authorship attribution as L-motif. The current study is an attempt to investigate whether F-motif can be employed for authorship attribution in terms of its length and frequency distribution. In this study, the rankfrequency distribution (modeled by Zipf-Mandelbrot distribution), length distribution and proportion of hypax legomena of F-motifs are calculated and compared across texts of The Lord of the Rings and Harry Potter. Results show that parameters of Zipf-Mandelbrot model, the most frequent length class of F-motifs and the proportion of hapax legomena of the F-motifs to the size of the F motif tokens can effectively differentiate texts of the two authors from each other. This study can be seen as an innovative attempt to verify whether the parameters of ZM model, proportion of hypax legomena and length of F-motif are effective in authorship attribution. By calculating F-motifs, we can quantitatively present the linear sequential features of word frequency in each text. And by observing the frequency and length of F-motifs, we can obtain more information about the texts in an unambiguous and comprehensive manner, offering insights into the function of F-motifs as indicators for authorship attribution.

Propietarios

Congreso Cilc 2021

Comentarios

Nuevo comentario

Serie: CILC2021: Usos específicos de la lingüística de corpus / Special uses of corpus linguistics (+información)