A preliminary investigation into the use of fixed formulaic sequences as a marker of authorship

Larner, Samuel orcid iconORCID: 0000-0002-8386-3789 (2014) A preliminary investigation into the use of fixed formulaic sequences as a marker of authorship. International Journal of Speech Language and the Law, 21 (1). ISSN 1748-8885

[thumbnail of Author Accepted Manuscript]
Preview
PDF (Author Accepted Manuscript) - Accepted Version
Available under License Creative Commons Attribution Non-commercial No Derivatives.

727kB

Official URL: http://dx.doi.org/10.1558/ijsll.v21i1.1

Abstract

This research unites the theory of formulaic language—prefabricated sequences of words believed to be stored as holistic units—and the practice of forensic authorship attribution with a view to developing a new marker of authorship. It stands to reason that since formulaic sequences are holistically processed as single lexical items, they are likely to elude a writer’s attempts to disguise their style. Furthermore, evidence suggests that individuals have different stores of formulaic sequences. Therefore, research into differences in formulaic language usage may assist in the development of new tools for authorship attribution. In order to test this assertion, a reference list containing 13,412 formulaic sequences was compiled from multiple online sources. This was then used to identify formulaic sequences in a 20 author corpus containing 100 personal narratives. After exploring the types of formulaic sequences used by authors, statistical tests were used to determine whether the count of formulaic words was sufficient to establish variation between authors and to attribute a Questioned Text to its author


Repository Staff Only: item control page