A non-parametric significance test to compare corpora

Autoři: Alexander Koplenig aff001
Působiště autorů: Leibniz Institute for the German language (IDS), Mannheim, Germany aff001
Vyšlo v časopise: PLoS ONE 14(9)
Kategorie: Research Article
doi: 10.1371/journal.pone.0222703


Classical null hypothesis significance tests are not appropriate in corpus linguistics, because the randomness assumption underlying these testing procedures is not fulfilled. Nevertheless, there are numerous scenarios where it would be beneficial to have some kind of test in order to judge the relevance of a result (e.g. a difference between two corpora) by answering the question whether the attribute of interest is pronounced enough to warrant the conclusion that it is substantial and not due to chance. In this paper, I outline such a test.

Physical sciences – Mathematics – Discrete mathematics – Combinatorics – Permutation – Statistics – Statistical data – Probability theory – Statistical distributions – Social sciences – Linguistics – Semantics – Sociolinguistics – Biology and life sciences – Neuroscience – Cognitive science – Cognitive psychology – Language – Psychology – Research and analysis methods – Mathematical and statistical techniques – Statistical methods – Test statistics – Statistical inference


