Mac's Data:
Frequency of Common Words
Common Words That Discriminate
Despite the context-sensitive character of many pronouns and verbs, they have been
used effectively in dozens of authorship studies, along with other high-frequency words.
Very common words that, unlike "that," are ineffective as stand-alone discriminators
may have value as members of a substantial group of words, each with some discriminatory power.
So, as an initial trial, from word lists, ordered by frequency, for Moore and for Livingston,
there were extracted each poet's top fifty words.
All Words in All Poems
Word Frequencies in All
Word Frequencies in Moore
Word Frequencies in Henry
Word Frequencies in Visit
Mac pulled from the frequency listing twenty-six words that were in both Moore and Henry's poetry, and which appeared
twice in "The Night Before Christmas." These he placed in rank order. Mac then applied Spearman's rank-order correlation,
a simple statistical test, to determine whether the rank order for Visit of these twenty-six
words more closely matches the rank order for Henry or the rank order for Moore.
From this data, Mac found the correlation between Visit and Henry to be .7638. The correlation between Visit and
Moore was .6633. Which meant that the way the words are used in Visit is closer to the way they're used in
Henry's poetry rather than the way they're used in Moore's.
Next Mac identified words favored by Henry more than Moore (Henry Favored Words), and by Moore more than Henry (Moore Favored Words).
After dropping words that had been evaluated in other tests, so as to keep the tests independent, Mac was left with
Henry Favored Words:
I his my her on as is was at
thy will
day When me Where While
Moore Favored Words:
to from your for they be With this our not which so would For it
heart Of are we
Henry and Moore Favored Words in Poems
Using a t-test, Mac found it unlikely that Henry's poems and Moore's poems fit within a single population. So he had a differentiator.
Looking at "The Night Before Christmas," Mac found that it fit neatly within Henry's percentages, but was an outlier for Moore, that it, it
was at the extreme end of Moore's percentages.