Saturday, February 28, 2009

On Mendenhall and compelling evidence of Marlowe authorship by Daryl Pinsken

In 1887, physicist T.C. Mendenhall decided to study the word-length frequencies of writers − the frequency at which writers tended to use words of varying lengths − in a quest to find a mathematical way of proving or disproving authorship. To accomplish this he counted the numbers of one-letter, two-letter, three-letter words, etc., in a text, calculated the percentages of each word length used (the frequency), and displayed the results on a graph. He discovered that writers’ word-length frequency curves remained consistent, and more important, that each writer’s characteristic word-length frequency curve differed from those of other writers.1

In 1901, a wealthy Bostonian named Augustus Hemingway, having heard of Mendenhall’s work, commissioned him to carry out a comparative study of Shakespeare and his contemporaries, including Christopher Marlowe. Hemingway believed Francis Bacon wrote the works of Shakespeare, and he was convinced that Mendenhall had found a way to prove it.

Using Hemingway’s money, Mendenhall hired two women to count the words of various lengths in each of the works. They had to manually count millions of words. Unfortunately for Hemingway, the overall word-length frequencies of Bacon and Shakespeare were very different (although, to be fair, Mendenhall compared Bacon’s prose with Shakespeare’s blank verse, two genres since shown to differ markedly in word-length frequency).

Mendenhall discovered that Shakespeare used significantly more four-letter words than three-letter words. Every other English writer Mendenhall studied, including Shakespeare’s playwright contemporaries, used more three-letter words than any other length. After a while, Mendenhall and his assistants could recognize unidentified blocks of Shakespeare from the four-letter-word spike alone. But when his assistants began to count the words in the Marlowe plays, Mendenhall realized that the Shakespeare curve was not unique after all. To his surprise, the Marlowe and Shakespeare curves were nearly identical. Here is Mendenhall’s reaction to the discovery:

“It was in the counting and plotting of the plays of Christopher Marlowe, however, that something akin to a sensation was produced among those actually engaged in the work. Here was a man to whom it has always been acknowledged, Shakespeare was deeply indebted; one of whom able critics have declared that he ‘might have written the plays of Shakespeare.’ … Even this did not lessen the interest with which it was discovered that in the characteristic curve of his plays Christopher Marlowe agrees with Shakespeare about as well as Shakespeare agrees with himself, as is shown in Fig.9” 2

(In Fig. 9, word length is shown on the horizontal axis, and frequency, in number of words out of a thousand, on the vertical axis. For example, Mendenhall found that Marlowe and Shakespeare both used 2-letter words about 175 times out of a 1000, or 17.5% of the time).

Mendenhall’s study made no impact on Shakespearean scholarship, but it did energize a small number of proponents of the Marlowe theory and persuaded others that perhaps the Marlowe theory was worth further investigation.

The study sat idle for decades until Peter Farey decided to extend Mendenhall’s work.3 Farey chose a group of authors and, using electronic texts and word-counting software, performed a comparison of two large chunks of text by each writer, calculating how closely each writer agreed with him or herself. All of the writers agreed with themselves quite closely.4

Next, Farey moved on to a comparison of Shakespeare and Marlowe. Mendenhall had not noticed that authors' word-length frequency usage could change over time and genre, but Farey had discovered that the word-length frequency curves for Marlowe's earlier works differed from his later ones, and that Shakespeare's comedies differed from his non-comedies. The differences were subtle but significant. Accordingly, Farey chose to compare Marlowe's later plays with Shakespeare's histories and tragedies (since none of the Marlowe works are comedies) to eliminate the effects of time and genre. In his Marlowe-Shakespeare comparison, Farey found the agreement was statistically closer than any other writer had been with himself.5

The match between the two curves is astonishing. Farey’s method is easily reproducible, and it is accompanied by rigourous statistical analysis. As such it merits serious attention from mainstream scholarship. Like Mendenhall before him, Farey’s work has not penetrated the mainstream to any large extent. The standard scholarly explanation of the origins of Shakespeare’s style, that he began his career by imitating Marlowe, is invoked to explain why the word counts of the two respective writers are so similar. But even allowing for this, it would still require a strong coincidence for Shakespeare to obtain this degree of match with Marlowe, especially since it would have to have happened unconsciously.

My contribution to the debate was to look at the word counts of individual Shakespeare plays. Did they all look more or less alike? Using a counting method shared with me by Peter Farey, I counted the words of twenty-one Shakespeare plays, randomly chosen across the whole canon. I plotted all twenty-one curves, along with the overall average, on a single graph.

What is immediately striking is the amount of variation between individual plays. What this variation tells us is that the average curve for Shakespeare could have assumed many different shapes. There was no underlying principle forcing it to average out in this manner.6

The graph emphasizes how remarkable Mendenhall's and Farey’s results actually are. The possibility that two writers, both showing variability in individual plays, could arrive at the same average curve by chance, is exceedingly small. Mendenhall's and Farey’s studies provide compelling evidence for a Marlowe authorship of the Shakespeare plays.

Daryl Pinksen

© Daryl Pinksen, February 2009

Daryl Pinksen, a regular contributor to MSC and author of Marlowe's Ghost, is a Fellow of the School of Graduate Studies at Memorial University of Newfoundland.

(keywords: shakespeare and marlowe similarities and differences; marlowe shakespeare computer analysis))


Absolutely fascinating, great website!

thanks for this great article daryl

very interesting

thanks for the post. another thing stratfordians must answer.

I've enjoyed reading Mr. Pinksen's posts . . .

Great, great article (I first read this in Mr. Pinksen's Marlowe's Ghost).

a very interesting post.

The stylometric work - the Mendenhall results in particular - and the refinements of Peter Farey which show an almost perfect fit between the late Marlowe works (1588 to 1593) and the Shakespeare non-comedies, are truly remarkable. Nevertheless, these results do not prove that Marlowe wrote the Shakespearean works solo. Clearly, Shakespeare was profoundly interested in, and fascinated by, Marlowe’s pioneering developments in the utilisation of blank verse. He would have emulated Marlowe’s style, and would have sought to expand and develop it. Although a good case can be made for Marlowe/Shakespeare co-authorship of many of the early Shakespearean plays (perhaps involving collaboration, and/or Shakespeare’s revision of unpublished Marlowe works), it is less certain that the later plays (i.e., post 1600) reflect the direct participation of Marlowe, notwithstanding that his influence remains in them. The biggest stumbling block for acceptance of the non-orthodox view is the absence of direct (non-circumstantial) evidence for Marlowe’s survival beyond 1593, notwithstanding that there are considerable anomalies in the coronial report . Of course, if such evidence were to surface, then Marlovian theory would be more widely understood and accepted as a reasonable interpretation of the origins of most of the Shakespearean works. I don’t believe the story of Mr Le Doux is sufficiently compelling to fill that gap.
John Hermann

Peter said...

John Hermann said...The biggest stumbling block for acceptance of the non-orthodox view is the absence of direct (non-circumstantial) evidence for Marlowe’s survival beyond 1593, notwithstanding that there are considerable anomalies in the coronial report.

Hello again, John,

That is not the only place, of course, where a whole lot of anomalies have to be explained. For example, in my "The Riddle of the Monument" at you will find a multitude of anomalies concerning the Stratford monument. I have been able to find only one way in which every one of them can be explained. This is that the poem on the monument is in fact a riddle concealing the message that "Christofer Marley" (as Marlowe signed his own name) "is returned" and is in some way "in" the monument with Shakespeare. If you can find a better single explanation for all of them, please let me know!

Peter Farey

Peter Farey said...

The last time I tried to post a website address, it appeared severely truncated. I therefore obtained a "tiny URL" for it, and posted that. Having done so, I discovered that by some miracle the whole of the original address had now appeared in my first post, making me look somewhat stupid.

Looking at my post this time I again see only an address which stops half-way through, at "Riddle_of_the". As I post the following, it still does, but I guarantee that as soon as I post the "tinyurl" for it - - the blinking thing will correct itself once again!

Peter Farey