The LA Times’ Jacket Copy blog — tagline “Books, Authors and All Things Bookish” — recently alerted me to one of the most enjoyable “quickie time-waster websites” I’ve found in quite some time: I Write Like.
Based on “a Bayesian classifier” — the programming backbone for a number of ordinary, run-of-the-mill spam filters — the website is disarmingly simple: a blank box for “cutting and pasting,” an “Analyze” button, and a brief assurance that the website does not retain any of the material submitted.
One’s reward for a moment or two of cutting and pasting? The author that I Write Like’s Bayesian classifier identifies as most stylistically similar to your own bit of text.
This is basically how “I Write Like” works on my side: I feed it with “Frankenstein” and tell it, “This is Mary Shelley. Recognize works similar to this as Mary Shelley.” Of course, the algorithm is slightly different from the one used to detect spam, because it takes into account more stylistic features of the text, such as the number of words in sentences, the number of commas, semicolons, and whether the sentence is a direct speech or a quotation.
The website’s creator says that “the current version includes 50 writers. First versions included authors from the bestsellers list on Wikipedia, top downloaded books from The Gutenberg Project (a public library of out-of-copyright books), and the ones I could remember.”
I entered a number of my reviews and blog posts from here at InsideCatholic, and was almost universally informed that “I write like H.P. Lovercraft.” (The only exception was a single article that was, apparently, in the style of James Joyce. After a moment of horror, I rejected the result as an outlier.)
While amusing, the algorithm does not seem entirely reliable. Entering a number of Mark Twain’s short stories resulted in a 100% recognition rate; Mark Twain, strangely enough, writes very much “like Mark Twain.” Dicken’s “A Christmas Carol,” on the other hand, is written in the style of Stephen King.
Perhaps a more complete analysis is necessary.