The focus of the book is what are the actual features
that distinguish the bestsellers from the non-bestsellers?
At the moment in London where I am right now,
The Girl on the Train is still a book
that everybody is reading.
And I wondered why that particular text and not another one,
and wondered if we could answer this question
using text mining methods.
I did the thrust of the computational work,
the code writing and the analysis,
and to that Jodie brought her really deep
interpretive skills and her really wide understanding
of contemporary literature.
He definitely brought the tech expertise
to complement my publishing background.
The machine can basically pick out
which are the bestsellers and which are not 80% of the time.
And when we got this 80% validation of the hypothesis,
we were really, really stunned with that,
and pleased, of course,
but it really led to then five years of further research
to deepen the question and then try and explain this data.
It's definitely not a how-to book.
This is not a formula book.
You can't just plug in what we reveal
and write a bestseller.
We did find, for example, that the emotional curve,
the curves that we created that represent more or less
the emotional situation that's going on in the text
was very, very important.
In a sense, the machine helps us to do an even closer kind
of close reading than we're accustomed to doing
as literary scholars.
