Mining Bestselling Novels

Explore the content of fiction novels - e.g. what’s it about?

This app allows users to explore both manuscripts and published novels by uploading texts in epub or pdf format. In less than a minute, it can help answer questions such as:

  • what’s it about? who are the main characters?
  • Is it a potential bestseller?
  • Is the plot more like an intense page-turner or slow-paced?
  • which famous novels are similar in semantics and themes?

This is work in progress, but if you’re curious about this project, please get in touch!

Some of the main features are shown in the images below.

Meta data, similar books and bestseller score


Readability and plot curves


Named entity recognition


Entity co-occurrence


Tools and methods
  • R
  • Shiny
  • H2O
  • D3.js
  • Named Entity Recognition
  • Topic Modelling
  • (Text) feature engineering