Automatically understanding the plot of novels is important both for informing literary scholarship and applications such as book summarization or recommendation. Humans select and recommend novels based on a variety of preferences (such as mood and types of featured characters, or their relations). We present a deep recurrent autoencoder model that learns richly structured multi-view plot representations from raw book text, approximating such preferences. While various models have addressed the task of story understanding, their evaluation has remained largely intrinsic and qualitative. We propose a principled and scalable framework leveraging expert-provided semantic tags (e.g., mystery, pirates) to evaluate plot representations in an extrinsic fashion, assessing their ability to produce locally coherent groupings of novels (micro-clusters) in model space. We show that our learnt multi-view representations yield better micro-clusters than less structured representations; and that they are interpretable, and thus useful for further literary analysis or labelling of the emerging micro-clusters.
Lea is a postdoc at the University of Edinburgh, and is currently a visiting scholar in the Language and Cognition lab at Stanford University. Previously she obtained a PhD from the University of Edinburgh, and interned at Amazon Machine Learning, Berlin. In her research she develops machine learning methods and computational models to gain a deeper understanding of the structure and dynamics of meaning representations both in language and in humans.