Reading Between the Lines: Using Math to Uncover Hidden Patterns in Books

10.08.23 08:06 AM Comment(s) By Ines Almeida

An ‘ousiogram’ (Dodds et al., 2021) displaying power and danger scores for a subset of 14,499 unique words appearing in Terry Pratchett’s 41-book Discworld series.

Books may seem like straightforward stories, but researchers are finding mathematical patterns hidden in the text. By tracking how words are used over the course of a book in minute detail, they can reveal new insights into plot, emotion, and structure that are not visible to the naked eye.


The researchers started by scoring a large number of words based on their emotional meaning. For example, positive words like "love" scored higher while negative words like "war" scored lower. They used a framework called "ousiometrics" which boils down emotions to two key dimensions: power and danger. Power relates to agency, confidence, and positivity. Danger relates to emotional uncertainty, negativity, and aggression.


They then took thousands of books and broke them down into short segments of 50 words each. For each segment, they calculated the average power and danger scores based on the words present. This turned each book into a rolling wave of numbers, with peaks representing more emotional sections and valleys as more neutral parts.


Short books generally showed a steady wave pattern while long books had more fluctuations in emotion over the course of the text. Surprisingly, when they zoomed in on long books they found the fluctuating highs and lows had a consistent length of a few thousand words. This matches the typical length of chapters in published fiction.


To study the patterns further, the researchers used a technique called empirical mode decomposition that breaks down fluctuations in data into distinct components, much like musical notes make up chords. The text segments were also compared to "shuffled" versions of the books with random word order. The real books differed from the random versions after a certain decomposition level, indicating that the fluctuations were not random but reflected an underlying structure.


These findings suggest longer books have a wave-like shape that is closer to collections of short stories or chapters. The emotional ups and downs of the text cycle on a scale of thousands of words, perhaps reflecting how long the human brain can comfortably process a complex narrative before needing a reset. Shorter books lacked these larger fluctuations.


While we intuitively understand how passages evoke certain moods, the researchers were able to quantify the pacing of emotional highs and lows mathematically. Their work helps confirm the existence of nested patterns in writing - punctuation gives phrases, paragraphs offer local structure, chapters provide mid-level segments, and over the full book arcs emerge.


So the next time you open a book, think about the hidden rhythms inside that subtly influence your experience. The feelings evoked in the story may follow mathematical waves as you steadily progress from cover to cover. This emerging field opens up new ways of appreciating the art and science of expert storytelling.


Sources:

A decomposition of book structure through ousiometric fluctuations in cumulative word-time

Share -