TLDR: A study by Honna, Murayama, and Matsui investigates how text structure, defined by the order and timing of functional blocks, correlates with success across novels, Wikipedia, research papers, and movies. They found that structural principles vary by medium: novels and academic papers often show reverse Anna Karenina Principle (AKP) in order (diverse success, homogeneous failure), while Wikipedia exhibits AKP (narrow success, diverse failure). Movies are genre-dependent but show ordered patterns in transition position for successful films. The research highlights that success depends on medium-specific structural constraints, with information-oriented texts relying on order and narrative-oriented texts on position.
A recent study delves into the fundamental question of what makes written communication successful by examining the underlying structural principles across various forms of text. Researchers Shinichi Honna, Taichi Murayama, and Akira Matsui explored how the arrangement and timing of information within novels, online encyclopedias, research papers, and movies correlate with their reception and impact.
The study introduces a novel approach to analyze text structure by breaking down documents into sequences of “functional blocks.” These blocks are derived from surface cues like word profiles, syntactic patterns, and emotional tone, allowing for a language-agnostic representation. The researchers then developed two key metrics: “transition order,” which looks at the sequence of these functional blocks, and “transition position,” which examines where these shifts occur along the text’s timeline. This innovative methodology allowed for a quantitative, corpus-scale analysis of how structural similarity aligns with external indicators of success, such as reader bookmarks for novels, page views for Wikipedia, citations for academic papers, and box-office revenue for movies.
The Anna Karenina Principle and Text Structure
At the heart of the investigation were four competing hypotheses about success and failure in text structure: the Anna Karenina Principle (AKP), its reverse, an ordered type, and a noisy type. The AKP suggests that success requires meeting a few specific conditions, while failure can manifest in many ways. Conversely, the reverse AKP posits that successful forms are diverse, while failures converge on a limited set of common patterns. An ordered type implies that both high and low-evaluation texts converge but follow different structural paths. Lastly, a noisy type indicates no clear convergence for either successful or unsuccessful texts, meaning structure doesn’t align with evaluation.
Also Read:
- VideoAgent: Crafting Personalized Scientific Videos from Research Papers
- Building Culturally Competent AI: Introducing the CultureSynth Framework
Diverse Findings Across Media
The study’s findings reveal that no single structural principle governs all forms of communication; instead, patterns vary significantly by medium and the dimension being measured (order versus position).
-
Online Novels: These texts displayed a “reverse AKP” pattern in their transition order. This means that lower-evaluation novels tended to follow repetitive, conventional structural sequences, while highly successful novels branched out into more diverse and varied structural trajectories. In terms of transition position, novels showed a weaker ordered pattern, with successful narratives placing key structural shifts at deliberate moments.
-
Wikipedia Articles: Online encyclopedias exhibited a combination of AKP and ordered patterns. High-evaluation Wikipedia articles converged on a narrow, coherent structural pathway in their order of functional blocks, while lower-rated articles were more scattered. In terms of position, successful articles concentrated structural transitions near the middle of the text, distinguishing them from lower-evaluation articles with more dispersed placements.
-
Academic Papers (arXiv): Research papers, particularly those in mathematics and physics, aligned with the “reverse AKP” in their transition order. Low-citation papers often converged on formulaic orderings, while highly cited papers diverged structurally. However, in terms of transition position, academic papers remained largely “noisy.” This lack of positional differentiation is attributed to the rigid IMRAD (Introduction, Methods, Results, and Discussion) format, which imposes strong constraints on where rhetorical shifts occur.
-
Movies: Overall, movies were “noisy” in their transition order, meaning both high and low-grossing films scattered without forming stable clusters. However, when analyzed by genre, interesting patterns emerged: mystery films showed an AKP pattern, science fiction films displayed reverse AKP, and others remained noisy. Crucially, in terms of transition position, successful films tended towards an “ordered” pattern, concentrating decisive structural transitions later in the narrative, aligning with cognitive-narrative accounts of popular cinema.
These results underscore that structural alignment with evaluation depends on both the specific dimension of structure (order or position) and the medium itself. Information-oriented texts like Wikipedia and research papers often hinge on the sequence of information, while narrative-oriented texts such as novels and movies rely more on the timing of structural shifts. External templates, like the IMRAD format in academic papers, can significantly suppress structural variation and eliminate evaluative differentiation.
The study concludes that success in communication is not governed by a universal principle but by medium-specific structural constraints. This perspective offers a temporal reinterpretation of classical structural models, suggesting that the impact of textual elements depends less on their mere presence and more on when they appear relative to the reader’s or viewer’s cognitive timeline. For more details, you can read the full research paper here.


