Marco Remón, GuillermoGonzalo Arroyo, Julio AntonioFresno Fernández, Víctor Diego2025-12-032025-12-032025-01-01Guillermo Marco, Julio Gonzalo, and Víctor Fresno. 2025. The Reader is the Metric: How Textual Features and Reader Profiles Explain Conflicting Evaluations of AI Creative Writing. In Findings of the Association for Computational Linguistics: ACL 2025, pages 25432–25449, Vienna, Austria. Association for Computational Linguistics979-8-89176-256-5https://doi.org/10.18653/v1/2025.findings-acl.1304https://hdl.handle.net/20.500.14468/30996The registered version of this conference paper, first published in "Findings of the Association for Computational Linguistics: ACL 2025, pages 25432–25449, Vienna, Austria", is available online at the publisher's website: Association for Computational Linguistics, https//doi: 10.18653/v1/2025.findings-acl.1304La versión registrada de esta comunicación, publicada por primera vez en "Findings of the Association for Computational Linguistics: ACL 2025, pages 25432–25449, Vienna, Austria", está disponible en línea en el sitio web del editor: Association for Computational Linguistics, https//doi: 10.18653/v1/2025.findings-acl.1304Recent studies comparing AI-generated and human-authored literary texts have produced conflicting results: some suggest AI already surpasses human quality, while others argue it still falls short. We start from the hypothesis that such divergences can be largely explained by genuine differences in how readers interpret and value literature, rather than by an intrinsic quality of the texts evaluated. Using five public datasets (1,471 stories, 101 annotators including critics, students, and lay readers), we (i) extract 17 reference-less textual features (e.g., coherence, emotional variance, average sentence length...); (ii) model individual reader preferences, deriving feature importance vectors that reflect their textual priorities; and (iii) analyze these vectors in a shared “preference space”. Reader vectors cluster into two profiles: _surface-focused readers_ (mainly non-experts), who prioritize readability and textual richness; and _holistic readers_ (mainly experts), who value thematic development, rhetorical variety, and sentiment dynamics. Our results quantitatively explain how measurements of literary quality are a function of how text features align with each reader’s preferences. These findings advocate for reader-sensitive evaluation frameworks in the field of creative text generation.eninfo:eu-repo/semantics/openAccess5701.04 Lingüística informatizada1203.04 Inteligencia artificialThe Reader is the Metric: How Textual Features and Reader Profiles Explain Conflicting Evaluations of AI Creative Writingactas de congreso