Publicación:
Robust Estimation of Population-Level Effects in Repeated-Measures NLP Experimental Designs

dc.contributor.authorBenito Santos, Alejandro
dc.contributor.authorGhajari Espinosa, Adrián
dc.contributor.authorFresno Fernández, Víctor Diego
dc.coverage.spatialViena
dc.coverage.temporal2025-07-27
dc.date.accessioned2025-12-03T14:35:50Z
dc.date.available2025-12-03T14:35:50Z
dc.date.issued2025-01-01
dc.descriptionThe registered version of this conference paper, first published in " In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 33076–33089, Vienna, Austria", is available online at the publisher's website: Association for Computational Linguistics, https//doi: 10.18653/v1/2025.acl-long.1586
dc.descriptionLa versión registrada de esta comunicación, publicada por primera vez en"In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 33076–33089, Vienna, Austria", está disponible en línea en el sitio web del editor: Association for Computational Linguistics, https//doi: 10.18653/v1/2025.acl-long.1586
dc.description.abstractNLP research frequently grapples with multiple sources of variability—spanning runs, datasets, annotators, and more—yet conventional analysis methods often neglect these hierarchical structures, threatening the reproducibility of findings. To address this gap, we contribute a case study illustrating how linear mixed-effects models (LMMs) can rigorously capture systematic language-dependent differences (i.e., population-level effects) in a population of monolingual and multilingual language models. In the context of a bilingual hate speech detection task, we demonstrate that LMMs can uncover significant population-level effects—even under low-resource (small-N) experimental designs—while mitigating confounds and random noise. By setting out a transparent blueprint for repeated-measures experimentation, we encourage the NLP community to embrace variability as a feature, rather than a nuisance, in order to advance more robust, reproducible, and ultimately trustworthy results.en
dc.description.versionversión publicada
dc.identifier.citationAlejandro Benito-Santos, Adrian Ghajari, and Víctor Fresno. 2025. Robust Estimation of Population-Level Effects in Repeated-Measures NLP Experimental Designs. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 33076–33089, Vienna, Austria. Association for Computational Linguistics.
dc.identifier.doihttps://doi.org/10.18653/v1/2025.acl-long.1586
dc.identifier.issn0736-587X
dc.identifier.urihttps://hdl.handle.net/20.500.14468/30995
dc.language.isoen
dc.publisherAssociation for Computational Linguistics
dc.relation.centerE.T.S. de Ingeniería Informática
dc.relation.congressAnnual Meeting of the Association for Computational Linguistics
dc.relation.departmentLenguajes y Sistemas Informáticos
dc.rightsinfo:eu-repo/semantics/openAccess
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/deed.es
dc.subject1203.23 Lenguajes de programación
dc.subject1203.07 Modelos causales
dc.titleRobust Estimation of Population-Level Effects in Repeated-Measures NLP Experimental Designsen
dc.typeactas de congresoes
dc.typeconference proceedingsen
dspace.entity.typePublication
relation.isAuthorOfPublicationc2a07fe0-c0d7-4a21-bdb8-e7d547e5b78b
relation.isAuthorOfPublicationdb5da577-2d78-45c3-9733-47368503a59c
relation.isAuthorOfPublication80cd3492-0ff8-4c8e-a904-2858623c7fc1
relation.isAuthorOfPublication.latestForDiscoveryc2a07fe0-c0d7-4a21-bdb8-e7d547e5b78b
Archivos
Bloque original
Mostrando 1 - 1 de 1
No hay miniatura disponible
Nombre:
(Benito-Santos et , 2025) 2025.acl-long.1586_VICTOR DIEGO FRESNO.pdf
Tamaño:
724.53 KB
Formato:
Adobe Portable Document Format
Bloque de licencias
Mostrando 1 - 1 de 1
No hay miniatura disponible
Nombre:
license.txt
Tamaño:
3.62 KB
Formato:
Item-specific license agreed to upon submission
Descripción: