Publicación:
Is anisotropy really the cause of BERT embeddings not being semantic?

dc.contributor.authorFuster Baggetto, Alejandro
dc.contributor.authorFresno Fernández, Víctor Diego
dc.coverage.spatialAbu Dhabi
dc.coverage.temporal2022-12-11
dc.date.accessioned2025-12-03T15:19:35Z
dc.date.available2025-12-03T15:19:35Z
dc.date.issued2022-01-01
dc.descriptionThe registered version of this conference paper, first published in "Findings of the Association for Computational Linguistics: EMNLP 2022, pages 4271–4281, Abu Dhabi, United Arab Emirates", is available online at the publisher's website: Association for Computational Linguistics, https://doi.org/10.18653/v1/2022.findings-emnlp.314
dc.descriptionLa versión registrada de esta comunicación, publicada por primera vez en "Findings of the Association for Computational Linguistics: EMNLP 2022, pages 4271–4281, Abu Dhabi, United Arab Emirates", está disponible en línea en el sitio web del editor: Association for Computational Linguistics, https://doi.org/10.18653/v1/2022.findings-emnlp.314
dc.description.abstractIn this paper we conduct a set of experiments aimed to improve our understanding of the lack of semantic isometry in BERT, i.e. the lack of correspondence between the embedding and meaning spaces of its contextualized word representations. Our empirical results show that, contrary to popular belief, the anisotropy is not the root cause of the poor performance of these contextual models’ embeddings in semantic tasks. What does affect both the anisotropy and semantic isometry is a set of known biases: frequency, subword, punctuation, and case. For each one of them, we measure its magnitude and the effect of its removal, showing that these biases contribute but do not completely explain the phenomenon of anisotropy and lack of semantic isometry of these contextual language models.en
dc.description.versionversión publicada
dc.identifier.citationAlejandro Fuster Baggetto and Victor Fresno. 2022. Is anisotropy really the cause of BERT embeddings not being semantic?. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 4271–4281, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
dc.identifier.doihttps://doi.org/10.18653/v1/2022.findings-emnlp.314
dc.identifier.isbn978-1-959429-43-2
dc.identifier.urihttps://hdl.handle.net/20.500.14468/30998
dc.language.isoen
dc.publisherAssociation for Computational Linguistics
dc.relation.centerE.T.S. de Ingeniería Informática
dc.relation.congressConference on Empirical Methods in Natural Language Processing
dc.relation.departmentLenguajes y Sistemas Informáticos
dc.rightsinfo:eu-repo/semantics/openAccess
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/deed.es
dc.subject1203.04 Inteligencia artificial
dc.titleIs anisotropy really the cause of BERT embeddings not being semantic?en
dc.typeactas de congresoes
dc.typeconference proceedingsen
dspace.entity.typePublication
relation.isAuthorOfPublication80cd3492-0ff8-4c8e-a904-2858623c7fc1
relation.isAuthorOfPublication.latestForDiscovery80cd3492-0ff8-4c8e-a904-2858623c7fc1
Archivos
Bloque original
Mostrando 1 - 1 de 1
No hay miniatura disponible
Nombre:
(Fuster et al, 2022) 2022.findings-emnlp.314_VICTOR DIEGO FRESNO.pdf
Tamaño:
553.92 KB
Formato:
Adobe Portable Document Format
Bloque de licencias
Mostrando 1 - 1 de 1
No hay miniatura disponible
Nombre:
license.txt
Tamaño:
3.62 KB
Formato:
Item-specific license agreed to upon submission
Descripción: