Persona: Ros Muñoz, Salvador
Cargando...
Dirección de correo electrónico
ORCID
0000-0001-6330-4958
Fecha de nacimiento
Proyectos de investigación
Unidades organizativas
Puesto de trabajo
Apellidos
Ros Muñoz
Nombre de pila
Salvador
Nombre
5 resultados
Resultados de la búsqueda
Mostrando 1 - 5 de 5
Publicación TEI-friendly annotation scheme for medieval named entities: a case on a Spanish medieval corpus(Springer Nature, 2021-02-27) Álvarez Mellado, Elena; Díez-Platas, María Luisa; Ruiz-Fabo, Pablo; Bermúdez, Helena; Ros Muñoz, Salvador; González-Blanco, ElenaMedieval documents are a rich source of historical data. Performing named-entity recognition (NER) on this genre of texts can provide us with valuable historical evidence. However, traditional NER categories and schemes are usually designed with modern documents in mind (i.e. journalistic text) and the general-domain NER annotation schemes fail to capture the nature of medieval entities. In this paper we explore the challenges of performing named-entity annotation on a corpus of Spanish medieval documents: we discuss the mismatches that arise when applying traditional NER categories to a corpus of Spanish medieval documents and we propose a novel humanist-friendly TEI-compliant annotation scheme and guidelines intended to capture the particular nature of medieval entities.Publicación Medieval Spanish (12th–15th centuries) named entity recognition and attribute annotation system based on contextual information(WILEY, 2021) Díez Platas, Mª Luisa; Ros Muñoz, Salvador; González-Blanco, Elena; Ruiz Fabo, Pablo; Álvarez Mellado, ElenaThe recognition of named entities in Spanish medieval texts presents great complexity, involving specific challenges: First, the complex morphosyntactic characteristics in proper-noun use in medieval texts. Second, the lack of strict orthographic standards. Finally, diachronic and geographical variations in Spanish from the 12th to 15th century. In this period, named entities usually appear as complex text structure. For example, it was frequent to add nicknames and information about the persons role in society and geographic origin. To tackle this complexity, named entity recognition and classification system has been implemented. The system uses contextual cues based on semantics to detect entities and assign a type. Given the occurrence of entities with attached attributes, entity contexts are also parsed to determine entity-type-specific dependencies for these attributes. Moreover, it uses a variant generator to handle the diachronic evolution of Spanish medieval terms from a phonetic and morphosyntactic viewpoint. The tool iteratively enriches its proper lexica, dictionaries, and gazetteers. The system was evaluated on a corpus of over 3,000 manually annotated entities of different types and periods, obtaining F1 scores between 0.74 and 0.87. Attribute annotation was evaluated for a person and role name attributes with an overall F1 of 0.75.Publicación Digital humanities in Spain: historical perspective and current scenario(Ediciones Profesionales de la Información, 2020-12-19) Toscano, Murizio; Rabadán, Aroa; Ros Muñoz, Salvador; González-Blanco, ElenaThe objective of this study was to provide the global community of interested scholars with an updated understanding of Digital Humanities in Spain, in terms of researchers and research centres, disciplines in- volved and research topics of interest, trends in digital resources development, main funding bodies and the evolution of their investment since the early nineties. One of the characteristics that differentiates this study from previous approaches is the information used to carry out the research. It combines large datasets of publicly available data from trusted sources with a handpicked selection of records grouping information scattered over the Web. Most of the evidence detected by other studies has been numerically confirmed. At the same time, the new metrics and values established constitute a reference base for monitoring the future evolution of the discipline and thus favour comparisons. Half of the researchers were found to be affiliated to only nine institutions, whereas the other half of them were scattered across 84 locations. Department affiliation showed a varied pattern of the different degrees of specialization in each institution. Although the major historic role played by Philology was confirmed, the rising interest of other areas of the Humanities and Social Science produces a wider picture, which helped to identify five large clusters of research topics, centred on major disciplines. The quantitative analysis of funding, a dimension almost unexplored in the Humanities, proved to be a valuable way to assess the discipline and its historical evolution. In fact, it revealed interesting trends that led to our proposal of a three-phase periodization in the consolidation of Digital Humanities in Spain. The paper concludes with a set of recommendations regarding how to successfully deal with issues that can harm the future development of this research area and the role that Spanish researchers can play in the international context.Publicación Automated metric analysis of Spanish Poetry: two complementary approaches(IEEE Xplore, 2021-03-30) Marco Remón, Guillermo; De la Rosa, Javier; Gonzalo Arroyo, Julio Antonio; Ros Muñoz, Salvador; González-Blanco, ElenaThe automatic metric analysis (commonly referred to as scansion) of Spanish poetry is not a trivial problem since it combines the nuances of the language, the different poetic traditions related to melodic patterns, and the personal stylistic preferences and intentions of the author. In this paper, we explore two alternative algorithmic approaches tailored to different applications scenarios. The first approach, Rantanplan, is a rule-based method that consists of four Natural Language Processing modules that work together to perform scansion and other related analysis: Part of Speech tagging, syllabification, stress assignment, and metrical adjustment. The second approach, Jumper, explores the possibility of performing scansion without syllabification, with a twofold purpose: to minimize the errors propagated in different parts of the linguistic processing pipeline (including the syllabification step), and to improve the efficiency of the process. Both systems outperform the state of the art and provide either a more informative solution (suitable, for instance, for teaching purposes) or a more efficient processing (when a correct scansion is all the linguistic knowledge required, as in scholar philological studies). The combined use of both systems turns out to provide a practical tool to clean-up manual annotation errors in corpora.Publicación Exploring Spanish contemporary song lyrics through digital humanities methods: some thematic and structural properties(Oxford Academic, 2021-11-08) Hernández Lorenzo, Laura; Díaz Paredes, Aitor; Pérez Pozo, Álvaro; Ros Muñoz, Salvador; González-Blanco, ElenaIn this article, we present a quantitative study with Digital Humanities methods on an extensive corpus of Spanish contemporary song lyrics, a type of text related to poetry. On the one hand, poetry and songs not only have been connected since their origins, but they share some characteristics, such as the division in lines or the use of rhymes. On the other hand, Digital Humanities quantitative approaches have already been applied to poetry, but we still lack a study in the same fashion for lyrics. Taking advantage of the advances in automatic scansion and syllabification, rhyme detection, or Topic Modeling technologies, the present study analyzes Spanish contemporary song lyrics’ main thematic and structural properties, comparing them with those used in poetic texts. Our results offered new insights into the characteristics of the analyzed texts and their connections to poetic ones.