Publicación:
Patterns of differential expression by association in omic data using a new measure based on ensemble learning

Cargando...
Miniatura
Fecha
2023-11-23
Editor/a
Director/a
Tutor/a
Coordinador/a
Prologuista
Revisor/a
Ilustrador/a
Derechos de acceso
info:eu-repo/semantics/openAccess
Título de la revista
ISSN de la revista
Título del volumen
Editor
De Gruyter
Proyectos de investigación
Unidades organizativas
Número de la revista
Resumen
The ongoing development of high-throughput technologies is allowing the simultaneous monitoring of the expression levels for hundreds or thousands of biological inputs with the proliferation of what has been coined as omic data sources. One relevant issue when analyzing such data sources is concerned with the detection of differential expression across two experimental conditions, clinical status or two classes of a biological outcome. While a great deal of univariate data analysis approaches have been developed to address the issue, strategies for assessing interaction patterns of differential expression are scarce in the literature and have been limited to ad hoc solutions. This paper contributes to the problem by exploiting the facilities of an ensemble learning algorithm like random forests to propose a measure that assesses the differential expression explained by the interaction of the omic variables so subtle biological patterns may be uncovered as a result. The out of bag error rate, which is an estimate of the predictive accuracy of a random forests classifier, is used as a by-product to propose a new measure that assesses interaction patterns of differential expression. Its performance is studied in synthetic scenarios and it is also applied to real studies on SARS-CoV-2 and colon cancer data where it uncovers associations that remain undetected by other methods. Our proposal is aimed at providing a novel approach that may help the experts in biomedical and life sciences to unravel insightful interaction patterns that may decipher the molecular mechanisms underlying biological and clinical outcomes.
Descripción
The registered version of this article, first published in “Statistical Applications in Genetics and Molecular Biology, vol. 22, 2023", is available online at the publisher's website: De Gruyter, https://doi.org/10.1515/sagmb-2023-0009 La versión registrada de este artículo, publicado por primera vez en “Statistical Applications in Genetics and Molecular Biology, vol. 22, 2023", está disponible en línea en el sitio web del editor: De Gruyter, https://doi.org/10.1515/sagmb-2023-0009
Categorías UNESCO
Palabras clave
omic data, differential expression, association patterns, ensemble learning, random forests, out of bag error rate
Citación
Jorge M Arevalillo, Raquel Martín-Arevalillo (2023). Patterns of differential expression by association in omic data using a new measure based on ensemble learning. Statistical Applications in Genetics and Molecular Biology. 22 (1): 20230009. https://doi.org/10.1515/sagmb-2023-0009
Centro
Facultad de Ciencias
Departamento
Estadística, Investigación Operativa y Cálculo Numérico
Grupo de investigación
Grupo de innovación
Programa de doctorado
Cátedra