ALBERT — All Library Books, journals and Electronic Records Telegrafenberg

Hits per page

hits 1 - 4 | 4 hits

Sorting

Unknown

Independent Component Analysis for Unraveling the Complexity of Cancer Omics Datasets (2019)

Sompairac, Nicolas ; Nazarov, Petr V. ; Czerwinska, Urszula ; [et al.]

Molecular Diversity Preservation International

In: International Journal of Molecular Sciences. 2019; 20(18): 4414. Published 2019 Sep 07. doi: 10.3390/ijms20184414.

add to mindlist on the mindlist

Details

Publication Date: 2019-09-07

Description: Independent component analysis (ICA) is a matrix factorization approach where the signals captured by each individual matrix factors are optimized to become as mutually independent as possible. Initially suggested for solving source blind separation problems in various fields, ICA was shown to be successful in analyzing functional magnetic resonance imaging (fMRI) and other types of biomedical data. In the last twenty years, ICA became a part of the standard machine learning toolbox, together with other matrix factorization methods such as principal component analysis (PCA) and non-negative matrix factorization (NMF). Here, we review a number of recent works where ICA was shown to be a useful tool for unraveling the complexity of cancer biology from the analysis of different types of omics data, mainly collected for tumoral samples. Such works highlight the use of ICA in dimensionality reduction, deconvolution, data pre-processing, meta-analysis, and others applied to different data types (transcriptome, methylome, proteome, single-cell data). We particularly focus on the technical aspects of ICA application in omics studies such as using different protocols, determining the optimal number of components, assessing and improving reproducibility of the ICA results, and comparison with other popular matrix factorization techniques. We discuss the emerging ICA applications to the integrative analysis of multi-level omics datasets and introduce a conceptual view on ICA as a tool for defining functional subsystems of a complex biological system and their interactions under various conditions. Our review is accompanied by a Jupyter notebook which illustrates the discussed concepts and provides a practical tool for applying ICA to the analysis of cancer omics datasets.

Print ISSN: 1661-6596

Electronic ISSN: 1422-0067

Topics: Chemistry and Pharmacology

Published by Molecular Diversity Preservation International

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

Unknown

Molecular Inverse Comorbidity between Alzheimer’s Disease and Lung Cancer: New Insights from Matrix Factorization (2019)

Greco, Alessandro ; Sanchez Valle, Jon ; Pancaldi, Vera ; [et al.]

Molecular Diversity Preservation International

In: International Journal of Molecular Sciences. 2019; 20(13): 3114. Published 2019 Jun 26. doi: 10.3390/ijms20133114.

add to mindlist on the mindlist

Details

Publication Date: 2019-06-26

Description: Matrix factorization (MF) is an established paradigm for large-scale biological data analysis with tremendous potential in computational biology. Here, we challenge MF in depicting the molecular bases of epidemiologically described disease–disease (DD) relationships. As a use case, we focus on the inverse comorbidity association between Alzheimer’s disease (AD) and lung cancer (LC), described as a lower than expected probability of developing LC in AD patients. To this day, the molecular mechanisms underlying DD relationships remain poorly explained and their better characterization might offer unprecedented clinical opportunities. To this goal, we extend our previously designed MF-based framework for the molecular characterization of DD relationships. Considering AD–LC inverse comorbidity as a case study, we highlight multiple molecular mechanisms, among which we confirm the involvement of processes related to the immune system and mitochondrial metabolism. We then distinguish mechanisms specific to LC from those shared with other cancers through a pan-cancer analysis. Additionally, new candidate molecular players, such as estrogen receptor (ER), cadherin 1 (CDH1) and histone deacetylase (HDAC), are pinpointed as factors that might underlie the inverse relationship, opening the way to new investigations. Finally, some lung cancer subtype-specific factors are also detected, also suggesting the existence of heterogeneity across patients in the context of inverse comorbidity.

Print ISSN: 1661-6596

Electronic ISSN: 1422-0067

Topics: Chemistry and Pharmacology

Published by Molecular Diversity Preservation International

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

Unknown

Minimum Spanning vs. Principal Trees for Structured Approximations of Multi-Dimensional Datasets (2020)

Chervov, Alexander ; Bac, Jonathan ; Zinovyev, Andrei

Molecular Diversity Preservation International

In: Entropy . 2020; 22(11): 1274. Published 2020 Nov 11. doi: 10.3390/e22111274.

add to mindlist on the mindlist

Details

Publication Date: 2020-11-11

Description: Construction of graph-based approximations for multi-dimensional data point clouds is widely used in a variety of areas. Notable examples of applications of such approximators are cellular trajectory inference in single-cell data analysis, analysis of clinical trajectories from synchronic datasets, and skeletonization of images. Several methods have been proposed to construct such approximating graphs, with some based on computation of minimum spanning trees and some based on principal graphs generalizing principal curves. In this article we propose a methodology to compare and benchmark these two graph-based data approximation approaches, as well as to define their hyperparameters. The main idea is to avoid comparing graphs directly, but at first to induce clustering of the data point cloud from the graph approximation and, secondly, to use well-established methods to compare and score the data cloud partitioning induced by the graphs. In particular, mutual information-based approaches prove to be useful in this context. The induced clustering is based on decomposing a graph into non-branching segments, and then clustering the data point cloud by the nearest segment. Such a method allows efficient comparison of graph-based data approximations of arbitrary topology and complexity. The method is implemented in Python using the standard scikit-learn library which provides high speed and efficiency. As a demonstration of the methodology we analyse and compare graph-based data approximation methods using synthetic as well as real-life single cell datasets.

Electronic ISSN: 1099-4300

Topics: Chemistry and Pharmacology , Physics

Published by Molecular Diversity Preservation International

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

Unknown

Robust and Scalable Learning of Complex Intrinsic Dataset Geometry via ElPiGraph (2020)

Albergante, Luca ; Mirkes, Evgeny ; Bac, Jonathan ; [et al.]

Molecular Diversity Preservation International

In: Entropy . 2020; 22(3): 296. Published 2020 Mar 04. doi: 10.3390/e22030296.

add to mindlist on the mindlist

Details

Publication Date: 2020-03-04

Description: Multidimensional datapoint clouds representing large datasets are frequently characterized by non-trivial low-dimensional geometry and topology which can be recovered by unsupervised machine learning approaches, in particular, by principal graphs. Principal graphs approximate the multivariate data by a graph injected into the data space with some constraints imposed on the node mapping. Here we present ElPiGraph, a scalable and robust method for constructing principal graphs. ElPiGraph exploits and further develops the concept of elastic energy, the topological graph grammar approach, and a gradient descent-like optimization of the graph topology. The method is able to withstand high levels of noise and is capable of approximating data point clouds via principal graph ensembles. This strategy can be used to estimate the statistical significance of complex data features and to summarize them into a single consensus principal graph. ElPiGraph deals efficiently with large datasets in various fields such as biology, where it can be used for example with single-cell transcriptomic or epigenomic datasets to infer gene expression dynamics and recover differentiation landscapes.

Electronic ISSN: 1099-4300

Topics: Chemistry and Pharmacology , Physics

Published by Molecular Diversity Preservation International

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

hits 1 - 4 | 4 hits