ALBERT — All Library Books, journals and Electronic Records Telegrafenberg

1

Unknown

The Power of Word-Frequency Based Alignment-Free Functions: a Comprehensive Large-Scale Experimental Analysis (2021)

Cattaneo, Giuseppe ; Ferraro Petrillo, Umberto ; Giancarlo, Raffaele ; [et al.]

Oxford University Press

In: Bioinformatics. 2021; Published 2021 Oct 30. doi: 10.1093/bioinformatics/btab747. [early online release]

add to mindlist on the mindlist

Details

Publication Date: 2021-10-30

Description: Motivation Alignment-free (AF) distance/similarity functions are a key tool for sequence analysis. Experimental studies on real datasets abound and, to some extent, there are also studies regarding their control of false positive rate (Type I error). However, assessment of their power, i.e., their ability to identify true similarity, has been limited to some members of the D2 family. The corresponding experimental studies have concentrated on short sequences, a scenario no longer adequate for current applications, where sequence lengths may vary considerably. Such a State of the Art is methodologically problematic, since information regarding a key feature such as power is either missing or limited. Results By concentrating on a representative set of word-frequency based AF functions, we perform the first coherent and uniform evaluation of the power, involving also Type I error for completeness. Two Alternative models of important genomic features (CIS Regulatory Modules and Horizontal Gene Transfer), a wide range of sequence lengths from a few thousand to millions, and different values of k have been used. As a result, we provide a characterization of those AF functions that is novel and informative. Indeed, we identify weak and strong points of each function considered, which may be used as a guide to choose one for analysis tasks. Remarkably, of the fifteen functions that we have considered, only four stand out, with small differences between small and short sequence length scenarios. Finally, in order to encourage the use of our methodology for validation of future AF functions, the Big Data platform supporting it is public. Availability The software is available at: https://github.com/pipp8/power_statistics Supplementary information Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

2

Unknown

CoCoPRED: coiled-coil protein structural feature prediction from amino acid sequence using deep neural networks (2021)

Feng, Shi-Hao ; Xia, Chun-Qiu ; Shen, Hong-Bin

Oxford University Press

In: Bioinformatics. 2021; Published 2021 Oct 30. doi: 10.1093/bioinformatics/btab744. [early online release]

add to mindlist on the mindlist

Details

Publication Date: 2021-10-30

Description: Motivation Coiled-coil is composed of two or more helices that are wound around each other. It widely exists in proteins and has been discovered to play a variety of critical roles in biology processes. Generally, there are three types of structural features in coiled-coil: coiled-coil domain (CCD), oligomeric state, and register. However, most of the existing computational tools only focus on one of them. Results Here, we describe a new deep learning model, CoCoPRED, which is based on convolutional layers, bidirectional long short-term memory, and attention mechanism. It has three networks, i.e., CCD network, oligomeric state network, and register network, corresponding to the three types of structural features in coiled-coil. This means CoCoPRED has the ability of fulfilling comprehensive prediction for coiled-coil proteins. Through the 5-fold cross-validation experiment, we demonstrate that CoCoPRED can achieve better performance than the state-of-the-art models on both CCD prediction and oligomeric state prediction. Further analysis suggests the CCD prediction may be a performance indicator of the oligomeric state prediction in CoCoPRED. The attention heads in CoCoPRED indicate that registers a, b, and e are more crucial for the oligomeric state prediction. Availability CoCoPRED is available at http://www.csbio.sjtu.edu.cn/bioinf/CoCoPRED. Supplementary information Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

3

Unknown

AOP-helpFinder webserver: a tool for comprehensive analysis of the literature to support adverse outcome pathways development (2021)

Jornod, Florence ; Jaylet, Thomas ; Blaha, Ludek ; [et al.]

Oxford University Press

In: Bioinformatics. 2021; Published 2021 Oct 30. doi: 10.1093/bioinformatics/btab750. [early online release]

add to mindlist on the mindlist

Details

Publication Date: 2021-10-30

Description: Motivation Adverse Outcome Pathways (AOPs) are a conceptual framework developed to support the use of alternative toxicology approaches in the risk assessment. AOPs are structured linear organizations of existing knowledge illustrating causal pathways from the initial molecular perturbation triggered by various stressors, through key events (KEs) at different levels of biology, to the ultimate health or ecotoxicological adverse outcome. Results Artificial intelligence can be used to systematically explore available toxicological data that can be parsed in the scientific literature. Recently a tool called AOP-helpFinder was developed to identify associations between stressors and KEs supporting thus documentation of AOPs. To facilitate the utilization of this advanced bioinformatics tool by the scientific and the regulatory community, a webserver was created. The proposed AOP-helpFinder webserver uses better performing version of the tool which reduces the need for manual curation of the obtained results. As an example, the server was successfully applied to explore relationships of a set of endocrine disruptors with metabolic-related events. The AOP-helpFinder webserver assists in a rapid evaluation of existing knowledge stored in the PubMed database, a global resource of scientific information, to build AOPs and Adverse Outcome Networks (AONs) supporting the chemical risk assessment. Availability and implementation AOP-helpFinder is available at http://aop-helpfinder.u-paris-sciences.fr/index.php

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

4

Unknown

Importance-Penalized Joint Graphical Lasso (IPJGL): differential network inference via GGMs (2021)

Leng, Jiacheng ; Wu, Ling-Yun

Oxford University Press

In: Bioinformatics. 2021; Published 2021 Oct 30. doi: 10.1093/bioinformatics/btab751. [early online release]

add to mindlist on the mindlist

Details

Publication Date: 2021-10-30

Description: Motivation Differential network inference is a fundamental and challenging problem to reveal gene interactions and regulation relationships under different conditions. Many algorithms have been developed for this problem; however, they do not consider the differences between the importance of genes, which may not fit the real-world situation. Different genes have different mutation probabilities, and the vital genes associated with basic life activities have less fault tolerance to mutation. Equally treating all genes may bias the results of differential network inference. Thus, it is necessary to consider the importance of genes in the models of differential network inference. Results Based on the Gaussian graphical model with adaptive gene importance regularization, we develop a novel importance-penalized joint graphical Lasso method, IPJGL, for differential network inference. The presented method is validated by the simulation experiments as well as the real datasets. Furthermore, to precisely evaluate the results of differential network inference, we propose a new metric named APC2 for the differential levels of gene pairs. We apply IPJGL to analyze the TCGA colorectal and breast cancer datasets and find some candidate cancer genes with significant survival analysis results, including SOST for colorectal cancer and RBBP8 for breast cancer. We also conduct further analysis based on the interactions in the Reactome database and confirm the utility of our method. Availability R source code of importance-penalized joint graphical lasso is freely available at https://github.com/Wu-Lab/IPJGL. Supplementary information Supplementary materials are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

5

Unknown

LocalNgsRelate: a software tool for inferring IBD sharing along the genome between pairs of individuals from low-depth NGS data (2021)

Severson, Alissa L ; Korneliussen, Thorfinn Sand ; Moltke, Ida

Oxford University Press

In: Bioinformatics. 2021; Published 2021 Oct 28. doi: 10.1093/bioinformatics/btab732. [early online release]

add to mindlist on the mindlist

Details

Publication Date: 2021-10-28

Description: Motivation Inference of Identity-by-descent (IBD) sharing along the genome between pairs of individuals has important uses. But all existing inference methods are based on genotypes, which is not ideal for low-depth Next Generation Sequencing (NGS) data from which genotypes can only be called with high uncertainty. Results We present a new probabilistic software tool, LocalNgsRelate, for inferring IBD sharing along the genome between pairs of individuals from low-depth NGS data. Its inference is based on genotype likelihoods instead of genotypes, and thereby it takes the uncertainty of the genotype calling into account. Using real data from the 1000 Genomes project, we show that LocalNgsRelate provides more accurate IBD inference for low-depth NGS data than two state-of-the-art genotype based methods, Albrechtsen et al. (2009) and hap-IBD. We also show that the method works well for NGS data down to a depth of 2X. Availability LocalNgsRelate is freely available at https://github.com/idamoltke/LocalNgsRelate Supplementary Data Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

6

Unknown

DeepMP: a deep learning tool to detect DNA base modifications on Nanopore sequencing data (2021)

Bonet, Jose ; Chen, Mandi ; Dabad, Marc ; [et al.]

Oxford University Press

In: Bioinformatics. 2021; Published 2021 Oct 28. doi: 10.1093/bioinformatics/btab745. [early online release]

add to mindlist on the mindlist

Details

Publication Date: 2021-10-28

Description: Motivation DNA Methylation plays a key role in a variety of biological processes. Recently, Nanopore long-read sequencing has enabled direct detection of these modifications. As a consequence, a range of computational methods have been developed to exploit Nanopore data for methylation detection. However, current approaches rely on a human-defined threshold to detect the methylation status of a genomic position and are not optimized to detect sites methylated at low frequency. Furthermore, most methods employ either the Nanopore signals or the basecalling errors as the model input and do not take advantage of their combination. Results Here we present DeepMP, a convolutional neural network (CNN)-based model that takes information from Nanopore signals and basecalling errors to detect whether a given motif in a read is methylated or not. Besides, DeepMP introduces a threshold-free position modification calling model sensitive to sites methylated at low frequency across cells. We comprehensively benchmarked DeepMP against state-of-the-art methods on E. coli, human and pUC19 datasets. DeepMP outperforms current approaches at read-based and position-based methylation detection across sites methylated at different frequencies in the three datasets. Availability DeepMP is implemented and freely available under MIT license at https://github.com/pepebonet/DeepMP Supplementary information Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

7

Unknown

SigTools: Exploratory Visualization For Genomic Signals (2021)

Masoumi, Shohre ; Libbrecht, Maxwell W ; Wiese, Kay C

Oxford University Press

In: Bioinformatics. 2021; Published 2021 Oct 28. doi: 10.1093/bioinformatics/btab742. [early online release]

add to mindlist on the mindlist

Details

Publication Date: 2021-10-28

Description: Motivation With the advancement of sequencing technologies, genomic data sets are constantly being expanded by high volumes of different data types. One recently introduced data type in genomic science is genomic signals, which are usually short-read coverage measurements over the genome. To understand and evaluate the results of such studies, one needs to understand and analyze the characteristics of the input data. Results SigTools is an R-based genomic signals visualization package developed with two objectives: 1) to facilitate genomic signals exploration in order to uncover insights for later model training, refinement, and development by including distribution and autocorrelation plots. 2) to enable genomic signals interpretation by including correlation, and aggregation plots. In addition, our corresponding web application, SigTools-Shiny, extends the accessibility scope of these modules to people who are more comfortable working with graphical user interfaces instead of command-line tools. Availability SigTools source code, installation guide, and manual is freely available on http://github.com/shohre73.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

8

Unknown

CpG Transformer for imputation of single-cell methylomes (2021)

De Waele, Gaetan ; Clauwaert, Jim ; Menschaert, Gerben ; [et al.]

Oxford University Press

In: Bioinformatics. 2021; Published 2021 Oct 28. doi: 10.1093/bioinformatics/btab746. [early online release]

add to mindlist on the mindlist

Details

Publication Date: 2021-10-28

Description: Motivation The adoption of current single-cell DNA methylation sequencing protocols is hindered by incomplete coverage, outlining the need for effective imputation techniques. The task of imputing single-cell (methylation) data requires models to build an understanding of underlying biological processes. Results We adapt the transformer neural network architecture to operate on methylation matrices through combining axial attention with sliding window self-attention. The obtained CpG Transformer displays state-of-the-art performances on a wide range of scBS-seq and scRRBS-seq datasets. Furthermore, we demonstrate the interpretability of CpG Transformer and illustrate its rapid transfer learning properties, allowing practitioners to train models on new datasets with a limited computational and time budget. Availability and Implementation CpG Transformer is freely available at https://github.com/gdewael/cpg-transformer. Supplementary information Supplementary data are available at Bioinformatics online.

Print ISSN: 1367-4803

Electronic ISSN: 1460-2059

Topics: Biology , Computer Science , Medicine

Published by Oxford University Press

Permalink