ALBERT — All Library Books, journals and Electronic Records Telegrafenberg

1

Unknown

Analytics and visualization tools to characterize single-cell stochasticity using bacterial single-cell movie cytometry data (2021)

Balomenos, Athanasios D. ; Stefanou, Victoria ; Manolakos, Elias S.

BioMed Central

In: BMC Bioinformatics. 2021; 22(1): 531. Published 2021 Oct 29. doi: 10.1186/s12859-021-04409-9.

add to mindlist on the mindlist

Details

Publication Date: 2021-10-29

Description: Background Time-lapse microscopy live-cell imaging is essential for studying the evolution of bacterial communities at single-cell resolution. It allows capturing detailed information about the morphology, gene expression, and spatial characteristics of individual cells at every time instance of the imaging experiment. The image analysis of bacterial "single-cell movies" (videos) generates big data in the form of multidimensional time series of measured bacterial attributes. If properly analyzed, these datasets can help us decipher the bacterial communities' growth dynamics and identify the sources and potential functional role of intra- and inter-subpopulation heterogeneity. Recent research has highlighted the importance of investigating the role of biological "noise" in gene regulation, cell growth, cell division, etc. Single-cell analytics of complex single-cell movie datasets, capturing the interaction of multiple micro-colonies with thousands of cells, can shed light on essential phenomena for human health, such as the competition of pathogens and benign microbiome cells, the emergence of dormant cells (“persisters”), the formation of biofilms under different stress conditions, etc. However, highly accurate and automated bacterial bioimage analysis and single-cell analytics methods remain elusive, even though they are required before we can routinely exploit the plethora of data that single-cell movies generate. Results We present visualization and single-cell analytics using R (ViSCAR), a set of methods and corresponding functions, to visually explore and correlate single-cell attributes generated from the image processing of complex bacterial single-cell movies. They can be used to model and visualize the spatiotemporal evolution of attributes at different levels of the microbial community organization (i.e., cell population, colony, generation, etc.), to discover possible epigenetic information transfer across cell generations, infer mathematical and statistical models describing various stochastic phenomena (e.g., cell growth, cell division), and even identify and auto-correct errors introduced unavoidably during the bioimage analysis of a dense movie with thousands of overcrowded cells in the microscope's field of view. Conclusions ViSCAR empowers researchers to capture and characterize the stochasticity, uncover the mechanisms leading to cellular phenotypes of interest, and decipher a large heterogeneous microbial communities' dynamic behavior. ViSCAR source code is available from GitLab at https://gitlab.com/ManolakosLab/viscar.

Electronic ISSN: 1471-2105

Topics: Biology , Computer Science

Published by BioMed Central

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

2

Unknown

Filling gaps of genome scaffolds via probabilistic searching optical maps against assembly graph (2021)

Huang, Bin ; Wei, Guozheng ; Wang, Bing ; [et al.]

BioMed Central

In: BMC Bioinformatics. 2021; 22(1): 533. Published 2021 Oct 30. doi: 10.1186/s12859-021-04448-2.

add to mindlist on the mindlist

Details

Publication Date: 2021-10-30

Description: Background Optical maps record locations of specific enzyme recognition sites within long genome fragments. This long-distance information enables aligning genome assembly contigs onto optical maps and ordering contigs into scaffolds. The generated scaffolds, however, often contain a large amount of gaps. To fill these gaps, a feasible way is to search genome assembly graph for the best-matching contig paths that connect boundary contigs of gaps. The combination of searching and evaluation procedures might be “searching followed by evaluation”, which is infeasible for long gaps, or “searching by evaluation”, which heavily relies on heuristics and thus usually yields unreliable contig paths. Results We here report an accurate and efficient approach to filling gaps of genome scaffolds with aids of optical maps. Using simulated data from 12 species and real data from 3 species, we demonstrate the successful application of our approach in gap filling with improved accuracy and completeness of genome scaffolds. Conclusion Our approach applies a sequential Bayesian updating technique to measure the similarity between optical maps and candidate contig paths. Using this similarity to guide path searching, our approach achieves higher accuracy than the existing “searching by evaluation” strategy that relies on heuristics. Furthermore, unlike the “searching followed by evaluation” strategy enumerating all possible paths, our approach prunes the unlikely sub-paths and extends the highly-probable ones only, thus significantly increasing searching efficiency.

Electronic ISSN: 1471-2105

Topics: Biology , Computer Science

Published by BioMed Central

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

3

Unknown

A drug repositioning algorithm based on a deep autoencoder and adaptive fusion (2021)

Chen, Peng ; Bao, Tianjiazhi ; Yu, Xiaosheng ; [et al.]

BioMed Central

In: BMC Bioinformatics. 2021; 22(1): 532. Published 2021 Oct 30. doi: 10.1186/s12859-021-04406-y.

add to mindlist on the mindlist

Details

Publication Date: 2021-10-30

Description: Background Drug repositioning has caught the attention of scholars at home and abroad due to its effective reduction of the development cost and time of new drugs. However, existing drug repositioning methods that are based on computational analysis are limited by sparse data and classic fusion methods; thus, we use autoencoders and adaptive fusion methods to calculate drug repositioning. Results In this study, a drug repositioning algorithm based on a deep autoencoder and adaptive fusion was proposed to mitigate the problems of decreased precision and low-efficiency multisource data fusion caused by data sparseness. Specifically, a drug is repositioned by fusing drug-disease associations, drug target proteins, drug chemical structures and drug side effects. First, drug feature data integrated by drug target proteins and chemical structures were processed with dimension reduction via a deep autoencoder to characterize feature representations more densely and abstractly. Then, disease similarity was computed using drug-disease association data, while drug similarity was calculated with drug feature and drug-side effect data. Predictions of drug-disease associations were also calculated using a top-k neighbor method that is commonly used in predictive drug repositioning studies. Finally, a predicted matrix for drug-disease associations was acquired after fusing a wide variety of data via adaptive fusion. Based on experimental results, the proposed algorithm achieves a higher precision and recall rate than the DRCFFS, SLAMS and BADR algorithms with the same dataset. Conclusion The proposed algorithm contributes to investigating the novel uses of drugs, as shown in a case study of Alzheimer's disease. Therefore, the proposed algorithm can provide an auxiliary effect for clinical trials of drug repositioning.

Electronic ISSN: 1471-2105

Topics: Biology , Computer Science

Published by BioMed Central

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

4

Unknown

LongStitch: high-quality genome assembly correction and scaffolding using long reads (2021)

Coombe, Lauren ; Li, Janet X. ; Lo, Theodora ; [et al.]

BioMed Central

In: BMC Bioinformatics. 2021; 22(1): 534. Published 2021 Oct 30. doi: 10.1186/s12859-021-04451-7.

add to mindlist on the mindlist

Details

Publication Date: 2021-10-30

Description: Background Generating high-quality de novo genome assemblies is foundational to the genomics study of model and non-model organisms. In recent years, long-read sequencing has greatly benefited genome assembly and scaffolding, a process by which assembled sequences are ordered and oriented through the use of long-range information. Long reads are better able to span repetitive genomic regions compared to short reads, and thus have tremendous utility for resolving problematic regions and helping generate more complete draft assemblies. Here, we present LongStitch, a scalable pipeline that corrects and scaffolds draft genome assemblies exclusively using long reads. Results LongStitch incorporates multiple tools developed by our group and runs in up to three stages, which includes initial assembly correction (Tigmint-long), followed by two incremental scaffolding stages (ntLink and ARKS-long). Tigmint-long and ARKS-long are misassembly correction and scaffolding utilities, respectively, previously developed for linked reads, that we adapted for long reads. Here, we describe the LongStitch pipeline and introduce our new long-read scaffolder, ntLink, which utilizes lightweight minimizer mappings to join contigs. LongStitch was tested on short and long-read assemblies of Caenorhabditis elegans, Oryza sativa, and three different human individuals using corresponding nanopore long-read data, and improves the contiguity of each assembly from 1.2-fold up to 304.6-fold (as measured by NGA50 length). Furthermore, LongStitch generates more contiguous and correct assemblies compared to state-of-the-art long-read scaffolder LRScaf in most tests, and consistently improves upon human assemblies in under five hours using less than 23 GB of RAM. Conclusions Due to its effectiveness and efficiency in improving draft assemblies using long reads, we expect LongStitch to benefit a wide variety of de novo genome assembly projects. The LongStitch pipeline is freely available at https://github.com/bcgsc/longstitch.

Electronic ISSN: 1471-2105

Topics: Biology , Computer Science

Published by BioMed Central

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

5

Unknown

Assessing the suitability of general practice electronic health records for clinical prediction model development: a data quality assessment (2021)

Thuraisingam, Sharmala ; Chondros, Patty ; Dowsey, Michelle M. ; [et al.]

BioMed Central

In: BMC Medical Informatics and Decision Making. 2021; 21(1): 297. Published 2021 Oct 30. doi: 10.1186/s12911-021-01669-6.

add to mindlist on the mindlist

Details

Publication Date: 2021-10-30

Description: Background The use of general practice electronic health records (EHRs) for research purposes is in its infancy in Australia. Given these data were collected for clinical purposes, questions remain around data quality and whether these data are suitable for use in prediction model development. In this study we assess the quality of data recorded in 201,462 patient EHRs from 483 Australian general practices to determine its usefulness in the development of a clinical prediction model for total knee replacement (TKR) surgery in patients with osteoarthritis (OA). Methods Variables to be used in model development were assessed for completeness and plausibility. Accuracy for the outcome and competing risk were assessed through record level linkage with two gold standard national registries, Australian Orthopaedic Association National Joint Replacement Registry (AOANJRR) and National Death Index (NDI). The validity of the EHR data was tested using participant characteristics from the 2014–15 Australian National Health Survey (NHS). Results There were substantial missing data for body mass index and weight gain between early adulthood and middle age. TKR and death were recorded with good accuracy, however, year of TKR, year of death and side of TKR were poorly recorded. Patient characteristics recorded in the EHR were comparable to participant characteristics from the NHS, except for OA medication and metastatic solid tumour. Conclusions In this study, data relating to the outcome, competing risk and two predictors were unfit for prediction model development. This study highlights the need for more accurate and complete recording of patient data within EHRs if these data are to be used to develop clinical prediction models. Data linkage with other gold standard data sets/registries may in the meantime help overcome some of the current data quality challenges in general practice EHRs when developing prediction models.

Electronic ISSN: 1472-6947

Topics: Computer Science , Medicine

Published by BioMed Central

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

6

Unknown

isoCNV: in silico optimization of copy number variant detection from targeted or exome sequencing data (2021)

Barcelona-Cabeza, Rosa ; Sanseverino, Walter ; Aiese Cigliano, Riccardo

BioMed Central

In: BMC Bioinformatics. 2021; 22(1): 530. Published 2021 Oct 29. doi: 10.1186/s12859-021-04452-6.

add to mindlist on the mindlist

Details

Publication Date: 2021-10-29

Description: Background Accurate copy number variant (CNV) detection is especially challenging for both targeted sequencing (TS) and whole‐exome sequencing (WES) data. To maximize the performance, the parameters of the CNV calling algorithms should be optimized for each specific dataset. This requires obtaining validated CNV information using either multiplex ligation-dependent probe amplification (MLPA) or array comparative genomic hybridization (aCGH). They are gold standard but time-consuming and costly approaches. Results We present isoCNV which optimizes the parameters of DECoN algorithm using only NGS data. The parameter optimization process is performed using an in silico CNV validated dataset obtained from the overlapping calls of three algorithms: CNVkit, panelcn.MOPS and DECoN. We evaluated the performance of our tool and showed that increases the sensitivity in both TS and WES real datasets. Conclusions isoCNV provides an easy-to-use pipeline to optimize DECoN that allows the detection of analysis-ready CNV from a set of DNA alignments obtained under the same conditions. It increases the sensitivity of DECoN without the need for orthogonal methods. isoCNV is available at https://gitlab.com/sequentiateampublic/isocnv.

Electronic ISSN: 1471-2105

Topics: Biology , Computer Science

Published by BioMed Central

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

7

Unknown

Centrality of drug targets in protein networks (2021)

Viacava Follis, Ariele

BioMed Central

In: BMC Bioinformatics. 2021; 22(1): 527. Published 2021 Oct 29. doi: 10.1186/s12859-021-04342-x.

add to mindlist on the mindlist

Details

Publication Date: 2021-10-29

Description: Background In the pharmaceutical industry, competing for few validated drug targets there is a drive to identify new ways of therapeutic intervention. Here, we attempted to define guidelines to evaluate a target’s ‘fitness’ based on its node characteristics within annotated protein functional networks to complement contingent therapeutic hypotheses. Results We observed that targets of approved, selective small molecule drugs exhibit high node centrality within protein networks relative to a broader set of investigational targets spanning various development stages. Targets of approved drugs also exhibit higher centrality than other proteins within their respective functional class. These findings expand on previous reports of drug targets’ network centrality by suggesting some centrality metrics such as low topological coefficient as inherent characteristics of a ‘good’ target, relative to other exploratory targets and regardless of its functional class. These centrality metrics could thus be indicators of an individual protein’s ‘fitness’ as potential drug target. Correlations between protein nodes’ network centrality and number of associated publications underscored the possibility of knowledge bias as an inherent limitation to such predictions. Conclusions Despite some entanglement with knowledge bias, like structure-oriented ‘druggability’ assessments of new protein targets, centrality metrics could assist early pharmaceutical discovery teams in evaluating potential targets with limited experimental proof of concept and help allocate resources for an effective drug discovery pipeline.

Electronic ISSN: 1471-2105

Topics: Biology , Computer Science

Published by BioMed Central

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

8

Unknown

FoldHSphere: deep hyperspherical embeddings for protein fold recognition (2021)

Villegas-Morcillo, Amelia ; Sanchez, Victoria ; Gomez, Angel M.

BioMed Central

In: BMC Bioinformatics. 2021; 22(1): 490. Published 2021 Oct 12. doi: 10.1186/s12859-021-04419-7.

add to mindlist on the mindlist

Details

Publication Date: 2021-10-12

Description: Background Current state-of-the-art deep learning approaches for protein fold recognition learn protein embeddings that improve prediction performance at the fold level. However, there still exists aperformance gap at the fold level and the (relatively easier) family level, suggesting that it might be possible to learn an embedding space that better represents the protein folds. Results In this paper, we propose the FoldHSphere method to learn a better fold embedding space through a two-stage training procedure. We first obtain prototype vectors for each fold class that are maximally separated in hyperspherical space. We then train a neural network by minimizing the angular large margin cosine loss to learn protein embeddings clustered around the corresponding hyperspherical fold prototypes. Our network architectures, ResCNN-GRU and ResCNN-BGRU, process the input protein sequences by applying several residual-convolutional blocks followed by a gated recurrent unit-based recurrent layer. Evaluation results on the LINDAHL dataset indicate that the use of our hyperspherical embeddings effectively bridges the performance gap at the family and fold levels. Furthermore, our FoldHSpherePro ensemble method yields an accuracy of 81.3% at the fold level, outperforming all the state-of-the-art methods. Conclusions Our methodology is efficient in learning discriminative and fold-representative embeddings for the protein domains. The proposed hyperspherical embeddings are effective at identifying the protein fold class by pairwise comparison, even when amino acid sequence similarities are low.

Electronic ISSN: 1471-2105

Topics: Biology , Computer Science

Published by BioMed Central

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

9

Unknown

adabmDCA: adaptive Boltzmann machine learning for biological sequences (2021)

Muntoni, Anna Paola ; Pagnani, Andrea ; Weigt, Martin ; [et al.]

BioMed Central

In: BMC Bioinformatics. 2021; 22(1): 528. Published 2021 Oct 29. doi: 10.1186/s12859-021-04441-9.

add to mindlist on the mindlist

Details

Publication Date: 2021-10-29

Description: Background Boltzmann machines are energy-based models that have been shown to provide an accurate statistical description of domains of evolutionary-related protein and RNA families. They are parametrized in terms of local biases accounting for residue conservation, and pairwise terms to model epistatic coevolution between residues. From the model parameters, it is possible to extract an accurate prediction of the three-dimensional contact map of the target domain. More recently, the accuracy of these models has been also assessed in terms of their ability in predicting mutational effects and generating in silico functional sequences. Results Our adaptive implementation of Boltzmann machine learning, , can be generally applied to both protein and RNA families and accomplishes several learning set-ups, depending on the complexity of the input data and on the user requirements. The code is fully available at https://github.com/anna-pa-m/adabmDCA. As an example, we have performed the learning of three Boltzmann machines modeling the Kunitz and Beta-lactamase2 protein domains and TPP-riboswitch RNA domain. Conclusions The models learned by are comparable to those obtained by state-of-the-art techniques for this task, in terms of the quality of the inferred contact map as well as of the synthetically generated sequences. In addition, the code implements both equilibrium and out-of-equilibrium learning, which allows for an accurate and lossless training when the equilibrium one is prohibitive in terms of computational time, and allows for pruning irrelevant parameters using an information-based criterion.

Electronic ISSN: 1471-2105

Topics: Biology , Computer Science

Published by BioMed Central

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

10

Unknown

Factors of quality of care and their association with smartphone based PHR adoption in South Korean hospitals (2021)

Choi, Byung Kwan ; Park, Young-Taek ; Park, Hyeoun-Ae ; [et al.]

BioMed Central

In: BMC Medical Informatics and Decision Making. 2021; 21(1): 296. Published 2021 Oct 29. doi: 10.1186/s12911-021-01666-9.

add to mindlist on the mindlist

Details

Publication Date: 2021-10-29

Description: Background Healthcare organizations have begun to adopt personal health records (PHR) systems to engage patients, but little is known about factors associated with the adoption of PHR systems at an organizational level. The objective of this study is to investigate factors associated with healthcare organizations’ adoption of PHR systems in South Korea. Methods The units of analysis were hospitals with more than 100 beds. Study data of 313 hospitals were collected from May 1 to June 30, 2020. The PHR adoption status for each hospital was collected from PHR vendors and online searches. Adoption was then confirmed by downloading the hospital’s PHR app and the PHR app was examined to ascertain its available functions. One major outcome variable was PHR adoption status at hospital level. Data were analysed by logistic regressions using SAS 9.4 version. Results Out of 313 hospitals, 103 (32.9%) hospitals adopted PHR systems. The nurse-patient ratio was significantly associated with PHR adoption (OR 0.758; 0.624 to 0.920, p = 0.005). The number of health information management staff was associated with PHR adoption (OR 1.622; 1.228 to 2.141, p = 0.001). The number of CTs was positively associated with PHR adoption (OR 5.346; 1.962 to 14.568, p = 0.001). Among the hospital characteristics, the number of beds was significantly related with PHR adoption in the model of standard of nursing care (OR 1.003; 1.001 to 1.005, p

Electronic ISSN: 1472-6947

Topics: Computer Science , Medicine

Published by BioMed Central

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext