ALBERT

All Library Books, journals and Electronic Records Telegrafenberg

feed icon rss

Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
Filter
  • Oxford University Press  (5)
Collection
Publisher
Years
  • 1
    Publication Date: 2016-06-16
    Description: Motivation: Alignment-based taxonomic binning for metagenome characterization proceeds in two steps: reads mapping against a reference database (RDB) and taxonomic assignment according to the best hits. Beyond the sequencing technology and the completeness of the RDB, selecting the optimal configuration of the workflow, in particular the mapper parameters and the best hit selection threshold, to get the highest binning performance remains quite empirical. Results: We developed a statistical framework to perform such optimization at a minimal computational cost. Using an optimization experimental design and simulated datasets for three sequencing technologies, we built accurate prediction models for five performance indicators and then derived the parameter configuration providing the optimal performance. Whatever the mapper and the dataset, we observed that the optimal configuration yielded better performance than the default configuration and that the best hit selection threshold had a large impact on performance. Finally, on a reference dataset from the Human Microbiome Project, we confirmed that the optimized configuration increased the performance compared with the default configuration. Availability and implementation: Not applicable. Contact: magali.dancette@biomerieux.com Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 2
    Publication Date: 2014-04-25
    Description: Motivation: Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry has been broadly adopted by routine clinical microbiology laboratories for bacterial species identification. An isolated colony of the targeted microorganism is the single prerequisite. Currently, MS-based microbial identification directly from clinical specimens can not be routinely performed, as it raises two main challenges: (i) the nature of the sample itself may increase the level of technical variability and bring heterogeneity with respect to the reference database and (ii) the possibility of encountering polymicrobial samples that will yield a ‘mixed’ MS fingerprint. In this article, we introduce a new method to infer the composition of polymicrobial samples on the basis of a single mass spectrum. Our approach relies on a penalized non-negative linear regression framework making use of species-specific prototypes, which can be derived directly from the routine reference database of pure spectra. Results: A large spectral dataset obtained from in vitro mono- and bi-microbial samples allowed us to evaluate the performance of the method in a comprehensive way. Provided that the reference matrix-assisted laser desorption/ionization time-of-flight mass spectrometry fingerprints were sufficiently distinct for the individual species, the method automatically predicted which bacterial species were present in the sample. Only few samples (5.3%) were misidentified, and bi-microbial samples were correctly identified in up to 61.2% of the cases. This method could be used in routine clinical microbiology practice. Availability and implementation: The complete dataset including both the reference database and the mock-up mixture spectra is available at http://archive.ics.uci.edu/ml/datasets/MicroMass . Contact: pierre.mahe@biomerieux.com Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 3
    Publication Date: 2016-03-26
    Description: Motivation: Metagenomics characterizes the taxonomic diversity of microbial communities by sequencing DNA directly from an environmental sample. One of the main challenges in metagenomics data analysis is the binning step, where each sequenced read is assigned to a taxonomic clade. Because of the large volume of metagenomics datasets, binning methods need fast and accurate algorithms that can operate with reasonable computing requirements. While standard alignment-based methods provide state-of-the-art performance, compositional approaches that assign a taxonomic class to a DNA read based on the k -mers it contains have the potential to provide faster solutions. Results: We propose a new rank-flexible machine learning-based compositional approach for taxonomic assignment of metagenomics reads and show that it benefits from increasing the number of fragments sampled from reference genome to tune its parameters, up to a coverage of about 10, and from increasing the k -mer size to about 12. Tuning the method involves training machine learning models on about 10 8 samples in 10 7 dimensions, which is out of reach of standard softwares but can be done efficiently with modern implementations for large-scale machine learning. The resulting method is competitive in terms of accuracy with well-established alignment and composition-based tools for problems involving a small to moderate number of candidate species and for reasonable amounts of sequencing errors. We show, however, that machine learning-based compositional approaches are still limited in their ability to deal with problems involving a greater number of species and more sensitive to sequencing errors. We finally show that the new method outperforms the state-of-the-art in its ability to classify reads from species of lineage absent from the reference database and confirm that compositional approaches achieve faster prediction times, with a gain of 2–17 times with respect to the BWA-MEM short read mapper, depending on the number of candidate species and the level of sequencing noise. Availability and implementation: Data and codes are available at http://cbio.ensmp.fr/largescalemetagenomics . Contact: pierre.mahe@biomerieux.com Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 4
    Publication Date: 2013-12-19
    Description: Motivation: Paired-end sequencing allows circumventing the shortness of the reads produced by second generation sequencers and is essential for de novo assembly of genomes. However, obtaining a finished genome from short reads is still an open challenge. We present an algorithm that exploits the pairing information issued from inserts of potentially any length. The method determines paths through an overlaps graph by using a constrained search tree. We also present a method that automatically determines suited overlaps cutoffs according to the contextual coverage, reducing thus the need for manual parameterization. Finally, we introduce an interactive mode that allows querying an assembly at targeted regions. Results: We assess our methods by assembling two Staphylococcus aureus strains that were sequenced on the Illumina platform. Using 100 bp paired-end reads and minimal manual curation, we produce a finished genome sequence for the previously undescribed isolate SGH-10-168. Availability and implementation: The presented algorithms are implemented in the standalone Edena software, freely available under the General Public License (GPLv3) at www.genomic.ch/edena.php . Contact: david.hernandez@genomic.ch Supplementary Information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 5
    Publication Date: 2008-12-23
    Print ISSN: 0737-4038
    Electronic ISSN: 1537-1719
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. More information can be found here...