ALBERT

All Library Books, journals and Electronic Records Telegrafenberg

Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
Filter
  • Books
  • Articles  (23)
  • Computational Methods, Massively Parallel (Deep) Sequencing  (12)
  • Repair  (11)
  • Oxford University Press  (23)
  • MDPI Publishing
  • Nucleic Acids Research  (23)
  • 60967
  • 88336
  • 1
    Publication Date: 2016-02-20
    Description: Laser microirradiation is a powerful tool for real-time single-cell analysis of the DNA damage response (DDR). It is often found, however, that factor recruitment or modification profiles vary depending on the laser system employed. This is likely due to an incomplete understanding of how laser conditions/dosages affect the amounts and types of damage and the DDR. We compared different irradiation conditions using a femtosecond near-infrared laser and found distinct damage site recruitment thresholds for 53BP1 and TRF2 correlating with the dose-dependent increase of strand breaks and damage complexity. Low input-power microirradiation that induces relatively simple strand breaks led to robust recruitment of 53BP1 but not TRF2. In contrast, increased strand breaks with complex damage including crosslinking and base damage generated by high input-power microirradiation resulted in TRF2 recruitment to damage sites with no 53BP1 clustering. We found that poly(ADP-ribose) polymerase (PARP) activation distinguishes between the two damage states and that PARP activation is essential for rapid TRF2 recruitment while suppressing 53BP1 accumulation at damage sites. Thus, our results reveal that careful titration of laser irradiation conditions allows induction of varying amounts and complexities of DNA damage that are gauged by differential PARP activation regulating protein assembly at the damage site.
    Keywords: Repair
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 2
    Publication Date: 2016-03-01
    Description: The adaptive immune system includes populations of B and T cells capable of binding foreign epitopes via antigen specific receptors, called immunoglobulin (IG) for B cells and the T cell receptor (TCR) for T cells. In order to provide protection from a wide range of pathogens, these cells display highly diverse repertoires of IGs and TCRs. This is achieved through combinatorial rearrangement of multiple gene segments in addition, for B cells, to somatic hypermutation. Deep sequencing technologies have revolutionized analysis of the diversity of these repertoires; however, accurate TCR/IG diversity profiling requires specialist bioinformatics tools. Here we present LymAnalzyer, a software package that significantly improves the completeness and accuracy of TCR/IG profiling from deep sequence data and includes procedures to identify novel alleles of gene segments. On real and simulated data sets LymAnalyzer produces highly accurate and complete results. Although, to date we have applied it to TCR/IG data from human and mouse, it can be applied to data from any species for which an appropriate database of reference genes is available. Implemented in Java, it includes both a command line version and a graphical user interface and is freely available at https://sourceforge.net/projects/lymanalyzer/ .
    Keywords: Computational Methods, Massively Parallel (Deep) Sequencing
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 3
    Publication Date: 2016-04-21
    Description: DNA double-strand breaks (DSBs) and their repair can cause extensive epigenetic changes. As a result, DSBs have been proposed to promote transcriptional and, ultimately, physiological dysfunction via both cell-intrinsic and cell-non-autonomous pathways. Studying the consequences of DSBs in higher organisms has, however, been hindered by a scarcity of tools for controlled DSB induction. Here, we describe a mouse model that allows for both tissue-specific and temporally controlled DSB formation at ~140 defined genomic loci. Using this model, we show that DSBs promote a DNA damage signaling-dependent decrease in gene expression in primary cells specifically at break-bearing genes, which is reversed upon DSB repair. Importantly, we demonstrate that restoration of gene expression can occur independently of cell cycle progression, underlining its relevance for normal tissue maintenance. Consistent with this, we observe no evidence for persistent transcriptional repression in response to a multi-day course of continuous DSB formation and repair in mouse lymphocytes in vivo . Together, our findings reveal an unexpected capacity of primary cells to maintain transcriptome integrity in response to DSBs, pointing to a limited role for DNA damage as a mediator of cell-autonomous epigenetic dysfunction.
    Keywords: Repair
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 4
    Publication Date: 2015-02-18
    Description: DNA-damage tolerance (DDT) via translesion DNA synthesis (TLS) or homology-dependent repair (HDR) functions to bypass DNA lesions encountered during replication, and is critical for maintaining genome stability. Here, we present piggyBlock, a new chromosomal assay that, using piggyBac transposition of DNA containing a known lesion, measures the division of labor between the two DDT pathways. We show that in the absence of DNA damage response, tolerance of the most common sunlight-induced DNA lesion, TT-CPD, is achieved by TLS in mouse embryo fibroblasts. Meanwhile, BP-G, a major smoke-induced DNA lesion, is bypassed primarily by HDR, providing the first evidence for this mechanism being the main tolerance pathway for a biologically important lesion in a mammalian genome. We also show that, far from being a last-resort strategy as it is sometimes portrayed, TLS operates alongside nucleotide excision repair, handling 40% of TT-CPDs in repair-proficient cells. Finally, DDT acts in mouse embryonic stem cells, exhibiting the same pattern—mutagenic TLS included—despite the risk of propagating mutations along all cell lineages. The new method highlights the importance of HDR, and provides an effective tool for studying DDT in mammalian cells.
    Keywords: Repair
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 5
    Publication Date: 2015-09-19
    Description: Structural variations (SVs) play a crucial role in genetic diversity. However, the alignments of reads near/across SVs are made inaccurate by the presence of polymorphisms. BatAlign is an algorithm that integrated two strategies called ‘Reverse-Alignment’ and ‘Deep-Scan’ to improve the accuracy of read-alignment. In our experiments, BatAlign was able to obtain the highest F-measures in read-alignments on mismatch-aberrant, indel-aberrant, concordantly/discordantly paired and SV-spanning data sets. On real data, the alignments of BatAlign were able to recover 4.3% more PCR-validated SVs with 73.3% less callings. These suggest BatAlign to be effective in detecting SVs and other polymorphic-variants accurately using high-throughput data. BatAlign is publicly available at https://goo.gl/a6phxB .
    Keywords: Computational Methods, Massively Parallel (Deep) Sequencing
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 6
    Publication Date: 2014-02-11
    Description: Correctly estimating isoform-specific gene expression is important for understanding complicated biological mechanisms and for mapping disease susceptibility genes. However, estimating isoform-specific gene expression is challenging because various biases present in RNA-Seq (RNA sequencing) data complicate the analysis, and if not appropriately corrected, can affect isoform expression estimation and downstream analysis. In this article, we present PennSeq, a statistical method that allows each isoform to have its own non-uniform read distribution. Instead of making parametric assumptions, we give adequate weight to the underlying data by the use of a non-parametric approach. Our rationale is that regardless what factors lead to non-uniformity, whether it is due to hexamer priming bias, local sequence bias, positional bias, RNA degradation, mapping bias or other unknown reasons, the probability that a fragment is sampled from a particular region will be reflected in the aligned data. This empirical approach thus maximally reflects the true underlying non-uniform read distribution. We evaluate the performance of PennSeq using both simulated data with known ground truth, and using two real Illumina RNA-Seq data sets including one with quantitative real time polymerase chain reaction measurements. Our results indicate superior performance of PennSeq over existing methods, particularly for isoforms demonstrating severe non-uniformity. PennSeq is freely available for download at http://sourceforge.net/projects/pennseq .
    Keywords: Repair
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 7
    Publication Date: 2014-02-28
    Description: The nucleotide excision repair pathway removes ultraviolet (UV) photoproducts from the human genome in the form of short oligonucleotides ~30 nt in length. Because there are limitations to many of the currently available methods for investigating UV photoproduct repair in vivo , we developed a convenient non-radioisotopic method to directly detect DNA excision repair events in human cells. The approach involves extraction of oligonucleotides from UV-irradiated cells, DNA end-labeling with biotin and streptavidin-mediated chemiluminescent detection of the excised UV photoproduct-containing oligonucleotides that are released from the genome during excision repair. Our novel approach is robust, with essentially no signal in the absence of UV or a functional excision repair system. Furthermore, our non-radioisotopic methodology allows for the sensitive detection of excision products within minutes following UV irradiation and does not require additional enrichment steps such as immunoprecipitation. Finally, this technique allows for quantitative measurements of excision repair in human cells. We suggest that the new techniques presented here will be a useful and powerful approach for studying the mechanism of human nucleotide excision repair in vivo .
    Keywords: Repair
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 8
    Publication Date: 2014-02-11
    Description: 5' strand resection at DNA double strand breaks (DSBs) is critical for homologous recombination (HR) and genomic stability. Here we develop a novel method to quantitatively measure single-stranded DNA intermediates in human cells and find that the 5' strand at endonuclease-generated break sites is resected up to 3.5 kb in a cell cycle–dependent manner. Depletion of CtIP, Mre11, Exo1 or SOSS1 blocks resection, while depletion of 53BP1, Ku or DNA-dependent protein kinase catalytic subunit leads to increased resection as measured by this method. While 53BP1 negatively regulates DNA end processing, depletion of Brca1 does not, suggesting that the role of Brca1 in HR is primarily to promote Rad51 filament formation, not to regulate end resection.
    Keywords: Repair
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 9
    Publication Date: 2013-06-08
    Description: Double-strand break (DSB) repair pathways are critical for the maintenance of genomic integrity and the prevention of tumorigenesis in mammalian cells. Here, we present the development and validation of a novel assay to measure mutagenic non-homologous end-joining (NHEJ) repair in living cells, which is inversely related to canonical NHEJ and is based on the sequence-altering repair of a single site-specific DSB at an intrachromosomal locus. We have combined this mutagenic NHEJ assay with an established homologous recombination (HR) assay such that both pathways can be monitored simultaneously. In addition, we report the development of a ligand-responsive I-SceI protein, in which the timing and kinetics of DSB induction can be precisely controlled by regulating protein stability and cellular localization in cells. Using this system, we report that mutagenic NHEJ repair is suppressed in growth-arrested and serum-deprived cells, suggesting that end-joining activity in proliferating cells is more likely to be mutagenic. Collectively, the novel DSB repair assay and inducible I-SceI will be useful tools to further elucidate the complexities of NHEJ and HR repair.
    Keywords: Repair
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 10
    Publication Date: 2013-09-06
    Description: MiST is a novel approach to variant calling from deep sequencing data, using the inverted mapping approach developed for Geoseq. Reads that can map to a targeted exonic region are identified using exact matches to tiles from the region. The reads are then aligned to the targets to discover variants. MiST carefully handles paralogous reads that map ambiguously to the genome and clonal reads arising from PCR bias, which are the two major sources of errors in variant calling. The reduced computational complexity of mapping selected reads to targeted regions of the genome improves speed, specificity and sensitivity of variant detection. Compared with variant calls from the GATK platform, MiST showed better concordance with SNPs from dbSNP and genotypes determined by an exonic-SNP array. Variant calls made only by MiST confirm at a high rate (〉90%) by Sanger sequencing. Thus, MiST is a valuable alternative tool to analyse variants in deep sequencing data.
    Keywords: Computational Methods, Massively Parallel (Deep) Sequencing
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 11
    Publication Date: 2013-05-29
    Description: Sequencing of RNAs (RNA-Seq) has revolutionized the field of transcriptomics, but the reads obtained often contain errors. Read error correction can have a large impact on our ability to accurately assemble transcripts. This is especially true for de novo transcriptome analysis, where a reference genome is not available. Current read error correction methods, developed for DNA sequence data, cannot handle the overlapping effects of non-uniform abundance, polymorphisms and alternative splicing. Here we present SEquencing Error CorrEction in Rna-seq data (SEECER), a hidden Markov Model (HMM)–based method, which is the first to successfully address these problems. SEECER efficiently learns hundreds of thousands of HMMs and uses these to correct sequencing errors. Using human RNA-Seq data, we show that SEECER greatly improves on previous methods in terms of quality of read alignment to the genome and assembly accuracy. To illustrate the usefulness of SEECER for de novo transcriptome studies, we generated new RNA-Seq data to study the development of the sea cucumber Parastichopus parvimensis . Our corrected assembled transcripts shed new light on two important stages in sea cucumber development. Comparison of the assembled transcripts to known transcripts in other species has also revealed novel transcripts that are unique to sea cucumber, some of which we have experimentally validated. Supporting website: http://sb.cs.cmu.edu/seecer/ .
    Keywords: Computational Methods, Massively Parallel (Deep) Sequencing
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 12
    Publication Date: 2013-11-21
    Description: Detection of differential expression in RNA-Seq data is currently limited to studies in which two or more sample conditions are known a priori. However, these biological conditions are typically unknown in cohort, cross-sectional and nonrandomized controlled studies such as the HapMap, the ENCODE or the 1000 Genomes project. We present DEXUS for detecting differential expression in RNA-Seq data for which the sample conditions are unknown. DEXUS models read counts as a finite mixture of negative binomial distributions in which each mixture component corresponds to a condition. A transcript is considered differentially expressed if modeling of its read counts requires more than one condition. DEXUS decomposes read count variation into variation due to noise and variation due to differential expression. Evidence of differential expression is measured by the informative/noninformative (I/NI) value, which allows differentially expressed transcripts to be extracted at a desired specificity (significance level) or sensitivity (power). DEXUS performed excellently in identifying differentially expressed transcripts in data with unknown conditions. On 2400 simulated data sets, I/NI value thresholds of 0.025, 0.05 and 0.1 yielded average specificities of 92, 97 and 99% at sensitivities of 76, 61 and 38%, respectively. On real-world data sets, DEXUS was able to detect differentially expressed transcripts related to sex, species, tissue, structural variants or quantitative trait loci. The DEXUS R package is publicly available from Bioconductor and the scripts for all experiments are available at http://www.bioinf.jku.at/software/dexus/ .
    Keywords: Computational Methods, Massively Parallel (Deep) Sequencing
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 13
    Publication Date: 2013-09-06
    Description: Oxidized bases in DNA have been implicated in cancer, aging and neurodegenerative disease. We have developed an approach combining single-cell gel electrophoresis (comet) with fluorescence in situ hybridization (FISH) that enables the comparative quantification of low, physiologically relevant levels of DNA lesions in the respective strands of defined nucleotide sequences and in the genome overall. We have synthesized single-stranded probes targeting the termini of DNA segments of interest using a polymerase chain reaction-based method. These probes facilitate detection of damage at the single-molecule level, as the lesions are converted to DNA strand breaks by lesion-specific endonucleases or glycosylases. To validate our method, we have documented transcription-coupled repair of cyclobutane pyrimidine dimers in the ataxia telangiectasia-mutated (ATM) gene in human fibroblasts irradiated with 254 nm ultraviolet at 0.1 J/m 2 , a dose ~100-fold lower than those typically used. The high specificity and sensitivity of our approach revealed that 7,8-dihydro-8-oxoguanine (8-oxoG) at an incidence of approximately three lesions per megabase is preferentially repaired in the transcribed strand of the ATM gene. We have also demonstrated that the hOGG1, XPA, CSB and UVSSA proteins, as well as actively elongating RNA polymerase II, are required for this process, suggesting cross-talk between DNA repair pathways.
    Keywords: Repair
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 14
    Publication Date: 2013-06-28
    Description: We report the development of simple fluorogenic probes that report on the activity of both bacterial and mammalian uracil–DNA glycosylase (UDG) enzymes. The probes are built from short, modified single-stranded oligonucleotides containing natural and unnatural bases. The combination of multiple fluorescent pyrene and/or quinacridone nucleobases yields fluorescence at 480 and 540 nm (excitation 340 nm), with large Stokes shifts of 140–200 nm, considerably greater than previous probes. They are strongly quenched by uracil bases incorporated into the sequence, and they yield light-up signals of up to 40-fold, or ratiometric signals with ratio changes of 82-fold, on enzymatic removal of these quenching uracils. We find that the probes are efficient reporters of bacterial UDG, human UNG2, and human SMUG1 enzymes in vitro , yielding complete signals in minutes. Further experiments establish that a probe can be used to image UDG activity by laser confocal microscopy in bacterial cells and in a human cell line, and that signals from a probe signalling UDG activity in human cells can be quantified by flow cytometry. Such probes may prove generally useful both in basic studies of these enzymes and in biomedical applications as well.
    Keywords: Repair
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 15
    Publication Date: 2013-04-14
    Description: With the advent of high-throughput sequencing technologies, the rapid generation and accumulation of large amounts of sequencing data pose an insurmountable demand for efficient algorithms for constructing whole-genome phylogenies. The existing phylogenomic methods all use assembled sequences, which are often not available owing to the difficulty of assembling short-reads; this obstructs phylogenetic investigations on species without a reference genome. In this report, we present co-phylog , an assembly-free phylogenomic approach that creates a ‘micro-alignment’ at each ‘object’ in the sequence using the ‘context’ of the object and calculates pairwise distances before reconstructing the phylogenetic tree based on those distances. We explored the parameters’ usages and the optimal working range of co-phylog , assessed co-phylog using the simulated next-generation sequencing (NGS) data and the real NGS raw data. We also compared co-phylog method with traditional alignment and alignment-free methods and illustrated the advantages and limitations of co-phylog method. In conclusion, we demonstrated that co-phylog is efficient algorithm and that it delivers high resolution and accurate phylogenies using whole-genome unassembled sequencing data, especially in the case of closely related organisms, thereby significantly alleviating the computational burden in the genomic era.
    Keywords: Computational Methods, Massively Parallel (Deep) Sequencing
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 16
    Publication Date: 2012-10-10
    Description: Proteins are covalently trapped on DNA to form DNA–protein crosslinks (DPCs) when cells are exposed to DNA-damaging agents. DPCs interfere with many aspects of DNA transactions. The current DPC detection methods indirectly measure crosslinked proteins (CLPs) through DNA tethered to proteins. However, a major drawback of such methods is the non-linear relationship between the amounts of DNA and CLPs, which makes quantitative data interpretation difficult. Here we developed novel methods of DPC detection based on direct CLP measurement, whereby CLPs in DNA isolated from cells are labeled with fluorescein isothiocyanate (FITC) and quantified by fluorometry or western blotting using anti-FITC antibodies. Both formats successfully monitored the induction and elimination of DPCs in cultured cells exposed to aldehydes and mouse tumors exposed to ionizing radiation (carbon-ion beams). The fluorometric and western blotting formats require 30 and 0.3 μg of DNA, respectively. Analyses of the isolated genomic DPCs revealed that both aldehydes and ionizing radiation produce two types of DPC with distinct stabilities. The stable components of aldehyde-induced DPCs have half-lives of up to days. Interestingly, that of radiation-induced DPCs has an infinite half-life, suggesting that the stable DPC component exerts a profound effect on DNA transactions over many cell cycles.
    Keywords: Repair
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 17
    Publication Date: 2012-06-28
    Description: Engineered zinc finger nucleases (ZFNs) induce DNA double-strand breaks at specific recognition sequences and can promote efficient introduction of desired insertions, deletions or substitutions at or near the cut site via homology-directed repair (HDR) with a double- and/or single-stranded donor DNA template. However, mutagenic events caused by error-prone non-homologous end-joining (NHEJ)-mediated repair are introduced with equal or higher frequency at the nuclease cleavage site. Furthermore, unintended mutations can also result from NHEJ-mediated repair of off-target nuclease cleavage sites. Here, we describe a simple and general method for converting engineered ZFNs into zinc finger nickases (ZFNickases) by inactivating the catalytic activity of one monomer in a ZFN dimer. ZFNickases show robust strand-specific nicking activity in vitro . In addition, we demonstrate that ZFNickases can stimulate HDR at their nicking site in human cells, albeit at a lower frequency than by the ZFNs from which they were derived. Finally, we find that ZFNickases appear to induce greatly reduced levels of mutagenic NHEJ at their target nicking site. ZFNickases thus provide a promising means for inducing HDR-mediated gene modifications while reducing unwanted mutagenesis caused by error-prone NHEJ.
    Keywords: Repair
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 18
    Publication Date: 2012-05-13
    Description: Quantitative analyses of next-generation sequencing (NGS) data, such as the detection of copy number variations (CNVs), remain challenging. Current methods detect CNVs as changes in the depth of coverage along chromosomes. Technological or genomic variations in the depth of coverage thus lead to a high false discovery rate (FDR), even upon correction for GC content. In the context of association studies between CNVs and disease, a high FDR means many false CNVs, thereby decreasing the discovery power of the study after correction for multiple testing. We propose ‘Copy Number estimation by a Mixture Of PoissonS’ (cn.MOPS), a data processing pipeline for CNV detection in NGS data. In contrast to previous approaches, cn.MOPS incorporates modeling of depths of coverage across samples at each genomic position. Therefore, cn.MOPS is not affected by read count variations along chromosomes. Using a Bayesian approach, cn.MOPS decomposes variations in the depth of coverage across samples into integer copy numbers and noise by means of its mixture components and Poisson distributions, respectively. The noise estimate allows for reducing the FDR by filtering out detections having high noise that are likely to be false detections. We compared cn.MOPS with the five most popular methods for CNV detection in NGS data using four benchmark datasets: (i) simulated data, (ii) NGS data from a male HapMap individual with implanted CNVs from the X chromosome, (iii) data from HapMap individuals with known CNVs, (iv) high coverage data from the 1000 Genomes Project. cn.MOPS outperformed its five competitors in terms of precision (1–FDR) and recall for both gains and losses in all benchmark data sets. The software cn.MOPS is publicly available as an R package at http://www.bioinf.jku.at/software/cnmops/ and at Bioconductor.
    Keywords: Computational Methods, Massively Parallel (Deep) Sequencing
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 19
    Publication Date: 2012-05-23
    Description: GC content bias describes the dependence between fragment count (read coverage) and GC content found in Illumina sequencing data. This bias can dominate the signal of interest for analyses that focus on measuring fragment abundance within a genome, such as copy number estimation (DNA-seq). The bias is not consistent between samples; and there is no consensus as to the best methods to remove it in a single sample. We analyze regularities in the GC bias patterns, and find a compact description for this unimodal curve family. It is the GC content of the full DNA fragment, not only the sequenced read, that most influences fragment count. This GC effect is unimodal: both GC-rich fragments and AT-rich fragments are underrepresented in the sequencing results. This empirical evidence strengthens the hypothesis that PCR is the most important cause of the GC bias. We propose a model that produces predictions at the base pair level, allowing strand-specific GC-effect correction regardless of the downstream smoothing or binning. These GC modeling considerations can inform other high-throughput sequencing analyses such as ChIP-seq and RNA-seq.
    Keywords: Computational Methods, Massively Parallel (Deep) Sequencing
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 20
    Publication Date: 2012-03-29
    Description: Recent advances in sequencing technology have enabled the rapid generation of billions of bases at relatively low cost. A crucial first step in many sequencing applications is to map those reads to a reference genome. However, when the reference genome is large, finding accurate mappings poses a significant computational challenge due to the sheer amount of reads, and because many reads map to the reference sequence approximately but not exactly. We introduce Hobbes, a new gram-based program for aligning short reads, supporting Hamming and edit distance. Hobbes implements two novel techniques, which yield substantial performance improvements: an optimized gram-selection procedure for reads, and a cache-efficient filter for pruning candidate mappings. We systematically tested the performance of Hobbes on both real and simulated data with read lengths varying from 35 to 100 bp, and compared its performance with several state-of-the-art read-mapping programs, including Bowtie, BWA, mrsFast and RazerS. Hobbes is faster than all other read mapping programs we have tested while maintaining high mapping quality. Hobbes is about five times faster than Bowtie and about 2–10 times faster than BWA, depending on read length and error rate, when asked to find all mapping locations of a read in the human genome within a given Hamming or edit distance, respectively. Hobbes supports the SAM output format and is publicly available at http://hobbes.ics.uci.edu .
    Keywords: Computational Methods, Massively Parallel (Deep) Sequencing
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 21
    Publication Date: 2012-12-14
    Description: We present Quip, a lossless compression algorithm for next-generation sequencing data in the FASTQ and SAM/BAM formats. In addition to implementing reference-based compression, we have developed, to our knowledge, the first assembly-based compressor, using a novel de novo assembly algorithm. A probabilistic data structure is used to dramatically reduce the memory required by traditional de Bruijn graph assemblers, allowing millions of reads to be assembled very efficiently. Read sequences are then stored as positions within the assembled contigs. This is combined with statistical compression of read identifiers, quality scores, alignment information and sequences, effectively collapsing very large data sets to 〈15% of their original size with no loss of information. Availability: Quip is freely available under the 3-clause BSD license from http://cs.washington.edu/homes/dcjones/quip .
    Keywords: Computational Methods, Massively Parallel (Deep) Sequencing
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 22
    Publication Date: 2012-09-27
    Description: High-throughput immunoglobulin sequencing promises new insights into the somatic hypermutation and antigen-driven selection processes that underlie B-cell affinity maturation and adaptive immunity. The ability to estimate positive and negative selection from these sequence data has broad applications not only for understanding the immune response to pathogens, but is also critical to determining the role of somatic hypermutation in autoimmunity and B-cell cancers. Here, we develop a statistical framework for Bayesian estimation of Antigen-driven SELectIoN (BASELINe) based on the analysis of somatic mutation patterns. Our approach represents a fundamental advance over previous methods by shifting the problem from one of simply detecting selection to one of quantifying selection. Along with providing a more intuitive means to assess and visualize selection, our approach allows, for the first time, comparative analysis between groups of sequences derived from different germline V(D)J segments. Application of this approach to next-generation sequencing data demonstrates different selection pressures for memory cells of different isotypes. This framework can easily be adapted to analyze other types of DNA mutation patterns resulting from a mutator that displays hot/cold-spots, substitution preference or other intrinsic biases.
    Keywords: Computational Methods, Massively Parallel (Deep) Sequencing
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 23
    Publication Date: 2012-09-27
    Description: Competitive gene set tests are commonly used in molecular pathway analysis to test for enrichment of a particular gene annotation category amongst the differential expression results from a microarray experiment. Existing gene set tests that rely on gene permutation are shown here to be extremely sensitive to inter-gene correlation. Several data sets are analyzed to show that inter-gene correlation is non-ignorable even for experiments on homogeneous cell populations using genetically identical model organisms. A new gene set test procedure (CAMERA) is proposed based on the idea of estimating the inter-gene correlation from the data, and using it to adjust the gene set test statistic. An efficient procedure is developed for estimating the inter-gene correlation and characterizing its precision. CAMERA is shown to control the type I error rate correctly regardless of inter-gene correlations, yet retains excellent power for detecting genuine differential expression. Analysis of breast cancer data shows that CAMERA recovers known relationships between tumor subtypes in very convincing terms. CAMERA can be used to analyze specified sets or as a pathway analysis tool using a database of molecular signatures.
    Keywords: Computational Methods, Massively Parallel (Deep) Sequencing
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. More information can be found here...