ALBERT

All Library Books, journals and Electronic Records Telegrafenberg

Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
Filter
  • Computational Methods, Genomics  (67)
  • Synthetic Biology and Assembly Cloning  (46)
  • Oxford University Press  (113)
  • Irkutsk : Ross. Akad. Nauk, Sibirskoe Otd., Inst. Zemnoj Kory
  • Krefeld : Geologischer Dienst Nordhein-Westfalen
  • Public Library of Science (PLoS)
  • 2010-2014  (113)
  • 2005-2009
Collection
Publisher
  • Oxford University Press  (113)
  • Irkutsk : Ross. Akad. Nauk, Sibirskoe Otd., Inst. Zemnoj Kory
  • Krefeld : Geologischer Dienst Nordhein-Westfalen
  • Public Library of Science (PLoS)
Years
Year
  • 1
    Publication Date: 2014-11-07
    Description: A new functional gene database, FOAM (Functional Ontology Assignments for Metagenomes), was developed to screen environmental metagenomic sequence datasets. FOAM provides a new functional ontology dedicated to classify gene functions relevant to environmental microorganisms based on Hidden Markov Models (HMMs). Sets of aligned protein sequences (i.e. ‘profiles’) were tailored to a large group of target KEGG Orthologs (KOs) from which HMMs were trained. The alignments were checked and curated to make them specific to the targeted KO. Within this process, sequence profiles were enriched with the most abundant sequences available to maximize the yield of accurate classifier models. An associated functional ontology was built to describe the functional groups and hierarchy. FOAM allows the user to select the target search space before HMM-based comparison steps and to easily organize the results into different functional categories and subcategories. FOAM is publicly available at http://portal.nersc.gov/project/m1317/FOAM/ .
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 2
    Publication Date: 2014-11-28
    Description: It is now known that unwanted noise and unmodeled artifacts such as batch effects can dramatically reduce the accuracy of statistical inference in genomic experiments. These sources of noise must be modeled and removed to accurately measure biological variability and to obtain correct statistical inference when performing high-throughput genomic analysis. We introduced surrogate variable analysis (sva) for estimating these artifacts by (i) identifying the part of the genomic data only affected by artifacts and (ii) estimating the artifacts with principal components or singular vectors of the subset of the data matrix. The resulting estimates of artifacts can be used in subsequent analyses as adjustment factors to correct analyses. Here I describe a version of the sva approach specifically created for count data or FPKMs from sequencing experiments based on appropriate data transformation. I also describe the addition of supervised sva (ssva) for using control probes to identify the part of the genomic data only affected by artifacts. I present a comparison between these versions of sva and other methods for batch effect estimation on simulated data, real count-based data and FPKM-based data. These updates are available through the sva Bioconductor package and I have made fully reproducible analysis using these methods available from: https://github.com/jtleek/svaseq .
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 3
    Publication Date: 2014-11-28
    Description: High-throughput techniques have considerably increased the potential of comparative genomics whilst simultaneously posing many new challenges. One of those challenges involves efficiently mining the large amount of data produced and exploring the landscape of both conserved and idiosyncratic genomic regions across multiple genomes. Domains of application of these analyses are diverse: identification of evolutionary events, inference of gene functions, detection of niche-specific genes or phylogenetic profiling. Insyght is a comparative genomic visualization tool that combines three complementary displays: (i) a table for thoroughly browsing amongst homologues, (ii) a comparator of orthologue functional annotations and (iii) a genomic organization view designed to improve the legibility of rearrangements and distinctive loci. The latter display combines symbolic and proportional graphical paradigms. Synchronized navigation across multiple species and interoperability between the views are core features of Insyght. A gene filter mechanism is provided that helps the user to build a biologically relevant gene set according to multiple criteria such as presence/absence of homologues and/or various annotations. We illustrate the use of Insyght with scenarios. Currently, only Bacteria and Archaea are supported. A public instance is available at http://genome.jouy.inra.fr/Insyght . The tool is freely downloadable for private data set analysis.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 4
    Publication Date: 2014-11-28
    Description: The 54 promoters are unique in prokaryotic genome and responsible for transcripting carbon and nitrogen-related genes. With the avalanche of genome sequences generated in the postgenomic age, it is highly desired to develop automated methods for rapidly and effectively identifying the 54 promoters. Here, a predictor called ‘ iPro54-PseKNC ’ was developed. In the predictor, the samples of DNA sequences were formulated by a novel feature vector called ‘pseudo k -tuple nucleotide composition’, which was further optimized by the incremental feature selection procedure. The performance of iPro54-PseKNC was examined by the rigorous jackknife cross-validation tests on a stringent benchmark data set. As a user-friendly web-server, iPro54-PseKNC is freely accessible at http://lin.uestc.edu.cn/server/iPro54-PseKNC . For the convenience of the vast majority of experimental scientists, a step-by-step protocol guide was provided on how to use the web-server to get the desired results without the need to follow the complicated mathematics that were presented in this paper just for its integrity. Meanwhile, we also discovered through an in-depth statistical analysis that the distribution of distances between the transcription start sites and the translation initiation sites were governed by the gamma distribution, which may provide a fundamental physical principle for studying the 54 promoters.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 5
    Publication Date: 2014-11-28
    Description: We present a discriminative learning method for pattern discovery of binding sites in nucleic acid sequences based on hidden Markov models. Sets of positive and negative example sequences are mined for sequence motifs whose occurrence frequency varies between the sets. The method offers several objective functions, but we concentrate on mutual information of condition and motif occurrence. We perform a systematic comparison of our method and numerous published motif-finding tools. Our method achieves the highest motif discovery performance, while being faster than most published methods. We present case studies of data from various technologies, including ChIP-Seq, RIP-Chip and PAR-CLIP, of embryonic stem cell transcription factors and of RNA-binding proteins, demonstrating practicality and utility of the method. For the alternative splicing factor RBM10, our analysis finds motifs known to be splicing-relevant. The motif discovery method is implemented in the free software package Discrover. It is applicable to genome- and transcriptome-scale data, makes use of available repeat experiments and aside from binary contrasts also more complex data configurations can be utilized.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 6
    Publication Date: 2014-03-13
    Description: To reveal the full potential of human pluripotent stem cells, new methods for rapid, site-specific genomic engineering are needed. Here, we describe a system for precise genetic modification of human embryonic stem cells (ESCs) and induced pluripotent stem cells (iPSCs). We identified a novel human locus, H11 , located in a safe, intergenic, transcriptionally active region of chromosome 22, as the recipient site, to provide robust, ubiquitous expression of inserted genes. Recipient cell lines were established by site-specific placement of a ‘landing pad’ cassette carrying attP sites for phiC31 and Bxb1 integrases at the H11 locus by spontaneous or TALEN-assisted homologous recombination. Dual integrase cassette exchange (DICE) mediated by phiC31 and Bxb1 integrases was used to insert genes of interest flanked by phiC31 and Bxb1 attB sites at the H11 locus, replacing the landing pad. This system provided complete control over content, direction and copy number of inserted genes, with a specificity of 100%. A series of genes, including mCherry and various combinations of the neural transcription factors LMX1a, FOXA2 and OTX2, were inserted in recipient cell lines derived from H9 ESC, as well as iPSC lines derived from a Parkinson’s disease patient and a normal sibling control. The DICE system offers rapid, efficient and precise gene insertion in ESC and iPSC and is particularly well suited for repeated modifications of the same locus.
    Keywords: Synthetic Biology and Assembly Cloning
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 7
    Publication Date: 2014-03-13
    Description: Recombineering, which is the use of homologous recombination for DNA engineering in Escherichia coli , usually uses antibiotic selection to identify the intended recombinant. When combined in a second step with counterselection using a small molecule toxin, seamless products can be obtained. Here, we report the advantages of a genetic strategy using CcdB as the counterselectable agent. Expression of CcdB is toxic to E. coli in the absence of the CcdA antidote so counterselection is initiated by the removal of CcdA expression. CcdB counterselection is robust and does not require titrations or experiment-to-experiment optimization. Because counterselection strategies necessarily differ according to the copy number of the target, we describe two variations. For multi-copy targets, we use two E. coli hosts so that counterselection is exerted by the transformation step that is needed to separate the recombined and unrecombined plasmids. For single copy targets, we put the ccdA gene onto the temperature-sensitive pSC101 Red expression plasmid so that counterselection is exerted by the standard temperature shift to remove the expression plasmid. To reduce unwanted intramolecular recombination, we also combined CcdB counterselection with Redα omission. These options improve the use of counterselection in recombineering with BACs, plasmids and the E. coli chromosome.
    Keywords: Synthetic Biology and Assembly Cloning
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 8
    Publication Date: 2014-05-01
    Description: Molecular stratification of tumors is essential for developing personalized therapies. Although patient stratification strategies have been successful; computational methods to accurately translate the gene-signature from high-throughput platform to a clinically adaptable low-dimensional platform are currently lacking. Here, we describe PIGExClass (platform-independent isoform-level gene-expression based classification-system), a novel computational approach to derive and then transfer gene-signatures from one analytical platform to another. We applied PIGExClass to design a reverse transcriptase-quantitative polymerase chain reaction (RT-qPCR) based molecular-subtyping assay for glioblastoma multiforme (GBM), the most aggressive primary brain tumors. Unsupervised clustering of TCGA (the Cancer Genome Altas Consortium) GBM samples, based on isoform-level gene-expression profiles, recaptured the four known molecular subgroups but switched the subtype for 19% of the samples, resulting in significant ( P = 0.0103) survival differences among the refined subgroups. PIGExClass derived four-class classifier, which requires only 121 transcript-variants, assigns GBM patients’ molecular subtype with 92% accuracy. This classifier was translated to an RT-qPCR assay and validated in an independent cohort of 206 GBM samples. Our results demonstrate the efficacy of PIGExClass in the design of clinically adaptable molecular subtyping assay and have implications for developing robust diagnostic assays for cancer patient stratification.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 9
    Publication Date: 2014-05-01
    Description: The ability to correlate chromosome conformation and gene expression gives a great deal of information regarding the strategies used by a cell to properly regulate gene activity. 4C-Seq is a relatively new and increasingly popular technology where the set of genomic interactions generated by a single point in the genome can be determined. 4C-Seq experiments generate large, complicated data sets and it is imperative that signal is properly distinguished from noise. Currently, there are a limited number of methods for analyzing 4C-Seq data. Here, we present a new method, fourSig , which in addition to being precise and simple to use also includes a new feature that prioritizes detected interactions. Our results demonstrate the efficacy of fourSig with previously published and novel 4C-Seq data sets and show that our significance prioritization correlates with the ability to reproducibly detect interactions among replicates.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 10
    Publication Date: 2014-02-28
    Description: DNA ‘assembly’ from ‘building blocks’ remains a cornerstone in synthetic biology, whether it be for gene synthesis (~1 kb), pathway engineering (~10 kb) or synthetic genomes (〉100 kb). Despite numerous advances in the techniques used for DNA assembly, verification of the assembly is still a necessity, which becomes cost-prohibitive and a logistical challenge with increasing scale. Here we describe for the first time a comprehensive, high-throughput solution for structural DNA assembly verification by restriction digest using exhaustive in silico enzyme screening, rolling circle amplification of plasmid DNA, capillary electrophoresis and automated digest pattern recognition. This low-cost and robust methodology has been successfully used to screen over 31 000 clones of DNA constructs at 〈$1 per sample.
    Keywords: Synthetic Biology and Assembly Cloning
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 11
    Publication Date: 2014-02-28
    Description: Synthetic biology requires effective methods to assemble DNA parts into devices and to modify these devices once made. Here we demonstrate a convenient rapid procedure for DNA fragment assembly using site-specific recombination by C31 integrase. Using six orthogonal attP / attB recombination site pairs with different overlap sequences, we can assemble up to five DNA fragments in a defined order and insert them into a plasmid vector in a single recombination reaction. C31 integrase-mediated assembly is highly efficient, allowing production of large libraries suitable for combinatorial gene assembly strategies. The resultant assemblies contain arrays of DNA cassettes separated by recombination sites, which can be used to manipulate the assembly by further recombination. We illustrate the utility of these procedures to (i) assemble functional metabolic pathways containing three, four or five genes; (ii) optimize productivity of two model metabolic pathways by combinatorial assembly with randomization of gene order or ribosome binding site strength; and (iii) modify an assembled metabolic pathway by gene replacement or addition.
    Keywords: Synthetic Biology and Assembly Cloning
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 12
    Publication Date: 2014-04-03
    Description: Alternative transcript processing is an important mechanism for generating functional diversity in genes. However, little is known about the precise functions of individual isoforms. In fact, proteins (translated from transcript isoforms), not genes, are the function carriers. By integrating multiple human RNA-seq data sets, we carried out the first systematic prediction of isoform functions, enabling high-resolution functional annotation of human transcriptome. Unlike gene function prediction, isoform function prediction faces a unique challenge: the lack of the training data—all known functional annotations are at the gene level. To address this challenge, we modelled the gene–isoform relationships as multiple instance data and developed a novel label propagation method to predict functions. Our method achieved an average area under the receiver operating characteristic curve of 0.67 and assigned functions to 15 572 isoforms. Interestingly, we observed that different functions have different sensitivities to alternative isoform processing, and that the function diversity of isoforms from the same gene is positively correlated with their tissue expression diversity. Finally, we surveyed the literature to validate our predictions for a number of apoptotic genes. Strikingly, for the famous ‘TP53’ gene, we not only accurately identified the apoptosis regulation function of its five isoforms, but also correctly predicted the precise direction of the regulation.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 13
    Publication Date: 2014-04-03
    Description: A conditional gene expression system that is fast-acting, is tunable and achieves single-gene specificity was recently developed for yeast. A gene placed directly downstream of a modified GAL1 promoter containing six Zif268 binding sequences (with single nucleotide spacing) was shown to be selectively inducible in the presence of β-estradiol, so long as cells express the artificial transcription factor, Z 3 EV (a fusion of the Zif268 DNA binding domain, the ligand binding domain of the human estrogen receptor and viral protein 16). We show the strength of Z 3 EV-responsive promoters can be modified using straightforward design principles. By moving Zif268 binding sites toward the transcription start site, expression output can be nearly doubled. Despite the reported requirement of estrogen receptor dimerization for hormone-dependent activation, a single binding site suffices for target gene activation. Target gene expression levels correlate with promoter binding site copy number and we engineer a set of inducible promoter chassis with different input–output characteristics. Finally, the coupling between inducer identity and gene activation is flexible: the ligand specificity of Z 3 EV can be re-programmed to respond to a non-hormone small molecule with only five amino acid substitutions in the human estrogen receptor domain, which may prove useful for industrial applications.
    Keywords: Synthetic Biology and Assembly Cloning
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 14
    Publication Date: 2014-10-10
    Description: Parallel analysis of RNA ends (PARE) is a technique utilizing high-throughput sequencing to profile uncapped, mRNA cleavage or decay products on a genome-wide basis. Tools currently available to validate miRNA targets using PARE data employ only annotated genes, whereas important targets may be found in unannotated genomic regions. To handle such cases and to scale to the growing availability of PARE data and genomes, we developed a new tool, ‘ sPARTA ’ (small RNA-PARE target analyzer) that utilizes a built-in, plant-focused target prediction module (aka ‘ miRferno ’). sPARTA not only exhibits an unprecedented gain in speed but also it shows greater predictive power by validating more targets, compared to a popular alternative. In addition, the novel ‘seed-free’ mode, optimized to find targets irrespective of complementarity in the seed-region, identifies novel intergenic targets. To fully capitalize on the novelty and strengths of sPARTA , we developed a web resource, ‘ comPARE ’, for plant miRNA target analysis; this facilitates the systematic identification and analysis of miRNA-target interactions across multiple species, integrated with visualization tools. This collation of high-throughput small RNA and PARE datasets from different genomes further facilitates re-evaluation of existing miRNA annotations, resulting in a ‘cleaner’ set of microRNAs.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 15
    Publication Date: 2014-10-10
    Description: Identification of three-dimensional (3D) interactions between regulatory elements across the genome is crucial to unravel the complex regulatory machinery that orchestrates proliferation and differentiation of cells. ChIA-PET is a novel method to identify such interactions, where physical contacts between regions bound by a specific protein are quantified using next-generation sequencing. However, determining the significance of the observed interaction frequencies in such datasets is challenging, and few methods have been proposed. Despite the fact that regions that are close in linear genomic distance have a much higher tendency to interact by chance, no methods to date are capable of taking such dependency into account. Here, we propose a statistical model taking into account the genomic distance relationship, as well as the general propensity of anchors to be involved in contacts overall. Using both real and simulated data, we show that the previously proposed statistical test, based on Fisher's exact test, leads to invalid results when data are dependent on genomic distance. We also evaluate our method on previously validated cell-line specific and constitutive 3D interactions, and show that relevant interactions are significant, while avoiding over-estimating the significance of short nearby interactions.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 16
    Publication Date: 2014-10-10
    Description: Viral sequence classification has wide applications in clinical, epidemiological, structural and functional categorization studies. Most existing approaches rely on an initial alignment step followed by classification based on phylogenetic or statistical algorithms. Here we present an ultrafast alignment-free subtyping tool for human immunodeficiency virus type one (HIV-1) adapted from Prediction by Partial Matching compression. This tool, named COMET, was compared to the widely used phylogeny-based REGA and SCUEAL tools using synthetic and clinical HIV data sets (1 090 698 and 10 625 sequences, respectively). COMET's sensitivity and specificity were comparable to or higher than the two other subtyping tools on both data sets for known subtypes. COMET also excelled in detecting and identifying new recombinant forms, a frequent feature of the HIV epidemic. Runtime comparisons showed that COMET was almost as fast as USEARCH. This study demonstrates the advantages of alignment-free classification of viral sequences, which feature high rates of variation, recombination and insertions/deletions. COMET is free to use via an online interface.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 17
    Publication Date: 2014-09-27
    Description: The precise control of gene expression is essential in basic biological research as well as in biotechnological applications. Most regulated systems available in yeast enable only the overexpression of the target gene, excluding the possibility of intermediate or weak expression. Moreover, these systems are frequently toxic or depend on growth conditions. We constructed a heterologous transcription factor that overcomes these limitations. Our system is a fusion of the bacterial LexA DNA-binding protein, the human estrogen receptor (ER) and an activation domain (AD). The activity of this chimera, called LexA-ER-AD, is tightly regulated by the hormone β-estradiol. The selection of the AD proved to be crucial to avoid toxic effects and to define the range of activity that can be precisely tuned with β-estradiol. As our system is based on a heterologous DNA-binding domain, induction in different metabolic contexts is possible. Additionally, by controlling the number of LexA-binding sites in the target promoter, one can scale the expression levels up or down. Overall, our LexA-ER-AD system is a valuable tool to precisely control gene expression in different experimental contexts without toxic side effects.
    Keywords: Synthetic Biology and Assembly Cloning
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 18
    Publication Date: 2014-09-27
    Description: Inspired by the developments of synthetic biology and the need for improved genetic tools to exploit cyanobacteria for the production of renewable bioproducts, we developed a versatile platform for the construction of broad-host-range vector systems. This platform includes the following features: (i) an efficient assembly strategy in which modules released from 3 to 4 donor plasmids or produced by polymerase chain reaction are assembled by isothermal assembly guided by short GC-rich overlap sequences. (ii) A growing library of molecular devices categorized in three major groups: (a) replication and chromosomal integration; (b) antibiotic resistance; (c) functional modules. These modules can be assembled in different combinations to construct a variety of autonomously replicating plasmids and suicide plasmids for gene knockout and knockin. (iii) A web service, the CYANO-VECTOR assembly portal, which was built to organize the various modules, facilitate the in silico construction of plasmids, and encourage the use of this system. This work also resulted in the construction of an improved broad-host-range replicon derived from RSF1010, which replicates in several phylogenetically distinct strains including a new experimental model strain Synechocystis sp. WHSyn, and the characterization of nine antibiotic cassettes, four reporter genes, four promoters, and a ribozyme-based insulator in several diverse cyanobacterial strains.
    Keywords: Synthetic Biology and Assembly Cloning
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 19
    Publication Date: 2014-11-28
    Description: Understanding how regulatory networks globally coordinate the response of a cell to changing conditions, such as perturbations by shifting environments, is an elementary challenge in systems biology which has yet to be met. Genome-wide gene expression measurements are high dimensional as these are reflecting the condition-specific interplay of thousands of cellular components. The integration of prior biological knowledge into the modeling process of systems-wide gene regulation enables the large-scale interpretation of gene expression signals in the context of known regulatory relations. We developed COGERE ( http://mips.helmholtz-muenchen.de/cogere ), a method for the inference of condition-specific gene regulatory networks in human and mouse. We integrated existing knowledge of regulatory interactions from multiple sources to a comprehensive model of prior information. COGERE infers condition-specific regulation by evaluating the mutual dependency between regulator (transcription factor or miRNA) and target gene expression using prior information. This dependency is scored by the non-parametric, nonlinear correlation coefficient 2 (eta squared) that is derived by a two-way analysis of variance. We show that COGERE significantly outperforms alternative methods in predicting condition-specific gene regulatory networks on simulated data sets. Furthermore, by inferring the cancer-specific gene regulatory network from the NCI-60 expression study, we demonstrate the utility of COGERE to promote hypothesis-driven clinical research.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 20
    Publication Date: 2014-11-28
    Description: Mammalian synthetic biology may provide novel therapeutic strategies, help decipher new paths for drug discovery and facilitate synthesis of valuable molecules. Yet, our capacity to genetically program cells is currently hampered by the lack of efficient approaches to streamline the design, construction and screening of synthetic gene networks. To address this problem, here we present a framework for modular and combinatorial assembly of functional (multi)gene expression vectors and their efficient and specific targeted integration into a well-defined chromosomal context in mammalian cells. We demonstrate the potential of this framework by assembling and integrating different functional mammalian regulatory networks including the largest gene circuit built and chromosomally integrated to date (6 transcription units, 27kb) encoding an inducible memory device. Using a library of 18 different circuits as a proof of concept, we also demonstrate that our method enables one-pot/single-flask chromosomal integration and screening of circuit libraries. This rapid and powerful prototyping platform is well suited for comparative studies of genetic regulatory elements, genes and multi-gene circuits as well as facile development of libraries of isogenic engineered cell lines.
    Keywords: Synthetic Biology and Assembly Cloning
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 21
    Publication Date: 2014-12-17
    Description: Non-coding RNAs (ncRNAs) are known to play important functional roles in the cell. However, their identification and recognition in genomic sequences remains challenging. In silico methods, such as classification tools, offer a fast and reliable way for such screening and multiple classifiers have already been developed to predict well-defined subfamilies of RNA. So far, however, out of all the ncRNAs, only tRNA, miRNA and snoRNA can be predicted with a satisfying sensitivity and specificity. We here present ptRNApred , a tool to detect and classify subclasses of non-coding RNA that are involved in the regulation of post-transcriptional modifications or DNA replication, which we here call post-transcriptional RNA (ptRNA). It (i) detects RNA sequences coding for post-transcriptional RNA from the genomic sequence with an overall sensitivity of 91% and a specificity of 94% and (ii) predicts ptRNA-subclasses that exist in eukaryotes: snRNA, snoRNA, RNase P, RNase MRP, Y RNA or telomerase RNA. AVAILABILITY: The ptRNApred software is open for public use on http://www.ptrnapred.org/ .
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 22
    Publication Date: 2014-12-17
    Description: Rapid development of next generation sequencing technology has enabled the identification of genomic alterations from short sequencing reads. There are a number of software pipelines available for calling single nucleotide variants from genomic DNA but, no comprehensive pipelines to identify, annotate and prioritize expressed SNVs (eSNVs) from non-directional paired-end RNA-Seq data. We have developed the eSNV-Detect, a novel computational system, which utilizes data from multiple aligners to call, even at low read depths, and rank variants from RNA-Seq. Multi-platform comparisons with the eSNV-Detect variant candidates were performed. The method was first applied to RNA-Seq from a lymphoblastoid cell-line, achieving 99.7% precision and 91.0% sensitivity in the expressed SNPs for the matching HumanOmni2.5 BeadChip data. Comparison of RNA-Seq eSNV candidates from 25 ER+ breast tumors from The Cancer Genome Atlas (TCGA) project with whole exome coding data showed 90.6–96.8% precision and 91.6–95.7% sensitivity. Contrasting single-cell mRNA-Seq variants with matching traditional multicellular RNA-Seq data for the MD-MB231 breast cancer cell-line delineated variant heterogeneity among the single-cells. Further, Sanger sequencing validation was performed for an ER+ breast tumor with paired normal adjacent tissue validating 29 out of 31 candidate eSNVs. The source code and user manuals of the eSNV-Detect pipeline for Sun Grid Engine and virtual machine are available at http://bioinformaticstools.mayo.edu/research/esnv-detect/ .
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 23
    Publication Date: 2014-04-15
    Description: Heterogeneity in genetic networks across different signaling molecular contexts can suggest molecular regulatory mechanisms. Here we describe a comparative chi-square analysis (CP 2 ) method, considerably more flexible and effective than other alternatives, to screen large gene expression data sets for conserved and differential interactions. CP 2 decomposes interactions across conditions to assess homogeneity and heterogeneity. Theoretically, we prove an asymptotic chi-square null distribution for the interaction heterogeneity statistic. Empirically, on synthetic yeast cell cycle data, CP 2 achieved much higher statistical power in detecting differential networks than alternative approaches. We applied CP 2 to Drosophila melanogaster wing gene expression arrays collected under normal conditions, and conditions with overexpressed E2F and Cabut, two transcription factor complexes that promote ectopic cell cycling. The resulting differential networks suggest a mechanism by which E2F and Cabut regulate distinct gene interactions, while still sharing a small core network. Thus, CP 2 is sensitive in detecting network rewiring, useful in comparing related biological systems.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 24
    Publication Date: 2014-04-15
    Description: Sequence similarity search is a fundamental way of analyzing nucleotide sequences. Despite decades of research, this is not a solved problem because there exist many similarities that are not found by current methods. Search methods are typically based on a seed-and-extend approach, which has many variants (e.g. spaced seeds, transition seeds), and it remains unclear how to optimize this approach. This study designs and tests seeding methods for inter-mammal and inter-insect genome comparison. By considering substitution patterns of real genomes, we design sets of multiple complementary transition seeds, which have better performance (sensitivity per run time) than previous seeding strategies. Often the best seed patterns have more transition positions than those used previously. We also point out that recent computer memory sizes (e.g. 60 GB) make it feasible to use multiple (e.g. eight) seeds for whole mammal genomes. Interestingly, the most sensitive settings achieve diminishing returns for human–dog and melanogaster–pseudoobscura comparisons, but not for human–mouse, which suggests that we still miss many human–mouse alignments. Our optimized heuristics find ~20 000 new human–mouse alignments that are missing from the standard UCSC alignments. We tabulate seed patterns and parameters that work well so they can be used in future research.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 25
    Publication Date: 2014-04-15
    Description: Identifying differential features between conditions is a popular approach to understanding molecular features and their mechanisms underlying a biological process of particular interest. Although many tests for identifying differential expression of gene or gene sets have been proposed, there was limited success in developing methods for differential interactions of genes between conditions because of its computational complexity. We present a method for Evaluation of Dependency DifferentialitY (EDDY), which is a statistical test for differential dependencies of a set of genes between two conditions. Unlike previous methods focused on differential expression of individual genes or correlation changes of individual gene–gene interactions, EDDY compares two conditions by evaluating the probability distributions of dependency networks from genes. The method has been evaluated and compared with other methods through simulation studies, and application to glioblastoma multiforme data resulted in informative cancer and glioblastoma multiforme subtype-related findings. The comparison with Gene Set Enrichment Analysis, a differential expression-based method, revealed that EDDY identifies the gene sets that are complementary to those identified by Gene Set Enrichment Analysis. EDDY also showed much lower false positives than Gene Set Co-expression Analysis, a method based on correlation changes of individual gene–gene interactions, thus providing more informative results. The Java implementation of the algorithm is freely available to noncommercial users. Download from: http://biocomputing.tgen.org/software/EDDY .
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 26
    Publication Date: 2014-04-15
    Description: RGB marking and DNA barcoding are two cutting-edge technologies in the field of clonal cell marking. To combine the virtues of both approaches, we equipped LeGO vectors encoding red, green or blue fluorescent proteins with complex DNA barcodes carrying color-specific signatures. For these vectors, we generated highly complex plasmid libraries that were used for the production of barcoded lentiviral vector particles. In proof-of-principle experiments, we used barcoded vectors for RGB marking of cell lines and primary murine hepatocytes. We applied single-cell polymerase chain reaction to decipher barcode signatures of individual RGB-marked cells expressing defined color hues. This enabled us to prove clonal identity of cells with one and the same RGB color. Also, we made use of barcoded vectors to investigate clonal development of leukemia induced by ectopic oncogene expression in murine hematopoietic cells. In conclusion, by combining RGB marking and DNA barcoding, we have established a novel technique for the unambiguous genetic marking of individual cells in the context of normal regeneration as well as malignant outgrowth. Moreover, the introduction of color-specific signatures in barcodes will facilitate studies on the impact of different variables (e.g. vector type, transgenes, culture conditions) in the context of competitive repopulation studies.
    Keywords: Synthetic Biology and Assembly Cloning
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 27
    Publication Date: 2014-04-15
    Description: Insertional oncogene activation and aberrant splicing have proved to be major setbacks for retroviral stem cell gene therapy. Integrase-deficient human immunodeficiency virus-1-derived vectors provide a potentially safer approach, but their circular genomes are rapidly lost during cell division. Here we describe a novel lentiviral vector (LV) that incorporates human ß-interferon scaffold/matrix-associated region sequences to provide an origin of replication for long-term mitotic maintenance of the episomal LTR circles. The resulting ‘anchoring’ non-integrating lentiviral vector (aniLV) achieved initial transduction rates comparable with integrating vector followed by progressive establishment of long-term episomal expression in a subset of cells. Analysis of aniLV-transduced single cell-derived clones maintained without selective pressure for 〉100 rounds of cell division showed sustained transgene expression from episomes and provided molecular evidence for long-term episome maintenance. To evaluate aniLV performance in primary cells, we transduced lineage-depleted murine hematopoietic progenitor cells, observing GFP expression in clonogenic progenitor colonies and peripheral blood leukocyte chimerism following transplantation into conditioned hosts. In aggregate, our studies suggest that scaffold/matrix-associated region elements can serve as molecular anchors for non-integrating lentivector episomes, providing sustained gene expression through successive rounds of cell division and progenitor differentiation in vitro and in vivo .
    Keywords: Synthetic Biology and Assembly Cloning
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 28
    Publication Date: 2014-11-12
    Description: Assembly of DNA ‘parts’ to create larger constructs is an essential enabling technique for bioengineering and synthetic biology. Here we describe a simple method, PaperClip, which allows flexible assembly of multiple DNA parts from currently existing libraries cloned in any vector. No restriction enzymes, mutagenesis of internal restriction sites, or reamplification to add end homology are required. Order of assembly is directed by double stranded oligonucleotides—‘Clips’. Clips are formed by ligation of pairs of oligonucleotides corresponding to the ends of each part. PaperClip assembly can be performed by polymerase chain reaction or by cell extract-mediated recombination. Once multi-use Clips have been prepared, assembly of at least six DNA parts in any order can be accomplished with high efficiency within several hours.
    Keywords: Synthetic Biology and Assembly Cloning
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 29
    Publication Date: 2014-09-02
    Description: Inundation of evolutionary markers expedited in Human Genome Project and 1000 Genome Consortium has necessitated pruning of redundant and dependent variables. Various computational tools based on machine-learning and data-mining methods like feature selection/extraction have been proposed to escape the curse of dimensionality in large datasets. Incidentally, evolutionary studies, primarily based on sequentially evolved variations have remained un-facilitated by such advances till date. Here, we present a novel approach of recursive feature selection for hierarchical clustering of Y-chromosomal SNPs/haplogroups to select a minimal set of independent markers, sufficient to infer population structure as precisely as deduced by a larger number of evolutionary markers. To validate the applicability of our approach, we optimally designed MALDI-TOF mass spectrometry-based multiplex to accommodate independent Y-chromosomal markers in a single multiplex and genotyped two geographically distinct Indian populations. An analysis of 105 world-wide populations reflected that 15 independent variations/markers were optimal in defining population structure parameters, such as F ST , molecular variance and correlation-based relationship. A subsequent addition of randomly selected markers had a negligible effect (close to zero, i.e. 1 x 10 –3 ) on these parameters. The study proves efficient in tracing complex population structures and deriving relationships among world-wide populations in a cost-effective and expedient manner.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 30
    Publication Date: 2014-09-17
    Description: Developing a quantitative view of how biological pathways are regulated in response to environmental factors is central for understanding of disease phenotypes. We present a computational framework, named Multivariate Inference of Pathway Activity (MIPA), which quantifies degree of activity induced in a biological pathway by computing five distinct measures from transcriptomic profiles of its member genes. Statistical significance of inferred activity is examined using multiple independent self-contained tests followed by a competitive analysis. The method incorporates a new algorithm to identify a subset of genes that may regulate the extent of activity induced in a pathway. We present an in-depth evaluation of specificity, robustness, and reproducibility of our method. We benchmarked MIPA's false positive rate at less than 1%. Using transcriptomic profiles representing distinct physiological and disease states, we illustrate applicability of our method in (i) identifying gene–gene interactions in autophagy-dependent response to Salmonella infection, (ii) uncovering gene–environment interactions in host response to bacterial and viral pathogens and (iii) identifying driver genes and processes that contribute to wound healing and response to anti-TNFα therapy. We provide relevant experimental validation that corroborates the accuracy and advantage of our method.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 31
    Publication Date: 2014-09-17
    Description: Viral recombination is a key evolutionary mechanism, aiding escape from host immunity, contributing to changes in tropism and possibly assisting transmission across species barriers. The ability to determine whether recombination has occurred and to locate associated specific recombination junctions is thus of major importance in understanding emerging diseases and pathogenesis. This paper describes a method for determining recombinant mosaics (and their proportions) originating from two parent genomes, using high-throughput sequence data. The method involves setting the problem geometrically and the use of appropriately constrained quadratic programming. Recombinants of the honeybee deformed wing virus and the Varroa destructor virus-1 are inferred to illustrate the method from both siRNAs and reads sampling the viral genome population (cDNA library); our results are confirmed experimentally. Matlab software (MosaicSolver) is available.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 32
    Publication Date: 2014-08-15
    Description: Synthetic biology has significantly advanced the design of mammalian trigger-inducible transgene-control devices that are able to programme complex cellular behaviour. Fruit-based benzoate derivatives licensed as food additives, such as flavours (e.g. vanillate) and preservatives (e.g. benzoate), are a particularly attractive class of trigger compounds for orthogonal mammalian transgene control devices because of their innocuousness, physiological compatibility and simple oral administration. Capitalizing on the genetic componentry of the soil bacterium Comamonas testosteroni , which has evolved to catabolize a variety of aromatic compounds, we have designed different mammalian gene expression systems that could be induced and repressed by the food additives benzoate and vanillate. When implanting designer cells engineered for gene switch-driven expression of the human placental secreted alkaline phosphatase (SEAP) into mice, blood SEAP levels of treated animals directly correlated with a benzoate-enriched drinking programme. Additionally, the benzoate-/vanillate-responsive device was compatible with other transgene control systems and could be assembled into higher-order control networks providing expression dynamics reminiscent of a lap-timing stopwatch. Designer gene switches using licensed food additives as trigger compounds to achieve antagonistic dual-input expression profiles and provide novel control topologies and regulation dynamics may advance future gene- and cell-based therapies.
    Keywords: Synthetic Biology and Assembly Cloning
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 33
    Publication Date: 2014-08-01
    Description: Gene set enrichment testing can enhance the biological interpretation of ChIP-seq data. Here, we develop a method, ChIP-Enrich, for this analysis which empirically adjusts for gene locus length (the length of the gene body and its surrounding non-coding sequence). Adjustment for gene locus length is necessary because it is often positively associated with the presence of one or more peaks and because many biologically defined gene sets have an excess of genes with longer or shorter gene locus lengths. Unlike alternative methods, ChIP-Enrich can account for the wide range of gene locus length-to-peak presence relationships (observed in ENCODE ChIP-seq data sets). We show that ChIP-Enrich has a well-calibrated type I error rate using permuted ENCODE ChIP-seq data sets; in contrast, two commonly used gene set enrichment methods, Fisher's exact test and the binomial test implemented in Genomic Regions Enrichment of Annotations Tool (GREAT), can have highly inflated type I error rates and biases in ranking. We identify DNA-binding proteins, including CTCF, JunD and glucocorticoid receptor α (GRα), that show different enrichment patterns for peaks closer to versus further from transcription start sites. We also identify known and potential new biological functions of GRα. ChIP-Enrich is available as a web interface ( http://chip-enrich.med.umich.edu ) and Bioconductor package.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 34
    Publication Date: 2013-09-26
    Description: Tandem repeats (TRs) are often present in proteins with crucial functions, responsible for resistance, pathogenicity and associated with infectious or neurodegenerative diseases. This motivates numerous studies of TRs and their evolution, requiring accurate multiple sequence alignment. TRs may be lost or inserted at any position of a TR region by replication slippage or recombination, but current methods assume fixed unit boundaries, and yet are of high complexity. We present a new global graph-based alignment method that does not restrict TR unit indels by unit boundaries. TR indels are modeled separately and penalized using the phylogeny-aware alignment algorithm. This ensures enhanced accuracy of reconstructed alignments, disentangling TRs and measuring indel events and rates in a biologically meaningful way. Our method detects not only duplication events but also all changes in TR regions owing to recombination, strand slippage and other events inserting or deleting TR units. We evaluate our method by simulation incorporating TR evolution, by either sampling TRs from a profile hidden Markov model or by mimicking strand slippage with duplications. The new method is illustrated on a family of type III effectors, a pathogenicity determinant in agriculturally important bacteria Ralstonia solanacearum. We show that TR indel rate variation contributes to the diversification of this protein family.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 35
    Publication Date: 2013-09-26
    Description: In reverse genetics, a gene’s function is elucidated through targeted modifications in the coding region or associated DNA cis -regulatory elements. To this purpose, recently developed customizable transcription activator-like effector nucleases (TALENs) have proven an invaluable tool, allowing introduction of double-strand breaks at predetermined sites in the genome. Here we describe a practical and efficient method for the targeted genome engineering in Drosophila . We demonstrate TALEN-mediated targeted gene integration and efficient identification of mutant flies using a traceable marker phenotype. Furthermore, we developed an easy TALEN assembly (easyT) method relying on simultaneous reactions of DNA Bae I digestion and ligation, enabling construction of complete TALENs from a monomer unit library in a single day. Taken together, our strategy with easyT and TALEN-plasmid microinjection simplifies mutant generation and enables isolation of desired mutant fly lines in the F 1 generation.
    Keywords: Synthetic Biology and Assembly Cloning
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 36
    Publication Date: 2013-06-08
    Description: Transcription activator-like effector nucleases (TALENs) are a powerful new approach for targeted gene disruption in various animal models, but little is known about their activities in Mus musculus, the widely used mammalian model organism. Here, we report that direct injection of in vitro transcribed messenger RNA of TALEN pairs into mouse zygotes induced somatic mutations, which were stably passed to the next generation through germ-line transmission. With one TALEN pair constructed for each of 10 target genes, mutant F0 mice for each gene were obtained with the mutation rate ranged from 13 to 67% and an average of ~40% of total healthy newborns with no significant differences between C57BL/6 and FVB/N genetic background. One TALEN pair with single mismatch to their intended target sequence in each side failed to yield any mutation. Furthermore, highly efficient germ-line transmission was obtained, as all the F0 founders tested transmitted the mutations to F1 mice. In addition, we also observed that one bi-allele mutant founder of Lepr gene, encoding Leptin receptor, had similar diabetic phenotype as db/db mouse. Together, our results suggest that TALENs are an effective genetic tool for rapid gene disruption with high efficiency and heritability in mouse with distinct genetic background.
    Keywords: Synthetic Biology and Assembly Cloning
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 37
    Publication Date: 2013-06-08
    Description: We describe a new cell-free protein synthesis (CFPS) method for site-specific incorporation of non-natural amino acids (nnAAs) into proteins in which the orthogonal tRNA (o-tRNA) and the modified protein (i.e. the protein containing the nnAA) are produced simultaneously. Using this method, 0.9–1.7 mg/ml of modified soluble super-folder green fluorescent protein (sfGFP) containing either p -azido- l -phenylalanine (pAzF) or p -propargyloxy- l -phenylalanine (pPaF) accumulated in the CFPS solutions; these yields correspond to 50–88% suppression efficiency. The o-tRNA can be transcribed either from a linearized plasmid or from a crude PCR product. Comparison of two different o-tRNAs suggests that the new platform is not limited by Ef-Tu recognition of the acylated o-tRNA at sufficiently high o-tRNA template concentrations. Analysis of nnAA incorporation across 12 different sites in sfGFP suggests that modified protein yields and suppression efficiencies (i.e. the position effect) do not correlate with any of the reported trends. Sites that were ineffectively suppressed with the original o-tRNA were better suppressed with an optimized o-tRNA (o-tRNA opt ) that was evolved to be better recognized by Ef-Tu. This new platform can also be used to screen scissile ribozymes for improved catalysis.
    Keywords: Synthetic Biology and Assembly Cloning
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 38
    Publication Date: 2013-06-08
    Description: The introduction of next generation sequencing methods in genome studies has made it possible to shift research from a gene-centric approach to a genome wide view. Although methods and tools to detect single nucleotide polymorphisms are becoming more mature, methods to identify and visualize structural variation (SV) are still in their infancy. Most genome browsers can only compare a given sequence to a reference genome; therefore, direct comparison of multiple individuals still remains a challenge. Therefore, the implementation of efficient approaches to explore and visualize SVs and directly compare two or more individuals is desirable. In this article, we present a visualization approach that uses space-filling Hilbert curves to explore SVs based on both read-depth and pair-end information. An interactive open-source Java application, called Meander , implements the proposed methodology, and its functionality is demonstrated using two cases. With Meander , users can explore variations at different levels of resolution and simultaneously compare up to four different individuals against a common reference. The application was developed using Java version 1.6 and Processing.org and can be run on any platform. It can be found at http://homes.esat.kuleuven.be/~bioiuser/meander .
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 39
    Publication Date: 2013-02-20
    Description: While it has been long recognized that genes are not randomly positioned along the genome, the degree to which its 3D structure influences the arrangement of genes has remained elusive. In particular, several lines of evidence suggest that actively transcribed genes are spatially co-localized, forming transcription factories; however, a generalized systematic test has hitherto not been described. Here we reveal transcription factories using a rigorous definition of genomic structure based on Saccharomyces cerevisiae chromosome conformation capture data, coupled with an experimental design controlling for the primary gene order. We develop a data-driven method for the interpolation and the embedding of such datasets and introduce statistics that enable the comparison of the spatial and genomic densities of genes. Combining these, we report evidence that co-regulated genes are clustered in space, beyond their observed clustering in the context of gene order along the genome and show this phenomenon is significant for 64 out of 117 transcription factors. Furthermore, we show that those transcription factors with high spatially co-localized targets are expressed higher than those whose targets are not spatially clustered. Collectively, our results support the notion that, at a given time, the physical density of genes is intimately related to regulatory activity.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 40
    Publication Date: 2013-12-07
    Description: The two-step process of selection and counter-selection is a standard way to enable genetic modification and engineering of bacterial genomes using homologous recombination methods. The tetA and sacB genes are contained in a DNA cassette and confer a novel dual counter-selection system. Expression of tetA confers bacterial resistance to tetracycline (Tc R ) and also causes sensitivity to the lipophillic chelator fusaric acid; sacB causes sensitivity to sucrose. These two genes are introduced as a joint DNA cassette into Escherichia coli by selection for Tc R . A medium containing both fusaric acid and sucrose has been developed, in which, coexpression of tetA-sacB is orders of magnitude more sensitive as a counter-selection agent than either gene alone. In conjunction with the homologous recombination methods of recombineering and P1 transduction, this powerful system has been used to select changes in the bacterial genome that cannot be directly detected by other counter-selection systems.
    Keywords: Synthetic Biology and Assembly Cloning
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 41
    Publication Date: 2013-09-06
    Description: Protein-binding microarray (PBM) is a high-throughout platform that can measure the DNA-binding preference of a protein in a comprehensive and unbiased manner. A typical PBM experiment can measure binding signal intensities of a protein to all the possible DNA k-mers (k = 8 ~10); such comprehensive binding affinity data usually need to be reduced and represented as motif models before they can be further analyzed and applied. Since proteins can often bind to DNA in multiple modes, one of the major challenges is to decompose the comprehensive affinity data into multimodal motif representations. Here, we describe a new algorithm that uses Hidden Markov Models (HMMs) and can derive precise and multimodal motifs using belief propagations. We describe an HMM-based approach using belief propagations (kmerHMM), which accepts and preprocesses PBM probe raw data into median-binding intensities of individual k-mers. The k-mers are ranked and aligned for training an HMM as the underlying motif representation. Multiple motifs are then extracted from the HMM using belief propagations. Comparisons of kmerHMM with other leading methods on several data sets demonstrated its effectiveness and uniqueness. Especially, it achieved the best performance on more than half of the data sets. In addition, the multiple binding modes derived by kmerHMM are biologically meaningful and will be useful in interpreting other genome-wide data such as those generated from ChIP-seq. The executables and source codes are available at the authors’ websites: e.g. http://www.cs.toronto.edu/~wkc/kmerHMM .
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 42
    Publication Date: 2013-09-06
    Description: We developed a framework for quick and reliable construction of complex gene circuits for genetically engineering mammalian cells. Our hierarchical framework is based on a novel nucleotide addressing system for defining the position of each part in an overall circuit. With this framework, we demonstrate construction of synthetic gene circuits of up to 64 kb in size comprising 11 transcription units and 33 basic parts. We show robust gene expression control of multiple transcription units by small molecule inducers in human cells with transient transfection and stable chromosomal integration of these circuits. This framework enables development of complex gene circuits for engineering mammalian cells with unprecedented speed, reliability and scalability and should have broad applicability in a variety of areas including mammalian cell fermentation, cell fate reprogramming and cell-based assays.
    Keywords: Synthetic Biology and Assembly Cloning
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 43
    Publication Date: 2013-07-16
    Description: Synthetic biology has significantly advanced the design of synthetic control devices, gene circuits and networks that can reprogram mammalian cells in a trigger-inducible manner. Prokaryotic helix-turn-helix motifs have become the standard resource to design synthetic mammalian transcription factors that tune chimeric promoters in a small molecule-responsive manner. We have identified a family of Actinomycetes transcriptional repressor proteins showing a tandem TetR-family signature and have used a synthetic biology-inspired approach to reveal the potential control dynamics of these bi-partite regulators. Daisy-chain assembly of well-characterized prokaryotic repressor proteins such as TetR, ScbR, TtgR or VanR and fusion to either the Herpes simplex transactivation domain VP16 or the Krueppel-associated box domain (KRAB) of the human kox-1 gene resulted in synthetic bi- and even tri-partite mammalian transcription factors that could reversibly program their individual chimeric or hybrid promoters for trigger-adjustable transgene expression using tetracycline (TET), -butyrolactones, phloretin and vanillic acid. Detailed characterization of the bi-partite ScbR-TetR-VP16 (ST-TA) transcription factor revealed independent control of TET- and -butyrolactone-responsive promoters at high and double-pole double-throw (DPDT) relay switch qualities at low intracellular concentrations. Similar to electromagnetically operated mechanical DPDT relay switches that control two electric circuits by a fully isolated low-power signal, TET programs ST-TA to progressively switch from TetR-specific promoter-driven expression of transgene one to ScbR-specific promoter-driven transcription of transgene two while ST-TA flips back to exclusive transgene 1 expression in the absence of the trigger antibiotic. We suggest that natural repressors and activators with tandem TetR-family signatures may also provide independent as well as DPDT-mediated control of two sets of transgenes in bacteria, and that their synthetic transcription-factor analogs may enable the design of compact therapeutic gene circuits for gene and cell-based therapies.
    Keywords: Synthetic Biology and Assembly Cloning
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 44
    Publication Date: 2013-06-08
    Description: An appreciable fraction of introns is thought to have some function, but there is no obvious way to predict which specific intron is likely to be functional. We hypothesize that functional introns experience a different selection regime than non-functional ones and will therefore show distinct evolutionary histories. In particular, we expect functional introns to be more resistant to loss, and that this would be reflected in high conservation of their position with respect to the coding sequence. To test this hypothesis, we focused on introns whose function comes about from microRNAs and snoRNAs that are embedded within their sequence. We built a data set of orthologous genes across 28 eukaryotic species, reconstructed the evolutionary histories of their introns and compared functional introns with the rest of the introns. We found that, indeed, the position of microRNA- and snoRNA-bearing introns is significantly more conserved. In addition, we found that both families of RNA genes settled within introns early during metazoan evolution. We identified several easily computable intronic properties that can be used to detect functional introns in general, thereby suggesting a new strategy to pinpoint non-coding cellular functions.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 45
    Publication Date: 2013-11-21
    Description: Traditional methods that aim to identify biomarkers that distinguish between two groups, like Significance Analysis of Microarrays or the t -test, perform optimally when such biomarkers show homogeneous behavior within each group and differential behavior between the groups. However, in many applications, this is not the case. Instead, a subgroup of samples in one group shows differential behavior with respect to all other samples. To successfully detect markers showing such imbalanced patterns of differential signal, a different approach is required. We propose a novel method, specifically designed for the Detection of Imbalanced Differential Signal (DIDS). We use an artificial dataset and a human breast cancer dataset to measure its performance and compare it with three traditional methods and four approaches that take imbalanced signal into account. Supported by extensive experimental results, we show that DIDS outperforms all other approaches in terms of power and positive predictive value. In a mouse breast cancer dataset, DIDS is the only approach that detects a functionally validated marker of chemotherapy resistance. DIDS can be applied to any continuous value data, including gene expression data, and in any context where imbalanced differential signal is manifested.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 46
    Publication Date: 2013-01-20
    Description: Identification of differentially expressed subnetworks from protein–protein interaction (PPI) networks has become increasingly important to our global understanding of the molecular mechanisms that drive cancer. Several methods have been proposed for PPI subnetwork identification, but the dependency among network member genes is not explicitly considered, leaving many important hub genes largely unidentified. We present a new method, based on a bagging Markov random field (BMRF) framework, to improve subnetwork identification for mechanistic studies of breast cancer. The method follows a maximum a posteriori principle to form a novel network score that explicitly considers pairwise gene interactions in PPI networks, and it searches for subnetworks with maximal network scores. To improve their robustness across data sets, a bagging scheme based on bootstrapping samples is implemented to statistically select high confidence subnetworks. We first compared the BMRF-based method with existing methods on simulation data to demonstrate its improved performance. We then applied our method to breast cancer data to identify PPI subnetworks associated with breast cancer progression and/or tamoxifen resistance. The experimental results show that not only an improved prediction performance can be achieved by the BMRF approach when tested on independent data sets, but biologically meaningful subnetworks can also be revealed that are relevant to breast cancer and tamoxifen resistance.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 47
    Publication Date: 2013-01-20
    Description: miRDeep and its varieties are widely used to quantify known and novel micro RNA (miRNA) from small RNA sequencing (RNAseq). This article describes miRDeep*, our integrated miRNA identification tool, which is modeled off miRDeep, but the precision of detecting novel miRNAs is improved by introducing new strategies to identify precursor miRNAs. miRDeep* has a user-friendly graphic interface and accepts raw data in FastQ and Sequence Alignment Map (SAM) or the binary equivalent (BAM) format. Known and novel miRNA expression levels, as measured by the number of reads, are displayed in an interface, which shows each RNAseq read relative to the pre-miRNA hairpin. The secondary pre-miRNA structure and read locations for each predicted miRNA are shown and kept in a separate figure file. Moreover, the target genes of known and novel miRNAs are predicted using the TargetScan algorithm, and the targets are ranked according to the confidence score. miRDeep* is an integrated standalone application where sequence alignment, pre-miRNA secondary structure calculation and graphical display are purely Java coded. This application tool can be executed using a normal personal computer with 1.5 GB of memory. Further, we show that miRDeep* outperformed existing miRNA prediction tools using our LNCaP and other small RNAseq datasets. miRDeep* is freely available online at http://www.australianprostatecentre.org/research/software/mirdeep-star .
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 48
    Publication Date: 2013-01-20
    Description: The mRNA export complex TREX (TREX) is known to contain Aly, UAP56, Tex1 and the THO complex, among which UAP56 is required for TREX assembly. Here, we systematically investigated the role of each human TREX component in TREX assembly and its association with the mRNA. We found that Tex1 is essentially a subunit of the THO complex. Aly, THO and UAP56 are all required for assembly of TREX, in which Aly directly interacts with THO subunits Thoc2 and Thoc5. Both Aly and THO function in linking UAP56 to the cap-binding protein CBP80. Interestingly, association of UAP56 with the spliced mRNA, but not with the pre-mRNA, requires Aly and THO. Unexpectedly, we found that Aly and THO require each other to associate with the spliced mRNA. Consistent with these biochemical results, similar to Aly and UAP56, THO plays critical roles in mRNA export. Together, we propose that Aly, THO and UAP56 form a highly integrated unit to associate with the spliced mRNA and function in mRNA export.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 49
    Publication Date: 2013-01-20
    Description: Synthetic RNA control devices that use ribozymes as gene-regulatory components have been applied to controlling cellular behaviors in response to environmental signals. Quantitative measurement of the in vitro cleavage rate constants associated with ribozyme-based devices is essential for advancing the molecular design and optimization of this class of gene-regulatory devices. One of the key challenges encountered in ribozyme characterization is the efficient generation of full-length RNA from in vitro transcription reactions, where conditions generally lead to significant ribozyme cleavage. Current methods for generating full-length ribozyme-encoding RNA rely on a trans-blocking strategy, which requires a laborious gel separation and extraction step. Here, we develop a simple two-step gel-free process including cis-blocking and trans-activation steps to support scalable generation of functional full-length ribozyme-encoding RNA. We demonstrate our strategy on various types of natural ribozymes and synthetic ribozyme devices, and the cleavage rate constants obtained for the RNA generated from our strategy are comparable with those generated through traditional methods. We further develop a rapid, label-free ribozyme cleavage assay based on surface plasmon resonance, which allows continuous, real-time monitoring of ribozyme cleavage. The surface plasmon resonance-based characterization assay will complement the versatile cis-blocking and trans-activation strategy to broadly advance our ability to characterize and engineer ribozyme-based devices.
    Keywords: Synthetic Biology and Assembly Cloning
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 50
    Publication Date: 2013-08-28
    Description: The ability to artificially control transcription is essential both to the study of gene function and to the construction of synthetic gene networks with desired properties. Cas9 is an RNA-guided double-stranded DNA nuclease that participates in the CRISPR-Cas immune defense against prokaryotic viruses. We describe the use of a Cas9 nuclease mutant that retains DNA-binding activity and can be engineered as a programmable transcription repressor by preventing the binding of the RNA polymerase (RNAP) to promoter sequences or as a transcription terminator by blocking the running RNAP. In addition, a fusion between the omega subunit of the RNAP and a Cas9 nuclease mutant directed to bind upstream promoter regions can achieve programmable transcription activation. The simple and efficient modulation of gene expression achieved by this technology is a useful asset for the study of gene networks and for the development of synthetic biology and biotechnological applications.
    Keywords: Synthetic Biology and Assembly Cloning
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 51
    Publication Date: 2013-08-28
    Description: Antisense RNA transcription attenuators are a key component of the synthetic biology toolbox, with their ability to serve as building blocks for both signal integration logic circuits and transcriptional cascades. However, a central challenge to building more sophisticated RNA genetic circuitry is creating larger families of orthogonal attenuators that function independently of each other. Here, we overcome this challenge by developing a modular strategy to create chimeric fusions between the engineered transcriptional attenuator from plasmid pT181 and natural antisense RNA translational regulators. Using in vivo gene expression assays in Escherichia coli , we demonstrate our ability to create chimeric attenuators by fusing sequences from five different translational regulators. Mutagenesis of these functional attenuators allowed us to create a total of 11 new chimeric attenutaors. A comprehensive orthogonality test of these culminated in a 7 x 7 matrix of mutually orthogonal regulators. A comparison between all chimeras tested led to design principles that will facilitate further engineering of orthogonal RNA transcription regulators, and may help elucidate general principles of non-coding RNA regulation. We anticipate that our strategy will accelerate the development of even larger families of orthogonal RNA transcription regulators, and thus create breakthroughs in our ability to construct increasingly sophisticated RNA genetic circuitry.
    Keywords: Synthetic Biology and Assembly Cloning
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 52
    Publication Date: 2013-04-23
    Description: Studying complex biological processes such as cancer development, stem cell induction and transdifferentiation requires the modulation of multiple genes or pathways at one time in a single cell. Herein, we describe straightforward methods for rapid and efficient assembly of bacterial marker free multigene cassettes containing up to six complementary DNAs/short hairpin RNAs. We have termed this method RecWay assembly, as it makes use of both Cre recombinase and the commercially available Gateway cloning system. Further, because RecWay assembly uses truly modular components, it allows for the generation of randomly assembled multigene vector libraries. These multigene vectors are integratable, and later excisable, using the highly efficient piggyBac ( PB ) DNA transposon system. Moreover, we have dramatically improved the expression of stably integrated multigene vectors by incorporation of insulator elements to prevent promoter interference seen with multigene vectors. We demonstrate that insulated multigene PB transposons can stably integrate and faithfully express up to five fluorescent proteins and the puromycin-thymidine kinase resistance gene in vitro , with up to 70-fold higher gene expression compared with analogous uninsulated vectors . RecWay assembly of multigene transposon vectors allows for widely applicable modelling of highly complex biological processes and can be easily performed by other research laboratories.
    Keywords: Synthetic Biology and Assembly Cloning
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 53
    Publication Date: 2013-04-23
    Description: Techniques for assembly of designed DNA sequences are important for synthetic biology. So far, a few methods have been developed towards high-throughput seamless DNA assembly in vitro , including both the homologous sequences-based system and the type IIS-mediated system. Here, we describe a novel method designated ‘MASTER Ligation’, by which multiple DNA sequences can be seamlessly assembled through a simple and sequence-independent hierarchical procedure. The key restriction endonuclease used, MspJI, shares both type IIM and type IIS properties; thus, it only recognizes the methylation-specific 4-bp sites, m CNNR (R = A or G), and cuts DNA outside of the recognition sequences. This method was tested via successful assembly of either multiple polymerase chain reaction amplicons or restriction fragments of the actinorhodin biosynthetic cluster of Streptomyces coelicolor (~29 kb), which was further heterologously expressed in a fast-growing and moderately thermophilic strain, Streptomyces sp. 4F.
    Keywords: Synthetic Biology and Assembly Cloning
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 54
    Publication Date: 2013-02-20
    Description: Bacterial operons are nature’s tool for regulating and coordinating multi-gene expression in prokaryotes. They are also a gene architecture commonly used in the biosynthesis of many pharmaceutically important compounds and industrially useful chemicals. Despite being an important eukaryotic production host, Saccharomyces cerevisiae has never had such gene architecture. Here, we report the development of a system to assemble and regulate a multi-gene pathway in S. cerevisiae . Full pathways can be constructed using pre-made parts from a plasmid toolbox. Subsequently, through the use of a yeast strain containing a stably integrated gene switch, the assembled pathway can be regulated using a readily available and inexpensive compound—estradiol—with extremely high sensitivity (10 nM). To demonstrate the use of the system, we assembled the five-gene zeaxanthin biosynthetic pathway in a single step and showed the ligand-dependent coordinated expression of all five genes as well as the tightly regulated production of zeaxanthin. Compared with a previously reported constitutive zeaxanthin pathway, our inducible pathway was shown to have 50-fold higher production level.
    Keywords: Synthetic Biology and Assembly Cloning
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 55
    Publication Date: 2013-02-20
    Description: High-throughput sequencing is increasingly being used in combination with bisulfite (BS) assays to study DNA methylation at nucleotide resolution. Although several programmes provide genome-wide alignment of BS-treated reads, the resulting information is not readily interpretable and often requires further bioinformatic steps for meaningful analysis. Current post-alignment BS-sequencing programmes are generally focused on the gene-specific level, a restrictive feature when analysis in the non-coding regions, such as enhancers and intergenic microRNAs, is required. Here, we present Genome Bisulfite Sequencing Analyser (GBSA— http://ctrad-csi.nus.edu.sg/gbsa ), a free open-source software capable of analysing whole-genome bisulfite sequencing data with either a gene-centric or gene-independent focus. Through analysis of the largest published data sets to date, we demonstrate GBSA’s features in providing sequencing quality assessment, methylation scoring, functional data management and visualization of genomic methylation at nucleotide resolution. Additionally, we show that GBSA’s output can be easily integrated with other high-throughput sequencing data, such as RNA-Seq or ChIP-seq, to elucidate the role of methylated intergenic regions in gene regulation. In essence, GBSA allows an investigator to explore not only known loci but also all the genomic regions, for which methylation studies could lead to the discovery of new regulatory mechanisms.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 56
    Publication Date: 2013-02-20
    Description: Computationally identifying effective biomarkers for cancers from gene expression profiles is an important and challenging task. The challenge lies in the complicated pathogenesis of cancers that often involve the dysfunction of many genes and regulatory interactions. Thus, sophisticated classification model is in pressing need. In this study, we proposed an efficient approach, called ellipsoidFN (ellipsoid Feature Net), to model the disease complexity by ellipsoids and seek a set of heterogeneous biomarkers. Our approach achieves a non-linear classification scheme for the mixed samples by the ellipsoid concept, and at the same time uses a linear programming framework to efficiently select biomarkers from high-dimensional space. ellipsoidFN reduces the redundancy and improves the complementariness between the identified biomarkers, thus significantly enhancing the distinctiveness between cancers and normal samples, and even between cancer types. Numerical evaluation on real prostate cancer, breast cancer and leukemia gene expression datasets suggested that ellipsoidFN outperforms the state-of-the-art biomarker identification methods, and it can serve as a useful tool for cancer biomarker identification in the future. The Matlab code of ellipsoidFN is freely available from http://doc.aporc.org/wiki/EllipsoidFN .
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 57
    Publication Date: 2013-02-20
    Description: Zinc-finger nucleases (ZFNs) have been used for genome engineering in a wide variety of organisms; however, it remains challenging to design effective ZFNs for many genomic sequences using publicly available zinc-finger modules. This limitation is in part because of potential finger–finger incompatibility generated on assembly of modules into zinc-finger arrays (ZFAs). Herein, we describe the validation of a new set of two-finger modules that can be used for building ZFAs via conventional assembly methods or a new strategy—finger stitching—that increases the diversity of genomic sequences targetable by ZFNs. Instead of assembling ZFAs based on units of the zinc-finger structural domain, our finger stitching method uses units that span the finger–finger interface to ensure compatibility of neighbouring recognition helices. We tested this approach by generating and characterizing eight ZFAs, and we found their DNA-binding specificities reflected the specificities of the component modules used in their construction. Four pairs of ZFNs incorporating these ZFAs generated targeted lesions in vivo , demonstrating that stitching yields ZFAs with robust recognition properties.
    Keywords: Synthetic Biology and Assembly Cloning
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 58
    Publication Date: 2013-02-02
    Description: Designing effective antisense sequences is a formidable problem. A method for predicting efficacious antisense holds the potential to provide fundamental insight into this biophysical process. More practically, such an understanding increases the chance of successful antisense design as well as saving considerable time, money and labor. The secondary structure of an mRNA molecule is believed to be in a constant state of flux, sampling several different suboptimal states. We hypothesized that particularly volatile regions might provide better accessibility for antisense targeting. A computational framework, GenAVERT was developed to evaluate this hypothesis. GenAVERT used UNAFold and RNAforester to generate and compare the predicted suboptimal structures of mRNA sequences. Subsequent analysis revealed regions that were particularly volatile in terms of intramolecular hydrogen bonding, and thus potentially superior antisense targets due to their high accessibility. Several mRNA sequences with known natural antisense target sites as well as artificial antisense target sites were evaluated. Upon comparison, antisense sequences predicted based upon the volatility hypothesis closely matched those of the naturally occurring antisense, as well as those artificial target sites that provided efficient down-regulation. These results suggest that this strategy may provide a powerful new approach to antisense design.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 59
    Publication Date: 2013-02-02
    Description: Existence of some extra-genetic (epigenetic) codes has been postulated since the discovery of the primary genetic code. Evident effects of histone post-translational modifications or DNA methylation over the efficiency and the regulation of DNA processes are supporting this postulation. EMdeCODE is an original algorithm that approximate the genomic distribution of given DNA features (e.g. promoter, enhancer, viral integration) by identifying relevant ChIPSeq profiles of post-translational histone marks or DNA binding proteins and combining them in a supermark. EMdeCODE kernel is essentially a two-step procedure: (i) an expectation-maximization process calculates the mixture of epigenetic factors that maximize the Sensitivity (recall) of the association with the feature under study; (ii) the approximated density is then recursively trimmed with respect to a control dataset to increase the precision by reducing the number of false positives. EMdeCODE densities improve significantly the prediction of enhancer loci and retroviral integration sites with respect to previous methods. Importantly, it can also be used to extract distinctive factors between two arbitrary conditions. Indeed EMdeCODE identifies unexpected epigenetic profiles specific for coding versus non-coding RNA, pointing towards a new role for H3R2me1 in coding regions.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 60
    Publication Date: 2013-02-02
    Description: Insertion and deletion polymorphisms (indels) are an important source of genomic variation in plant and animal genomes, but accurate genotyping from low-coverage and exome next-generation sequence data remains challenging. We introduce an efficient population clustering algorithm for diploids and polyploids which was tested on a dataset of 2000 exomes. Compared with existing methods, we report a 4-fold reduction in overall indel genotype error rates with a 9-fold reduction in low coverage regions.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 61
    Publication Date: 2013-02-02
    Description: Conventional approaches to predict transcriptional regulatory interactions usually rely on the definition of a shared motif sequence on the target genes of a transcription factor (TF). These efforts have been frustrated by the limited availability and accuracy of TF binding site motifs, usually represented as position-specific scoring matrices, which may match large numbers of sites and produce an unreliable list of target genes. To improve the prediction of binding sites, we propose to additionally use the unrelated knowledge of the genome layout. Indeed, it has been shown that co-regulated genes tend to be either neighbors or periodically spaced along the whole chromosome. This study demonstrates that respective gene positioning carries significant information. This novel type of information is combined with traditional sequence information by a machine learning algorithm called PreCisIon. To optimize this combination, PreCisIon builds a strong gene target classifier by adaptively combining weak classifiers based on either local binding sequence or global gene position. This strategy generically paves the way to the optimized incorporation of any future advances in gene target prediction based on local sequence, genome layout or on novel criteria. With the current state of the art, PreCisIon consistently improves methods based on sequence information only. This is shown by implementing a cross-validation analysis of the 20 major TFs from two phylogenetically remote model organisms. For Bacillus subtilis and Escherichia coli , respectively, PreCisIon achieves on average an area under the receiver operating characteristic curve of 70 and 60%, a sensitivity of 80 and 70% and a specificity of 60 and 56%. The newly predicted gene targets are demonstrated to be functionally consistent with previously known targets, as assessed by analysis of Gene Ontology enrichment or of the relevant literature and databases.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 62
    Publication Date: 2013-02-02
    Description: microRNAs (miRNAs) are short non-coding regulatory RNA molecules. The activity of a miRNA in a biological process can often be reflected in the expression program that characterizes the outcome of the activity. We introduce a computational approach that infers such activity from high-throughput data using a novel statistical methodology, called minimum-mHG (mmHG), that examines mutual enrichment in two ranked lists. Based on this methodology, we provide a user-friendly web application that supports the statistical assessment of miRNA target enrichment analysis (miTEA) in the top of a ranked list of genes or proteins. Using miTEA, we analyze several target prediction tools by examining performance on public miRNA constitutive expression data. We also apply miTEA to analyze several integrative biology data sets, including a novel matched miRNA/mRNA data set covering nine human tissue types. Our novel findings include proposed direct activity of miR-519 in placenta, a direct activity of the oncogenic miR-15 in different healthy tissue types and a direct activity of the poorly characterized miR-768 in both healthy tissue types and cancer cell lines. The miTEA web application is available at http://cbl-gorilla.cs.technion.ac.il/miTEA/ .
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 63
    Publication Date: 2013-02-02
    Description: Sequence alignment of proteins and nucleic acids is a routine task in bioinformatics. Although the comparison of complete peptides, genes or genomes can be undertaken with a great variety of tools, the alignment of short DNA sequences and motifs entails pitfalls that have not been fully addressed yet. Here we confront the structural superposition of transcription factors with the sequence alignment of their recognized cis elements. Our goals are (i) to test TFcompare ( http://floresta.eead.csic.es/tfcompare ), a structural alignment method for protein–DNA complexes; (ii) to benchmark the pairwise alignment of regulatory elements; (iii) to define the confidence limits and the twilight zone of such alignments and (iv) to evaluate the relevance of these thresholds with elements obtained experimentally. We find that the structure of cis elements and protein–DNA interfaces is significantly more conserved than their sequence and measures how this correlates with alignment errors when only sequence information is considered. Our results confirm that DNA motifs in the form of matrices produce better alignments than individual sequences. Finally, we report that empirical and theoretically derived twilight thresholds are useful for estimating the natural plasticity of regulatory sequences, and hence for filtering out unreliable alignments.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 64
    Publication Date: 2013-02-02
    Description: To mine gene expression data sets effectively, analysis frameworks need to incorporate methods that identify intergenic relationships within enriched biologically relevant subpathways. For this purpose, we developed the Topology Enrichment Analysis frameworK (TEAK). TEAK employs a novel in-house algorithm and a tailor-made Clique Percolation Method to extract linear and nonlinear KEGG subpathways, respectively. TEAK scores subpathways using the Bayesian Information Criterion for context specific data and the Kullback-Leibler divergence for case–control data. In this article, we utilized TEAK with experimental studies to analyze microarray data sets profiling stress responses in the model eukaryote Saccharomyces cerevisiae . Using a public microarray data set, we identified via TEAK linear sphingolipid metabolic subpathways activated during the yeast response to nitrogen stress, and phenotypic analyses of the corresponding deletion strains indicated previously unreported fitness defects for the dpl1 and lag1 mutants under conditions of nitrogen limitation. In addition, we studied the yeast filamentous response to nitrogen stress by profiling changes in transcript levels upon deletion of two key filamentous growth transcription factors, FLO8 and MSS11 . Via TEAK we identified a nonlinear glycerophospholipid metabolism subpathway involving the SLC1 gene, which we found via mutational analysis to be required for yeast filamentous growth.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 65
    Publication Date: 2013-02-02
    Description: Structural variation is an important class of genetic variation in mammals. High-throughput sequencing (HTS) technologies promise to revolutionize copy-number variation (CNV) detection but present substantial analytic challenges. Converging evidence suggests that multiple types of CNV-informative data (e.g. read-depth, read-pair, split-read) need be considered, and that sophisticated methods are needed for more accurate CNV detection. We observed that various sources of experimental biases in HTS confound read-depth estimation, and note that bias correction has not been adequately addressed by existing methods. We present a novel read-depth–based method, GENSENG, which uses a hidden Markov model and negative binomial regression framework to identify regions of discrete copy-number changes while simultaneously accounting for the effects of multiple confounders. Based on extensive calibration using multiple HTS data sets, we conclude that our method outperforms existing read-depth–based CNV detection algorithms. The concept of simultaneous bias correction and CNV detection can serve as a basis for combining read-depth with other types of information such as read-pair or split-read in a single analysis. A user-friendly and computationally efficient implementation of our method is freely available.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 66
    Publication Date: 2013-05-04
    Description: The reliable forward engineering of genetic systems remains limited by the ad hoc reuse of many types of basic genetic elements. Although a few intrinsic prokaryotic transcription terminators are used routinely, termination efficiencies have not been studied systematically. Here, we developed and validated a genetic architecture that enables reliable measurement of termination efficiencies. We then assembled a collection of 61 natural and synthetic terminators that collectively encode termination efficiencies across an ~800-fold dynamic range within Escherichia coli . We simulated co-transcriptional RNA folding dynamics to identify competing secondary structures that might interfere with terminator folding kinetics or impact termination activity. We found that structures extending beyond the core terminator stem are likely to increase terminator activity. By excluding terminators encoding such context-confounding elements, we were able to develop a linear sequence-function model that can be used to estimate termination efficiencies ( r = 0.9, n = 31) better than models trained on all terminators ( r = 0.67, n = 54). The resulting systematically measured collection of terminators should improve the engineering of synthetic genetic systems and also advance quantitative modeling of transcription termination.
    Keywords: Synthetic Biology and Assembly Cloning
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 67
    Publication Date: 2013-05-04
    Description: Tumor formation is partially driven by DNA copy number changes, which are typically measured using array comparative genomic hybridization, SNP arrays and DNA sequencing platforms. Many techniques are available for detecting recurring aberrations across multiple tumor samples, including CMAR, STAC, GISTIC and KC-SMART. GISTIC is widely used and detects both broad and focal (potentially overlapping) recurring events. However, GISTIC performs false discovery rate control on probes instead of events. Here we propose Analytical Multi-scale Identification of Recurrent Events, a multi-scale Gaussian smoothing approach, for the detection of both broad and focal (potentially overlapping) recurring copy number alterations. Importantly, false discovery rate control is performed analytically (no need for permutations) on events rather than probes. The method does not require segmentation or calling on the input dataset and therefore reduces the potential loss of information due to discretization. An important characteristic of the approach is that the error rate is controlled across all scales and that the algorithm outputs a single profile of significant events selected from the appropriate scales. We perform extensive simulations and showcase its utility on a glioblastoma SNP array dataset. Importantly, ADMIRE detects focal events that are missed by GISTIC, including two events involving known glioma tumor-suppressor genes: CDKN2C and NF1.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 68
    Publication Date: 2013-06-28
    Description: Recombineering in bacteria is a powerful technique for genome reconstruction, but until now, it was not generally applicable for development of small-molecule producers because of the inconspicuous phenotype of most compounds of biotechnological relevance. Here, we establish recombineering for Corynebacterium glutamicum using RecT of prophage Rac and combine this with our recently developed nanosensor technology, which enables the detection and isolation of productive mutants at the single-cell level via fluorescence-activated cell sorting (FACS). We call this new technology RecFACS, which we use for genomic site-directed saturation mutagenesis without relying on pre-constructed libraries to directly isolate l -lysine-producing cells. A mixture of 19 different oligonucleotides was used targeting codon 81 in murE of the wild-type, at a locus where one single mutation is known to cause l -lysine production. Using RecFACS, productive mutants were screened and isolated. Sequencing revealed 12 different amino acid exchanges in the targeted murE codon, which caused different l -lysine production titers. Apart from introducing a rapid genome construction technology for C. glutamicum , the present work demonstrates that RecFACS is suitable to simply create producers as well as genetic diversity in one single step, thus establishing a new general concept in synthetic biology.
    Keywords: Synthetic Biology and Assembly Cloning
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 69
    Publication Date: 2013-11-02
    Description: The type II CRISPR/Cas system from Streptococcus pyogenes and its simplified derivative, the Cas9/single guide RNA (sgRNA) system, have emerged as potent new tools for targeted gene knockout in bacteria, yeast, fruit fly, zebrafish and human cells. Here, we describe adaptations of these systems leading to successful expression of the Cas9/sgRNA system in two dicot plant species, Arabidopsis and tobacco, and two monocot crop species, rice and sorghum. Agrobacterium tumefaciens was used for delivery of genes encoding Cas9, sgRNA and a non-fuctional, mutant green fluorescence protein (GFP) to Arabidopsis and tobacco. The mutant GFP gene contained target sites in its 5' coding regions that were successfully cleaved by a CAS9/sgRNA complex that, along with error-prone DNA repair, resulted in creation of functional GFP genes. DNA sequencing confirmed Cas9/sgRNA-mediated mutagenesis at the target site. Rice protoplast cells transformed with Cas9/sgRNA constructs targeting the promoter region of the bacterial blight susceptibility genes, OsSWEET14 and OsSWEET11 , were confirmed by DNA sequencing to contain mutated DNA sequences at the target sites. Successful demonstration of the Cas9/sgRNA system in model plant and crop species bodes well for its near-term use as a facile and powerful means of plant genetic engineering for scientific and agricultural applications.
    Keywords: Synthetic Biology and Assembly Cloning
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 70
    Publication Date: 2013-11-02
    Description: Cas9 is an RNA-guided double-stranded DNA nuclease that participates in clustered regularly interspaced short palindromic repeats (CRISPR)-mediated adaptive immunity in prokaryotes. CRISPR–Cas9 has recently been used to generate insertion and deletion mutations in Caenorhabditis elegans, but not to create tailored changes (knock-ins). We show that the CRISPR–CRISPR-associated (Cas) system can be adapted for efficient and precise editing of the C. elegans genome. The targeted double-strand breaks generated by CRISPR are substrates for transgene-instructed gene conversion. This allows customized changes in the C. elegans genome by homologous recombination: sequences contained in the repair template (the transgene) are copied by gene conversion into the genome. The possibility to edit the C. elegans genome at selected locations will facilitate the systematic study of gene function in this widely used model organism.
    Keywords: Synthetic Biology and Assembly Cloning
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 71
    Publication Date: 2013-11-02
    Description: The generation of genome-modified animals is a powerful approach to analyze gene functions. The CAS9/guide RNA (gRNA) system is expected to become widely used for the efficient generation of genome-modified animals, but detailed studies on optimum conditions and availability are limited. In the present study, we attempted to generate large-scale genome-modified mice with an optimized CAS9/gRNA system, and confirmed the transmission of these mutations to the next generations. A comparison of different types of gRNA indicated that the target loci of almost all pups were modified successfully by the use of long-type gRNAs with CAS9. We showed that this system has much higher mutation efficiency and much lower off-target effect compared to zinc-finger nuclease. We propose that most of these off-target effects can be avoided by the careful control of CAS9 mRNA concentration and that the genome-modification efficiency depends rather on the gRNA concentration. Under optimized conditions, large-scale (~10 kb) genome-modified mice can be efficiently generated by modifying two loci on a single chromosome using two gRNAs at once in mouse zygotes. In addition, the normal transmission of these CAS9/gRNA-induced mutations to the next generation was confirmed. These results indicate that CAS9/gRNA system can become a highly effective tool for the generation of genome-modified animals.
    Keywords: Synthetic Biology and Assembly Cloning
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 72
    Publication Date: 2013-04-14
    Description: Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) and CRISPR-associated (Cas) systems in bacteria and archaea use RNA-guided nuclease activity to provide adaptive immunity against invading foreign nucleic acids. Here, we report the use of type II bacterial CRISPR-Cas system in Saccharomyces cerevisiae for genome engineering. The CRISPR-Cas components, Cas9 gene and a designer genome targeting CRISPR guide RNA (gRNA), show robust and specific RNA-guided endonuclease activity at targeted endogenous genomic loci in yeast. Using constitutive Cas9 expression and a transient gRNA cassette, we show that targeted double-strand breaks can increase homologous recombination rates of single- and double-stranded oligonucleotide donors by 5-fold and 130-fold, respectively. In addition, co-transformation of a gRNA plasmid and a donor DNA in cells constitutively expressing Cas9 resulted in near 100% donor DNA recombination frequency. Our approach provides foundations for a simple and powerful genome engineering tool for site-specific mutagenesis and allelic replacement in yeast.
    Keywords: Synthetic Biology and Assembly Cloning
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 73
    Publication Date: 2013-04-14
    Description: In this article, we focus on the analysis of competitive gene set methods for detecting the statistical significance of pathways from gene expression data. Our main result is to demonstrate that some of the most frequently used gene set methods, GSEA, GSEArot and GAGE, are severely influenced by the filtering of the data in a way that such an analysis is no longer reconcilable with the principles of statistical inference, rendering the obtained results in the worst case inexpressive. A possible consequence of this is that these methods can increase their power by the addition of unrelated data and noise. Our results are obtained within a bootstrapping framework that allows a rigorous assessment of the robustness of results and enables power estimates. Our results indicate that when using competitive gene set methods, it is imperative to apply a stringent gene filtering criterion. However, even when genes are filtered appropriately, for gene expression data from chips that do not provide a genome-scale coverage of the expression values of all mRNAs, this is not enough for GSEA, GSEArot and GAGE to ensure the statistical soundness of the applied procedure. For this reason, for biomedical and clinical studies, we strongly advice not to use GSEA, GSEArot and GAGE for such data sets.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 74
    Publication Date: 2013-08-09
    Description: Customized TALENs and Cas9/gRNAs have been used for targeted mutagenesis in zebrafish to induce indels into protein-coding genes. However, indels are usually not sufficient to disrupt the function of non-coding genes, gene clusters or regulatory sequences, whereas large genomic deletions or inversions are more desirable for this purpose. By injecting two pairs of TALEN mRNAs or two gRNAs together with Cas9 mRNA targeting distal DNA sites of the same chromosome, we obtained predictable genomic deletions or inversions with sizes ranging from several hundred bases to nearly 1 Mb. We have successfully achieved this type of modifications for 11 chromosomal loci by TALENs and 2 by Cas9/gRNAs with different combinations of gRNA pairs, including clusters of miRNA and protein-coding genes. Seven of eight TALEN-targeted lines transmitted the deletions and one transmitted the inversion through germ line. Our findings indicate that both TALENs and Cas9/gRNAs can be used as an efficient tool to engineer genomes to achieve large deletions or inversions, including fragments covering multiple genes and non-coding sequences. To facilitate the analyses and application of existing ZFN, TALEN and CRISPR/Cas data, we have updated our EENdb database to provide a chromosomal view of all reported engineered endonucleases targeting human and zebrafish genomes.
    Keywords: Synthetic Biology and Assembly Cloning
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 75
    Publication Date: 2012-12-14
    Description: Pan-genome ortholog clustering tool ( PanOCT ) is a tool for pan-genomic analysis of closely related prokaryotic species or strains. PanOCT uses conserved gene neighborhood information to separate recently diverged paralogs into orthologous clusters where homology-only clustering methods cannot. The results from PanOCT and three commonly used graph-based ortholog-finding programs were compared using a set of four publicly available strains of the same bacterial species. All four methods agreed on ~70% of the clusters and ~86% of the proteins. The clusters that did not agree were inspected for evidence of correctness resulting in 85 high-confidence manually curated clusters that were used to compare all four methods.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 76
    Publication Date: 2012-09-27
    Description: Genome-scale engineering of living organisms requires precise and economical methods to efficiently modify many loci within chromosomes. One such example is the directed integration of chemically synthesized single-stranded deoxyribonucleic acid (oligonucleotides) into the chromosome of Escherichia coli during replication. Herein, we present a general co-selection strategy in multiplex genome engineering that yields highly modified cells. We demonstrate that disparate sites throughout the genome can be easily modified simultaneously by leveraging selectable markers within 500 kb of the target sites. We apply this technique to the modification of 80 sites in the E. coli genome.
    Keywords: Synthetic Biology and Assembly Cloning
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 77
    Publication Date: 2012-10-10
    Description: A novel ab initio parameter-tuning-free system to identify transcriptional factor (TF) binding motifs (TFBMs) in genome DNA sequences was developed. It is based on the comparison of two types of frequency distributions with respect to the TFBM candidates in the target DNA sequences and the non-candidates in the background sequence, with the latter generated by utilizing the intergenic sequences. For benchmark tests, we used DNA sequence datasets extracted by ChIP-on-chip and ChIP-seq techniques and identified 65 yeast and four mammalian TFBMs, with the latter including gaps. The accuracy of our system was compared with those of other available programs (i.e. MEME, Weeder, BioProspector, MDscan and DME) and was the best among them, even without tuning of the parameter set for each TFBM and pre-treatment/editing of the target DNA sequences. Moreover, with respect to some TFs for which the identified motifs are inconsistent with those in the references, our results were revealed to be correct, by comparing them with other existing experimental data. Thus, our identification system does not need any other biological information except for gene positions, and is also expected to be applicable to genome DNA sequences to identify unknown TFBMs as well as known ones.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 78
    Publication Date: 2012-10-10
    Description: MicroRNAs (miRNAs) are major regulators of gene expression in multicellular organisms. They recognize their targets by sequence complementarity and guide them to cleavage or translational arrest. It is generally accepted that plant miRNAs have extensive complementarity to their targets and their prediction usually relies on the use of empirical parameters deduced from known miRNA–target interactions. Here, we developed a strategy to identify miRNA targets which is mainly based on the conservation of the potential regulation in different species. We applied the approach to expressed sequence tags datasets from angiosperms. Using this strategy, we predicted many new interactions and experimentally validated previously unknown miRNA targets in Arabidopsis thaliana . Newly identified targets that are broadly conserved include auxin regulators, transcription factors and transporters. Some of them might participate in the same pathways as the targets known before, suggesting that some miRNAs might control different aspects of a biological process. Furthermore, this approach can be used to identify targets present in a specific group of species, and, as a proof of principle, we analyzed Solanaceae -specific targets. The presented strategy can be used alone or in combination with other approaches to find miRNA targets in plants.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 79
    Publication Date: 2012-04-15
    Description: We address the challenge of regulatory sequence alignment with a new method, Pro-Coffee, a multiple aligner specifically designed for homologous promoter regions. Pro-Coffee uses a dinucleotide substitution matrix estimated on alignments of functional binding sites from TRANSFAC. We designed a validation framework using several thousand families of orthologous promoters. This dataset was used to evaluate the accuracy for predicting true human orthologs among their paralogs. We found that whereas other methods achieve on average 73.5% accuracy, and 77.6% when trained on that same dataset, the figure goes up to 80.4% for Pro-Coffee. We then applied a novel validation procedure based on multi-species ChIP-seq data. Trained and untrained methods were tested for their capacity to correctly align experimentally detected binding sites. Whereas the average number of correctly aligned sites for two transcription factors is 284 for default methods and 316 for trained methods, Pro-Coffee achieves 331, 16.5% above the default average. We find a high correlation between a method's performance when classifying orthologs and its ability to correctly align proven binding sites. Not only has this interesting biological consequences, it also allows us to conclude that any method that is trained on the ortholog data set will result in functionally more informative alignments.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 80
    Publication Date: 2012-04-15
    Description: MCScan is an algorithm able to scan multiple genomes or subgenomes in order to identify putative homologous chromosomal regions, and align these regions using genes as anchors. The MCScanX toolkit implements an adjusted MCScan algorithm for detection of synteny and collinearity that extends the original software by incorporating 14 utility programs for visualization of results and additional downstream analyses. Applications of MCScanX to several sequenced plant genomes and gene families are shown as examples. MCScanX can be used to effectively analyze chromosome structural changes, and reveal the history of gene family expansions that might contribute to the adaptation of lineages and taxa. An integrated view of various modes of gene duplication can supplement the traditional gene tree analysis in specific families. The source code and documentation of MCScanX are freely available at http://chibba.pgml.uga.edu/mcscan2/ .
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 81
    Publication Date: 2012-07-22
    Description: Cytosines in genomic DNA are sometimes methylated. This affects many biological processes and diseases. The standard way of measuring methylation is to use bisulfite, which converts unmethylated cytosines to thymines, then sequence the DNA and compare it to a reference genome sequence. We describe a method for the critical step of aligning the DNA reads to the correct genomic locations. Our method builds on classic alignment techniques, including likelihood-ratio scores and spaced seeds. In a realistic benchmark, our method has a better combination of sensitivity, specificity and speed than nine other high-throughput bisulfite aligners. This study enables more accurate and rational analysis of DNA methylation. It also illustrates how to adapt general-purpose alignment methods to a special case with distorted base patterns: this should be informative for other special cases such as ancient DNA and AT-rich genomes.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 82
    Publication Date: 2012-09-13
    Description: Prophages are phages in lysogeny that are integrated into, and replicated as part of, the host bacterial genome. These mobile elements can have tremendous impact on their bacterial hosts’ genomes and phenotypes, which may lead to strain emergence and diversification, increased virulence or antibiotic resistance. However, finding prophages in microbial genomes remains a problem with no definitive solution. The majority of existing tools rely on detecting genomic regions enriched in protein-coding genes with known phage homologs, which hinders the de novo discovery of phage regions. In this study, a weighted phage detection algorithm, PhiSpy was developed based on seven distinctive characteristics of prophages, i.e. protein length, transcription strand directionality, customized AT and GC skew, the abundance of unique phage words, phage insertion points and the similarity of phage proteins. The first five characteristics are capable of identifying prophages without any sequence similarity with known phage genes. PhiSpy locates prophages by ranking genomic regions enriched in distinctive phage traits, which leads to the successful prediction of 94% of prophages in 50 complete bacterial genomes with a 6% false-negative rate and a 0.66% false-positive rate.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 83
    Publication Date: 2012-06-06
    Description: A chemistry-based artificial restriction DNA cutter (ARCUT) was recently prepared from Ce(IV)/EDTA complex and a pair of pseudo-complementary peptide nucleic acids. This cutter has freely tunable scission-site and site specificity. In this article, homologous recombination (HR) in human cells was promoted by cutting a substrate DNA with ARCUT, and the efficiency of this bioprocess was optimized by various chemical and biological approaches. Of two kinds of terminal structure formed by ARCUT, 3'-overhang termini provided by 1.7-fold higher efficiency than 5'-overhang termini. A longer homology length (e.g. 698 bp) was about 2-fold more favorable than shorter one (e.g. 100 bp). When the cell cycle was synchronized to G2/M phase with nocodazole, the HR was promoted by about 2-fold. Repression of the NHEJ-relevant proteins Ku70 and Ku80 by siRNA increased the efficiency by 2- to 3-fold. It was indicated that appropriate combination of all these chemical and biological approaches should be very effective to promote ARCUT-mediated HR in human cells.
    Keywords: Synthetic Biology and Assembly Cloning
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 84
    Publication Date: 2012-06-06
    Description: Messenger RNA sequences possess specific nucleotide patterns distinguishing them from non-coding genomic sequences. In this study, we explore the utilization of modified Markov models to analyze sequences up to 44 bp, far beyond the 8-bp limit of conventional Markov models, for exon/intron discrimination. In order to analyze nucleotide sequences of this length, their information content is first reduced by conversion into shorter binary patterns via the application of numerous abstraction schemes. After the conversion of genomic sequences to binary strings, homogenous Markov models trained on the binary sequences are used to discriminate between exons and introns. We term this approach the Binary Abstraction Markov Model (BAMM). High-quality abstraction schemes for exon/intron discrimination are selected using optimization algorithms on supercomputers. The best MM classifiers are then combined using support vector machines into a single classifier. With this approach, over 95% classification accuracy is achieved without taking reading frame into account. With further development, the BAMM approach can be applied to sequences lacking the genetic code such as ncRNAs and 5'-untranslated regions.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 85
    Publication Date: 2012-04-24
    Description: We describe a novel cloning method termed SLiCE (Seamless L i gation Cloning Extract) that utilizes easy to generate bacterial cell extracts to assemble multiple DNA fragments into recombinant DNA molecules in a single in vitro recombination reaction. SLiCE overcomes the sequence limitations of traditional cloning methods, facilitates seamless cloning by recombining short end homologies (≥15 bp) with or without flanking heterologous sequences and provides an effective strategy for directional subcloning of DNA fragments from Bacteria Artificial Chromosomes (BACs) or other sources. SLiCE is highly cost effective as a number of standard laboratory bacterial strains can serve as sources for SLiCE extract. In addition, the cloning efficiencies and capabilities of these strains can be greatly improved by simple genetic modifications. As an example, we modified the DH10B Escherichia coli strain to express an optimized prophage Red recombination system. This strain, termed PPY, facilitates SLiCE with very high efficiencies and demonstrates the versatility of the method.
    Keywords: Synthetic Biology and Assembly Cloning
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 86
    Publication Date: 2012-04-24
    Description: Targeted gene addition to mammalian genomes is central to biotechnology, basic research and gene therapy. For example, gene targeting to the ROSA26 locus by homologous recombination in embryonic stem cells is commonly used for mouse transgenesis to achieve ubiquitous and persistent transgene expression. However, conventional methods are not readily adaptable to gene targeting in other cell types. The emerging zinc finger nuclease (ZFN) technology facilitates gene targeting in diverse species and cell types, but an optimal strategy for engineering highly active ZFNs is still unclear. We used a modular assembly approach to build ZFNs that target the ROSA26 locus. ZFN activity was dependent on the number of modules in each zinc finger array. The ZFNs were active in a variety of cell types in a time- and dose-dependent manner. The ZFNs directed gene addition to the ROSA26 locus, which enhanced the level of sustained gene expression, the uniformity of gene expression within clonal cell populations and the reproducibility of gene expression between clones. These ZFNs are a promising resource for cell engineering, mouse transgenesis and pre-clinical gene therapy studies. Furthermore, this characterization of the modular assembly method provides general insights into the implementation of the ZFN technology.
    Keywords: Synthetic Biology and Assembly Cloning
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 87
    Publication Date: 2012-05-13
    Description: A simple approach for creating libraries of circularly permuted proteins is described that is called PERMutation Using Transposase Engineering (PERMUTE). In PERMUTE, the transposase MuA is used to randomly insert a minitransposon that can function as a protein expression vector into a plasmid that contains the open reading frame (ORF) being permuted. A library of vectors that express different permuted variants of the ORF-encoded protein is created by: (i) using bacteria to select for target vectors that acquire an integrated minitransposon; (ii) excising the ensemble of ORFs that contain an integrated minitransposon from the selected vectors; and (iii) circularizing the ensemble of ORFs containing integrated minitransposons using intramolecular ligation. Construction of a Thermotoga neapolitana adenylate kinase (AK) library using PERMUTE revealed that this approach produces vectors that express circularly permuted proteins with distinct sequence diversity from existing methods. In addition, selection of this library for variants that complement the growth of Escherichia coli with a temperature-sensitive AK identified functional proteins with novel architectures, suggesting that PERMUTE will be useful for the directed evolution of proteins with new functions.
    Keywords: Synthetic Biology and Assembly Cloning
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 88
    Publication Date: 2012-05-13
    Description: Insertional mutagenesis screens in mice are used to identify individual genes that drive tumor formation. In these screens, candidate cancer genes are identified if their genomic location is proximal to a common insertion site (CIS) defined by high rates of transposon or retroviral insertions in a given genomic window. In this article, we describe a new method for defining CISs based on a Poisson distribution, the Poisson Regression Insertion Model, and show that this new method is an improvement over previously described methods. We also describe a modification of the method that can identify pairs and higher orders of co-occurring common insertion sites. We apply these methods to two data sets, one generated in a transposon-based screen for gastrointestinal tract cancer genes and another based on the set of retroviral insertions in the Retroviral Tagged Cancer Gene Database. We show that the new methods identify more relevant candidate genes and candidate gene pairs than found using previous methods. Identification of the biologically relevant set of mutations that occur in a single cell and cause tumor progression will aid in the rational design of single and combinatorial therapies in the upcoming age of personalized cancer therapy.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 89
    Publication Date: 2012-02-28
    Description: Synthetic scaffolds that permit spatial and temporal organization of enzymes in living cells are a promising post-translational strategy for controlling the flow of information in both metabolic and signaling pathways. Here, we describe the use of plasmid DNA as a stable, robust and configurable scaffold for arranging biosynthetic enzymes in the cytoplasm of Escherichia coli . This involved conversion of individual enzymes into custom DNA-binding proteins by genetic fusion to zinc-finger domains that specifically bind unique DNA sequences. When expressed in cells that carried a rationally designed DNA scaffold comprising corresponding zinc finger binding sites, the titers of diverse metabolic products, including resveratrol, 1,2-propanediol and mevalonate were increased as a function of the scaffold architecture. These results highlight the utility of DNA scaffolds for assembling biosynthetic enzymes into functional metabolic structures. Beyond metabolism, we anticipate that DNA scaffolds may be useful in sequestering different types of enzymes for specifying the output of biological signaling pathways or for coordinating other assembly-line processes such as protein folding, degradation and post-translational modifications.
    Keywords: Synthetic Biology and Assembly Cloning
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 90
    Publication Date: 2012-03-29
    Description: Broadly, computational approaches for ortholog assignment is a three steps process: (i) identify all putative homologs between the genomes, (ii) identify gene anchors and (iii) link anchors to identify best gene matches given their order and context. In this article, we engineer two methods to improve two important aspects of this pipeline [specifically steps (ii) and (iii)]. First, computing sequence similarity data [step (i)] is a computationally intensive task for large sequence sets, creating a bottleneck in the ortholog assignment pipeline. We have designed a fast and highly scalable sort-join method (afree) based on k -mer counts to rapidly compare all pairs of sequences in a large protein sequence set to identify putative homologs. Second, availability of complex genomes containing large gene families with prevalence of complex evolutionary events, such as duplications, has made the task of assigning orthologs and co-orthologs difficult. Here, we have developed an iterative graph matching strategy where at each iteration the best gene assignments are identified resulting in a set of orthologs and co-orthologs. We find that the afree algorithm is faster than existing methods and maintains high accuracy in identifying similar genes. The iterative graph matching strategy also showed high accuracy in identifying complex gene relationships. Standalone afree available from http://vbc.med.monash.edu.au/~kmahmood/afree . EGM2, complete ortholog assignment pipeline (including afree and the iterative graph matching method) available from http://vbc.med.monash.edu.au/~kmahmood/EGM2 .
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 91
    Publication Date: 2012-03-29
    Description: We demonstrate a system for cloning and modifying the chloroplast genome from the green alga, Chlamydomonas reinhardtii . Through extensive use of sequence stabilization strategies, the ex vivo genome is assembled in yeast from a collection of overlapping fragments. The assembled genome is then moved into bacteria for large-scale preparations and transformed into C. reinhardtii cells. This system also allows for the generation of simultaneous, systematic and complex genetic modifications at multiple loci in vivo. We use this system to substitute genes encoding core subunits of the photosynthetic apparatus with orthologs from a related alga, Scenedesmus obliquus . Once transformed into algae, the substituted genome recombines with the endogenous genome, resulting in a hybrid plastome comprising modifications in disparate loci. The in vivo function of the genomes described herein demonstrates that simultaneous engineering of multiple sites within the chloroplast genome is now possible. This work represents the first steps toward a novel approach for creating genetic diversity in any or all regions of a chloroplast genome.
    Keywords: Synthetic Biology and Assembly Cloning
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 92
    Publication Date: 2012-03-29
    Description: With the availability of next-generation sequencing (NGS) technology, it is expected that sequence variants may be called on a genomic scale. Here, we demonstrate that a deeper understanding of the distribution of the variant call frequencies at heterozygous loci in NGS data sets is a prerequisite for sensitive variant detection. We model the crucial steps in an NGS protocol as a stochastic branching process and derive a mathematical framework for the expected distribution of alleles at heterozygous loci before measurement that is sequencing. We confirm our theoretical results by analyzing technical replicates of human exome data and demonstrate that the variance of allele frequencies at heterozygous loci is higher than expected by a simple binomial distribution. Due to this high variance, mutation callers relying on binomial distributed priors are less sensitive for heterozygous variants that deviate strongly from the expected mean frequency. Our results also indicate that error rates can be reduced to a greater degree by technical replicates than by increasing sequencing depth.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 93
    Publication Date: 2012-03-14
    Description: An approach to infer the unknown microbial population structure within a metagenome is to cluster nucleotide sequences based on common patterns in base composition, otherwise referred to as binning. When functional roles are assigned to the identified populations, a deeper understanding of microbial communities can be attained, more so than gene-centric approaches that explore overall functionality. In this study, we propose an unsupervised, model-based binning method with two clustering tiers, which uses a novel transformation of the oligonucleotide frequency-derived error gradient and GC content to generate coarse groups at the first tier of clustering; and tetranucleotide frequency to refine these groups at the secondary clustering tier. The proposed method has a demonstrated improvement over PhyloPythia, S-GSOM, TACOA and TaxSOM on all three benchmarks that were used for evaluation in this study. The proposed method is then applied to a pyrosequenced metagenomic library of mud volcano sediment sampled in southwestern Taiwan, with the inferred population structure validated against complementary sequencing of 16S ribosomal RNA marker genes. Finally, the proposed method was further validated against four publicly available metagenomes, including a highly complex Antarctic whale-fall bone sample, which was previously assumed to be too complex for binning prior to functional analysis.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 94
    Publication Date: 2012-02-17
    Description: We introduce the software tool NTRFinder to search for a complex repetitive structure in DNA we call a nested tandem repeat (NTR). An NTR is a recurrence of two or more distinct tandem motifs interspersed with each other. We propose that NTRs can be used as phylogenetic and population markers. We have tested our algorithm on both real and simulated data, and present some real NTRs of interest. NTRFinder can be downloaded from http://www.maths.otago.ac.nz/~aamatroud/ .
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 95
    Publication Date: 2012-02-17
    Description: The increasing interest in genetic manipulation of bacterial host metabolic pathways for protein or small molecule production has led to a need to add new genes to a chromosome quickly and easily without leaving behind a selectable marker. The present report describes a vector and four-day procedure that enable site-specific chromosomal insertion of cloned genes in a context insulated from external transcription, usable once in a construction series. The use of rhamnose-inducible transcription from rhaBp allows regulation of the inserted genes independently of the commonly used IPTG and arabinose strategies. Using lacZ as a reporter, we first show that expression from the rhamnose promoter is tightly regulatable, exhibiting very low leakage of background expression compared with background, and moderate rhamnose-induced expression compared with IPTG-induced expression from lacp . Second, the expression of a DNA methyltransferase was used to show that rhamnose regulation yielded on-off expression of this enzyme, such that a resident high-copy plasmid was either fully sensitive or fully resistant to isoschizomer restriction enzyme cleavage. In both cases, growth medium manipulation allows intermediate levels of expression. The vehicle can also be adapted as an ORF-cloning vector.
    Keywords: Synthetic Biology and Assembly Cloning
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 96
    Publication Date: 2012-02-17
    Description: The development of economical and high-throughput gene synthesis technology has been hampered by the high occurrence of errors in the synthesized products, which requires expensive labor and time to correct. Here, we describe an error correction reaction (ECR), which employs Surveyor, a mismatch-specific DNA endonuclease, to remove errors from synthetic genes. In ECR reactions, errors are revealed as mismatches by re-annealing of the synthetic gene products. Mismatches are recognized and excised by a combination of mismatch-specific endonuclease and 3'-〉5' exonuclease activities in the reaction mixture. Finally, overlap extension polymerase chain reaction (OE-PCR) re-assembles the resulting fragments into intact genes. The process can be iterated for increased fidelity. With two iterations, we were able to reduce errors in synthetic genes by 〉16-fold, yielding a final error rate of ~1 in 8700 bp.
    Keywords: Synthetic Biology and Assembly Cloning
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 97
    Publication Date: 2012-10-10
    Description: A major challenge in metabolic engineering and synthetic biology is to balance the flux of an engineered heterologous metabolic pathway to achieve high yield and productivity in a target organism. Here, we report a simple, efficient and programmable approach named ‘customized optimization of metabolic pathways by combinatorial transcriptional engineering (COMPACTER)’ for rapid tuning of gene expression in a heterologous pathway under distinct metabolic backgrounds. Specifically, a library of mutant pathways is created by de novo assembly of promoter mutants of varying strengths for each pathway gene in a target organism followed by high-throughput screening/selection. To demonstrate this approach, a single round of COMPACTER was used to generate both a xylose utilizing pathway with near-highest efficiency and a cellobiose utilizing pathway with highest efficiency that were ever reported in literature for both laboratory and industrial yeast strains. Interestingly, these engineered xylose and cellobiose utilizing pathways were all host-specific. Therefore, COMPACTER represents a powerful approach to tailor-make metabolic pathways for different strain backgrounds, which is difficult if not impossible to achieve by existing pathway engineering methods.
    Keywords: Synthetic Biology and Assembly Cloning
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 98
    Publication Date: 2012-10-10
    Description: We developed a highly scalable ‘shotgun’ DNA synthesis technology by utilizing microchip oligonucleotides, shotgun assembly and next-generation sequencing technology. A pool of microchip oligonucleotides targeting a penicillin biosynthetic gene cluster were assembled into numerous random fragments, and tagged with 20 bp degenerate barcode primer pairs. An optimal set of error-free fragments were identified by high-throughput DNA sequencing, selectively amplified using the barcode sequences, and successfully assembled into the target gene cluster.
    Keywords: Synthetic Biology and Assembly Cloning
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 99
    Publication Date: 2012-12-14
    Description: Multivalent molecular interactions can be exploited to dramatically enhance the performance of an affinity reagent. The enhancement in affinity and specificity achieved with a multivalent construct depends critically on the effectiveness of the scaffold that joins the ligands, as this determines their positions and orientations with respect to the target molecule. Currently, no generalizable design rules exist for construction of an optimal multivalent ligand for targets with known structures, and the design challenge remains an insurmountable obstacle for the large number of proteins whose structures are not known. As an alternative to such design-based strategies, we report here a directed evolution-based method for generating optimal bivalent aptamers. To demonstrate this approach, we fused two thrombin aptamers with a randomized DNA sequence and used a microfluidic in vitro selection strategy to isolate scaffolds with exceptionally high affinities. Within five rounds of selection, we generated a bivalent aptamer that binds thrombin with an apparent dissociation constant (K d ) 〈10 pM, representing a ~200-fold improvement in binding affinity over the monomeric aptamers and a ~15-fold improvement over the best designed bivalent construct. The process described here can be used to produce high-affinity multivalent aptamers and could potentially be adapted to other classes of biomolecules.
    Keywords: Synthetic Biology and Assembly Cloning
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 100
    Publication Date: 2012-06-28
    Description: Identification of transcriptional regulatory regions and tracing their internal organization are important for understanding the eukaryotic cell machinery. Cis-regulatory modules (CRMs) of higher eukaryotes are believed to possess a regulatory ‘grammar’, or preferred arrangement of binding sites, that is crucial for proper regulation and thus tends to be evolutionarily conserved. Here, we present a method CORECLUST (COnservative REgulatory CLUster STructure) that predicts CRMs based on a set of positional weight matrices. Given regulatory regions of orthologous and/or co-regulated genes, CORECLUST constructs a CRM model by revealing the conserved rules that describe the relative location of binding sites. The constructed model may be consequently used for the genome-wide prediction of similar CRMs, and thus detection of co-regulated genes, and for the investigation of the regulatory grammar of the system. Compared with related methods, CORECLUST shows better performance at identification of CRMs conferring muscle-specific gene expression in vertebrates and early-developmental CRMs in Drosophila .
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. More information can be found here...