ALBERT

All Library Books, journals and Electronic Records Telegrafenberg

Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
Filter
  • Articles  (28,222)
  • Oxford University Press  (28,222)
  • Blackwell Publishing Ltd
  • Bioinformatics  (3,731)
  • Genome Biology and Evolution  (1,147)
  • Monthly Notices of the Royal Astronomical Society / Letters  (720)
  • 119207
  • 2184
  • 55697
Collection
  • Articles  (28,222)
Publisher
  • Oxford University Press  (28,222)
  • Blackwell Publishing Ltd
Years
Topic
  • 1
    Publication Date: 2020-08-27
    Description: We present CO observations towards a sample of six H i-rich Ultradiffuse galaxies (UDGs) as well as one UDG (VLSB-A) in the Virgo Cluster with the Institut de RadioAstronomie Millimétrique (IRAM) 30-m telescope. CO J = 1–0 is marginally detected at 4σ level in AGC 122966, as the first detection of CO emission in UDGs. We estimate upper limits of molecular mass in other galaxies from the non-detection of CO lines. These upper limits and the marginal CO detection in AGC 122966 indicate low mass ratios between molecular and atomic gas masses. With the star formation efficiency derived from the molecular gas, we suggest that the inefficiency of star formation in such H i-rich UDGs is likely caused by the low efficiency in converting molecules from atomic gas, instead of low efficiency in forming stars from molecular gas.
    Print ISSN: 1745-3925
    Electronic ISSN: 1745-3933
    Topics: Physics
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 2
    Publication Date: 2020-07-16
    Description: We find that the minor axes of the ultra-diffuse galaxies (UDGs) in Abell 2634 tend to be aligned with the major axis of the central dominant galaxy, at a $gtrsim 95{{ m per cent}}$ confidence level. This alignment is produced by the bright UDGs with the absolute magnitudes Mr 〈 −15.3 mag, and outer-region UDGs with R 〉 0.5R200. The alignment signal implies that these bright, outer-region UDGs are very likely to acquire their angular momenta from the vortices around the large-scale filament before they were accreted into A2634, and form their extended stellar bodies outside of the cluster; in this scenario, the orientations of their primordial angular momenta, which are roughly shown by their minor axes on the images, should tend to be parallel to the elongation of the large-scale filament. When these UDGs fell into the unrelaxed cluster A2634 along the filament, they could still preserve their primordial alignment signal before violent relaxation and encounters. These bright, outer-region UDGs in A2634 are very unlikely to be the descendants of the high-surface-brightness dwarf progenitors under tidal interactions with the central dominant galaxy in the cluster environment. Our results indicate that the primordial alignment could be a useful probe of the origin of UDGs in large-scale structures.
    Print ISSN: 1745-3925
    Electronic ISSN: 1745-3933
    Topics: Physics
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 3
    Publication Date: 2020-07-11
    Description: We present a detailed analysis of the gaseous component of the Si K edge using high-resolution Chandra spectra of low-mass X-ray binaries. We fit the spectra with a modified version of the ISMabs model, including new photoabsorption cross-sections computed for all Si ionic species. We estimate column densities for Si i, Si ii, Si iii, Si xii, and Si xiii, which trace the warm, intermediate temperature, and hot phases of the Galactic interstellar medium. We find that the ionic fractions of the first two phases are similar. This may be due to the physical state of the plasma determined by the temperature or due to the presence of absorber material in the close vicinity of the sources. Our findings highlight the need for accurate modelling of the gaseous component before attempting to address the solid component.
    Print ISSN: 1745-3925
    Electronic ISSN: 1745-3933
    Topics: Physics
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 4
    Publication Date: 2020-07-11
    Description: Using state-of-the-art high-resolution fully GPU N-body simulations, we demonstrate for the first time that the infall of a dark matter-rich satellite naturally explains a present black hole offset by subparsecs in M31. Observational data of the tidal features provide stringent constraints on the initial conditions of our simulations. The heating of the central region of M31 by the satellite via dynamical friction entails a significant black hole offset after the first pericentric passage. After having reached its maximum offset, the massive black hole sinks towards the M31 centre due to dynamical friction and it is determined to be offset by subparsecs as derived by observations.
    Print ISSN: 1745-3925
    Electronic ISSN: 1745-3933
    Topics: Physics
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 5
    Publication Date: 2020-06-12
    Description: In two recent papers published in MNRAS, Namouni and Morais claimed evidence for the interstellar origin of some small Solar system bodies, including: (i) objects in retrograde co-orbital motion with the giant planets and (ii) the highly inclined Centaurs. Here, we discuss the flaws of those papers that invalidate the authors’ conclusions. Numerical simulations backwards in time are not representative of the past evolution of real bodies. Instead, these simulations are only useful as a means to quantify the short dynamical lifetime of the considered bodies and the fast decay of their population. In light of this fast decay, if the observed bodies were the survivors of populations of objects captured from interstellar space in the early Solar system, these populations should have been implausibly large (e.g. about 10 times the current main asteroid belt population for the retrograde co-orbital of Jupiter). More likely, the observed objects are just transient members of a population that is maintained in quasi-steady state by a continuous flux of objects from some parent reservoir in the distant Solar system. We identify in the Halley-type comets and the Oort cloud the most likely sources of retrograde co-orbitals and highly inclined Centaurs.
    Print ISSN: 1745-3925
    Electronic ISSN: 1745-3933
    Topics: Physics
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 6
    Publication Date: 2020-06-18
    Description: Strong gravitational lensing has been a powerful probe of cosmological models and gravity. To date, constraints in either domain have been obtained separately. We propose a new methodology through which the cosmological model, specifically the Hubble constant, and post-Newtonian parameter can be simultaneously constrained. Using the time-delay cosmography from strong lensing combined with the stellar kinematics of the deflector lens, we demonstrate that the Hubble constant and post-Newtonian parameter are incorporated in two distance ratios that reflect the lensing mass and dynamical mass, respectively. Through the re-analysis of the four publicly released lenses distance posteriors from the H0LiCOW (H0 Lenses in COSMOGRAIL’s Wellspring) collaboration, the simultaneous constraints of Hubble constant and post-Newtonian parameter are obtained. Our results suggest no deviation from the general relativity; $gamma _{t {PPN}}=0.87^{+0.19}_{-0.17}$ with a Hubble constant that favours the local Universe value, $H_0=73.65^{+1.95}_{-2.26}$ km s−1 Mpc−1. Finally, we forecast the robustness of gravity tests by using the time-delay strong lensing for constraints we expect in the next few years. We find that the joint constraint from 40 lenses is able to reach the order of $7.7{{ m per cent}}$ for the post-Newtonian parameter and $1.4{{ m per cent}}$ for the Hubble constant.
    Print ISSN: 1745-3925
    Electronic ISSN: 1745-3933
    Topics: Physics
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 7
    Publication Date: 2020-07-10
    Description: One of the proposed channels of binary black hole mergers involves dynamical interactions of three black holes. In such scenarios, it is possible that all three black holes merge in a so-called hierarchical merger chain, where two of the black holes merge first and then their remnant subsequently merges with the remaining single black hole. Depending on the dynamical environment, it is possible that both mergers will appear within the observable time window. Here, we perform a search for such merger pairs in the public available LIGO and Virgo data from the O1/O2 runs. Using a frequentist p-value assignment statistics, we do not find any significant merger pair candidates, the most significant being GW170809-GW151012 pair. Assuming no observed candidates in O3/O4, we derive upper limits on merger pairs to be ∼11–110 yr−1 Gpc−3, corresponding to a rate that relative to the total merger rate is ∼0.1−1.0. From this, we argue that both a detection and a non-detection within the next few years can be used to put useful constraints on some dynamical progenitor models.
    Print ISSN: 1745-3925
    Electronic ISSN: 1745-3933
    Topics: Physics
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 8
    Publication Date: 2020-07-10
    Description: We present extremely deep upper limits on the radio emission from 4U 1957+11, an X-ray binary that is generally believed to be a persistently accreting black hole that is almost always in the soft state. We discuss a more comprehensive search for Type I bursts than in past work, revealing a stringent upper limit on the burst rate, bolstering the case for a black hole accretor. The lack of detection of this source at the 1.07 μJy/beam noise level indicates jet suppression that is stronger than expected even in the most extreme thin disc models for radio jet production – the radio power here is 1500–3700 times lower than the extrapolation of the hard state radio/X-ray correlation, with the uncertainties depending primarily on the poorly constrained source distance. We also discuss the location and velocity of the source and show that it must have either formed in the halo or with a strong asymmetric natal kick.
    Print ISSN: 1745-3925
    Electronic ISSN: 1745-3933
    Topics: Physics
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 9
    Publication Date: 2020-06-12
    Description: Swift J004427.3−734801 is an X-ray source in the Small Magellanic Cloud (SMC) that was first discovered as part of the Swift S-CUBED programme in 2020 January. It was not detected in any of the previous 3 yr worth of observations. The accurate positional determination from the X-ray data has permitted an optical counterpart to be identified that has the characteristics of an O9V−B2III star. Evidence for the presence of an infrared excess and significant I-band variability strongly suggests that this is an OBe-type star. Over 17 yr worth of optical monitoring by the OGLE (Optical Gravitational Lensing Experiment) project reveals periods of time in which quasi-periodic optical flares occur at intervals of ∼21.5 d. The X-ray data obtained from the S-CUBED project reveal a very soft spectrum, too soft to be that from accretion on to a neutron star or black hole. It is suggested here that this is a rarely identified Be star–white dwarf binary in the SMC.
    Print ISSN: 1745-3925
    Electronic ISSN: 1745-3933
    Topics: Physics
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 10
    Publication Date: 2020-07-10
    Description: Gravitational microlensing can detect isolated stellar-mass black holes (BHs), which are believed to be the dominant form of Galactic BHs according to population synthesis models. Previous searches for BH events in microlensing data focused on long time-scale events with significant microlensing parallax detections. Here we show that, although BH events preferentially have long time-scales, the microlensing parallax amplitudes are so small that in most cases the parallax signals cannot be detected statistically significantly. We then identify OGLE-2006-BLG-044 to be a candidate BH event because of its long time-scale and small microlensing parallax. Our findings have implications to future BH searches in microlensing data.
    Print ISSN: 1745-3925
    Electronic ISSN: 1745-3933
    Topics: Physics
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 11
  • 12
    Publication Date: 2007-06-01
    Print ISSN: 1745-3925
    Electronic ISSN: 1745-3933
    Topics: Physics
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 13
    Publication Date: 2007-06-01
    Print ISSN: 1745-3925
    Electronic ISSN: 1745-3933
    Topics: Physics
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 14
    Publication Date: 2015-08-08
    Description: : As sequencing becomes cheaper and more widely available, there is a greater need to quickly and effectively analyze large-scale genomic data. While the functionality of AVIA v1.0, whose implementation was based on ANNOVAR, was comparable with other annotation web servers, AVIA v2.0 represents an enhanced web-based server that extends genomic annotations to cell-specific transcripts and protein-level functional annotations. With AVIA’s improved interface, users can better visualize their data, perform comprehensive searches and categorize both coding and non-coding variants. Availability and implementation : AVIA is freely available through the web at http://avia.abcc.ncifcrf.gov . Contact : Hue.Vuong@fnlcr.nih.gov Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 15
    Publication Date: 2015-08-08
    Description: : As new methods for multivariate analysis of genome wide association studies become available, it is important to be able to combine results from different cohorts in a meta-analysis. The R package MultiMeta provides an implementation of the inverse-variance-based method for meta-analysis, generalized to an n -dimensional setting. Availability and implementation: The R package MultiMeta can be downloaded from CRAN. Contact: dragana.vuckovic@burlo.trieste.it ; vi1@sanger.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 16
    Publication Date: 2015-08-12
    Description: Some of the most dangerous pathogens such as Mycobacterium tuberculosis and Yersinia pestis evolve clonally . This means that little or no recombination occurs between strains belonging to these species. Paradoxically, although different members of these species show extreme sequence similarity of orthologous genes, some show considerable intraspecies phenotypic variation, the source of which remains elusive. To examine the possible sources of phenotypic variation within clonal pathogenic bacterial species, we carried out an extensive genomic and pan-genomic analysis of the sources of genetic variation available to a large collection of clonal and nonclonal pathogenic bacterial species. We show that while nonclonal species diversify through a combination of changes to gene sequences, gene loss and gene gain, gene loss completely dominates as a source of genetic variation within clonal species. Indeed, gene loss is so prevalent within clonal species as to lead to levels of gene content variation comparable to those found in some nonclonal species that are much more diverged in their gene sequences and that acquire a substantial number of genes horizontally. Gene loss therefore needs to be taken into account as a potential dominant source of phenotypic variation within clonal bacterial species.
    Electronic ISSN: 1759-6653
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 17
    Publication Date: 2015-08-12
    Description: Obligate bacterial symbionts are widespread in many invertebrates, where they are often confined to specialized host cells and are transmitted directly from mother to progeny. Increasing numbers of these bacteria are being characterized but questions remain about their population structure and evolution. Here we take a comparative genomics approach to investigate two prominent bacterial symbionts (BFo1 and BFo2) isolated from geographically separated populations of western flower thrips, Frankliniella occidentalis. Our multifaceted approach to classifying these symbionts includes concatenated multilocus sequence analysis (MLSA) phylogenies, ribosomal multilocus sequence typing (rMLST), construction of whole-genome phylogenies, and in-depth genomic comparisons. We showed that the BFo1 genome clusters more closely to species in the genus Erwinia, and is a putative close relative to Erwinia aphidicola . BFo1 is also likely to have shared a common ancestor with Erwinia pyrifoliae/Erwinia amylovora and the nonpathogenic Erwinia tasmaniensis and genetic traits similar to Erwinia billingiae . The BFo1 genome contained virulence factors found in the genus Erwinia but represented a divergent lineage. In contrast, we showed that BFo2 belongs within the Enterobacteriales but does not group closely with any currently known bacterial species. Concatenated MLSA phylogenies indicate that it may have shared a common ancestor to the Erwinia and Pantoea genera, and based on the clustering of rMLST genes, it was most closely related to Pantoea ananatis but represented a divergent lineage. We reconstructed a core genome of a putative common ancestor of Erwinia and Pantoea and compared this with the genomes of BFo bacteria. BFo2 possessed none of the virulence determinants that were omnipresent in the Erwinia and Pantoea genera. Taken together, these data are consistent with BFo2 representing a highly novel species that maybe related to known Pantoea .
    Electronic ISSN: 1759-6653
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 18
    Publication Date: 2015-08-14
    Description: We present a sub-100 pc-scale analysis of the CO molecular gas emission and kinematics of the gravitational lens system SDP.81 at redshift 3.042 using Atacama Large Millimetre/submillimetre Array (ALMA) science verification data and a visibility-plane lens reconstruction technique. We find clear evidence for an excitation-dependent structure in the unlensed molecular gas distribution, with emission in CO (5–4) being significantly more diffuse and structured than in CO (8–7). The intrinsic line luminosity ratio is r 8–7/5–4  = 0.30 ± 0.04, which is consistent with other low-excitation starbursts at z  ~ 3. An analysis of the velocity fields shows evidence for a star-forming disc with multiple velocity components that is consistent with a merger/post-coalescence merger scenario, and a dynamical mass of M (〈1.56 kpc) = 1.6 ± 0.6  x  10 10 M . Source reconstructions from ALMA and the Hubble Space Telescope show that the stellar component is offset from the molecular gas and dust components. Together with Karl G. Jansky Very Large Array CO (1–0) data, they provide corroborative evidence for a complex ~2 kpc-scale starburst that is embedded within a larger ~15 kpc structure.
    Print ISSN: 1745-3925
    Electronic ISSN: 1745-3933
    Topics: Physics
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 19
    Publication Date: 2015-08-16
    Description: Gene expression evolution occurs through changes in cis - or trans -regulatory elements or both. Interactions between transcription factors (TFs) and their binding sites (TFBSs) constitute one of the most important points where these two regulatory components intersect. In this study, we investigated the evolution of TFBSs in the promoter regions of different Saccharomyces strains and species. We divided the promoter of a gene into the proximal region and the distal region, which are defined, respectively, as the 200-bp region upstream of the transcription starting site and as the 200-bp region upstream of the proximal region. We found that the predicted TFBSs in the proximal promoter regions tend to be evolutionarily more conserved than those in the distal promoter regions. Additionally, Saccharomyces cerevisiae strains used in the fermentation of alcoholic drinks have experienced more TFBS losses than gains compared with strains from other environments (wild strains, laboratory strains, and clinical strains). We also showed that differences in TFBSs correlate with the cis component of gene expression evolution between species (comparing S. cerevisiae and its sister species Saccharomyces paradoxus ) and within species (comparing two closely related S. cerevisiae strains).
    Electronic ISSN: 1759-6653
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 20
    Publication Date: 2015-08-16
    Description: Gene duplication is a key factor contributing to phenotype diversity across and within species. Although the availability of complete genomes has led to the extensive study of genomic duplications, the dynamics and variability of gene duplications mediated by retrotransposition are not well understood. Here, we predict mRNA retrotransposition and use comparative genomics to investigate their origin and variability across primates. Analyzing seven anthropoid primate genomes, we found a similar number of mRNA retrotranspositions (~7,500 retrocopies) in Catarrhini (Old Word Monkeys, including humans), but a surprising large number of retrocopies (~10,000) in Platyrrhini (New World Monkeys), which may be a by-product of higher long interspersed nuclear element 1 activity in these genomes. By inferring retrocopy orthology, we dated most of the primate retrocopy origins, and estimated a decrease in the fixation rate in recent primate history, implying a smaller number of species-specific retrocopies. Moreover, using RNA-Seq data, we identified approximately 3,600 expressed retrocopies. As expected, most of these retrocopies are located near or within known genes, present tissue-specific and even species-specific expression patterns, and no expression correlation to their parental genes. Taken together, our results provide further evidence that mRNA retrotransposition is an active mechanism in primate evolution and suggest that retrocopies may not only introduce great genetic variability between lineages but also create a large reservoir of potentially functional new genomic loci in primate genomes.
    Electronic ISSN: 1759-6653
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 21
    Publication Date: 2015-08-06
    Description: Viruses rely completely on the hosts’ machinery for translation of viral transcripts. However, for most viruses infecting humans, codon usage preferences (CUPrefs) do not match those of the host. Human papillomaviruses (HPVs) are a showcase to tackle this paradox: they present a large genotypic diversity and a broad range of phenotypic presentations, from asymptomatic infections to productive lesions and cancer. By applying phylogenetic inference and dimensionality reduction methods, we demonstrate first that genes in HPVs are poorly adapted to the average human CUPrefs, the only exception being capsid genes in viruses causing productive lesions. Phylogenetic relationships between HPVs explained only a small proportion of CUPrefs variation. Instead, the most important explanatory factor for viral CUPrefs was infection phenotype, as orthologous genes in viruses with similar clinical presentation displayed similar CUPrefs. Moreover, viral genes with similar spatiotemporal expression patterns also showed similar CUPrefs. Our results suggest that CUPrefs in HPVs reflect either variations in the mutation bias or differential selection pressures depending on the clinical presentation and expression timing. We propose that poor viral CUPrefs may be central to a trade-off between strong viral gene expression and the potential for eliciting protective immune response.
    Electronic ISSN: 1759-6653
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 22
    Publication Date: 2015-08-08
    Description: Motivation: Stem cell differentiation is largely guided by master transcriptional regulators, but it also depends on the expression of other types of genes, such as cell cycle genes, signaling genes, metabolic genes, trafficking genes, etc. Traditional approaches to understanding gene expression patterns across multiple conditions, such as principal components analysis or K-means clustering, can group cell types based on gene expression, but they do so without knowledge of the differentiation hierarchy. Hierarchical clustering can organize cell types into a tree, but in general this tree is different from the differentiation hierarchy itself. Methods: Given the differentiation hierarchy and gene expression data at each node, we construct a weighted Euclidean distance metric such that the minimum spanning tree with respect to that metric is precisely the given differentiation hierarchy. We provide a set of linear constraints that are provably sufficient for the desired construction and a linear programming approach to identify sparse sets of weights, effectively identifying genes that are most relevant for discriminating different parts of the tree. Results: We apply our method to microarray gene expression data describing 38 cell types in the hematopoiesis hierarchy, constructing a weighted Euclidean metric that uses just 175 genes. However, we find that there are many alternative sets of weights that satisfy the linear constraints. Thus, in the style of random-forest training, we also construct metrics based on random subsets of the genes and compare them to the metric of 175 genes. We then report on the selected genes and their biological functions. Our approach offers a new way to identify genes that may have important roles in stem cell differentiation. Contact: tperkins@ohri.ca Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 23
    Publication Date: 2015-08-08
    Description: Motivation: Principal component analysis (PCA) is a basic tool often used in bioinformatics for visualization and dimension reduction. However, it is known that PCA may not consistently estimate the true direction of maximal variability in high-dimensional, low sample size settings, which are typical for molecular data. Assuming that the underlying signal is sparse, i.e. that only a fraction of features contribute to a principal component (PC), this estimation consistency can be retained. Most existing sparse PCA methods use L1-penalization, i.e. the lasso , to perform feature selection. But, the lasso is known to lack variable selection consistency in high dimensions and therefore a subsequent interpretation of selected features can give misleading results. Results: We present S4VDPCA, a sparse PCA method that incorporates a subsampling approach, namely stability selection. S4VDPCA can consistently select the truly relevant variables contributing to a sparse PC while also consistently estimate the direction of maximal variability. The performance of the S4VDPCA is assessed in a simulation study and compared to other PCA approaches, as well as to a hypothetical oracle PCA that ‘knows’ the truly relevant features in advance and thus finds optimal, unbiased sparse PCs. S4VDPCA is computationally efficient and performs best in simulations regarding parameter estimation consistency and feature selection consistency. Furthermore, S4VDPCA is applied to a publicly available gene expression data set of medulloblastoma brain tumors. Features contributing to the first two estimated sparse PCs represent genes significantly over-represented in pathways typically deregulated between molecular subgroups of medulloblastoma. Availability and implementation: Software is available at https://github.com/mwsill/s4vdpca . Contact: m.sill@dkfz.de Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 24
    Publication Date: 2015-08-08
    Description: Motivation: Glycans play critical roles in many biological processes, and their structural diversity is key for specific protein-glycan recognition. Comparative structural studies of biological molecules provide useful insight into their biological relationships. However, most computational tools are designed for protein structure, and despite their importance, there is no currently available tool for comparing glycan structures in a sequence order- and size-independent manner. Results: A novel method, GS-align, is developed for glycan structure alignment and similarity measurement. GS-align generates possible alignments between two glycan structures through iterative maximum clique search and fragment superposition. The optimal alignment is then determined by the maximum structural similarity score, GS-score, which is size-independent. Benchmark tests against the Protein Data Bank (PDB) N -linked glycan library and PDB homologous/non-homologous N -glycoprotein sets indicate that GS-align is a robust computational tool to align glycan structures and quantify their structural similarity. GS-align is also applied to template-based glycan structure prediction and monosaccharide substitution matrix generation to illustrate its utility. Availability and implementation: http://www.glycanstructure.org/gsalign . Contact: wonpil@ku.edu Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 25
    Publication Date: 2015-08-08
    Description: Motivation: Impedance-based technologies are advancing methods for measuring proliferation of adherent cell cultures non-invasively and in real time. The analysis of the resulting data has so far been hampered by inappropriate computational methods and the lack of systematic data to evaluate the characteristics of the assay. Results: We used a commercially available system for impedance-based growth measurement (xCELLigence) and compared the reported cell index with data from microscopy. We found that the measured signal correlates linearly with the cell number throughout the time of an experiment with sufficient accuracy in subconfluent cell cultures. The resulting growth curves for various colon cancer cells could be well described with the empirical Richards growth model, which allows for extracting quantitative parameters (such as characteristic cycle times). We found that frequently used readouts like the cell index at a specific time or the area under the growth curve cannot be used to faithfully characterize growth inhibition. We propose to calculate the average growth rate of selected time intervals to accurately estimate time-dependent IC50 values of drugs from growth curves. Contact: nils.bluethgen@charite.de Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 26
    Publication Date: 2015-08-24
    Description: Regardless of the physical origin of stellar magnetic fields – fossil or dynamo induced - an inclination angle between the magnetic and rotation axes is very often observed. Absence of observational evidence in this direction in the solar case has led to generally assume that its global magnetic field and rotation axes are well aligned. We present the detection of a monthly periodic signal of the photospheric solar magnetic field at all latitudes, and especially near the poles, revealing that the main axis of the Sun's magnetic field is not aligned with the surface rotation axis. This result reinforces the view of our Sun as a common intermediate-mass star. Furthermore, this detection challenges and imposes a strong observational constraint to modern solar dynamo theories.
    Print ISSN: 1745-3925
    Electronic ISSN: 1745-3933
    Topics: Physics
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 27
    Publication Date: 2015-08-25
    Description: Motivation: The storage and transmission of high-throughput sequencing data consumes significant resources. As our capacity to produce such data continues to increase, this burden will only grow. One approach to reduce storage and transmission requirements is to compress this sequencing data. Results: We present a novel technique to boost the compression of sequencing that is based on the concept of bucketing similar reads so that they appear nearby in the file. We demonstrate that, by adopting a data-dependent bucketing scheme and employing a number of encoding ideas, we can achieve substantially better compression ratios than existing de novo sequence compression tools, including other bucketing and reordering schemes. Our method, Mince, achieves up to a 45% reduction in file sizes (28% on average) compared with existing state-of-the-art de novo compression schemes. Availability and implementation : Mince is written in C++11, is open source and has been made available under the GPLv3 license. It is available at http://www.cs.cmu.edu/~ckingsf/software/mince . Contact: carlk@cs.cmu.edu Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 28
    Publication Date: 2015-08-25
    Description: : Current methods for motif discovery from chromatin immunoprecipitation followed by sequencing (ChIP-seq) data often identify non-targeted transcription factor (TF) motifs, and are even further limited when peak sequences are similar due to common ancestry rather than common binding factors. The latter aspect particularly affects a large number of proteins from the Cys 2 His 2 zinc finger (C2H2-ZF) class of TFs, as their binding sites are often dominated by endogenous retroelements that have highly similar sequences. Here, we present recognition code-assisted discovery of regulatory elements (RCADE) for motif discovery from C2H2-ZF ChIP-seq data. RCADE combines predictions from a DNA recognition code of C2H2-ZFs with ChIP-seq data to identify models that represent the genuine DNA binding preferences of C2H2-ZF proteins. We show that RCADE is able to identify generalizable binding models even from peaks that are exclusively located within the repeat regions of the genome, where state-of-the-art motif finding approaches largely fail. Availability and implementation: RCADE is available as a webserver and also for download at http://rcade.ccbr.utoronto.ca/ . Supplementary information: Supplementary data are available at Bioinformatics online. Contact: t.hughes@utoronto.ca
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 29
    Publication Date: 2015-08-25
    Description: Motivation: Phylogenetic estimates from published studies can be archived using general platforms like Dryad (Vision, 2010) or TreeBASE (Sanderson et al. , 1994). Such services fulfill a crucial role in ensuring transparency and reproducibility in phylogenetic research. However, digital tree data files often require some editing (e.g. rerooting) to improve the accuracy and reusability of the phylogenetic statements. Furthermore, establishing the mapping between tip labels used in a tree and taxa in a single common taxonomy dramatically improves the ability of other researchers to reuse phylogenetic estimates. As the process of curating a published phylogenetic estimate is not error-free, retaining a full record of the provenance of edits to a tree is crucial for openness, allowing editors to receive credit for their work and making errors introduced during curation easier to correct. Results : Here, we report the development of software infrastructure to support the open curation of phylogenetic data by the community of biologists. The backend of the system provides an interface for the standard database operations of creating, reading, updating and deleting records by making commits to a git repository. The record of the history of edits to a tree is preserved by git’s version control features. Hosting this data store on GitHub ( http://github.com/ ) provides open access to the data store using tools familiar to many developers. We have deployed a server running the ‘phylesystem-api’, which wraps the interactions with git and GitHub. The Open Tree of Life project has also developed and deployed a JavaScript application that uses the phylesystem-api and other web services to enable input and curation of published phylogenetic statements. Availability and implementation : Source code for the web service layer is available at https://github.com/OpenTreeOfLife/phylesystem-api . The data store can be cloned from: https://github.com/OpenTreeOfLife/phylesystem . A web application that uses the phylesystem web services is deployed at http://tree.opentreeoflife.org/curator . Code for that tool is available from https://github.com/OpenTreeOfLife/opentree . Contact : mtholder@gmail.com
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 30
    Publication Date: 2015-08-25
    Description: Motivation: Chromatin Immunoprecipitation followed by sequencing (ChIP-seq) detects genome-wide DNA–protein interactions and chromatin modifications, returning enriched regions (ERs), usually associated with a significance score. Moderately significant interactions can correspond to true, weak interactions, or to false positives; replicates of a ChIP-seq experiment can provide co-localised evidence to decide between the two cases. We designed a general methodological framework to rigorously combine the evidence of ERs in ChIP-seq replicates, with the option to set a significance threshold on the repeated evidence and a minimum number of samples bearing this evidence. Results : We applied our method to Myc transcription factor ChIP-seq datasets in K562 cells available in the ENCODE project. Using replicates, we could extend up to 3 times the ER number with respect to single-sample analysis with equivalent significance threshold. We validated the ‘rescued’ ERs by checking for the overlap with open chromatin regions and for the enrichment of the motif that Myc binds with strongest affinity; we compared our results with alternative methods (IDR and jMOSAiCS), obtaining more validated peaks than the former and less peaks than latter, but with a better validation. Availability and implementation : An implementation of the proposed method and its source code under GPLv3 license are freely available at http://www.bioinformatics.deib.polimi.it/MSPC/ and http://mspc.codeplex.com/ , respectively. Contact : marco.morelli@iit.it Supplementary information: Supplementary Material are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 31
    Publication Date: 2015-08-25
    Description: : We announce the release of kSNP3.0, a program for SNP identification and phylogenetic analysis without genome alignment or the requirement for reference genomes. kSNP3.0 is a significantly improved version of kSNP v2. Availability and implementation : kSNP3.0 is implemented as a package of stand-alone executables for Linux and Mac OS X under the open-source BSD license. The executable packages, source code and a full User Guide are freely available at https://sourceforge.net/projects/ksnp/files/ Contact: barryghall@gmail.com
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 32
    Publication Date: 2015-08-25
    Description: Motivation: We have created an R package named phylogeo that provides a set of geographic utilities for sequencing-based microbial ecology studies. Although the geographic location of samples is an important aspect of environmental microbiology, none of the major software packages used in processing microbiome data include utilities that allow users to map and explore the spatial dimension of their data. phylogeo solves this problem by providing a set of plotting and mapping functions that can be used to visualize the geographic distribution of samples, to look at the relatedness of microbiomes using ecological distance, and to map the geographic distribution of particular sequences. By extending the popular phyloseq package and using the same data structures and command formats, phylogeo allows users to easily map and explore the geographic dimensions of their data from the R programming language. Availability and Implementation: phylogeo is documented and freely available http://zachcp.github.io/phylogeo Contact : zcharlop@rockefeller.edu
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 33
    Publication Date: 2015-08-25
    Description: : Gener is a development module for programming chemical controllers based on DNA strand displacement. Gener is developed with the aim of providing a simple interface that minimizes the opportunities for programming errors: Gener allows the user to test the computations of the DNA programs based on a simple two-domain strand displacement algebra, the minimal available so far. The tool allows the user to perform stepwise computations with respect to the rules of the algebra as well as exhaustive search of the computation space with different options for exploration and visualization. Gener can be used in combination with existing tools, and in particular, its programs can be exported to Microsoft Research’s DSD tool as well as to LaTeX. Availability and implementation : Gener is available for download at the Cosbi website at http://www.cosbi.eu/research/prototypes/gener as a windows executable that can be run on Mac OS X and Linux by using Mono. Contact : ozan@cosbi.eu
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 34
    Publication Date: 2015-08-25
    Description: Motivation: Molecular dynamics simulations provide atomic insight into the physicochemical characteristics of lipid membranes and hence, a wide range of force field families capable of modelling various lipid types have been developed in recent years. To model membranes in a biologically realistic lipid composition, simulation systems containing multiple different lipids must be assembled. Results: We present a new web service called MemGen that is capable of setting up simulation systems of heterogenous lipid membranes. MemGen is not restricted to certain lipid force fields or lipid types, but instead builds membranes from uploaded structure files which may contain any kind of amphiphilic molecule. MemGen works with any all-atom or united-atom lipid representation. Availability and implementation: MemGen is freely available without registration at http://memgen.uni-goettingen.de . Contact: jhub@gwdg.de Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 35
    Publication Date: 2015-08-21
    Description: We use the ‘Evolution and Assembly of GaLaxies and their Environments’ ( eagle ) suite of hydrodynamical cosmological simulations to measure offsets between the centres of stellar and dark matter components of galaxies. We find that the vast majority (〉95 per cent) of the simulated galaxies display an offset smaller than the gravitational softening length of the simulations (Plummer-equivalent  = 700 pc), both for field galaxies and satellites in clusters and groups. We also find no systematic trailing or leading of the dark matter along a galaxy's direction of motion. The offsets are consistent with being randomly drawn from a Maxwellian distribution with  ≤ 196 pc. Since astrophysical effects produce no feasible analogues for the $1.62^{+0.47}_{-0.49}$  kpc offset recently observed in Abell 3827, the observational result is in tension with the collisionless cold dark matter model assumed in our simulations.
    Print ISSN: 1745-3925
    Electronic ISSN: 1745-3933
    Topics: Physics
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 36
    Publication Date: 2015-08-24
    Description: The solar wind magnetic field contains rotations at a broad range of scales, which have been extensively studied in the magnetohydrodynamics range. Here, we present an extension of this analysis to the range between ion and electron kinetic scales. The distribution of rotation angles was found to be approximately lognormal, shifting to smaller angles at smaller scales almost self-similarly, but with small, statistically significant changes of shape. The fraction of energy in fluctuations with angles larger than α was found to drop approximately exponentially with α, with e-folding angle 9.8° at ion scales and 0 $_{.}^{\circ}$ 66 at electron scales, showing that large angles (α 〉 30°) do not contain a significant amount of energy at kinetic scales. Implications for kinetic turbulence theory and the dissipation of solar wind turbulence are discussed.
    Print ISSN: 1745-3925
    Electronic ISSN: 1745-3933
    Topics: Physics
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 37
    Publication Date: 2015-08-25
    Description: Motivation: In practice, identifying and interpreting the functional impacts of the regulatory relationships between micro-RNA and messenger-RNA is non-trivial. The sheer scale of possible micro-RNA and messenger-RNA interactions can make the interpretation of results difficult. Results: We propose a supervised framework, pMim, built upon concepts of significance combination, for jointly ranking regulatory micro-RNA and their potential functional impacts with respect to a condition of interest. Here, pMim directly tests if a micro-RNA is differentially expressed and if its predicted targets, which lie in a common biological pathway, have changed in the opposite direction. We leverage the information within existing micro-RNA target and pathway databases to stabilize the estimation and annotation of micro-RNA regulation making our approach suitable for datasets with small sample sizes. In addition to outputting meaningful and interpretable results, we demonstrate in a variety of datasets that the micro-RNA identified by pMim, in comparison to simpler existing approaches, are also more concordant with what is described in the literature. Availability and implementation: This framework is implemented as an R function, pMim , in the package sydSeq available from http://www.ellispatrick.com/r-packages . Contact: jean.yang@sydney.edu.au Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 38
    Publication Date: 2015-08-25
    Description: Motivation: Cellular mRNA levels originate from the combined action of multiple regulatory processes, which can be recapitulated by the rates of pre-mRNA synthesis, pre-mRNA processing and mRNA degradation. Recent experimental and computational advances set the basis to study these intertwined levels of regulation. Nevertheless, software for the comprehensive quantification of RNA dynamics is still lacking. Results: INSPEcT is an R package for the integrative analysis of RNA- and 4sU-seq data to study the dynamics of transcriptional regulation. INSPEcT provides gene-level quantification of these rates, and a modeling framework to identify which of these regulatory processes are most likely to explain the observed mRNA and pre-mRNA concentrations. Software performance is tested on a synthetic dataset, instrumental to guide the choice of the modeling parameters and the experimental design. Availability and implementation: INSPEcT is submitted to Bioconductor and is currently available as Supplementary Additional File S1 . Contact: mattia.pelizzola@iit.it Supplementary Information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 39
    Publication Date: 2015-08-25
    Description: Motivation: Experimentally determined gene regulatory networks can be enriched by computational inference from high-throughput expression profiles. However, the prediction of regulatory interactions is severely impaired by indirect and spurious effects, particularly for eukaryotes. Recently, published methods report improved predictions by exploiting the a priori known targets of a regulator (its local topology) in addition to expression profiles. Results: We find that methods exploiting known targets show an unexpectedly high rate of false discoveries. This leads to inflated performance estimates and the prediction of an excessive number of new interactions for regulators with many known targets. These issues are hidden from common evaluation and cross-validation setups, which is due to Simpson’s paradox. We suggest a confidence score recalibration method (CoRe) that reduces the false discovery rate and enables a reliable performance estimation. Conclusions: CoRe considerably improves the results of network inference methods that exploit known targets. Predictions then display the biological process specificity of regulators more correctly and enable the inference of accurate genome-wide regulatory networks in eukaryotes. For yeast, we propose a network with more than 22 000 confident interactions. We point out that machine learning approaches outside of the area of network inference may be affected as well. Availability and implementation: Results, executable code and networks are available via our website http://www.bio.ifi.lmu.de/forschung/CoRe . Contact: robert.kueffner@helmholtz-muenchen.de Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 40
    Publication Date: 2015-08-25
    Description: Motivation: Stoichiometric and constraint-based methods of computational strain design have become an important tool for rational metabolic engineering. One of those relies on the concept of constrained minimal cut sets (cMCSs). However, as most other techniques, cMCSs may consider only reaction (or gene) knockouts to achieve a desired phenotype. Results : We generalize the cMCSs approach to constrained regulatory MCSs (cRegMCSs), where up/downregulation of reaction rates can be combined along with reaction deletions. We show that flux up/downregulations can virtually be treated as cuts allowing their direct integration into the algorithmic framework of cMCSs. Because of vastly enlarged search spaces in genome-scale networks, we developed strategies to (optionally) preselect suitable candidates for flux regulation and novel algorithmic techniques to further enhance efficiency and speed of cMCSs calculation. We illustrate the cRegMCSs approach by a simple example network and apply it then by identifying strain designs for ethanol production in a genome-scale metabolic model of Escherichia coli. The results clearly show that cRegMCSs combining reaction deletions and flux regulations provide a much larger number of suitable strain designs, many of which are significantly smaller relative to cMCSs involving only knockouts. Furthermore, with cRegMCSs, one may also enable the fine tuning of desired behaviours in a narrower range. The new cRegMCSs approach may thus accelerate the implementation of model-based strain designs for the bio-based production of fuels and chemicals. Availability and implementation: MATLAB code and the examples can be downloaded at http://www.mpi-magdeburg.mpg.de/projects/cna/etcdownloads.html . Contact : krishna.mahadevan@utoronto.ca or klamt@mpi-magdeburg.mpg.de Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 41
    Publication Date: 2015-08-25
    Description: Motivation: Lipids are a large and diverse group of biological molecules with roles in membrane formation, energy storage and signaling. Cellular lipidomes may contain tens of thousands of structures, a staggering degree of complexity whose significance is not yet fully understood. High-throughput mass spectrometry-based platforms provide a means to study this complexity, but the interpretation of lipidomic data and its integration with prior knowledge of lipid biology suffers from a lack of appropriate tools to manage the data and extract knowledge from it. Results: To facilitate the description and exploration of lipidomic data and its integration with prior biological knowledge, we have developed a knowledge resource for lipids and their biology—SwissLipids. SwissLipids provides curated knowledge of lipid structures and metabolism which is used to generate an in silico library of feasible lipid structures. These are arranged in a hierarchical classification that links mass spectrometry analytical outputs to all possible lipid structures, metabolic reactions and enzymes. SwissLipids provides a reference namespace for lipidomic data publication, data exploration and hypothesis generation. The current version of SwissLipids includes over 244 000 known and theoretically possible lipid structures, over 800 proteins, and curated links to published knowledge from over 620 peer-reviewed publications. We are continually updating the SwissLipids hierarchy with new lipid categories and new expert curated knowledge. Availability: SwissLipids is freely available at http://www.swisslipids.org/ . Contact: alan.bridge@isb-sib.ch Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 42
    Publication Date: 2015-08-25
    Description: Motivation: Both the quantitative real-time polymerase chain reaction (qPCR) and quantitative isothermal amplification (qIA) are standard methods for nucleic acid quantification. Numerous real-time read-out technologies have been developed. Despite the continuous interest in amplification-based techniques, there are only few tools for pre-processing of amplification data. However, a transparent tool for precise control of raw data is indispensable in several scenarios, for example, during the development of new instruments. Results: chipPCR is an R package for the pre-processing and quality analysis of raw data of amplification curves. The package takes advantage of R ’s S 4 object model and offers an extensible environment. chipPCR contains tools for raw data exploration: normalization, baselining, imputation of missing values, a powerful wrapper for amplification curve smoothing and a function to detect the start and end of an amplification curve. The capabilities of the software are enhanced by the implementation of algorithms unavailable in R , such as a 5-point stencil for derivative interpolation. Simulation tools, statistical tests, plots for data quality management, amplification efficiency/quantification cycle calculation, and datasets from qPCR and qIA experiments are part of the package. Core functionalities are integrated in GUIs (web-based and standalone shiny applications), thus streamlining analysis and report generation. Availability and implementation: http://cran.r-project.org/web/packages/chipPCR . Source code: https://github.com/michbur/chipPCR . Contact : stefan.roediger@b-tu.de Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 43
    Publication Date: 2015-08-25
    Description: : A key to understanding RNA function is to uncover its complex 3D structure. Experimental methods used for determining RNA 3D structures are technologically challenging and laborious, which makes the development of computational prediction methods of substantial interest. Previously, we developed the iFoldRNA server that allows accurate prediction of short (〈50 nt) tertiary RNA structures starting from primary sequences. Here, we present a new version of the iFoldRNA server that permits the prediction of tertiary structure of RNAs as long as a few hundred nucleotides. This substantial increase in the server capacity is achieved by utilization of experimental information such as base-pairing and hydroxyl-radical probing. We demonstrate a significant benefit provided by integration of experimental data and computational methods. Availability and implementation: http://ifoldrna.dokhlab.org Contact: dokh@unc.eu
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 44
    Publication Date: 2015-08-25
    Description: : The ms-data-core-api is a free, open-source library for developing computational proteomics tools and pipelines. The Application Programming Interface, written in Java, enables rapid tool creation by providing a robust, pluggable programming interface and common data model. The data model is based on controlled vocabularies/ontologies and captures the whole range of data types included in common proteomics experimental workflows, going from spectra to peptide/protein identifications to quantitative results. The library contains readers for three of the most used Proteomics Standards Initiative standard file formats: mzML, mzIdentML, and mzTab. In addition to mzML, it also supports other common mass spectra data formats: dta, ms2, mgf, pkl, apl (text-based), mzXML and mzData (XML-based). Also, it can be used to read PRIDE XML, the original format used by the PRIDE database, one of the world-leading proteomics resources. Finally, we present a set of algorithms and tools whose implementation illustrates the simplicity of developing applications using the library. Availability and implementation: The software is freely available at https://github.com/PRIDE-Utilities/ms-data-core-api . Supplementary information: Supplementary data are available at Bioinformatics online Contact: juan@ebi.ac.uk
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 45
    Publication Date: 2015-08-25
    Description: : Scanning probe microscopy (SPM) is already a relevant tool in biological research at the nanoscale. We present ‘Flatten plus’, a recent and helpful implementation in the well-known WSxM free software package. ‘Flatten plus’ allows reducing low-frequency noise in SPM images in a semi-automated way preventing the appearance of typical artifacts associated with such filters. Availability and implementation: WSxM is a free software implemented in C++ supported on MS Windows, but it can also be run under Mac or Linux using emulators such as Wine or Parallels. WSxM can be downloaded from http://www.wsxmsolutions.com/ . Contact: ignacio.horcas@wsxmsolutions.com
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 46
    Publication Date: 2015-08-25
    Description: : Despite the plethora of methods available for the functional analysis of omics data, obtaining comprehensive-yet detailed understanding of the results remains challenging. This is mainly due to the lack of publicly available tools for the visualization of this type of information. Here we present an R package called GOplot, based on ggplot2, for enhanced graphical representation. Our package takes the output of any general enrichment analysis and generates plots at different levels of detail: from a general overview to identify the most enriched categories (bar plot, bubble plot) to a more detailed view displaying different types of information for molecules in a given set of categories (circle plot, chord plot, cluster plot). The package provides a deeper insight into omics data and allows scientists to generate insightful plots with only a few lines of code to easily communicate the findings. Availability and Implementation: The R package GOplot is available via CRAN-The Comprehensive R Archive Network: http://cran.r-project.org/web/packages/GOplot . The shiny web application of the Venn diagram can be found at: https://wwalter.shinyapps.io/Venn/ . A detailed manual of the package with sample figures can be found at https://wencke.github.io/ Contact: fscabo@cnic.es or mricote@cnic.es
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 47
    Publication Date: 2015-08-25
    Description: Motivation: Horizontal transfer of transposable (HTT) elements among eukaryotes was discovered in the mid-1980s. As then, 〉300 new cases have been described. New findings about HTT are revealing the evolutionary impact of this phenomenon on host genomes. In order to provide an up to date, interactive and expandable database for such events, we developed the HTT-DB database. Results: HTT-DB allows easy access to most of HTT cases reported along with rich information about each case. Moreover, it allows the user to generate tables and graphs based on searches using Transposable elements and/or host species classification and export them in several formats. Availability and implementation: This database is freely available on the web at http://lpa.saogabriel.unipampa.edu.br:8080/httdatabase . HTT-DB was developed based on Java and MySQL with all major browsers supported. Tools and software packages used are free for personal or non-profit projects. Contact: bdotto82@gmail.com or gabriel.wallau@gmail.com
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 48
    Publication Date: 2015-08-12
    Description: Historically, genome-wide and molecular characterization of the genus Listeria has concentrated on the important human pathogen Listeria monocytogenes and a small number of closely related species, together termed Listeria sensu strictu. More recently, a number of genome sequences for more basal, and nonpathogenic, members of the Listeria genus have become available, facilitating a wider perspective on the evolution of pathogenicity and genome level evolutionary dynamics within the entire genus (termed Listeria sensu lato). Here, we have sequenced the genomes of additional Listeria fleischmannii and Listeria newyorkensis isolates and explored the dynamics of genome evolution in Listeria sensu lato. Our analyses suggest that acquisition of genetic material through gene duplication and divergence as well as through lateral gene transfer (mostly from outside Listeria ) is widespread throughout the genus. Novel genetic material is apparently subject to rapid turnover. Multiple lines of evidence point to significant differences in evolutionary dynamics between the most basal Listeria subclade and all other congeners, including both sensu strictu and other sensu lato isolates. Strikingly, these differences are likely attributable to stochastic, population-level processes and contribute to observed variation in genome size across the genus. Notably, our analyses indicate that the common ancestor of Listeria sensu lato lacked flagella, which were acquired by lateral gene transfer by a common ancestor of Listeria grayi and Listeria sensu strictu, whereas a recently functionally characterized pathogenicity island, responsible for the capacity to produce cobalamin and utilize ethanolamine/propane-2-diol, was acquired in an ancestor of Listeria sensu strictu.
    Electronic ISSN: 1759-6653
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 49
    Publication Date: 2015-08-12
    Description: Taeniid cestodes (including the human parasites Echinococcus spp. and Taenia solium ) have very few mobile genetic elements (MGEs) in their genome, despite lacking a canonical PIWI pathway. The MGEs of these parasites are virtually unexplored, and nothing is known about their expression and silencing. In this work, we report the discovery of a novel family of small nonautonomous long terminal repeat retrotransposons (also known as terminal-repeat retrotransposons in miniature, TRIMs) which we have named ta-TRIM (taeniid TRIM). ta-TRIM s are only the second family of TRIM elements discovered in animals, and are likely the result of convergent reductive evolution in different taxonomic groups. These elements originated at the base of the taeniid tree and have expanded during taeniid diversification, including after the divergence of closely related species such as Echinococcus multilocularis and Echinococcus granulosus . They are massively expressed in larval stages, from a small proportion of full-length copies and from isolated terminal repeats that show transcriptional read-through into downstream regions, generating novel noncoding RNAs and transcriptional fusions to coding genes. In E. multilocularis , ta-TRIM s are specifically expressed in the germinative cells (the somatic stem cells) during asexual reproduction of metacestode larvae. This would provide a developmental mechanism for insertion of ta-TRIM s into cells that will eventually generate the adult germ line. Future studies of active and inactive ta-TRIM elements could give the first clues on MGE silencing mechanisms in cestodes.
    Electronic ISSN: 1759-6653
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 50
    Publication Date: 2015-08-16
    Description: Even though mitochondrial genomes, which characterize eukaryotic cells, were first discovered more than 50 years ago, mitochondrial genomics remains an important topic in molecular biology and genome sciences. The Phylum Alveolata comprises three major groups (ciliates, apicomplexans, and dinoflagellates), the mitochondrial genomes of which have diverged widely. Even though the gene content of dinoflagellate mitochondrial genomes is reportedly comparable to that of apicomplexans, the highly fragmented and rearranged genome structures of dinoflagellates have frustrated whole genomic analysis. Consequently, noncoding sequences and gene arrangements of dinoflagellate mitochondrial genomes have not been well characterized. Here we report that the continuous assembled genome (~326 kb) of the dinoflagellate, Symbiodinium minutum , is AT-rich (~64.3%) and that it contains three protein-coding genes. Based upon in silico analysis, the remaining 99% of the genome comprises transcriptomic noncoding sequences. RNA edited sites and unique, possible start and stop codons clarify conserved regions among dinoflagellates. Our massive transcriptome analysis shows that almost all regions of the genome are transcribed, including 27 possible fragmented ribosomal RNA genes and 12 uncharacterized small RNAs that are similar to mitochondrial RNA genes of the malarial parasite, Plasmodium falciparum . Gene map comparisons show that gene order is only slightly conserved between S. minutu m and P. falciparum . However, small RNAs and intergenic sequences share sequence similarities with P. falciparum , suggesting that the function of noncoding sequences has been preserved despite development of very different genome structures.
    Electronic ISSN: 1759-6653
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 51
    Publication Date: 2015-08-06
    Description: Evolutionary studies usually use a two-step process to investigate sequence data. Step one estimates a multiple sequence alignment (MSA) and step two applies phylogenetic methods to ask evolutionary questions of that MSA. Modern phylogenetic methods infer evolutionary parameters using maximum likelihood or Bayesian inference, mediated by a probabilistic substitution model that describes sequence change over a tree. The statistical properties of these methods mean that more data directly translates to an increased confidence in downstream results, providing the substitution model is adequate and the MSA is correct. Many studies have investigated the robustness of phylogenetic methods in the presence of substitution model misspecification, but few have examined the statistical properties of those methods when the MSA is unknown. This simulation study examines the statistical properties of the complete two-step process when inferring sequence divergence and the phylogenetic tree topology. Both nucleotide and amino acid analyses are negatively affected by the alignment step, both through inaccurate guide tree estimates and through overfitting to that guide tree. For many alignment tools these effects become more pronounced when additional sequences are added to the analysis. Nucleotide sequences are particularly susceptible, with MSA errors leading to statistical support for long-branch attraction artifacts, which are usually associated with gross substitution model misspecification. Amino acid MSAs are more robust, but do tend to arbitrarily resolve multifurcations in favor of the guide tree. No inference strategies produce consistently accurate estimates of divergence between sequences, although amino acid MSAs are again more accurate than their nucleotide counterparts. We conclude with some practical suggestions about how to limit the effect of MSA uncertainty on evolutionary inference.
    Electronic ISSN: 1759-6653
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 52
    Publication Date: 2015-08-06
    Description: The evolution of mitochondrial information processing pathways, including replication, transcription and translation, is characterized by the gradual replacement of mitochondrial-encoded proteins with nuclear-encoded counterparts of diverse evolutionary origins. Although the ancestral enzymes involved in mitochondrial transcription and replication have been replaced early in eukaryotic evolution, mitochondrial translation is still carried out by an apparatus largely inherited from the α-proteobacterial ancestor. However, variation in the complement of mitochondrial-encoded molecules involved in translation, including transfer RNAs (tRNAs), provides evidence for the ongoing evolution of mitochondrial protein synthesis. Here, we investigate the evolution of the mitochondrial translational machinery using recent genomic and transcriptomic data from animals that have experienced the loss of mt-tRNAs, including phyla Cnidaria and Ctenophora, as well as some representatives of all four classes of Porifera. We focus on four sets of mitochondrial enzymes that directly interact with tRNAs: Aminoacyl-tRNA synthetases, glutamyl-tRNA amidotransferase, tRNA Ile lysidine synthetase, and RNase P. Our results support the observation that the fate of nuclear-encoded mitochondrial proteins is influenced by the evolution of molecules encoded in mitochondrial DNA, but in a more complex manner than appreciated previously. The data also suggest that relaxed selection on mitochondrial translation rather than coevolution between mitochondrial and nuclear subunits is responsible for elevated rates of evolution in mitochondrial translational proteins.
    Electronic ISSN: 1759-6653
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 53
    Publication Date: 2015-08-06
    Description: The expansion of DUF1220 domain copy number during human evolution is a dramatic example of rapid and repeated domain duplication. Although patterns of expression, homology, and disease associations suggest a role in cortical development, this hypothesis has not been robustly tested using phylogenetic methods. Here, we estimate DUF1220 domain counts across 12 primate genomes using a nucleotide Hidden Markov Model. We then test a series of hypotheses designed to examine the potential evolutionary significance of DUF1220 copy number expansion. Our results suggest a robust association with brain size, and more specifically neocortex volume. In contradiction to previous hypotheses, we find a strong association with postnatal brain development but not with prenatal brain development. Our results provide further evidence of a conserved association between specific loci and brain size across primates, suggesting that human brain evolution may have occurred through a continuation of existing processes.
    Electronic ISSN: 1759-6653
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 54
    Publication Date: 2015-08-08
    Description: Motivation : The majority of variation identified by genome wide association studies falls in non-coding genomic regions and is hypothesized to impact regulatory elements that modulate gene expression. Here we present a statistically rigorous software tool GREGOR (Genomic Regulatory Elements and Gwas Overlap algoRithm) for evaluating enrichment of any set of genetic variants with any set of regulatory features. Using variants from five phenotypes, we describe a data-driven approach to determine the tissue and cell types most relevant to a trait of interest and to identify the subset of regulatory features likely impacted by these variants. Last, we experimentally evaluate six predicted functional variants at six lipid-associated loci and demonstrate significant evidence for allele-specific impact on expression levels. GREGOR systematically evaluates enrichment of genetic variation with the vast collection of regulatory data available to explore novel biological mechanisms of disease and guide us toward the functional variant at trait-associated loci. Availability and implementation : GREGOR, including source code, documentation, examples, and executables, is available at http://genome.sph.umich.edu/wiki/GREGOR . Contact : cristen@umich.edu Supplementary information : Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 55
    Publication Date: 2015-08-08
    Description: Motivation: Genome and transcriptome analyses can be used to explore cancers comprehensively, and it is increasingly common to have multiple omics data measured from each individual. Furthermore, there are rich functional data such as predicted impact of mutations on protein coding and gene/protein networks. However, integration of the complex information across the different omics and functional data is still challenging. Clinical validation, particularly based on patient outcomes such as survival, is important for assessing the relevance of the integrated information and for comparing different procedures. Results: An analysis pipeline is built for integrating genomic and transcriptomic alterations from whole-exome and RNA sequence data and functional data from protein function prediction and gene interaction networks. The method accumulates evidence for the functional implications of mutated potential driver genes found within and across patients. A driver-gene score (DGscore) is developed to capture the cumulative effect of such genes. To contribute to the score, a gene has to be frequently mutated, with high or moderate mutational impact at protein level, exhibiting an extreme expression and functionally linked to many differentially expressed neighbors in the functional gene network. The pipeline is applied to 60 matched tumor and normal samples of the same patient from The Cancer Genome Atlas breast-cancer project. In clinical validation, patients with high DGscores have worse survival than those with low scores ( P = 0.001). Furthermore, the DGscore outperforms the established expression-based signatures MammaPrint and PAM50 in predicting patient survival. In conclusion, integration of mutation, expression and functional data allows identification of clinically relevant potential driver genes in cancer. Availability and implementation: The documented pipeline including annotated sample scripts can be found in http://fafner.meb.ki.se/biostatwiki/driver-genes/ . Contact: yudi.pawitan@ki.se Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 56
    Publication Date: 2015-08-08
    Description: Motivation: With improvements in next-generation sequencing technologies and reductions in price, ordered RNA-seq experiments are becoming common. Of primary interest in these experiments is identifying genes that are changing over time or space, for example, and then characterizing the specific expression changes. A number of robust statistical methods are available to identify genes showing differential expression among multiple conditions, but most assume conditions are exchangeable and thereby sacrifice power and precision when applied to ordered data. Results: We propose an empirical Bayes mixture modeling approach called EBSeq-HMM. In EBSeq-HMM, an auto-regressive hidden Markov model is implemented to accommodate dependence in gene expression across ordered conditions. As demonstrated in simulation and case studies, the output proves useful in identifying differentially expressed genes and in specifying gene-specific expression paths. EBSeq-HMM may also be used for inference regarding isoform expression. Availability and implementation: An R package containing examples and sample datasets is available at Bioconductor. Contact: kendzior@biostat.wisc.edu Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 57
    facet.materialart.
    Unknown
    Oxford University Press
    Publication Date: 2015-08-08
    Description: Motivation: Sequence discovery tools play a central role in several fields of computational biology. In the framework of Transcription Factor binding studies, most of the existing motif finding algorithms are computationally demanding, and they may not be able to support the increasingly large datasets produced by modern high-throughput sequencing technologies. Results: We present FastMotif, a new motif discovery algorithm that is built on a recent machine learning technique referred to as Method of Moments. Based on spectral decompositions, our method is robust to model misspecifications and is not prone to locally optimal solutions. We obtain an algorithm that is extremely fast and designed for the analysis of big sequencing data. On HT-Selex data, FastMotif extracts motif profiles that match those computed by various state-of-the-art algorithms, but one order of magnitude faster. We provide a theoretical and numerical analysis of the algorithm’s robustness and discuss its sensitivity with respect to the free parameters. Availability and implementation: The Matlab code of FastMotif is available from http://lcsb-portal.uni.lu/bioinformatics . Contact: vlassis@adobe.com Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 58
    Publication Date: 2015-08-08
    Description: Motivation: Next-generation high-throughput sequencing has become a state-of-the-art technique in genome assembly. Scaffolding is one of the main stages of the assembly pipeline. During this stage, contigs assembled from the paired-end reads are merged into bigger chains called scaffolds. Because of a high level of statistical noise, chimeric reads, and genome repeats the problem of scaffolding is a challenging task. Current scaffolding software packages widely vary in their quality and are highly dependent on the read data quality and genome complexity. There are no clear winners and multiple opportunities for further improvements of the tools still exist. Results: This article presents an efficient scaffolding algorithm ScaffMatch that is able to handle reads with both short (〈600 bp) and long (〉35 000 bp) insert sizes producing high-quality scaffolds. We evaluate our scaffolding tool with the F score and other metrics (N50, corrected N50) on eight datasets comparing it with the most available packages. Our experiments show that ScaffMatch is the tool of preference for the most datasets. Availability and implementation: The source code is available at http://alan.cs.gsu.edu/NGS/?q=content/scaffmatch . Contact: mandric@cs.gsu.edu Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 59
    Publication Date: 2015-08-08
    Description: Motivation: Identifying protein subchloroplast localization in chloroplast organelle is very helpful for understanding the function of chloroplast proteins. There have existed a few computational prediction methods for protein subchloroplast localization. However, these existing works have ignored proteins with multiple subchloroplast locations when constructing prediction models, so that they can predict only one of all subchloroplast locations of this kind of multilabel proteins. Results: To address this problem, through utilizing label-specific features and label correlations simultaneously, a novel multilabel classifier was developed for predicting protein subchloroplast location(s) with both single and multiple location sites. As an initial study, the overall accuracy of our proposed algorithm reaches 55.52%, which is quite high to be able to become a promising tool for further studies. Availability and implementation: An online web server for our proposed algorithm named MultiP-SChlo was developed, which are freely accessible at http://biomed.zzuli.edu.cn/bioinfo/multip-schlo/ . Contact: pandaxiaoxi@gmail.com or gzli@tongji.edu.cn Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 60
    Publication Date: 2015-08-08
    Description: Motivation: Loops in proteins are often involved in biochemical functions. Their irregularity and flexibility make experimental structure determination and computational modeling challenging. Most current loop modeling methods focus on modeling single loops. In protein structure prediction, multiple loops often need to be modeled simultaneously. As interactions among loops in spatial proximity can be rather complex, sampling the conformations of multiple interacting loops is a challenging task. Results: In this study, we report a new method called m ulti-loop Di stance-guided S equential chain- Gro wth Monte Carlo ( M -D i SG ro ) for prediction of the conformations of multiple interacting loops in proteins. Our method achieves an average RMSD of 1.93 Å for lowest energy conformations of 36 pairs of interacting protein loops with the total length ranging from 12 to 24 residues. We further constructed a data set containing proteins with 2, 3 and 4 interacting loops. For the most challenging target proteins with four loops, the average RMSD of the lowest energy conformations is 2.35 Å. Our method is also tested for predicting multiple loops in β-barrel membrane proteins. For outer-membrane protein G, the lowest energy conformation has a RMSD of 2.62 Å for the three extracellular interacting loops with a total length of 34 residues (12, 12 and 10 residues in each loop). Availability and implementation : The software is freely available at: tanto.bioe.uic.edu/m-DiSGro. Contact: jinfeng@stat.fsu.edu or jliang@uic.edu Supplementary information : Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 61
    Publication Date: 2015-08-08
    Description: Motivation: Network comparison is a computationally intractable problem with important applications in systems biology and other domains. A key challenge is to properly quantify similarity between wiring patterns of two networks in an alignment-free fashion. Also, alignment-based methods exist that aim to identify an actual node mapping between networks and as such serve a different purpose. Various alignment-free methods that use different global network properties (e.g. degree distribution) have been proposed. Methods based on small local subgraphs called graphlets perform the best in the alignment-free network comparison task, due to high level of topological detail that graphlets can capture. Among different graphlet-based methods, Graphlet Correlation Distance (GCD) was shown to be the most accurate for comparing networks. Recently, a new graphlet-based method called NetDis was proposed, which was claimed to be superior. We argue against this, as the performance of NetDis was not properly evaluated to position it correctly among the other alignment-free methods. Results : We evaluate the performance of available alignment-free network comparison methods, including GCD and NetDis. We do this by measuring accuracy of each method (in a systematic precision-recall framework) in terms of how well the method can group (cluster) topologically similar networks. By testing this on both synthetic and real-world networks from different domains, we show that GCD remains the most accurate, noise-tolerant and computationally efficient alignment-free method. That is, we show that NetDis does not outperform the other methods, as originally claimed, while it is also computationally more expensive. Furthermore, since NetDis is dependent on the choice of a network null model (unlike the other graphlet-based methods), we show that its performance is highly sensitive to the choice of this parameter. Finally, we find that its performance is not independent on network sizes and densities, as originally claimed. Contact : natasha@imperial.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 62
    Publication Date: 2015-08-08
    Description: Motivation: The functional impact of small molecules is increasingly being assessed in different eukaryotic species through large-scale phenotypic screening initiatives. Identifying the targets of these molecules is crucial to mechanistically understand their function and uncover new therapeutically relevant modes of action. However, despite extensive work carried out in model organisms and human, it is still unclear to what extent one can use information obtained in one species to make predictions in other species. Results: Here, for the first time, we explore and validate at a large scale the use of protein homology relationships to predict the targets of small molecules across different species. Our results show that exploiting target homology can significantly improve the predictions, especially for molecules experimentally tested in other species. Interestingly, when considering separately orthology and paralogy relationships, we observe that mapping small molecule interactions among orthologs improves prediction accuracy, while including paralogs does not improve and even sometimes worsens the prediction accuracy. Overall, our results provide a novel approach to integrate chemical screening results across multiple species and highlight the promises and remaining challenges of using protein homology for small molecule target identification. Availability and implementation: Homology-based predictions can be tested on our website http://www.swisstargetprediction.ch . Contact: david.gfeller@unil.ch or vincent.zoete@isb-sib.ch . Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 63
    Publication Date: 2015-06-05
    Description: Alternative splicing and gene duplication are the two main processes responsible for expanding protein functional diversity. Although gene duplication can generate new genes and alternative splicing can introduce variation through alternative gene products, the interplay between the two processes is complex and poorly understood. Here, we have carried out a study of the evolution of alternatively spliced exons after gene duplication to better understand the interaction between the two processes. We created a manually curated set of 97 human genes with mutually exclusively spliced homologous exons and analyzed the evolution of these exons across five distantly related vertebrates (lamprey, spotted gar, zebrafish, fugu, and coelacanth). Most of these exons had an ancient origin (more than 400 Ma). We found examples supporting two extreme evolutionary models for the behaviour of homologous axons after gene duplication. We observed 11 events in which gene duplication was accompanied by splice isoform separation, that is, each paralog specifically conserved just one distinct ancestral homologous exon. At other extreme, we identified genes in which the homologous exons were always conserved within paralogs, suggesting that the alternative splicing event cannot easily be separated from the function in these genes. That many homologous exons fall in between these two extremes highlights the diversity of biological systems and suggests that the subtle balance between alternative splicing and gene duplication is adjusted to the specific cellular context of each gene.
    Electronic ISSN: 1759-6653
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 64
    Publication Date: 2015-06-05
    Description: There is widespread interest today in understanding enhancers, which are regulatory elements typically harboring several transcription factor binding sites and mediating the combinatorial effect of transcription factors on gene expression. The evolution of enhancers poses interesting unanswered questions, for example, the evolutionary time taken for a typical enhancer to emerge or the factors shaping its evolution. Existing approaches to cis -regulatory evolution have often ignored the combinatorial nature and varied biochemical mechanisms of gene regulation encoded in enhancers. We report on our investigation of enhancer evolution through the use of PEBCRES, a framework for evolutionary simulation of enhancers that employs a mechanistic and well-supported sequence-to-expression model to assign fitness to the evolving enhancer genotype. We estimated the time necessary to evolve, from genomic background, enhancers capable of driving complex gene expression patterns similar to those involved in early development in Drosophila. We found the time-to-evolve to range between 0.5 and 10 Myr, and to vary greatly with the target expression pattern, complexity of the real enhancer known to encode that pattern, and the strength of input from specific transcription factors. To our knowledge, this is the first estimate of waiting times for realistic enhancers to evolve. The in silico evolved enhancers had, with a few interesting exceptions, site compositions similar to those seen in real enhancers for the same patterns. Our simulations also revealed that certain features of an enhancer might evolve not due to their biological function but as aids to the evolutionary process itself.
    Electronic ISSN: 1759-6653
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 65
    Publication Date: 2015-06-05
    Description: Despite significant progress in the structural and functional characterization of the human genome, understanding of the mechanisms underlying the genetic basis of human phenotypic uniqueness remains limited. Here, I report that transposable element-derived sequences, most notably LTR7/HERV-H, LTR5_Hs, and L1HS, harbor 99.8% of the candidate human-specific regulatory loci (HSRL) with putative transcription factor-binding sites in the genome of human embryonic stem cells (hESC). A total of 4,094 candidate HSRL display selective and site-specific binding of critical regulators (NANOG [Nanog homeobox], POU5F1 [POU class 5 homeobox 1], CCCTC-binding factor [CTCF], Lamin B1), and are preferentially located within the matrix of transcriptionally active DNA segments that are hypermethylated in hESC. hESC-specific NANOG-binding sites are enriched near the protein-coding genes regulating brain size, pluripotency long noncoding RNAs, hESC enhancers, and 5-hydroxymethylcytosine-harboring regions immediately adjacent to binding sites. Sequences of only 4.3% of hESC-specific NANOG-binding sites are present in Neanderthals’ genome, suggesting that a majority of these regulatory elements emerged in Modern Humans. Comparisons of estimated creation rates of novel TF-binding sites revealed that there was 49.7-fold acceleration of creation rates of NANOG-binding sites in genomes of Chimpanzees compared with the mouse genomes and further 5.7-fold acceleration in genomes of Modern Humans compared with the Chimpanzees genomes. Preliminary estimates suggest that emergence of one novel NANOG-binding site detectable in hESC required 466 years of evolution. Pathway analysis of coding genes that have hESC-specific NANOG-binding sites within gene bodies or near gene boundaries revealed their association with physiological development and functions of nervous and cardiovascular systems, embryonic development, behavior, as well as development of a diverse spectrum of pathological conditions such as cancer, diseases of cardiovascular and reproductive systems, metabolic diseases, multiple neurological and psychological disorders. A proximity placement model is proposed explaining how a 33–47% excess of NANOG, CTCF, and POU5F1 proteins immobilized on a DNA scaffold may play a functional role at distal regulatory elements.
    Electronic ISSN: 1759-6653
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 66
    Publication Date: 2015-06-05
    Description: Organisms can adapt to local environmental conditions as a plastic response or become adapted through natural selection on genetic variation. The ability to adapt to increased water temperatures will be of paramount importance for many fish species as the climate continues to warm and water resources become limited. Because increased water temperatures will reduce the dissolved oxygen available for fish, we hypothesized that adaptation to low oxygen environments would involve improved respiration through oxidative phosphorylation (OXPHOS). To test this hypothesis, we subjected individuals from two ecologically divergent populations of inland (redband) rainbow trout ( Oncorhynchus mykiss gairdneri ) with historically different temperature regimes (desert and montane) and their F1 progeny to diel cycles of temperature stress and then examined gene expression data for 80 nuclear- and mitochondrial-encoded OXPHOS subunits that participate in respiration. Of the 80 transcripts, 7 showed ≥ 2-fold difference in expression levels in gill tissue from desert fish under heat stress whereas the montane fish had none and the F1 only had one differentially expressed gene. A structural analysis of the proteins encoded by those genes suggests that the response could coordinate the formation of supercomplexes and oligomers. Supercomplexes may increase the efficiency of respiration because complexes I, III, and IV are brought into close proximity and oligomerization of complex V alters the macrostructure of mitochondria to improve respiration. Significant differences in gene expression patterns in response to heat stress in a common environment indicate that the response was not due to plasticity but had a genetic basis.
    Electronic ISSN: 1759-6653
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 67
    Publication Date: 2015-08-08
    Description: Motivation: Very large studies are required to provide sufficiently big sample sizes for adequately powered association analyses. This can be an expensive undertaking and it is important that an accurate sample size is identified. For more realistic sample size calculation and power analysis, the impact of unmeasured aetiological determinants and the quality of measurement of both outcome and explanatory variables should be taken into account. Conventional methods to analyse power use closed-form solutions that are not flexible enough to cater for all of these elements easily. They often result in a potentially substantial overestimation of the actual power. Results: In this article, we describe the Estimating Sample-size and Power in R by Exploring Simulated Study Outcomes tool that allows assessment errors in power calculation under various biomedical scenarios to be incorporated. We also report a real world analysis where we used this tool to answer an important strategic question for an existing cohort. Availability and implementation: The software is available for online calculation and downloads at http://espresso-research.org . The code is freely available at https://github.com/ESPRESSO-research . Contact: louqman@gmail.com Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 68
    Publication Date: 2015-08-08
    Description: Motivation: Given the importance of non-coding RNAs to cellular regulatory functions, it would be highly desirable to have accurate computational prediction of RNA 3D structure, a task which remains challenging. Even for a short RNA sequence, the space of tertiary conformations is immense; existing methods to identify native-like conformations mostly resort to random sampling of conformations to achieve computational feasibility. However, native conformations may not be examined and prediction accuracy may be compromised due to sampling. State-of-the-art methods have yet to deliver satisfactory predictions for RNAs of length beyond 50 nucleotides. Results: This paper presents a method to tackle a key step in the RNA 3D structure prediction problem, the prediction of the nucleotide interactions that constitute the desired 3D structure. The research is based on a novel graph model, called a backbone k-tree , to tightly constrain the nucleotide interaction relationships considered for RNA 3D structures. It is shown that the new model makes it possible to efficiently predict the optimal set of nucleotide interactions (including the non-canonical interactions in all recently revealed families) from the query sequence along with known or predicted canonical basepairs. The preliminary results indicate that in most cases the new method can predict with a high accuracy the nucleotide interactions that constitute the 3D structure of the query sequence. It thus provides a useful tool for the accurate prediction of RNA 3D structure. Availability and Implementation: The source package for BkTree is available at http://rna-informatics.uga.edu/index.php?f=software&p=BkTree . Contact: lding@uga.edu or cai@cs.uga.edu Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 69
    Publication Date: 2015-08-08
    Description: Motivation: RNAs fold into complex structures that are integral to the diverse mechanisms underlying RNA regulation of gene expression. Recent development of transcriptome-wide RNA structure profiling through the application of structure-probing enzymes or chemicals combined with high-throughput sequencing has opened a new field that greatly expands the amount of in vitro and in vivo RNA structural information available. The resultant datasets provide the opportunity to investigate RNA structural information on a global scale. However, the analysis of high-throughput RNA structure profiling data requires considerable computational effort and expertise. Results: We present a new platform, StructureFold, that provides an integrated computational solution designed specifically for large-scale RNA structure mapping and reconstruction across any transcriptome. StructureFold automates the processing and analysis of raw high-throughput RNA structure profiling data, allowing the seamless incorporation of wet-bench structural information from chemical probes and/or ribonucleases to restrain RNA secondary structure prediction via the RNAstructure and ViennaRNA package algorithms. StructureFold performs reads mapping and alignment, normalization and reactivity derivation, and RNA structure prediction in a single user-friendly web interface or via local installation. The variation in transcript abundance and length that prevails in living cells and consequently causes variation in the counts of structure-probing events between transcripts is accounted for. Accordingly, StructureFold is applicable to RNA structural profiling data obtained in vivo as well as to in vitro or in silico datasets. StructureFold is deployed via the Galaxy platform. Availability and Implementation: StructureFold is freely available as a component of Galaxy available at: https://usegalaxy.org/ . Contact: yxt148@psu.edu or sma3@psu.edu Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 70
    Publication Date: 2015-08-08
    Description: : Structural variations (SVs) are large genomic rearrangements that vary significantly in size, making them challenging to detect with the relatively short reads from next-generation sequencing (NGS). Different SV detection methods have been developed; however, each is limited to specific kinds of SVs with varying accuracy and resolution. Previous works have attempted to combine different methods, but they still suffer from poor accuracy particularly for insertions. We propose MetaSV, an integrated SV caller which leverages multiple orthogonal SV signals for high accuracy and resolution. MetaSV proceeds by merging SVs from multiple tools for all types of SVs. It also analyzes soft-clipped reads from alignment to detect insertions accurately since existing tools underestimate insertion SVs. Local assembly in combination with dynamic programming is used to improve breakpoint resolution. Paired-end and coverage information is used to predict SV genotypes. Using simulation and experimental data, we demonstrate the effectiveness of MetaSV across various SV types and sizes. Availability and implementation: Code in Python is at http://bioinform.github.io/metasv/ . Contact: rd@bina.com Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 71
    Publication Date: 2015-08-08
    Description: : We present a web server to predict the functional effect of single or multiple amino acid substitutions, insertions and deletions using the prediction tool PROVEAN. The server provides rapid analysis of protein variants from any organisms, and also supports high-throughput analysis for human and mouse variants at both the genomic and protein levels. Availability and implementation : The web server is freely available and open to all users with no login requirements at http://provean.jcvi.org . Contact: achan@jcvi.org Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 72
    Publication Date: 2015-08-08
    Description: Motivation: In attempts to determine the genetic causes of human disease, researchers are often faced with a large number of candidate genes. Linkage studies can point to a genomic region containing hundreds of genes, while the high-throughput sequencing approach will often identify a great number of non-synonymous genetic variants. Since systematic experimental verification of each such candidate gene is not feasible, a method is needed to decide which genes are worth investigating further. Computational gene prioritization presents itself as a solution to this problem, systematically analyzing and sorting each gene from the most to least likely to be the disease-causing gene, in a fraction of the time it would take a researcher to perform such queries manually. Results: Here, we present Gene TIssue Expression Ranker (GeneTIER), a new web-based application for candidate gene prioritization. GeneTIER replaces knowledge-based inference traditionally used in candidate disease gene prioritization applications with experimental data from tissue-specific gene expression datasets and thus largely overcomes the bias toward the better characterized genes/diseases that commonly afflict other methods. We show that our approach is capable of accurate candidate gene prioritization and illustrate its strengths and weaknesses using case study examples. Availability and Implementation: Freely available on the web at http://dna.leeds.ac.uk/GeneTIER/. Contact: umaan@leeds.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 73
    Publication Date: 2015-08-08
    Description: Motivation: The role of personalized medicine and target treatment in the clinical management of cancer patients has become increasingly important in recent years. This has made the task of precise histological substratification of cancers crucial. Increasingly, genomic data are being seen as a valuable classifier. Specifically, copy number alteration (CNA) profiles generated by next-generation sequencing (NGS) can become a determinant for tumours subtyping. The principle purpose of this study is to devise a model with good prediction capability for the tumours histological subtypes as a function of both the patients covariates and their genome-wide CNA profiles from NGS data. Results: We investigate a logistic regression for modelling tumour histological subtypes as a function of the patients’ covariates and their CNA profiles, in a mixed model framework. The covariates, such as age and gender, are considered as fixed predictors and the genome-wide CNA profiles are considered as random predictors. We illustrate the application of this model in lung and oral cancer datasets, and the results indicate that the tumour histological subtypes can be modelled with a good fit. Our cross-validation indicates that the logistic regression exhibits the best prediction relative to other classification methods we considered in this study. The model also exhibits the best agreement in the prediction between smooth-segmented and circular binary-segmented CNA profiles. Availability and implementation: An R package to run a logistic regression is available in http://www1.maths.leeds.ac.uk/~arief/R/CNALR/ . Contact: a.gusnanto@leeds.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 74
    Publication Date: 2015-08-08
    Description: : Metabolic network mapping is a widely used approach for integration of metabolomic experimental results with biological domain knowledge. However, current approaches can be limited by biochemical domain or pathway knowledge which results in sparse disconnected graphs for real world metabolomic experiments. MetaMapR integrates enzymatic transformations with metabolite structural similarity, mass spectral similarity and empirical associations to generate richly connected metabolic networks. This open source, web-based or desktop software, written in the R programming language, leverages KEGG and PubChem databases to derive associations between metabolites even in cases where biochemical domain or molecular annotations are unknown. Network calculation is enhanced through an interface to the Chemical Translation System, which allows metabolite identifier translation between 〉200 common biochemical databases. Analysis results are presented as interactive visualizations or can be exported as high-quality graphics and numerical tables which can be imported into common network analysis and visualization tools. Availability and Implementation: Freely available at http://dgrapov.github.io/MetaMapR/ . Requires R and a modern web browser. Installation instructions, tutorials and application examples are available at http://dgrapov.github.io/MetaMapR/ . Contact: ofiehn@ucdavis.edu
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 75
    Publication Date: 2015-08-08
    Description: Motivation: The Cellular Phenotype Database (CPD) is a repository for data derived from high-throughput systems microscopy studies. The aims of this resource are: (i) to provide easy access to cellular phenotype and molecular localization data for the broader research community; (ii) to facilitate integration of independent phenotypic studies by means of data aggregation techniques, including use of an ontology and (iii) to facilitate development of analytical methods in this field. Results: In this article we present CPD, its data structure and user interface, propose a minimal set of information describing RNA interference experiments, and suggest a generic schema for management and aggregation of outputs from phenotypic or molecular localization experiments. The database has a flexible structure for management of data from heterogeneous sources of systems microscopy experimental outputs generated by a variety of protocols and technologies and can be queried by gene, reagent, gene attribute, study keywords, phenotype or ontology terms. Availability and implementation: CPD is developed as part of the Systems Microscopy Network of Excellence and is accessible at http://www.ebi.ac.uk/fg/sym . Contact: jes@ebi.ac.uk or ugis@ebi.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 76
    Publication Date: 2015-08-16
    Description: The enigmatic monocot family Triuridaceae provides a potentially useful model system for studying the effects of an ancient loss of photosynthesis on the plant plastid genome, as all of its members are mycoheterotrophic and achlorophyllous. However, few studies have placed the family in a comparative context, and its phylogenetic placement is only partly resolved. It was also unclear whether any taxa in this family have retained a plastid genome. Here, we used genome survey sequencing to retrieve plastid genome data for Sciaphila densiflora (Triuridaceae) and ten autotrophic relatives in the orders Dioscoreales and Pandanales. We recovered a highly reduced plastome for Sciaphila that is nearly colinear with Carludovica palmata , a photosynthetic relative that belongs to its sister group in Pandanales, Cyclanthaceae–Pandanaceae. This phylogenetic placement is well supported and robust to a broad range of analytical assumptions in maximum-likelihood inference, and is congruent with recent findings based on nuclear and mitochondrial evidence. The 28 genes retained in the S. densiflora plastid genome are involved in translation and other nonphotosynthetic functions, and we demonstrate that nearly all of the 18 protein-coding genes are under strong purifying selection. Our study confirms the utility of whole plastid genome data in phylogenetic studies of highly modified heterotrophic plants, even when they have substantially elevated rates of substitution.
    Electronic ISSN: 1759-6653
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 77
    Publication Date: 2015-08-08
    Description: : Specific recognition of DNA by proteins is a crucial step of many biological processes. PDIviz is a plugin for the PyMOL molecular visualization system that analyzes protein–DNA binding interfaces by comparing the solvent accessible surface area of the complex against the free protein and free DNA. The plugin provides three distinct three-dimensional visualization modes to highlight interactions with DNA bases and backbone, major and minor groove, and with atoms of different pharmacophoric type (hydrogen bond donors/acceptors, hydrophobic and thymine methyl). Each mode comes in three styles to focus the visual analysis on the protein or DNA side of the interface, or on the nucleotide sequence. PDIviz allows for the generation of publication quality images, all calculated data can be written to disk, and a command line interface is provided for automating tasks. The plugin may be helpful for the detailed identification of regions involved in DNA base and shape readout, and can be particularly useful in rapidly pinpointing the overall mode of interaction. Availability and implementation: Freely available at http://melolab.org/pdiviz/ as a PyMOL plugin. Tested with incentive, educational, and open source versions of PyMOL on Windows, Mac and Linux systems. Contact: aschueller@bio.puc.cl Supplementary Information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 78
    Publication Date: 2015-08-12
    Description: We report the broad-band spectral properties of the X-ray pulsar Cep X-4 by using a Suzaku observation in 2014 July. The 0.8–70 keV spectrum was found to be well described by three continuum models – Negative and Positive power-law with Exponential cut-off (NPEX), high-energy cut-off power-law and CompTT models. Additional components such as a cyclotron line at ~28 keV and two Gaussian components for iron lines at 6.4 and 6.9 keV were required in the spectral fitting. Apart from these, an additional absorption feature at ~45 keV was clearly detected in residuals obtained from the spectral fitting. This additional feature at ~45 keV was clearly seen in phase-resolved spectra of the pulsar. We identified this feature as the first harmonic of the fundamental cyclotron line at ~28 keV. The ratio between the first harmonic and fundamental line energies (1.7) was found to be in disagreement with the conventional factor of 2, indicating that the heights of line-forming regions are different or viewed at larger angles. The phase-resolved spectroscopy of the fundamental and first harmonic cyclotron lines shows significant pulse-phase variation of the line parameters. This can be interpreted as the effect of viewing angle or the role of complicated magnetic field of the pulsar.
    Print ISSN: 1745-3925
    Electronic ISSN: 1745-3933
    Topics: Physics
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 79
    Publication Date: 2015-08-14
    Description: The spin-down of a neutron star, e.g. due to magneto-dipole losses, results in compression of the stellar matter and induces nuclear reactions at phase transitions between different nuclear species in the crust. We show that this mechanism is effective in heating recycled pulsars, in which the previous accretion process has already been compressing the crust, so it is not in nuclear equilibrium. We calculate the corresponding emissivity and confront it with available observations, showing that it might account for the likely thermal ultraviolet emission of PSR J0437–4715.
    Print ISSN: 1745-3925
    Electronic ISSN: 1745-3933
    Topics: Physics
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 80
    Publication Date: 2015-08-14
    Description: New insights into the formation of interstellar formamide, a species of great relevance in prebiotic chemistry, are provided by electronic structure and kinetic calculations for the reaction NH 2 + H 2 CO -〉 NH 2 CHO + H. Contrarily to what previously suggested, this reaction is essentially barrierless and can, therefore, occur under the low temperature conditions of intestellar objects thus providing a facile formation route of formamide. The rate coefficient parameters for the reaction channel leading to NH 2 CHO + H have been calculated to be A = 2.6 x 10 –12  cm 3  s –1 , β = –2.1 and = 26.9 K in the range of temperatures 10–300 K. Including these new kinetic data in a refined astrochemical model, we show that the proposed mechanism can well reproduce the abundances of formamide observed in two very different interstellar objects: the cold envelope of the Sun-like protostar IRAS16293–2422 and the molecular shock L1157-B2. Therefore, the major conclusion of this Letter is that there is no need to invoke grain-surface chemistry to explain the presence of formamide provided that its precursors, NH 2 and H 2 CO, are available in the gas phase.
    Print ISSN: 1745-3925
    Electronic ISSN: 1745-3933
    Topics: Physics
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 81
    Publication Date: 2015-08-16
    Description: We report the identification of a novel gene family (named MgCRP-I) encoding short secreted cysteine-rich peptides in the Mediterranean mussel Mytilus galloprovincialis . These peptides display a highly conserved pre-pro region and a hypervariable mature peptide comprising six invariant cysteine residues arranged in three intramolecular disulfide bridges. Although their cysteine pattern is similar to cysteines-rich neurotoxic peptides of distantly related protostomes such as cone snails and arachnids, the different organization of the disulfide bridges observed in synthetic peptides and phylogenetic analyses revealed MgCRP-I as a novel protein family. Genome- and transcriptome-wide searches for orthologous sequences in other bivalve species indicated the unique presence of this gene family in Mytilus spp. Like many antimicrobial peptides and neurotoxins, MgCRP-I peptides are produced as pre-propeptides, usually have a net positive charge and likely derive from similar evolutionary mechanisms, that is, gene duplication and positive selection within the mature peptide region; however, synthetic MgCRP-I peptides did not display significant toxicity in cultured mammalian cells, insecticidal, antimicrobial, or antifungal activities. The functional role of MgCRP-I peptides in mussel physiology still remains puzzling.
    Electronic ISSN: 1759-6653
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 82
    Publication Date: 2015-08-16
    Description: Most sequenced eukaryotic genomes show a large excess of recent duplicates. As duplicates age, both the population genetic process of failed fixation and the mutation-driven process of nonfunctionalization act to reduce the observed number of duplicates. Understanding the processes generating the age distributions of recent duplicates is important to also understand the role of duplicate genes in the functional divergence of genomes. To date, mechanistic models for duplicate gene retention only account for the mutation-driven nonfunctionalization process. Here, a neutral model for the distribution of synonymous substitutions in duplicated genes which are segregating and expected to never fix in a population is introduced. This model enables differentiation of neutral loss due to failed fixation from loss due to mutation-driven nonfunctionalization. The model has been validated on simulated data and subsequent analysis with the model on genomic data from human and mouse shows that conclusions about the underlying mechanisms for duplicate gene retention can be sensitive to consideration of population genetic processes.
    Electronic ISSN: 1759-6653
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 83
    Publication Date: 2015-09-11
    Description: : In next generation sequencing (NGS)-based genetic studies, researchers typically perform genotype calling first and then apply standard genotype-based methods for association testing. However, such a two-step approach ignores genotype calling uncertainty in the association testing step and may incur power loss and/or inflated type-I error. In the recent literature, a few robust and efficient likelihood based methods including both likelihood ratio test (LRT) and score test have been proposed to carry out association testing without intermediate genotype calling. These methods take genotype calling uncertainty into account by directly incorporating genotype likelihood function (GLF) of NGS data into association analysis. However, existing LRT methods are computationally demanding or do not allow covariate adjustment; while existing score tests are not applicable to markers with low minor allele frequency (MAF). We provide an LRT allowing flexible covariate adjustment, develop a statistically more powerful score test and propose a combination strategy (UNC combo) to leverage the advantages of both tests. We have carried out extensive simulations to evaluate the performance of our proposed LRT and score test. Simulations and real data analysis demonstrate the advantages of our proposed combination strategy: it offers a satisfactory trade-off in terms of computational efficiency, applicability (accommodating both common variants and variants with low MAF) and statistical power, particularly for the analysis of quantitative trait where the power gain can be up to ~60% when the causal variant is of low frequency (MAF 〈 0.01). Availability and implementation : UNC combo and the associated R files, including documentation, examples, are available at http://www.unc.edu/~yunmli/UNCcombo/ Contact: yunli@med.unc.edu Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 84
    Publication Date: 2015-09-11
    Description: Motivation : Recombined T- and B-cell receptor repertoires are increasingly being studied using next generation sequencing (NGS) in order to interrogate the repertoire composition as well as changes in the distribution of receptor clones under different physiological and disease states. This type of analysis requires efficient and unambiguous clonotype assignment to a large number of NGS read sequences, including the identification of the incorporated V and J gene segments and the CDR3 sequence. Current tools have deficits with respect to performance, accuracy and documentation of their underlying algorithms and usage. Results : We present IMSEQ, a method to derive clonotype repertoires from NGS data with sophisticated routines for handling errors stemming from PCR and sequencing artefacts. The application can handle different kinds of input data originating from single- or paired-end sequencing in different configurations and is generic regarding the species and gene of interest. We have carefully evaluated our method with simulated and real world data and show that IMSEQ is superior to other tools with respect to its clonotyping as well as standalone error correction and runtime performance. Availability and implementation: IMSEQ was implemented in C++ using the SeqAn library for efficient sequence analysis. It is freely available under the GPLv2 open source license and can be downloaded at www.imtools.org . Supplementary information : Supplementary data are available at Bioinformatics online. Contact: lkuchenb@inf.fu-berlin.de or peter.robinson@charite.de
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 85
    Publication Date: 2015-09-11
    Description: Motivation: Interactions between amino acids are important determinants of the structure, stability and function of proteins. Several tools have been developed for the identification and analysis of such interactions in proteins based on the extensive studies carried out on high-resolution structures from Protein Data Bank (PDB). Although these tools allow users to identify and analyze interactions, analysis can only be performed on one structure at a time. This makes it difficult and time consuming to study the significance of these interactions on a large scale. Results: SpeeDB is a web-based tool for the identification of protein structures based on structural properties. SpeeDB queries are executed on all structures in the PDB at once, quickly enough for interactive use. SpeeDB includes standard queries based on published criteria for identifying various structures: disulphide bonds, catalytic triads and aromatic–aromatic, sulphur–aromatic, cation– and ionic interactions. Users can also construct custom queries in the user interface without any programming. Results can be downloaded in a Comma Separated Value (CSV) format for further analysis with other tools. Case studies presented in this article demonstrate how SpeeDB can be used to answer various biological questions. Analysis of human proteases revealed that disulphide bonds are the predominant type of interaction and are located close to the active site, where they promote substrate specificity. When comparing the two homologous G protein-coupled receptors and the two protein kinase paralogs analyzed, the differences in the types of interactions responsible for stability accounts for the differences in specificity and functionality of the structures. Availability and implementation: SpeeDB is available at http://www.parallelcomputing.ca as a web service. Contact: d@drobilla.net Supplementary Information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 86
    Publication Date: 2015-09-11
    Description: : Seq2pathway is an R/Python wrapper for pathway (or functional gene-set) analysis of genomic loci, adapted for advances in genome research. Seq2pathway associates the biological significance of genomic loci with their target transcripts and then summarizes the quantified values on the gene-level into pathway scores. It is designed to isolate systematic disturbances and common biological underpinnings from next-generation sequencing (NGS) data. Seq2pathway offers Bioconductor users enhanced capability in discovering collective pathway effects caused by both coding genes and cis-regulation of non-coding elements. Availability and implementation: The package is freely available at http://www.bioconductor.org/packages/release/bioc/html/seq2pathway.html . Contact : xyang2@uchicago.edu Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 87
    facet.materialart.
    Unknown
    Oxford University Press
    Publication Date: 2015-09-11
    Description: : Aggregation plots are frequently used to evaluate signal distributions at user-interested points in ChIP-Seq data analysis. agplus, a new and simple command-line tool, enables rapid and flexible generation of text tables tailored for aggregation plots from which users can easily design multiple groups based on user-definitions such as regulatory regions or transcription initiation sites. Availability and Implementation: This software is implemented in Ruby, supported on Linux and Mac OSX, and freely available at http://github.com/kazumits/agplus Contact: yohkawa@epigenetics.med.kyushu-u.ac.jp
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 88
    Publication Date: 2015-09-11
    Description: : We devise a novel inference algorithm to effectively solve the cancer progression model reconstruction problem. Our empirical analysis of the accuracy and convergence rate of our algorithm, CAncer PRogression Inference (CAPRI), shows that it outperforms the state-of-the-art algorithms addressing similar problems. Motivation: Several cancer-related genomic data have become available (e.g. The Cancer Genome Atlas , TCGA) typically involving hundreds of patients. At present, most of these data are aggregated in a cross-sectional fashion providing all measurements at the time of diagnosis. Our goal is to infer cancer ‘progression’ models from such data. These models are represented as directed acyclic graphs (DAGs) of collections of ‘selectivity’ relations, where a mutation in a gene A ‘selects’ for a later mutation in a gene B. Gaining insight into the structure of such progressions has the potential to improve both the stratification of patients and personalized therapy choices. Results: The CAPRI algorithm relies on a scoring method based on a probabilistic theory developed by Suppes, coupled with bootstrap and maximum likelihood inference. The resulting algorithm is efficient, achieves high accuracy and has good complexity, also, in terms of convergence properties. CAPRI performs especially well in the presence of noise in the data, and with limited sample sizes. Moreover CAPRI, in contrast to other approaches, robustly reconstructs different types of confluent trajectories despite irregularities in the data. We also report on an ongoing investigation using CAPRI to study atypical Chronic Myeloid Leukemia , in which we uncovered non trivial selectivity relations and exclusivity patterns among key genomic events. Availability and implementation: CAPRI is part of the TRanslational ONCOlogy R package and is freely available on the web at: http://bimib.disco.unimib.it/index.php/Tronco Contact: daniele.ramazzotti@disco.unimib.it Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 89
    Publication Date: 2015-09-11
    Description: Motivation: Model organisms play critical roles in biomedical research of human diseases and drug development. An imperative task is to translate information/knowledge acquired from model organisms to humans. In this study, we address a trans-species learning problem: predicting human cell responses to diverse stimuli, based on the responses of rat cells treated with the same stimuli. Results: We hypothesized that rat and human cells share a common signal-encoding mechanism but employ different proteins to transmit signals, and we developed a bimodal deep belief network and a semi-restricted bimodal deep belief network to represent the common encoding mechanism and perform trans-species learning. These ‘deep learning’ models include hierarchically organized latent variables capable of capturing the statistical structures in the observed proteomic data in a distributed fashion. The results show that the models significantly outperform two current state-of-the-art classification algorithms. Our study demonstrated the potential of using deep hierarchical models to simulate cellular signaling systems. Availability and implementation: The software is available at the following URL: http://pubreview.dbmi.pitt.edu/TransSpeciesDeepLearning/ . The data are available through SBV IMPROVER website, https://www.sbvimprover.com/challenge-2/overview , upon publication of the report by the organizers. Contact : xinghua@pitt.edu Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 90
    Publication Date: 2015-09-11
    Description: Motivation: Identification of differentially expressed genes is an important step in extracting knowledge from gene expression profiling studies. The raw expression data from microarray and other high-throughput technologies is deposited into the Gene Expression Omnibus (GEO) and served as Simple Omnibus Format in Text (SOFT) files. However, to extract and analyze differentially expressed genes from GEO requires significant computational skills. Results: Here we introduce GEO2Enrichr, a browser extension for extracting differentially expressed gene sets from GEO and analyzing those sets with Enrichr, an independent gene set enrichment analysis tool containing over 70 000 annotated gene sets organized into 75 gene-set libraries. GEO2Enrichr adds JavaScript code to GEO web-pages; this code scrapes user selected accession numbers and metadata, and then, with one click, users can submit this information to a web-server application that downloads the SOFT files, parses, cleans and normalizes the data, identifies the differentially expressed genes, and then pipes the resulting gene lists to Enrichr for downstream functional analysis. GEO2Enrichr opens a new avenue for adding functionality to major bioinformatics resources such GEO by integrating tools and resources without the need for a plug-in architecture. Importantly, GEO2Enrichr helps researchers to quickly explore hypotheses with little technical overhead, lowering the barrier of entry for biologists by automating data processing steps needed for knowledge extraction from the major repository GEO. Availability and implementation: GEO2Enrichr is an open source tool, freely available for installation as browser extensions at the Chrome Web Store and FireFox Add-ons. Documentation and a browser independent web application can be found at http://amp.pharm.mssm.edu/g2e/ . Contact: avi.maayan@mssm.edu
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 91
    Publication Date: 2015-09-11
    Description: : We report the creation of Drug Signatures Database (DSigDB), a new gene set resource that relates drugs/compounds and their target genes, for gene set enrichment analysis (GSEA). DSigDB currently holds 22 527 gene sets, consists of 17 389 unique compounds covering 19 531 genes. We also developed an online DSigDB resource that allows users to search, view and download drugs/compounds and gene sets. DSigDB gene sets provide seamless integration to GSEA software for linking gene expressions with drugs/compounds for drug repurposing and translational research. Availability and implementation: DSigDB is freely available for non-commercial use at http://tanlab.ucdenver.edu/DSigDB . Supplementary information: Supplementary data are available at Bioinformatics online. Contact: aikchoon.tan@ucdenver.edu
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 92
    Publication Date: 2015-09-11
    Description: : We describe the implementation of the method introduced by Chambaz et al. in 2012. We also demonstrate its genome-wide application to the integrative search of new regions with strong association between DNA copy number and gene expression accounting for DNA methylation in breast cancers. Availability and implementation: An open-source R package tmle.npvi is available from CRAN ( http://cran.r-project.org/ ). Contact: pierre.neuvial@genopole.cnrs.fr
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 93
    Publication Date: 2015-09-11
    Description: Stenotrophomonas maltophilia , a ubiquitous Gram-negative -proteobacterium, has emerged as an important opportunistic pathogen responsible for nosocomial infections. A major characteristic of clinical isolates is their high intrinsic or acquired antibiotic resistance level. The aim of this study was to decipher the genetic determinism of antibiotic resistance among strains from different origins (i.e., natural environment and clinical origin) showing various antibiotic resistance profiles. To this purpose, we selected three strains isolated from soil collected in France or Burkina Faso that showed contrasting antibiotic resistance profiles. After whole-genome sequencing, the phylogenetic relationships of these 3 strains and 11 strains with available genome sequences were determined. Results showed that a strain’s phylogeny did not match their origin or antibiotic resistance profiles. Numerous antibiotic resistance coding genes and efflux pump operons were revealed by the genome analysis, with 57% of the identified genes not previously described. No major variation in the antibiotic resistance gene content was observed between strains irrespective of their origin and antibiotic resistance profiles. Although environmental strains generally carry as many multidrug resistant (MDR) efflux pumps as clinical strains, the absence of resistance–nodulation–division (RND) pumps (i.e., SmeABC) previously described to be specific to S. maltophilia was revealed in two environmental strains (BurA1 and PierC1). Furthermore the genome analysis of the environmental MDR strain BurA1 showed the absence of SmeABC but the presence of another putative MDR RND efflux pump, named EbyCAB on a genomic island probably acquired through horizontal gene transfer.
    Electronic ISSN: 1759-6653
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 94
    Publication Date: 2015-09-11
    Description: Motivation: Genes with indispensable functions are identified as essential; however, the traditional gene-level studies of essentiality have several limitations. In this study, we characterized gene essentiality from a new perspective of protein domains, the independent structural or functional units of a polypeptide chain. Results: To identify such essential domains, we have developed an Expectation–Maximization (EM) algorithm-based Essential Domain Prediction (EDP) Model. With simulated datasets, the model provided convergent results given different initial values and offered accurate predictions even with noise. We then applied the EDP model to six microbial species and predicted 1879 domains to be essential in at least one species, ranging 10–23% in each species. The predicted essential domains were more conserved than either non-essential domains or essential genes. Comparing essential domains in prokaryotes and eukaryotes revealed an evolutionary distance consistent with that inferred from ribosomal RNA. When utilizing these essential domains to reproduce the annotation of essential genes, we received accurate results that suggest protein domains are more basic units for the essentiality of genes. Furthermore, we presented several examples to illustrate how the combination of essential and non-essential domains can lead to genes with divergent essentiality. In summary, we have described the first systematic analysis on gene essentiality on the level of domains. Contact: huilu.bioinfo@gmail.com or Long.Lu@cchmc.org Supplementary Information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 95
    Publication Date: 2015-09-11
    Description: Motivation: Information-theoretic and compositional analysis of biological sequences, in terms of k -mer dictionaries, has a well established role in genomic and proteomic studies. Much less so in epigenomics, although the role of k -mers in chromatin organization and nucleosome positioning is particularly relevant. Fundamental questions concerning the informational content and compositional structure of nucleosome favouring and disfavoring sequences with respect to their basic building blocks still remain open. Results: We present the first analysis on the role of k -mers in the composition of nucleosome enriched and depleted genomic regions (NER and NDR for short) that is: (i) exhaustive and within the bounds dictated by the information-theoretic content of the sample sets we use and (ii) informative for comparative epigenomics. We analize four different organisms and we propose a paradigmatic formalization of k -mer dictionaries, providing two different and complementary views of the k -mers involved in NER and NDR. The first extends well known studies in this area, its comparative nature being its major merit. The second, very novel, brings to light the rich variety of k -mers involved in influencing nucleosome positioning, for which an initial classification in terms of clusters is also provided. Although such a classification offers many insights, the following deserves to be singled-out: short poly(dA:dT) tracts are reported in the literature as fundamental for nucleosome depletion, however a global quantitative look reveals that their role is much less prominent than one would expect based on previous studies. Availability and implementation: Dictionaries, clusters and Supplementary Material are available online at http://math.unipa.it/rombo/epigenomics/ . Contact: simona.rombo@unipa.it Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 96
    Publication Date: 2015-09-11
    Description: Motivation: The number of reported genetic variants is rapidly growing, empowered by ever faster accumulation of next-generation sequencing data. A major issue is comparability. Standards that address the combined problem of inaccurately predicted breakpoints and repeat-induced ambiguities are missing. This decisively lowers the quality of ‘consensus’ callsets and hampers the removal of duplicate entries in variant databases, which can have deleterious effects in downstream analyses. Results: We introduce a sound framework for comparison of deletions that captures both tool-induced inaccuracies and repeat-induced ambiguities. We present a maximum matching algorithm that outputs virtual duplicates among two sets of predictions/annotations. We demonstrate that our approach is clearly superior over ad hoc criteria, like overlap, and that it can reduce the redundancy among callsets substantially. We also identify large amounts of duplicate entries in the Database of Genomic Variants, which points out the immediate relevance of our approach. Availability and implementation: Implementation is open source and available from https://bitbucket.org/readdi/readdi Contact: roland.wittler@uni-bielefeld.de or t.marschall@mpi-inf.mpg.de Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 97
    Publication Date: 2015-09-11
    Description: Motivation: Deep sequencing of clinical samples is now an established tool for the detection of infectious pathogens, with direct medical applications. The large amount of data generated produces an opportunity to detect species even at very low levels, provided that computational tools can effectively profile the relevant metagenomic communities. Data interpretation is complicated by the fact that short sequencing reads can match multiple organisms and by the lack of completeness of existing databases, in particular for viral pathogens. Here we present metaMix, a Bayesian mixture model framework for resolving complex metagenomic mixtures. We show that the use of parallel Monte Carlo Markov chains for the exploration of the species space enables the identification of the set of species most likely to contribute to the mixture. Results: We demonstrate the greater accuracy of metaMix compared with relevant methods, particularly for profiling complex communities consisting of several related species. We designed metaMix specifically for the analysis of deep transcriptome sequencing datasets, with a focus on viral pathogen detection; however, the principles are generally applicable to all types of metagenomic mixtures. Availability and implementation: metaMix is implemented as a user friendly R package, freely available on CRAN: http://cran.r-project.org/web/packages/metaMix Contact: sofia.morfopoulou.10@ucl.ac.uk Supplementary information: Supplementary data are available at Bionformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 98
    Publication Date: 2015-09-11
    Description: Motivation: A detailed analysis of multidimensional NMR spectra of macromolecules requires the identification of individual resonances (peaks). This task can be tedious and time-consuming and often requires support by experienced users. Automated peak picking algorithms were introduced more than 25 years ago, but there are still major deficiencies/flaws that often prevent complete and error free peak picking of biological macromolecule spectra. The major challenges of automated peak picking algorithms is both the distinction of artifacts from real peaks particularly from those with irregular shapes and also picking peaks in spectral regions with overlapping resonances which are very hard to resolve by existing computer algorithms. In both of these cases a visual inspection approach could be more effective than a ‘blind’ algorithm. Results: We present a novel approach using computer vision (CV) methodology which could be better adapted to the problem of peak recognition. After suitable ‘training’ we successfully applied the CV algorithm to spectra of medium-sized soluble proteins up to molecular weights of 26 kDa and to a 130 kDa complex of a tetrameric membrane protein in detergent micelles. Our CV approach outperforms commonly used programs. With suitable training datasets the application of the presented method can be extended to automated peak picking in multidimensional spectra of nucleic acids or carbohydrates and adapted to solid-state NMR spectra. Availability and implementation: CV-Peak Picker is available upon request from the authors. Contact : gsw@mol.biol.ethz.ch ; michal.walczak@mol.biol.ethz.ch ; adam.gonczarek@pwr.edu.pl Supplementary information : Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 99
    Publication Date: 2015-09-11
    Description: : PsyGeNET (Psychiatric disorders and Genes association NETwork) is a knowledge platform for the exploratory analysis of psychiatric diseases and their associated genes. PsyGeNET is composed of a database and a web interface supporting data search, visualization, filtering and sharing. PsyGeNET integrates information from DisGeNET and data extracted from the literature by text mining, which has been curated by domain experts. It currently contains 2642 associations between 1271 genes and 37 psychiatric disease concepts. In its first release, PsyGeNET is focused on three psychiatric disorders: major depression, alcohol and cocaine use disorders. PsyGeNET represents a comprehensive, open access resource for the analysis of the molecular mechanisms underpinning psychiatric disorders and their comorbidities. Availability and implementation: The PysGeNET platform is freely available at http://www.psygenet.org/ . The PsyGeNET database is made available under the Open Database License ( http://opendatacommons.org/licenses/odbl/1.0/ ). Contact: lfurlong@imim.es Supplementary information : Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 100
    Publication Date: 2015-09-19
    Description: The expansion of Bantu-speaking agropastoralist populations had a great impact on the genetic, linguistic, and cultural variation of sub-Saharan Africa. It is generally accepted that Bantu languages originated in an area around the present border between Cameroon and Nigeria approximately 5,000 years ago, from where they spread South and East becoming the largest African linguistic branch. The demic consequences of this event are reflected in the relatively high genetic homogeneity observed across most of sub-Saharan Africa populations. In this work, we explored genome-wide single nucleotide polymorphism data from 28 populations to characterize the genetic components present in sub-Saharan African populations. Combining novel data from four Southern African populations with previously published results, we reject the hypothesis that the "non-Bantu" genetic component reported in South-Eastern Africa (Mozambique) reflects extensive gene flow between incoming agriculturalist and resident hunter-gatherer communities. We alternatively suggest that this novel component is the result of demographic dynamics associated with the Bantu dispersal.
    Electronic ISSN: 1759-6653
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. More information can be found here...