ALBERT

All Library Books, journals and Electronic Records Telegrafenberg

Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
Filter
  • Computational Methods, Genomics  (67)
  • Eddies  (24)
  • Oxford University Press  (67)
  • American Meteorological Society  (24)
  • 2010-2014  (91)
  • 2000-2004
  • 1995-1999
  • 1
    facet.materialart.
    Unknown
    American Meteorological Society
    Publication Date: 2017-04-04
    Description: We study the quasi-geostrophic merging dynamics of axisymmetric baroclinic vortices to understand how baroclinicity affects merging rates and the development of the nonlinear cascade of enstrophy. The initial vortices are taken to simulate closely the horizontal' and vertical structure of Gulf Stream rings. A quasigeostrophic model is set with a horizontal resolution of 9 km and 6 vertical levels to resolve the mean stratification of the Gulf Stream region. The results show that the baroclinic merging is slower than the purely barotropic process, The merging is shown to occur in two phases: the tirst, which produces clove-shaped vortices and diffusive mixing of vorticity contours; and the second, which consists of the sliding of the remaining vorticity cores with a second diffusive mixing of the intemal vorticity field. Comparison among Nof, Cushman-Roisin, Polvani et al, and Dewar and Killworth merging events indicates a substantial agreement in the kinematics of the DYOCRSS. Parameter sensitivity experiments show that the decrease of the baroclinicity parameter of the system, Γ^2, [defined as Γ^2 = (D^2 fo^2)/ (No^2 H^2)], increases the speed of merging while its increase slows down the merging. However, the halting elfect of baroclinicity (large Γ^2 or small Rossby radii of deformation) reaches a saturation level where the merging becomes insensitive to larger F2 values. Furthermore, we show that a regime of small Γ^2 exists at which the merged baroclinic vortex is unstable (metastable) and breaks again into two new vortices, Thus, in the baroelinic case the range of Γ^2 detemines the stability of the merged vortex. We analyze these results by local energy and vorticity balances, showing that the horizontal divergence of pressure work term [∇ *(pv)] and the relative-vorticity advection term (v * ∇ (∇ ^2 φ) trigger the merging during the first phase. Due to this horizontal redistribution process, a net kinetic to gravitational energy conversion occurs via buoyancy work in the region external to the cores of the vortices. The second phase of merging is dominated by a direct baroclinic conversion of available gravitational energy into kinetic energy, which in tum triggers a horizontal energy redistribution producing the final fusion of the vortex centers. This energy and vorticity analysis supports the hypothesis that merging is an internal mixing process triggered by a horizontal redistribution of kinetic energy.
    Description: The work has been financed by a grant from the Progetto Finalizzato "Calcolo Parallelo"
    Description: Published
    Description: 1618/1637
    Description: 4A. Clima e Oceani
    Description: JCR Journal
    Description: restricted
    Keywords: Ocean modeling ; Vortex dynamics ; Baroclinicity ; Eddies ; 03. Hydrosphere::03.01. General::03.01.01. Analytical and numerical modeling
    Repository Name: Istituto Nazionale di Geofisica e Vulcanologia (INGV)
    Type: article
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 2
    Publication Date: 2022-05-25
    Description: Author Posting. © American Meteorological Society, 2010. This article is posted here by permission of American Meteorological Society for personal use, not for redistribution. The definitive version was published in Journal of Physical Oceanography 40 (2010): 789-801, doi:10.1175/2009JPO4039.1.
    Description: The issue of internal wave–mesoscale eddy interactions is revisited. Previous observational work identified the mesoscale eddy field as a possible source of internal wave energy. Characterization of the coupling as a viscous process provides a smaller horizontal transfer coefficient than previously obtained, with vh 50 m2 s−1 in contrast to νh 200–400 m2 s−1, and a vertical transfer coefficient bounded away from zero, with νυ + (f2/N2)Kh 2.5 ± 0.3 × 10−3 m2 s−1 in contrast to νυ + (f2/N2)Kh = 0 ± 2 × 10−2 m2 s−1. Current meter data from the Local Dynamics Experiment of the PolyMode field program indicate mesoscale eddy–internal wave coupling through horizontal interactions (i) is a significant sink of eddy energy and (ii) plays an O(1) role in the energy budget of the internal wave field.
    Keywords: Eddies ; Internal waves ; Mesoscale processes
    Repository Name: Woods Hole Open Access Server
    Type: Article
    Format: application/pdf
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 3
    Publication Date: 2022-05-25
    Description: Author Posting. © American Meteorological Society, 2008. This article is posted here by permission of American Meteorological Society for personal use, not for redistribution. The definitive version was published in Journal of Physical Oceanography 38 (2008): 2556-2574, doi:10.1175/2008JPO3666.1.
    Description: Vertical profiles of horizontal velocity obtained during the Mid-Ocean Dynamics Experiment (MODE) provided the first published estimates of the high vertical wavenumber structure of horizontal velocity. The data were interpreted as being representative of the background internal wave field, and thus, despite some evidence of excess downward energy propagation associated with coherent near-inertial features that was interpreted in terms of atmospheric generation, these data provided the basis for a revision to the Garrett and Munk spectral model. These data are reinterpreted through the lens of 30 years of research. Rather than representing the background wave field, atmospheric generation, or even near-inertial wave trapping, the coherent high wavenumber features are characteristic of internal wave capture in a mesoscale strain field. Wave capture represents a generalization of critical layer events for flows lacking the spatial symmetry inherent in a parallel shear flow or isolated vortex.
    Description: Salary support for this analysis was provided by Woods Hole Oceanographic Institution bridge support funds.
    Keywords: Eddies ; Ocean dynamics ; Internal waves ; Ocean variability
    Repository Name: Woods Hole Open Access Server
    Type: Article
    Format: application/pdf
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 4
    Publication Date: 2022-05-25
    Description: Author Posting. © American Meteorological Society, 2011. This article is posted here by permission of American Meteorological Society for personal use, not for redistribution. The definitive version was published in Journal of Physical Oceanography 41 (2011): 889–910, doi:10.1175/2010JPO4496.1.
    Description: This paper examines interaction between a barotropic point vortex and a steplike topography with a bay-shaped shelf. The interaction is governed by two mechanisms: propagation of topographic Rossby waves and advection by the forcing vortex. Topographic waves are supported by the potential vorticity (PV) jump across the topography and propagate along the step only in one direction, having higher PV on the right. Near one side boundary of the bay, which is in the wave propagation direction and has a narrow shelf, waves are blocked by the boundary, inducing strong out-of-bay transport in the form of detached crests. The wave–boundary interaction as well as out-of-bay transport is strengthened as the minimum shelf width is decreased. The two control mechanisms are related differently in anticyclone- and cyclone-induced interactions. In anticyclone-induced interactions, the PV front deformations are moved in opposite directions by the point vortex and topographic waves; a topographic cyclone forms out of the balance between the two opposing mechanisms and is advected by the forcing vortex into the deep ocean. In cyclone-induced interactions, the PV front deformations are moved in the same direction by the two mechanisms; a topographic cyclone forms out of the wave–boundary interaction but is confined to the coast. Therefore, anticyclonic vortices are more capable of driving water off the topography. The anticyclone-induced transport is enhanced for smaller vortex–step distance or smaller topography when the vortex advection is relatively strong compared to the wave propagation mechanism.
    Description: Y. Zhang acknowledges the support of theMIT-WHOI Joint Programin Physical Oceanography, NSF OCE-9901654 and OCE-0451086. J. Pedlosky acknowledges the support of NSF OCE- 9901654 and OCE-0451086.
    Keywords: Transport ; Eddies ; Barotropic flow ; Topographic effects ; Vortices ; Currents ; Potential vorticity ; Rossby waves
    Repository Name: Woods Hole Open Access Server
    Type: Article
    Format: application/pdf
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 5
    Publication Date: 2022-05-25
    Description: Author Posting. © American Meteorological Society, 2011. This article is posted here by permission of American Meteorological Society for personal use, not for redistribution. The definitive version was published in Journal of Climate 24 (2011): 4844–4858, doi:10.1175/2011JCLI4130.1.
    Description: The factors that determine the heat transport and overturning circulation in marginal seas subject to wind forcing and heat loss to the atmosphere are explored using a combination of a high-resolution ocean circulation model and a simple conceptual model. The study is motivated by the exchange between the subpolar North Atlantic Ocean and the Nordic Seas, a region that is of central importance to the oceanic thermohaline circulation. It is shown that mesoscale eddies formed in the marginal sea play a major role in determining the mean meridional heat transport and meridional overturning circulation across the sill. The balance between the oceanic eddy heat flux and atmospheric cooling, as characterized by a nondimensional number, is shown to be the primary factor in determining the properties of the exchange. Results from a series of eddy-resolving primitive equation model calculations for the meridional heat transport, overturning circulation, density of convective waters, and density of exported waters compare well with predictions from the conceptual model over a wide range of parameter space. Scaling and model results indicate that wind effects are small and the mean exchange is primarily buoyancy forced. These results imply that one must accurately resolve or parameterize eddy fluxes in order to properly represent the mean exchange between the North Atlantic and the Nordic Seas, and thus between the Nordic Seas and the atmosphere, in climate models.
    Description: This study was supported by the National Science Foundation under Grants OCE-0726339 and OCE-0850416.
    Keywords: Eddies ; Forcing ; Meridional overturning circulation ; Transport ; North Atlantic Ocean ; Seas/gulfs/bays
    Repository Name: Woods Hole Open Access Server
    Type: Article
    Format: application/pdf
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 6
    Publication Date: 2022-05-25
    Description: Author Posting. © American Meteorological Society, 2013. This article is posted here by permission of American Meteorological Society for personal use, not for redistribution. The definitive version was published in Journal of Physical Oceanography 43 (2013): 283–300, doi:10.1175/JPO-D-11-0240.1.
    Description: Motivated by the recent interest in ocean energetics, the widespread use of horizontal eddy viscosity in models, and the promise of high horizontal resolution data from the planned wide-swath satellite altimeter, this paper explores the impacts of horizontal eddy viscosity and horizontal grid resolution on geostrophic turbulence, with a particular focus on spectral kinetic energy fluxes Π(K) computed in the isotropic wavenumber (K) domain. The paper utilizes idealized two-layer quasigeostrophic (QG) models, realistic high-resolution ocean general circulation models, and present-generation gridded satellite altimeter data. Adding horizontal eddy viscosity to the QG model results in a forward cascade at smaller scales, in apparent agreement with results from present-generation altimetry. Eddy viscosity is taken to roughly represent coupling of mesoscale eddies to internal waves or to submesoscale eddies. Filtering the output of either the QG or realistic models before computing Π(K) also greatly increases the forward cascade. Such filtering mimics the smoothing inherent in the construction of present-generation gridded altimeter data. It is therefore difficult to say whether the forward cascades seen in present-generation altimeter data are due to real physics (represented here by eddy viscosity) or to insufficient horizontal resolution. The inverse cascade at larger scales remains in the models even after filtering, suggesting that its existence in the models and in altimeter data is robust. However, the magnitude of the inverse cascade is affected by filtering, suggesting that the wide-swath altimeter will allow a more accurate determination of the inverse cascade at larger scales as well as providing important constraints on smaller-scale dynamics.
    Description: BKA received support from Office of Naval Research Grant N00014-11-1-0487, National Science Foundation (NSF) Grants OCE-0924481 and OCE- 09607820, and University of Michigan startup funds. KLP acknowledges support from Woods Hole Oceanographic Institution bridge support funds. RBS acknowledges support from NSF grants OCE-0960834 and OCE-0851457, a contract with the National Oceanography Centre, Southampton, and a NASA subcontract to Boston University. JFS and JGR were supported by the projects ‘‘Global and remote littoral forcing in global ocean models’’ and ‘‘Agesotrophic vorticity dynamics of the ocean,’’ respectively, both sponsored by the Office of Naval Research under program element 601153N.
    Description: 2013-08-01
    Keywords: Eddies ; Nonlinear dynamics ; Ocean dynamics ; Satellite observations ; Ocean models
    Repository Name: Woods Hole Open Access Server
    Type: Article
    Format: application/pdf
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 7
    Publication Date: 2022-05-25
    Description: Author Posting. © American Meteorological Society, 2008. This article is posted here by permission of American Meteorological Society for personal use, not for redistribution. The definitive version was published in Journal of Physical Oceanography 38 (2008): 133–145, doi:10.1175/2007JPO3782.1.
    Description: Five ice-tethered profilers (ITPs), deployed between 2004 and 2006, have provided detailed potential temperature θ and salinity S profiles from 21 anticyclonic eddy encounters in the central Canada Basin of the Arctic Ocean. The 12–35-m-thick eddies have center depths between 42 and 69 m in the Arctic halocline, and are shallower and less dense than the majority of eddies observed previously in the central Canada Basin. They are characterized by anomalously cold θ and low stratification, and have horizontal scales on the order of, or less than, the Rossby radius of deformation (about 10 km). Maximum azimuthal speeds estimated from dynamic heights (assuming cyclogeostrophic balance) are between 9 and 26 cm s−1, an order of magnitude larger than typical ambient flow speeds in the central basin. Eddy θ–S and potential vorticity properties, as well as horizontal and vertical scales, are consistent with their formation by instability of a surface front at about 80°N that appears in historical CTD and expendable CTD (XCTD) measurements. This would suggest eddy lifetimes longer than 6 months. While the baroclinic instability of boundary currents cannot be ruled out as a generation mechanism, it is less likely since deeper eddies that would originate from the deeper-reaching boundary flows are not observed in the survey region.
    Description: The engineering design work for the ITP was initiated by the Cecil H. and Ida M. Green Technology Innovation Program (an internal program at the Woods Hole Oceanographic Institution). Prototype development and construction were funded jointly by the U.S. National Science Foundation (NSF) Oceanographic Technology and Interdisciplinary Coordination Program and Office of Polar Programs (OPP) under Award OCE-0324233. Continued support has been provided by the OPP Arctic Sciences Section under Award ARC-0519899 and internal WHOI funding.
    Keywords: Arctic ; Eddies ; Profilers ; Stability ; Salinity
    Repository Name: Woods Hole Open Access Server
    Type: Article
    Format: application/pdf
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 8
    Publication Date: 2022-05-25
    Description: Author Posting. © American Meteorological Society, 2008. This article is posted here by permission of American Meteorological Society for personal use, not for redistribution. The definitive version was published in Journal of Physical Oceanography 38 (2008): 1644-1668, doi:10.1175/2007JPO3829.1.
    Description: The mean structure and time-dependent behavior of the shelfbreak jet along the southern Beaufort Sea, and its ability to transport properties into the basin interior via eddies are explored using high-resolution mooring data and an idealized numerical model. The analysis focuses on springtime, when weakly stratified winter-transformed Pacific water is being advected out of the Chukchi Sea. When winds are weak, the observed jet is bottom trapped with a low potential vorticity core and has maximum mean velocities of O(25 cm s−1) and an eastward transport of 0.42 Sv (1 Sv ≡ 106 m3 s−1). Despite the absence of winds, the current is highly time dependent, with relative vorticity and twisting vorticity often important components of the Ertel potential vorticity. An idealized primitive equation model forced by dense, weakly stratified waters flowing off a shelf produces a mean middepth boundary current similar in structure to that observed at the mooring site. The model boundary current is also highly variable, and produces numerous strong, small anticyclonic eddies that transport the shelf water into the basin interior. Analysis of the energy conversion terms in both the mooring data and the numerical model indicates that the eddies are formed via baroclinic instability of the boundary current. The structure of the eddies in the basin interior compares well with observations from drifting ice platforms. The results suggest that eddies shed from the shelfbreak jet contribute significantly to the offshore flux of heat, salt, and other properties, and are likely important for the ventilation of the halocline in the western Arctic Ocean. Interaction with an anticyclonic basin-scale circulation, meant to represent the Beaufort gyre, enhances the offshore transport of shelf water and results in a loss of mass transport from the shelfbreak jet.
    Description: This study was supported by the National Science Foundation Office of Polar Programs under Grants 0421904 and 035268 (MS), and by the Office of Naval Research Grant N00014-02-1-0317 (RP and PF). Analysis by AJP was supported by the Office of Naval Research under Grant N00014-97-1-0135 and by the National Science Foundation under Grant OPP-9815303.
    Keywords: Arctic ; Eddies ; Transport ; Currents ; Jets
    Repository Name: Woods Hole Open Access Server
    Type: Article
    Format: application/pdf
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 9
    Publication Date: 2022-05-25
    Description: Author Posting. © American Meteorological Society, 2008. This article is posted here by permission of American Meteorological Society for personal use, not for redistribution. The definitive version was published in Journal of Physical Oceanography 38 (2008): 1992-2002, doi:10.1175/2008JPO3669.1.
    Description: This paper extends A. Bracco and J. Pedlosky’s investigation of the eddy-formation mechanism in the eastern Labrador Sea by including a more realistic depiction of the boundary current. The quasigeostrophic model consists of a meridional, coastally trapped current with three vertical layers. The current configuration and topographic domain are chosen to match, as closely as possible, the observations of the boundary current and the varying topographic slope along the West Greenland coast. The role played by the bottom-intensified component of the boundary current on the formation of the Labrador Sea Irminger Rings is explored. Consistent with the earlier study, a short, localized bottom-trapped wave is responsible for most of the perturbation energy growth. However, for the instability to occur in the three-layer model, the deepest component of the boundary current must be sufficiently strong, highlighting the importance of the near-bottom flow. The model is able to reproduce important features of the observed vortices in the eastern Labrador Sea, including the polarity, radius, rate of formation, and vertical structure. At the time of formation, the eddies have a surface signature as well as a strong circulation at depth, possibly allowing for the transport of both surface and near-bottom water from the boundary current into the interior basin. This work also supports the idea that changes in the current structure could be responsible for the observed interannual variability in the number of Irminger Rings formed.
    Description: AB is supported by WHOI unrestricted funds, JP by the National Science Foundation OCE 85108600, and RP by 0450658.
    Keywords: Eddies ; Boundary currents ; Quasigeostrophic models ; North Atlantic ; Coastlines
    Repository Name: Woods Hole Open Access Server
    Type: Article
    Format: application/pdf
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 10
    Publication Date: 2022-05-25
    Description: Author Posting. © American Meteorological Society, 2007. This article is posted here by permission of American Meteorological Society for personal use, not for redistribution. The definitive version was published in Journal of Physical Oceanography 37 (2007): 1103-1121, doi:10.1175/jpo3041.1.
    Description: The role of mesoscale oceanic eddies is analyzed in a quasigeostrophic coupled ocean–atmosphere model operating at a large Reynolds number. The model dynamics are characterized by decadal variability that involves nonlinear adjustment of the ocean to coherent north–south shifts of the atmosphere. The oceanic eddy effects are diagnosed by the dynamical decomposition method adapted for nonstationary external forcing. The main effects of the eddies are an enhancement of the oceanic eastward jet separating the subpolar and subtropical gyres and a weakening of the gyres. The flow-enhancing effect is due to nonlinear rectification driven by fluctuations of the eddy forcing. This is a nonlocal process involving generation of the eddies by the flow instabilities in the western boundary current and the upstream part of the eastward jet. The eddies are advected by the mean current to the east, where they backscatter into the rectified enhancement of the eastward jet. The gyre-weakening effect, which is due to the time-mean buoyancy component of the eddy forcing, is a result of the baroclinic instability of the westward return currents. The diagnosed eddy forcing is parameterized in a non-eddy-resolving ocean model, as a nonstationary random process, in which the corresponding parameters are derived from the control coupled simulation. The key parameter of the random process—its variance—is related to the large-scale flow baroclinicity index. It is shown that the coupled model with the non-eddy-resolving ocean component and the parameterized eddies correctly simulates climatology and low-frequency variability of the control eddy-resolving coupled solution.
    Description: Funding for this work came from NSF Grants OCE 02-221066 and OCE 03-44094. Additional funding for PB was provided by the U.K. Royal Society Fellowship and by WHOI Grants 27100056 and 52990035.
    Keywords: Ocean dynamics ; Ocean models ; Eddies ; Jets ; Coupled models
    Repository Name: Woods Hole Open Access Server
    Type: Article
    Format: application/pdf
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 11
    Publication Date: 2022-05-25
    Description: Author Posting. © American Meteorological Society, 2011. This article is posted here by permission of American Meteorological Society for personal use, not for redistribution. The definitive version was published in Journal of Physical Oceanography 41 (2011): 2168–2186, doi:10.1175/JPO-D-11-08.1.
    Description: This paper studies the interaction of an Antarctic Circumpolar Current (ACC)–like wind-driven channel flow with a continental slope and a flat-bottomed bay-shaped shelf near the channel’s southern boundary. Interaction between the model ACC and the topography in the second layer induces local changes of the potential vorticity (PV) flux, which further causes the formation of a first-layer PV front near the base of the topography. Located between the ACC and the first-layer slope, the newly formed PV front is constantly perturbed by the ACC and in turn forces the first-layer slope with its own variability in an intermittent but persistent way. The volume transport of the slope water across the first-layer slope edge is mostly directly driven by eddies and meanders of the new front, and its magnitude is similar to the maximum Ekman transport in the channel. Near the bay’s opening, the effect of the topographic waves, excited by offshore variability, dominates the cross-isobath exchange and induces a mean clockwise shelf circulation. The waves’ propagation is only toward the west and tends to be blocked by the bay’s western boundary in the narrow-shelf region. The ensuing wave–coast interaction amplifies the wave amplitude and the cross-shelf transport. Because the interaction only occurs near the western boundary, the shelf water in the west of the bay is more readily carried offshore than that in the east and the mean shelf circulation is also intensified along the bay’s western boundary.
    Description: Y. Zhang acknowledges the support of the MIT-WHOI Joint Program in Physical Oceanography and NSF OCE-9901654 and OCE- 0451086. J. Pedlosky acknowledges the support of NSF OCE-9901654 and OCE-0451086.
    Keywords: Baroclinic flows ; Eddies ; Fronts ; Mass fluxes/transport ; Mesoscale processes ; Topographic effects
    Repository Name: Woods Hole Open Access Server
    Type: Article
    Format: application/pdf
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 12
    Publication Date: 2022-05-25
    Description: Author Posting. © American Meteorological Society, 2012. This article is posted here by permission of American Meteorological Society for personal use, not for redistribution. The definitive version was published in Journal of Physical Oceanography 42 (2012): 2206–2228, doi:10.1175/JPO-D-11-0191.1.
    Description: This study investigates the anisotropic properties of the eddy-induced material transport in the near-surface North Atlantic from two independent datasets, one simulated from the sea surface height altimetry and one derived from real-ocean surface drifters, and systematically examines the interactions between the mean- and eddy-induced material transport in the region. The Lagrangian particle dispersion, which is widely used to characterize the eddy-induced tracer fluxes, is quantified by constructing the “spreading ellipses.” The analysis consistently demonstrates that this dispersion is spatially inhomogeneous and strongly anisotropic. The spreading is larger and more anisotropic in the subtropical than in the subpolar gyre, and the largest ellipses occur in the Gulf Stream vicinity. Even at times longer than half a year, the spreading exhibits significant nondiffusive behavior in some parts of the domain. The eddies in this study are defined as deviations from the long-term time-mean. The contributions from the climatological annual cycle, interannual, and subannual (shorter than one year) variability are investigated, and the latter is shown to have the strongest effect on the anisotropy of particle spreading. The influence of the mean advection on the eddy-induced particle spreading is investigated using the “eddy-following-full-trajectories” technique and is found to be significant. The role of the Ekman advection is, however, secondary. The pronounced anisotropy of particle dispersion is expected to have important implications for distributing oceanic tracers, and for parameterizing eddy-induced tracer transfer in non-eddy-resolving models.
    Description: IR was supported by Grant NSF-OCE-0725796. IK would like to acknowledge support by the National Science foundation Grant OCE-0842834.
    Description: 2013-06-01
    Keywords: North Atlantic Ocean ; Diffusion ; Dispersion ; Eddies ; Lagrangian circulation/transport ; Trajectories
    Repository Name: Woods Hole Open Access Server
    Type: Article
    Format: application/pdf
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 13
    Publication Date: 2022-05-25
    Description: Author Posting. © American Meteorological Society, 2013. This article is posted here by permission of American Meteorological Society for personal use, not for redistribution. The definitive version was published in Journal of Physical Oceanography 43 (2013): 905–919, doi:10.1175/JPO-D-12-0150.1.
    Description: Interactions between vortices and a shelfbreak current are investigated, with particular attention to the exchange of waters between the continental shelf and slope. The nonlinear, three-dimensional interaction between an anticyclonic vortex and the shelfbreak current is studied in the laboratory while varying the ratio ε of the maximum azimuthal velocity in the vortex to the maximum alongshelf velocity in the shelfbreak current. Strong interactions between the shelfbreak current and the vortex are observed when ε 〉 1; weak interactions are found when ε 〈 1. When the anticyclonic vortex comes in contact with the shelfbreak front during a strong interaction, a streamer of shelf water is drawn offshore and wraps anticyclonically around the vortex. Measurements of the offshore transport and identification of the particle trajectories in the shelfbreak current drawn offshore from the vortex allow quantification of the fraction of the shelfbreak current that is deflected onto the slope; this fraction increases for increasing values of ε. Experimental results in the laboratory are strikingly similar to results obtained from observations in the Middle Atlantic Bight (MAB); after proper scaling, measurements of offshore transport and offshore displacement of shelf water for vortices in the MAB that span a range of values of ε agree well with laboratory predictions.
    Description: Laboratory work was supported by the National Science Foundation through Grant OCE- 0081756. Glider observations in March–April 2006 were supported by the National Science Foundation through Grant OCE-0220769. Glider observations in July– October 2007 were supported by a grant from Raytheon. RET was supported by the Postdoctoral Scholar Program at the Woods Hole Oceanographic Institution, with funding provided by the Cooperative Institute for the North Atlantic Region. The REMUS observations were funded by the Office of Naval Research. GGG was supported by the National Science Foundation through Grant OCE-1129125 for analysis and writing.
    Description: 2013-11-01
    Keywords: Continental shelf/slope ; Eddies ; Fronts ; Transport ; Laboratory/physical models
    Repository Name: Woods Hole Open Access Server
    Type: Article
    Format: application/pdf
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 14
    Publication Date: 2022-05-25
    Description: Author Posting. © American Meteorological Society, 2013. This article is posted here by permission of American Meteorological Society for personal use, not for redistribution. The definitive version was published in Journal of Climate 26 (2013): 9839–9859, doi:10.1175/JCLI-D-12-00647.1.
    Description: Spatial and temporal covariability between the atmospheric transient eddy heat fluxes (i.e., υ′T′ and υ′q′) in the Northern Hemisphere winter (January–March) and the paths of the Gulf Stream (GS), Kuroshio Extension (KE), and Oyashio Extension (OE) are examined based on an atmospheric reanalyses and ocean observations for 1979–2009. For the climatological winter mean, the northward heat fluxes by the synoptic (2–8 days) transient eddies exhibit canonical storm tracks with their maxima collocated with the GS and KE/OE. The intraseasonal (8 days–3 months) counterpart, while having overall similar amplitude, shows a spatial pattern with more localized maxima near the major orography and blocking regions. Lateral heat flux divergence by transient eddies as the sum of the two frequency bands exhibits very close coupling with the exact locations of the ocean fronts. Linear regression is used to examine the lead–lag relationship between interannual changes in the northward heat fluxes by the transient eddies and the meridional changes in the paths of the GS, KE, and OE, respectively. One to three years prior to the northward shifts of each ocean front, the atmospheric storm tracks shift northward and intensify, which is consistent with wind-driven changes of the ocean. Following the northward shifts of the ocean fronts, the synoptic storm tracks weaken in all three cases. The zonally integrated northward heat transport by the synoptic transient eddies increases by ~5% of its maximum mean value prior to the northward shift of each ocean front and decreases to a similar amplitude afterward.
    Description: Support from the National Aeronautics and Space Administration (NASA) Physical Oceanography Program (NNX09AF35G to TJ and Y-OK) and the Department of Energy (DOE) Climate and Environmental Sciences Division (DE-SC0007052 to Y-OK) is gratefully acknowledged.
    Description: 2014-06-15
    Keywords: Atmosphere-ocean interaction ; Eddies ; Energy transport ; Storm tracks ; Heat budgets/fluxes
    Repository Name: Woods Hole Open Access Server
    Type: Article
    Format: application/pdf
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 15
    Publication Date: 2022-05-25
    Description: Author Posting. © American Meteorological Society, 2009. This article is posted here by permission of American Meteorological Society for personal use, not for redistribution. The definitive version was published in Journal of Physical Oceanography 39 (2009): 1551-1573, doi:10.1175/2008JPO4152.1.
    Description: A conceptually simple model is presented for predicting the amplitude and periodicity of eddies generated by a steady poleward outflow in a 1½-layer β-plane formulation. The prediction model is rooted in linear quasigeostrophic dynamics but is capable of predicting the amplitude of the β plume generated by outflows in the nonlinear range. Oscillations in the plume amplitude are seen to represent a near-zero group velocity response to an adjustment process that can be traced back to linear dynamics. When the plume-amplitude oscillations become large enough so that the coherent β plume is replaced by a robust eddy field, the eddy amplitude is still constrained by the plume-amplitude prediction model. The eddy periodicity remains close to that of the predictable, near-zero group-velocity linear oscillations. Striking similarities between the patterns of variability in the model and observations south of Indonesia’s Lombok Strait suggest that the processes investigated in this study may play an important role in the generation of the observed eddy field of the Indo-Australian Basin.
    Description: This work was completed at the Woods Hole Oceanographic Institution while TS Durland was supported by the Ocean and Climate Change Institute. MA Spall was supported by NSF Grant OCE-0423975 and J Pedlosky by NSF Grant OCE-0451086. TS Durland acknowledges additional report preparation support from NASA Grant NNG05GN98G.
    Keywords: Eddies ; Intraseasonal variability ; Nonlinear models ; Shallow-water equations ; Plumes
    Repository Name: Woods Hole Open Access Server
    Type: Article
    Format: application/pdf
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 16
    Publication Date: 2022-05-25
    Description: Author Posting. © American Meteorological Society, 2014. This article is posted here by permission of American Meteorological Society for personal use, not for redistribution. The definitive version was published in Journal of Physical Oceanography 44 (2014): 229–245, doi:10.1175/JPO-D-12-0218.1.
    Description: Data from a mooring deployed at the edge of the East Greenland shelf south of Denmark Strait from September 2007 to October 2008 are analyzed to investigate the processes by which dense water is transferred off the shelf. It is found that water denser than 27.7 kg m−3—as dense as water previously attributed to the adjacent East Greenland Spill Jet—resides near the bottom of the shelf for most of the year with no discernible seasonality. The mean velocity in the central part of the water column is directed along the isobaths, while the deep flow is bottom intensified and veers offshore. Two mechanisms for driving dense spilling events are investigated, one due to offshore forcing and the other associated with wind forcing. Denmark Strait cyclones propagating southward along the continental slope are shown to drive off-shelf flow at their leading edges and are responsible for much of the triggering of individual spilling events. Northerly barrier winds also force spilling. Local winds generate an Ekman downwelling cell. Nonlocal winds also excite spilling, which is hypothesized to be the result of southward-propagating coastally trapped waves, although definitive confirmation is still required. The combined effect of the eddies and barrier winds results in the strongest spilling events, while in the absence of winds a train of eddies causes enhanced spilling.
    Description: The authors wish to thank Paula Fratantoni, Frank Bahr, and Dan Torres for processing the mooring data. The mooring array was capably deployed by the crew of the R/V Arni Fridriksson and recovered by the crew of the R/V Knorr. We thank Hedinn Valdimarsson for his assistance in the field work. Ken Brink provided valuable insights regarding the dynamics of shelf waves. Funding for the study was provided by National Science Foundation Grant OCE-0722694, the Arctic Research Initiative of the Woods Hole Oceanographic Institution. We also wish to thank the Natural Environment Research Council for Ph.D. studentship funding, and the University of East Anglia’s Roberts Fund and Royal Meteorological Society for supporting travel for collaboration.
    Description: 2014-07-01
    Keywords: Geographic location/entity ; Continental shelf/slope ; Circulation/ Dynamics ; Meridional overturning circulation ; Upwelling/downwelling ; Atm/Ocean Structure/ Phenomena ; Eddies ; Extreme events ; Physical Meteorology and Climatology ; Air-sea interaction
    Repository Name: Woods Hole Open Access Server
    Type: Article
    Format: application/pdf
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 17
    facet.materialart.
    Unknown
    American Meteorological Society
    Publication Date: 2022-05-25
    Description: Author Posting. © American Meteorological Society, 2010. This article is posted here by permission of American Meteorological Society for personal use, not for redistribution. The definitive version was published in Journal of Physical Oceanography 40 (2010): 2341–2347, doi:10.1175/2010JPO4465.1.
    Description: The mean downwelling in an eddy-resolving model of a convective basin is concentrated near the boundary where eddies are shed from the cyclonic boundary current into the interior. It is suggested that the buoyancy-forced downwelling in the Labrador Sea and the Lofoten Basin is similarly concentrated in analogous eddy formation regions along their eastern boundaries. Use of a transformed Eulerian mean depiction of the density transport reveals the central role eddy fluxes play in maintaining the adiabatic nature of the flow in a nonperiodic region where heat is lost from the boundary current. The vorticity balance in the downwelling region is primarily between stretching of planetary vorticity and eddy flux divergence of relative vorticity, although a narrow viscous boundary layer is ultimately important in closing the regional vorticity budget. This overall balance is similar in some ways to the diffusive–viscous balance represented in previous boundary layer theories, and suggests that the downwelling in convective basins may be properly represented in low-resolution climate models if eddy flux parameterizations are adiabatic, identify localized regions of eddy formations, and allow density to be transported far from the region of eddy formations.
    Description: This study was supported by the National Science Foundation under Grants OCE-0726339 and OCE-0850416.
    Keywords: Eddies ; Convection ; Boundary layer ; Climate models ; Thermohaline circulation ; Vorticity
    Repository Name: Woods Hole Open Access Server
    Type: Article
    Format: application/pdf
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 18
    Publication Date: 2022-05-25
    Description: Author Posting. © American Meteorological Society, 2009. This article is posted here by permission of American Meteorological Society for personal use, not for redistribution. The definitive version was published in Journal of Physical Oceanography 39 (2009): 3162-3175, doi:10.1175/2009JPO4239.1.
    Description: This study analyzes anisotropic properties of the material transport by eddies and eddy-driven zonal jets in a general circulation model of the North Atlantic through the analysis of Lagrangian particle trajectories. Spreading rates—defined here as half the rate of change in the particle dispersion—in the zonal direction systematically exceed the meridional rates by an order of magnitude. Area-averaged values for the upper-ocean zonal and meridional spreading rates are approximately 8100 and 1400 m2 s−1, respectively, and in the deep ocean they are 2400 and 200 m2 s−1. The results demonstrate that this anisotropy is mainly due to the action of the transient eddies and not to the shear dispersion associated with the time-mean jets. This property is consistent with the fact that eddies in this study have zonally elongated shapes. With the exception of the upper-ocean subpolar gyre, eddies also cause the superdiffusive zonal spreading, significant variations in the spreading rate in the vertical and meridional directions, and the difference between the westward and eastward spreading.
    Description: Funding for IK was provided by NSF Grants OCE 0346178, 0749722, and 0842834. Funding for PB was provided by NSF Grants OCE 0344094 and OCE 0725796 and by the research grant from the Newton Trust of the University of Cambridge. For JP the acknowledgement is to NSF OCE-0451086.
    Keywords: Eddies ; Transport ; Currents ; North Atlantic Ocean
    Repository Name: Woods Hole Open Access Server
    Type: Article
    Format: application/pdf
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 19
    Publication Date: 2022-05-25
    Description: Author Posting. © American Meteorological Society, 2009. This article is posted here by permission of American Meteorological Society for personal use, not for redistribution. The definitive version was published in Journal of Physical Oceanography 39 (2009): 1361-1379, doi:10.1175/2008JPO4096.1.
    Description: Multiple zonal jets are observed in satellite data–based estimates of oceanic velocities, float measurements, and high-resolution numerical simulations of the ocean circulation. This study makes a step toward understanding the dynamics of these jets in the real ocean by analyzing the vertical structure and dynamical balances within multiple zonal jets simulated in an eddy-resolving primitive equation model of the North Atlantic. In particular, the authors focus on the role of eddy flux convergences (“eddy forcing”) in supporting the buoyancy and relative/potential vorticity (PV) anomalies associated with the jets. The results suggest a central role of baroclinic eddies in the barotropic and baroclinic dynamics of the jets, and significant differences in the effects of eddy forcing between the subtropical and subpolar gyres. Additionally, diabatic potential vorticity sources and sinks, associated with vertical diffusion, are shown to play an important role in supporting the potential vorticity anomalies. The resulting potential vorticity profile does not resemble a “PV staircase”—a distinct meridional structure observed in some idealized studies of geostrophic turbulence.
    Description: Funding for IK was provided by NSF Grants OCE 0346178 and 0749722. Funding for PB was provided by NSF Grants OCE 0344094 and OCE 0725796 and by the research grant from the Newton Trust of the University of Cambridge. For JP the acknowledgement is to NSF OCE-0451086.
    Keywords: Eddies ; Forcing ; Dynamics ; Jets ; North Atlantic Ocean
    Repository Name: Woods Hole Open Access Server
    Type: Article
    Format: application/pdf
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 20
    Publication Date: 2022-05-25
    Description: Author Posting. © American Meteorological Society, 2011. This article is posted here by permission of American Meteorological Society for personal use, not for redistribution. The definitive version was published in Journal of Physical Oceanography 41 (2011): 1182–1208, doi:10.1175/2010JPO4564.1.
    Description: The authors use data collected by a line of tall current meter moorings deployed across the axis of the Kuroshio Extension (KE) jet at the location of maximum time-mean eddy kinetic energy to characterize the mean jet structure, the eddy variability, and the nature of eddy–mean flow interactions observed during the Kuroshio Extension System Study (KESS). A picture of the 2-yr record mean jet structure is presented in both geographical and stream coordinates, revealing important contrasts in jet strength, width, vertical structure, and flanking recirculation structure. Eddy variability observed is discussed in the context of some of its various sources: jet meandering, rings, waves, and jet instability. Finally, various scenarios for eddy–mean flow interaction consistent with the observations are explored. It is shown that the observed cross-jet distributions of Reynolds stresses at the KESS location are consistent with wave radiation away from the jet, with the sense of the eddy feedback effect on the mean consistent with eddy driving of the observed recirculations. The authors consider these results in the context of a broader description of eddy–mean flow interactions in the larger KE region using KESS data in combination with in situ measurements from past programs in the region and satellite altimetry. This demonstrates important consistencies in the along-stream development of time-mean and eddy properties in the KE with features of an idealized model of a western boundary current (WBC) jet used to understand the nature and importance of eddy–mean flow interactions in WBC jet systems.
    Description: This work was supported by National Science Foundation funding for the KESS program under Grants OCE-0220161 (SW, NGH, and SRJ), OCE- 0825550 (SW), OCE-0850744 (NGH), and OCE-0849808 (SRJ). SW was also supported by the MIT Presidential Fellowship. The financial assistance of the Houghton Fund, the MIT Student Assistance Fund, and WHOI Academic Programs is also gratefully acknowledged.
    Keywords: Eddies ; Boundary currents ; Jets
    Repository Name: Woods Hole Open Access Server
    Type: Article
    Format: application/pdf
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 21
    Publication Date: 2022-05-25
    Description: Author Posting. © American Meteorological Society, 2013. This article is posted here by permission of American Meteorological Society for personal use, not for redistribution. The definitive version was published in Journal of Physical Oceanography 43 (2013): 1611–1626, doi:10.1175/JPO-D-12-0204.1.
    Description: A new method is proposed for extrapolating subsurface velocity and density fields from sea surface density and sea surface height (SSH). In this, the surface density is linked to the subsurface fields via the surface quasigeostrophic (SQG) formalism, as proposed in several recent papers. The subsurface field is augmented by the addition of the barotropic and first baroclinic modes, whose amplitudes are determined by matching to the sea surface height (pressure), after subtracting the SQG contribution. An additional constraint is that the bottom pressure anomaly vanishes. The method is tested for three regions in the North Atlantic using data from a high-resolution numerical simulation. The decomposition yields strikingly realistic subsurface fields. It is particularly successful in energetic regions like the Gulf Stream extension and at high latitudes where the mixed layer is deep, but it also works in less energetic eastern subtropics. The demonstration highlights the possibility of reconstructing three-dimensional oceanic flows using a combination of satellite fields, for example, sea surface temperature (SST) and SSH, and sparse (or climatological) estimates of the regional depth-resolved density. The method could be further elaborated to integrate additional subsurface information, such as mooring measurements.
    Description: JW and AM were supported by NASA (NNX12AD47G) and NSF (OCE 0928617). JLM was supported by the Office of Naval Research and the Office of Science (BER), U.S. Department of Energy under DE-GF0205ER64119. GRF is supported by OCE-0752346 and JHL by NORSEE (Nordic Seas Eddy Exchanges) funded by the Norwegian Research Council.
    Description: 2014-02-01
    Keywords: Eddies ; Ocean dynamics ; Potential vorticity ; Surface pressure ; Surface temperature ; Inverse methods
    Repository Name: Woods Hole Open Access Server
    Type: Article
    Format: application/pdf
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 22
    Publication Date: 2022-05-26
    Description: Author Posting. © American Meteorological Society, 2014. This article is posted here by permission of American Meteorological Society for personal use, not for redistribution. The definitive version was published in Journal of Physical Oceanography 44 (2014): 2593–2616, doi:10.1175/JPO-D-13-0120.1.
    Description: The first direct estimate of the rate at which geostrophic turbulence mixes tracers across the Antarctic Circumpolar Current is presented. The estimate is computed from the spreading of a tracer released upstream of Drake Passage as part of the Diapycnal and Isopycnal Mixing Experiment in the Southern Ocean (DIMES). The meridional eddy diffusivity, a measure of the rate at which the area of the tracer spreads along an isopycnal across the Antarctic Circumpolar Current, is 710 ± 260 m2 s−1 at 1500-m depth. The estimate is based on an extrapolation of the tracer-based diffusivity using output from numerical tracers released in a one-twentieth of a degree model simulation of the circulation and turbulence in the Drake Passage region. The model is shown to reproduce the observed spreading rate of the DIMES tracer and suggests that the meridional eddy diffusivity is weak in the upper kilometer of the water column with values below 500 m2 s−1 and peaks at the steering level, near 2 km, where the eddy phase speed is equal to the mean flow speed. These vertical variations are not captured by ocean models presently used for climate studies, but they significantly affect the ventilation of different water masses.
    Description: NSF support through Awards OCE-1233832, OCE-1232962, and OCE-1048926 is gratefully acknowledged.
    Description: 2015-04-01
    Keywords: Geographic location/entity ; Southern Ocean ; Circulation/ Dynamics ; Diffusion ; Eddies ; Ocean circulation ; Turbulence ; Physical Meteorology and Climatology ; Isopycnal mixing
    Repository Name: Woods Hole Open Access Server
    Type: Article
    Format: application/pdf
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 23
    Publication Date: 2022-05-26
    Description: Author Posting. © American Meteorological Society, 2014. This article is posted here by permission of American Meteorological Society for personal use, not for redistribution. The definitive version was published in Journal of Climate 27 (2014): 2842–2860, doi:10.1175/JCLI-D-13-00227.1.
    Description: Mooring measurements from the Kuroshio Extension System Study (June 2004–June 2006) and from the ongoing Kuroshio Extension Observatory (June 2004–present) are combined with float measurements of the Argo network to study the variability of the North Pacific Subtropical Mode Water (STMW) across the entire gyre, on time scales from days, to seasons, to a decade. The top of the STMW follows a seasonal cycle, although observations reveal that it primarily varies in discrete steps associated with episodic wind events. The variations of the STMW bottom depth are tightly related to the sea surface height (SSH), reflecting mesoscale eddies and large-scale variations of the Kuroshio Extension and recirculation gyre systems. Using the observed relationship between SSH and STMW, gridded SSH products and in situ estimates from floats are used to construct weekly maps of STMW thickness, providing nonbiased estimates of STMW total volume, annual formation and erosion volumes, and seasonal and interannual variability for the past decade. Year-to-year variations are detected, particularly a significant decrease of STMW volume in 2007–10 primarily attributable to a smaller volume formed. Variability of the heat content in the mode water region is dominated by the seasonal cycle and mesoscale eddies; there is only a weak link to STMW on interannual time scales, and no long-term trends in heat content and STMW thickness between 2002 and 2011 are detected. Weak lagged correlations among air–sea fluxes, oceanic heat content, and STMW thickness are found when averaged over the northwestern Pacific recirculation gyre region.
    Description: This work was sponsored by the National Science Foundation (Grants OCE-0220161, OCE-0825152, and OCE-0827125).
    Description: 2014-10-15
    Keywords: Atmosphere-ocean interaction ; Mesoscale processes ; Mesoscale systems ; Ocean dynamics ; Eddies ; Water masses
    Repository Name: Woods Hole Open Access Server
    Type: Article
    Format: application/pdf
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 24
    facet.materialart.
    Unknown
    American Meteorological Society
    Publication Date: 2022-05-26
    Description: Author Posting. © American Meteorological Society, 2012. This article is posted here by permission of American Meteorological Society for personal use, not for redistribution. The definitive version was published in Journal of Physical Oceanography 42 (2012): 1684–1700, doi:10.1175/JPO-D-11-0230.1.
    Description: The influences of precipitation on water mass transformation and the strength of the meridional overturning circulation in marginal seas are studied using theoretical and idealized numerical models. Nondimensional equations are developed for the temperature and salinity anomalies of deep convective water masses, making explicit their dependence on both geometric parameters such as basin area, sill depth, and latitude, as well as on the strength of atmospheric forcing. In addition to the properties of the convective water, the theory also predicts the magnitude of precipitation required to shut down deep convection and switch the circulation into the haline mode. High-resolution numerical model calculations compare well with the theory for the properties of the convective water mass, the strength of the meridional overturning circulation, and also the shutdown of deep convection. However, the numerical model also shows that, for precipitation levels that exceed this critical threshold, the circulation retains downwelling and northward heat transport, even in the absence of deep convection.
    Description: This study was supported by the National Science Foundation underGrantsOCE-0850416, OCE-0959381, andOCE-0859381.
    Description: 2013-04-01
    Keywords: Boundary currents ; Deep convection ; Eddies ; Meridional overturning circulation ; Ocean dynamics ; Stability
    Repository Name: Woods Hole Open Access Server
    Type: Article
    Format: application/pdf
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 25
    Publication Date: 2013-09-26
    Description: Tandem repeats (TRs) are often present in proteins with crucial functions, responsible for resistance, pathogenicity and associated with infectious or neurodegenerative diseases. This motivates numerous studies of TRs and their evolution, requiring accurate multiple sequence alignment. TRs may be lost or inserted at any position of a TR region by replication slippage or recombination, but current methods assume fixed unit boundaries, and yet are of high complexity. We present a new global graph-based alignment method that does not restrict TR unit indels by unit boundaries. TR indels are modeled separately and penalized using the phylogeny-aware alignment algorithm. This ensures enhanced accuracy of reconstructed alignments, disentangling TRs and measuring indel events and rates in a biologically meaningful way. Our method detects not only duplication events but also all changes in TR regions owing to recombination, strand slippage and other events inserting or deleting TR units. We evaluate our method by simulation incorporating TR evolution, by either sampling TRs from a profile hidden Markov model or by mimicking strand slippage with duplications. The new method is illustrated on a family of type III effectors, a pathogenicity determinant in agriculturally important bacteria Ralstonia solanacearum. We show that TR indel rate variation contributes to the diversification of this protein family.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 26
    Publication Date: 2013-06-08
    Description: The introduction of next generation sequencing methods in genome studies has made it possible to shift research from a gene-centric approach to a genome wide view. Although methods and tools to detect single nucleotide polymorphisms are becoming more mature, methods to identify and visualize structural variation (SV) are still in their infancy. Most genome browsers can only compare a given sequence to a reference genome; therefore, direct comparison of multiple individuals still remains a challenge. Therefore, the implementation of efficient approaches to explore and visualize SVs and directly compare two or more individuals is desirable. In this article, we present a visualization approach that uses space-filling Hilbert curves to explore SVs based on both read-depth and pair-end information. An interactive open-source Java application, called Meander , implements the proposed methodology, and its functionality is demonstrated using two cases. With Meander , users can explore variations at different levels of resolution and simultaneously compare up to four different individuals against a common reference. The application was developed using Java version 1.6 and Processing.org and can be run on any platform. It can be found at http://homes.esat.kuleuven.be/~bioiuser/meander .
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 27
    Publication Date: 2014-11-07
    Description: A new functional gene database, FOAM (Functional Ontology Assignments for Metagenomes), was developed to screen environmental metagenomic sequence datasets. FOAM provides a new functional ontology dedicated to classify gene functions relevant to environmental microorganisms based on Hidden Markov Models (HMMs). Sets of aligned protein sequences (i.e. ‘profiles’) were tailored to a large group of target KEGG Orthologs (KOs) from which HMMs were trained. The alignments were checked and curated to make them specific to the targeted KO. Within this process, sequence profiles were enriched with the most abundant sequences available to maximize the yield of accurate classifier models. An associated functional ontology was built to describe the functional groups and hierarchy. FOAM allows the user to select the target search space before HMM-based comparison steps and to easily organize the results into different functional categories and subcategories. FOAM is publicly available at http://portal.nersc.gov/project/m1317/FOAM/ .
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 28
    Publication Date: 2014-11-28
    Description: It is now known that unwanted noise and unmodeled artifacts such as batch effects can dramatically reduce the accuracy of statistical inference in genomic experiments. These sources of noise must be modeled and removed to accurately measure biological variability and to obtain correct statistical inference when performing high-throughput genomic analysis. We introduced surrogate variable analysis (sva) for estimating these artifacts by (i) identifying the part of the genomic data only affected by artifacts and (ii) estimating the artifacts with principal components or singular vectors of the subset of the data matrix. The resulting estimates of artifacts can be used in subsequent analyses as adjustment factors to correct analyses. Here I describe a version of the sva approach specifically created for count data or FPKMs from sequencing experiments based on appropriate data transformation. I also describe the addition of supervised sva (ssva) for using control probes to identify the part of the genomic data only affected by artifacts. I present a comparison between these versions of sva and other methods for batch effect estimation on simulated data, real count-based data and FPKM-based data. These updates are available through the sva Bioconductor package and I have made fully reproducible analysis using these methods available from: https://github.com/jtleek/svaseq .
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 29
    Publication Date: 2014-11-28
    Description: High-throughput techniques have considerably increased the potential of comparative genomics whilst simultaneously posing many new challenges. One of those challenges involves efficiently mining the large amount of data produced and exploring the landscape of both conserved and idiosyncratic genomic regions across multiple genomes. Domains of application of these analyses are diverse: identification of evolutionary events, inference of gene functions, detection of niche-specific genes or phylogenetic profiling. Insyght is a comparative genomic visualization tool that combines three complementary displays: (i) a table for thoroughly browsing amongst homologues, (ii) a comparator of orthologue functional annotations and (iii) a genomic organization view designed to improve the legibility of rearrangements and distinctive loci. The latter display combines symbolic and proportional graphical paradigms. Synchronized navigation across multiple species and interoperability between the views are core features of Insyght. A gene filter mechanism is provided that helps the user to build a biologically relevant gene set according to multiple criteria such as presence/absence of homologues and/or various annotations. We illustrate the use of Insyght with scenarios. Currently, only Bacteria and Archaea are supported. A public instance is available at http://genome.jouy.inra.fr/Insyght . The tool is freely downloadable for private data set analysis.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 30
    Publication Date: 2014-11-28
    Description: The 54 promoters are unique in prokaryotic genome and responsible for transcripting carbon and nitrogen-related genes. With the avalanche of genome sequences generated in the postgenomic age, it is highly desired to develop automated methods for rapidly and effectively identifying the 54 promoters. Here, a predictor called ‘ iPro54-PseKNC ’ was developed. In the predictor, the samples of DNA sequences were formulated by a novel feature vector called ‘pseudo k -tuple nucleotide composition’, which was further optimized by the incremental feature selection procedure. The performance of iPro54-PseKNC was examined by the rigorous jackknife cross-validation tests on a stringent benchmark data set. As a user-friendly web-server, iPro54-PseKNC is freely accessible at http://lin.uestc.edu.cn/server/iPro54-PseKNC . For the convenience of the vast majority of experimental scientists, a step-by-step protocol guide was provided on how to use the web-server to get the desired results without the need to follow the complicated mathematics that were presented in this paper just for its integrity. Meanwhile, we also discovered through an in-depth statistical analysis that the distribution of distances between the transcription start sites and the translation initiation sites were governed by the gamma distribution, which may provide a fundamental physical principle for studying the 54 promoters.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 31
    Publication Date: 2014-11-28
    Description: We present a discriminative learning method for pattern discovery of binding sites in nucleic acid sequences based on hidden Markov models. Sets of positive and negative example sequences are mined for sequence motifs whose occurrence frequency varies between the sets. The method offers several objective functions, but we concentrate on mutual information of condition and motif occurrence. We perform a systematic comparison of our method and numerous published motif-finding tools. Our method achieves the highest motif discovery performance, while being faster than most published methods. We present case studies of data from various technologies, including ChIP-Seq, RIP-Chip and PAR-CLIP, of embryonic stem cell transcription factors and of RNA-binding proteins, demonstrating practicality and utility of the method. For the alternative splicing factor RBM10, our analysis finds motifs known to be splicing-relevant. The motif discovery method is implemented in the free software package Discrover. It is applicable to genome- and transcriptome-scale data, makes use of available repeat experiments and aside from binary contrasts also more complex data configurations can be utilized.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 32
    Publication Date: 2013-02-20
    Description: While it has been long recognized that genes are not randomly positioned along the genome, the degree to which its 3D structure influences the arrangement of genes has remained elusive. In particular, several lines of evidence suggest that actively transcribed genes are spatially co-localized, forming transcription factories; however, a generalized systematic test has hitherto not been described. Here we reveal transcription factories using a rigorous definition of genomic structure based on Saccharomyces cerevisiae chromosome conformation capture data, coupled with an experimental design controlling for the primary gene order. We develop a data-driven method for the interpolation and the embedding of such datasets and introduce statistics that enable the comparison of the spatial and genomic densities of genes. Combining these, we report evidence that co-regulated genes are clustered in space, beyond their observed clustering in the context of gene order along the genome and show this phenomenon is significant for 64 out of 117 transcription factors. Furthermore, we show that those transcription factors with high spatially co-localized targets are expressed higher than those whose targets are not spatially clustered. Collectively, our results support the notion that, at a given time, the physical density of genes is intimately related to regulatory activity.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 33
    Publication Date: 2012-12-14
    Description: Pan-genome ortholog clustering tool ( PanOCT ) is a tool for pan-genomic analysis of closely related prokaryotic species or strains. PanOCT uses conserved gene neighborhood information to separate recently diverged paralogs into orthologous clusters where homology-only clustering methods cannot. The results from PanOCT and three commonly used graph-based ortholog-finding programs were compared using a set of four publicly available strains of the same bacterial species. All four methods agreed on ~70% of the clusters and ~86% of the proteins. The clusters that did not agree were inspected for evidence of correctness resulting in 85 high-confidence manually curated clusters that were used to compare all four methods.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 34
    Publication Date: 2012-10-10
    Description: A novel ab initio parameter-tuning-free system to identify transcriptional factor (TF) binding motifs (TFBMs) in genome DNA sequences was developed. It is based on the comparison of two types of frequency distributions with respect to the TFBM candidates in the target DNA sequences and the non-candidates in the background sequence, with the latter generated by utilizing the intergenic sequences. For benchmark tests, we used DNA sequence datasets extracted by ChIP-on-chip and ChIP-seq techniques and identified 65 yeast and four mammalian TFBMs, with the latter including gaps. The accuracy of our system was compared with those of other available programs (i.e. MEME, Weeder, BioProspector, MDscan and DME) and was the best among them, even without tuning of the parameter set for each TFBM and pre-treatment/editing of the target DNA sequences. Moreover, with respect to some TFs for which the identified motifs are inconsistent with those in the references, our results were revealed to be correct, by comparing them with other existing experimental data. Thus, our identification system does not need any other biological information except for gene positions, and is also expected to be applicable to genome DNA sequences to identify unknown TFBMs as well as known ones.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 35
    Publication Date: 2012-10-10
    Description: MicroRNAs (miRNAs) are major regulators of gene expression in multicellular organisms. They recognize their targets by sequence complementarity and guide them to cleavage or translational arrest. It is generally accepted that plant miRNAs have extensive complementarity to their targets and their prediction usually relies on the use of empirical parameters deduced from known miRNA–target interactions. Here, we developed a strategy to identify miRNA targets which is mainly based on the conservation of the potential regulation in different species. We applied the approach to expressed sequence tags datasets from angiosperms. Using this strategy, we predicted many new interactions and experimentally validated previously unknown miRNA targets in Arabidopsis thaliana . Newly identified targets that are broadly conserved include auxin regulators, transcription factors and transporters. Some of them might participate in the same pathways as the targets known before, suggesting that some miRNAs might control different aspects of a biological process. Furthermore, this approach can be used to identify targets present in a specific group of species, and, as a proof of principle, we analyzed Solanaceae -specific targets. The presented strategy can be used alone or in combination with other approaches to find miRNA targets in plants.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 36
    Publication Date: 2012-04-15
    Description: We address the challenge of regulatory sequence alignment with a new method, Pro-Coffee, a multiple aligner specifically designed for homologous promoter regions. Pro-Coffee uses a dinucleotide substitution matrix estimated on alignments of functional binding sites from TRANSFAC. We designed a validation framework using several thousand families of orthologous promoters. This dataset was used to evaluate the accuracy for predicting true human orthologs among their paralogs. We found that whereas other methods achieve on average 73.5% accuracy, and 77.6% when trained on that same dataset, the figure goes up to 80.4% for Pro-Coffee. We then applied a novel validation procedure based on multi-species ChIP-seq data. Trained and untrained methods were tested for their capacity to correctly align experimentally detected binding sites. Whereas the average number of correctly aligned sites for two transcription factors is 284 for default methods and 316 for trained methods, Pro-Coffee achieves 331, 16.5% above the default average. We find a high correlation between a method's performance when classifying orthologs and its ability to correctly align proven binding sites. Not only has this interesting biological consequences, it also allows us to conclude that any method that is trained on the ortholog data set will result in functionally more informative alignments.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 37
    Publication Date: 2012-04-15
    Description: MCScan is an algorithm able to scan multiple genomes or subgenomes in order to identify putative homologous chromosomal regions, and align these regions using genes as anchors. The MCScanX toolkit implements an adjusted MCScan algorithm for detection of synteny and collinearity that extends the original software by incorporating 14 utility programs for visualization of results and additional downstream analyses. Applications of MCScanX to several sequenced plant genomes and gene families are shown as examples. MCScanX can be used to effectively analyze chromosome structural changes, and reveal the history of gene family expansions that might contribute to the adaptation of lineages and taxa. An integrated view of various modes of gene duplication can supplement the traditional gene tree analysis in specific families. The source code and documentation of MCScanX are freely available at http://chibba.pgml.uga.edu/mcscan2/ .
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 38
    Publication Date: 2012-07-22
    Description: Cytosines in genomic DNA are sometimes methylated. This affects many biological processes and diseases. The standard way of measuring methylation is to use bisulfite, which converts unmethylated cytosines to thymines, then sequence the DNA and compare it to a reference genome sequence. We describe a method for the critical step of aligning the DNA reads to the correct genomic locations. Our method builds on classic alignment techniques, including likelihood-ratio scores and spaced seeds. In a realistic benchmark, our method has a better combination of sensitivity, specificity and speed than nine other high-throughput bisulfite aligners. This study enables more accurate and rational analysis of DNA methylation. It also illustrates how to adapt general-purpose alignment methods to a special case with distorted base patterns: this should be informative for other special cases such as ancient DNA and AT-rich genomes.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 39
    Publication Date: 2012-09-13
    Description: Prophages are phages in lysogeny that are integrated into, and replicated as part of, the host bacterial genome. These mobile elements can have tremendous impact on their bacterial hosts’ genomes and phenotypes, which may lead to strain emergence and diversification, increased virulence or antibiotic resistance. However, finding prophages in microbial genomes remains a problem with no definitive solution. The majority of existing tools rely on detecting genomic regions enriched in protein-coding genes with known phage homologs, which hinders the de novo discovery of phage regions. In this study, a weighted phage detection algorithm, PhiSpy was developed based on seven distinctive characteristics of prophages, i.e. protein length, transcription strand directionality, customized AT and GC skew, the abundance of unique phage words, phage insertion points and the similarity of phage proteins. The first five characteristics are capable of identifying prophages without any sequence similarity with known phage genes. PhiSpy locates prophages by ranking genomic regions enriched in distinctive phage traits, which leads to the successful prediction of 94% of prophages in 50 complete bacterial genomes with a 6% false-negative rate and a 0.66% false-positive rate.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 40
    Publication Date: 2012-06-06
    Description: Messenger RNA sequences possess specific nucleotide patterns distinguishing them from non-coding genomic sequences. In this study, we explore the utilization of modified Markov models to analyze sequences up to 44 bp, far beyond the 8-bp limit of conventional Markov models, for exon/intron discrimination. In order to analyze nucleotide sequences of this length, their information content is first reduced by conversion into shorter binary patterns via the application of numerous abstraction schemes. After the conversion of genomic sequences to binary strings, homogenous Markov models trained on the binary sequences are used to discriminate between exons and introns. We term this approach the Binary Abstraction Markov Model (BAMM). High-quality abstraction schemes for exon/intron discrimination are selected using optimization algorithms on supercomputers. The best MM classifiers are then combined using support vector machines into a single classifier. With this approach, over 95% classification accuracy is achieved without taking reading frame into account. With further development, the BAMM approach can be applied to sequences lacking the genetic code such as ncRNAs and 5'-untranslated regions.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 41
    Publication Date: 2012-05-13
    Description: Insertional mutagenesis screens in mice are used to identify individual genes that drive tumor formation. In these screens, candidate cancer genes are identified if their genomic location is proximal to a common insertion site (CIS) defined by high rates of transposon or retroviral insertions in a given genomic window. In this article, we describe a new method for defining CISs based on a Poisson distribution, the Poisson Regression Insertion Model, and show that this new method is an improvement over previously described methods. We also describe a modification of the method that can identify pairs and higher orders of co-occurring common insertion sites. We apply these methods to two data sets, one generated in a transposon-based screen for gastrointestinal tract cancer genes and another based on the set of retroviral insertions in the Retroviral Tagged Cancer Gene Database. We show that the new methods identify more relevant candidate genes and candidate gene pairs than found using previous methods. Identification of the biologically relevant set of mutations that occur in a single cell and cause tumor progression will aid in the rational design of single and combinatorial therapies in the upcoming age of personalized cancer therapy.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 42
    Publication Date: 2014-05-01
    Description: Molecular stratification of tumors is essential for developing personalized therapies. Although patient stratification strategies have been successful; computational methods to accurately translate the gene-signature from high-throughput platform to a clinically adaptable low-dimensional platform are currently lacking. Here, we describe PIGExClass (platform-independent isoform-level gene-expression based classification-system), a novel computational approach to derive and then transfer gene-signatures from one analytical platform to another. We applied PIGExClass to design a reverse transcriptase-quantitative polymerase chain reaction (RT-qPCR) based molecular-subtyping assay for glioblastoma multiforme (GBM), the most aggressive primary brain tumors. Unsupervised clustering of TCGA (the Cancer Genome Altas Consortium) GBM samples, based on isoform-level gene-expression profiles, recaptured the four known molecular subgroups but switched the subtype for 19% of the samples, resulting in significant ( P = 0.0103) survival differences among the refined subgroups. PIGExClass derived four-class classifier, which requires only 121 transcript-variants, assigns GBM patients’ molecular subtype with 92% accuracy. This classifier was translated to an RT-qPCR assay and validated in an independent cohort of 206 GBM samples. Our results demonstrate the efficacy of PIGExClass in the design of clinically adaptable molecular subtyping assay and have implications for developing robust diagnostic assays for cancer patient stratification.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 43
    Publication Date: 2014-05-01
    Description: The ability to correlate chromosome conformation and gene expression gives a great deal of information regarding the strategies used by a cell to properly regulate gene activity. 4C-Seq is a relatively new and increasingly popular technology where the set of genomic interactions generated by a single point in the genome can be determined. 4C-Seq experiments generate large, complicated data sets and it is imperative that signal is properly distinguished from noise. Currently, there are a limited number of methods for analyzing 4C-Seq data. Here, we present a new method, fourSig , which in addition to being precise and simple to use also includes a new feature that prioritizes detected interactions. Our results demonstrate the efficacy of fourSig with previously published and novel 4C-Seq data sets and show that our significance prioritization correlates with the ability to reproducibly detect interactions among replicates.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 44
    Publication Date: 2014-04-03
    Description: Alternative transcript processing is an important mechanism for generating functional diversity in genes. However, little is known about the precise functions of individual isoforms. In fact, proteins (translated from transcript isoforms), not genes, are the function carriers. By integrating multiple human RNA-seq data sets, we carried out the first systematic prediction of isoform functions, enabling high-resolution functional annotation of human transcriptome. Unlike gene function prediction, isoform function prediction faces a unique challenge: the lack of the training data—all known functional annotations are at the gene level. To address this challenge, we modelled the gene–isoform relationships as multiple instance data and developed a novel label propagation method to predict functions. Our method achieved an average area under the receiver operating characteristic curve of 0.67 and assigned functions to 15 572 isoforms. Interestingly, we observed that different functions have different sensitivities to alternative isoform processing, and that the function diversity of isoforms from the same gene is positively correlated with their tissue expression diversity. Finally, we surveyed the literature to validate our predictions for a number of apoptotic genes. Strikingly, for the famous ‘TP53’ gene, we not only accurately identified the apoptosis regulation function of its five isoforms, but also correctly predicted the precise direction of the regulation.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 45
    Publication Date: 2012-03-29
    Description: Broadly, computational approaches for ortholog assignment is a three steps process: (i) identify all putative homologs between the genomes, (ii) identify gene anchors and (iii) link anchors to identify best gene matches given their order and context. In this article, we engineer two methods to improve two important aspects of this pipeline [specifically steps (ii) and (iii)]. First, computing sequence similarity data [step (i)] is a computationally intensive task for large sequence sets, creating a bottleneck in the ortholog assignment pipeline. We have designed a fast and highly scalable sort-join method (afree) based on k -mer counts to rapidly compare all pairs of sequences in a large protein sequence set to identify putative homologs. Second, availability of complex genomes containing large gene families with prevalence of complex evolutionary events, such as duplications, has made the task of assigning orthologs and co-orthologs difficult. Here, we have developed an iterative graph matching strategy where at each iteration the best gene assignments are identified resulting in a set of orthologs and co-orthologs. We find that the afree algorithm is faster than existing methods and maintains high accuracy in identifying similar genes. The iterative graph matching strategy also showed high accuracy in identifying complex gene relationships. Standalone afree available from http://vbc.med.monash.edu.au/~kmahmood/afree . EGM2, complete ortholog assignment pipeline (including afree and the iterative graph matching method) available from http://vbc.med.monash.edu.au/~kmahmood/EGM2 .
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 46
    Publication Date: 2012-03-29
    Description: With the availability of next-generation sequencing (NGS) technology, it is expected that sequence variants may be called on a genomic scale. Here, we demonstrate that a deeper understanding of the distribution of the variant call frequencies at heterozygous loci in NGS data sets is a prerequisite for sensitive variant detection. We model the crucial steps in an NGS protocol as a stochastic branching process and derive a mathematical framework for the expected distribution of alleles at heterozygous loci before measurement that is sequencing. We confirm our theoretical results by analyzing technical replicates of human exome data and demonstrate that the variance of allele frequencies at heterozygous loci is higher than expected by a simple binomial distribution. Due to this high variance, mutation callers relying on binomial distributed priors are less sensitive for heterozygous variants that deviate strongly from the expected mean frequency. Our results also indicate that error rates can be reduced to a greater degree by technical replicates than by increasing sequencing depth.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 47
    Publication Date: 2012-03-14
    Description: An approach to infer the unknown microbial population structure within a metagenome is to cluster nucleotide sequences based on common patterns in base composition, otherwise referred to as binning. When functional roles are assigned to the identified populations, a deeper understanding of microbial communities can be attained, more so than gene-centric approaches that explore overall functionality. In this study, we propose an unsupervised, model-based binning method with two clustering tiers, which uses a novel transformation of the oligonucleotide frequency-derived error gradient and GC content to generate coarse groups at the first tier of clustering; and tetranucleotide frequency to refine these groups at the secondary clustering tier. The proposed method has a demonstrated improvement over PhyloPythia, S-GSOM, TACOA and TaxSOM on all three benchmarks that were used for evaluation in this study. The proposed method is then applied to a pyrosequenced metagenomic library of mud volcano sediment sampled in southwestern Taiwan, with the inferred population structure validated against complementary sequencing of 16S ribosomal RNA marker genes. Finally, the proposed method was further validated against four publicly available metagenomes, including a highly complex Antarctic whale-fall bone sample, which was previously assumed to be too complex for binning prior to functional analysis.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 48
    Publication Date: 2012-02-17
    Description: We introduce the software tool NTRFinder to search for a complex repetitive structure in DNA we call a nested tandem repeat (NTR). An NTR is a recurrence of two or more distinct tandem motifs interspersed with each other. We propose that NTRs can be used as phylogenetic and population markers. We have tested our algorithm on both real and simulated data, and present some real NTRs of interest. NTRFinder can be downloaded from http://www.maths.otago.ac.nz/~aamatroud/ .
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 49
    Publication Date: 2014-10-10
    Description: Parallel analysis of RNA ends (PARE) is a technique utilizing high-throughput sequencing to profile uncapped, mRNA cleavage or decay products on a genome-wide basis. Tools currently available to validate miRNA targets using PARE data employ only annotated genes, whereas important targets may be found in unannotated genomic regions. To handle such cases and to scale to the growing availability of PARE data and genomes, we developed a new tool, ‘ sPARTA ’ (small RNA-PARE target analyzer) that utilizes a built-in, plant-focused target prediction module (aka ‘ miRferno ’). sPARTA not only exhibits an unprecedented gain in speed but also it shows greater predictive power by validating more targets, compared to a popular alternative. In addition, the novel ‘seed-free’ mode, optimized to find targets irrespective of complementarity in the seed-region, identifies novel intergenic targets. To fully capitalize on the novelty and strengths of sPARTA , we developed a web resource, ‘ comPARE ’, for plant miRNA target analysis; this facilitates the systematic identification and analysis of miRNA-target interactions across multiple species, integrated with visualization tools. This collation of high-throughput small RNA and PARE datasets from different genomes further facilitates re-evaluation of existing miRNA annotations, resulting in a ‘cleaner’ set of microRNAs.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 50
    Publication Date: 2014-10-10
    Description: Identification of three-dimensional (3D) interactions between regulatory elements across the genome is crucial to unravel the complex regulatory machinery that orchestrates proliferation and differentiation of cells. ChIA-PET is a novel method to identify such interactions, where physical contacts between regions bound by a specific protein are quantified using next-generation sequencing. However, determining the significance of the observed interaction frequencies in such datasets is challenging, and few methods have been proposed. Despite the fact that regions that are close in linear genomic distance have a much higher tendency to interact by chance, no methods to date are capable of taking such dependency into account. Here, we propose a statistical model taking into account the genomic distance relationship, as well as the general propensity of anchors to be involved in contacts overall. Using both real and simulated data, we show that the previously proposed statistical test, based on Fisher's exact test, leads to invalid results when data are dependent on genomic distance. We also evaluate our method on previously validated cell-line specific and constitutive 3D interactions, and show that relevant interactions are significant, while avoiding over-estimating the significance of short nearby interactions.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 51
    Publication Date: 2014-10-10
    Description: Viral sequence classification has wide applications in clinical, epidemiological, structural and functional categorization studies. Most existing approaches rely on an initial alignment step followed by classification based on phylogenetic or statistical algorithms. Here we present an ultrafast alignment-free subtyping tool for human immunodeficiency virus type one (HIV-1) adapted from Prediction by Partial Matching compression. This tool, named COMET, was compared to the widely used phylogeny-based REGA and SCUEAL tools using synthetic and clinical HIV data sets (1 090 698 and 10 625 sequences, respectively). COMET's sensitivity and specificity were comparable to or higher than the two other subtyping tools on both data sets for known subtypes. COMET also excelled in detecting and identifying new recombinant forms, a frequent feature of the HIV epidemic. Runtime comparisons showed that COMET was almost as fast as USEARCH. This study demonstrates the advantages of alignment-free classification of viral sequences, which feature high rates of variation, recombination and insertions/deletions. COMET is free to use via an online interface.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 52
    Publication Date: 2014-11-28
    Description: Understanding how regulatory networks globally coordinate the response of a cell to changing conditions, such as perturbations by shifting environments, is an elementary challenge in systems biology which has yet to be met. Genome-wide gene expression measurements are high dimensional as these are reflecting the condition-specific interplay of thousands of cellular components. The integration of prior biological knowledge into the modeling process of systems-wide gene regulation enables the large-scale interpretation of gene expression signals in the context of known regulatory relations. We developed COGERE ( http://mips.helmholtz-muenchen.de/cogere ), a method for the inference of condition-specific gene regulatory networks in human and mouse. We integrated existing knowledge of regulatory interactions from multiple sources to a comprehensive model of prior information. COGERE infers condition-specific regulation by evaluating the mutual dependency between regulator (transcription factor or miRNA) and target gene expression using prior information. This dependency is scored by the non-parametric, nonlinear correlation coefficient 2 (eta squared) that is derived by a two-way analysis of variance. We show that COGERE significantly outperforms alternative methods in predicting condition-specific gene regulatory networks on simulated data sets. Furthermore, by inferring the cancer-specific gene regulatory network from the NCI-60 expression study, we demonstrate the utility of COGERE to promote hypothesis-driven clinical research.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 53
    Publication Date: 2014-12-17
    Description: Non-coding RNAs (ncRNAs) are known to play important functional roles in the cell. However, their identification and recognition in genomic sequences remains challenging. In silico methods, such as classification tools, offer a fast and reliable way for such screening and multiple classifiers have already been developed to predict well-defined subfamilies of RNA. So far, however, out of all the ncRNAs, only tRNA, miRNA and snoRNA can be predicted with a satisfying sensitivity and specificity. We here present ptRNApred , a tool to detect and classify subclasses of non-coding RNA that are involved in the regulation of post-transcriptional modifications or DNA replication, which we here call post-transcriptional RNA (ptRNA). It (i) detects RNA sequences coding for post-transcriptional RNA from the genomic sequence with an overall sensitivity of 91% and a specificity of 94% and (ii) predicts ptRNA-subclasses that exist in eukaryotes: snRNA, snoRNA, RNase P, RNase MRP, Y RNA or telomerase RNA. AVAILABILITY: The ptRNApred software is open for public use on http://www.ptrnapred.org/ .
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 54
    Publication Date: 2014-12-17
    Description: Rapid development of next generation sequencing technology has enabled the identification of genomic alterations from short sequencing reads. There are a number of software pipelines available for calling single nucleotide variants from genomic DNA but, no comprehensive pipelines to identify, annotate and prioritize expressed SNVs (eSNVs) from non-directional paired-end RNA-Seq data. We have developed the eSNV-Detect, a novel computational system, which utilizes data from multiple aligners to call, even at low read depths, and rank variants from RNA-Seq. Multi-platform comparisons with the eSNV-Detect variant candidates were performed. The method was first applied to RNA-Seq from a lymphoblastoid cell-line, achieving 99.7% precision and 91.0% sensitivity in the expressed SNPs for the matching HumanOmni2.5 BeadChip data. Comparison of RNA-Seq eSNV candidates from 25 ER+ breast tumors from The Cancer Genome Atlas (TCGA) project with whole exome coding data showed 90.6–96.8% precision and 91.6–95.7% sensitivity. Contrasting single-cell mRNA-Seq variants with matching traditional multicellular RNA-Seq data for the MD-MB231 breast cancer cell-line delineated variant heterogeneity among the single-cells. Further, Sanger sequencing validation was performed for an ER+ breast tumor with paired normal adjacent tissue validating 29 out of 31 candidate eSNVs. The source code and user manuals of the eSNV-Detect pipeline for Sun Grid Engine and virtual machine are available at http://bioinformaticstools.mayo.edu/research/esnv-detect/ .
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 55
    Publication Date: 2014-04-15
    Description: Heterogeneity in genetic networks across different signaling molecular contexts can suggest molecular regulatory mechanisms. Here we describe a comparative chi-square analysis (CP 2 ) method, considerably more flexible and effective than other alternatives, to screen large gene expression data sets for conserved and differential interactions. CP 2 decomposes interactions across conditions to assess homogeneity and heterogeneity. Theoretically, we prove an asymptotic chi-square null distribution for the interaction heterogeneity statistic. Empirically, on synthetic yeast cell cycle data, CP 2 achieved much higher statistical power in detecting differential networks than alternative approaches. We applied CP 2 to Drosophila melanogaster wing gene expression arrays collected under normal conditions, and conditions with overexpressed E2F and Cabut, two transcription factor complexes that promote ectopic cell cycling. The resulting differential networks suggest a mechanism by which E2F and Cabut regulate distinct gene interactions, while still sharing a small core network. Thus, CP 2 is sensitive in detecting network rewiring, useful in comparing related biological systems.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 56
    Publication Date: 2013-09-06
    Description: Protein-binding microarray (PBM) is a high-throughout platform that can measure the DNA-binding preference of a protein in a comprehensive and unbiased manner. A typical PBM experiment can measure binding signal intensities of a protein to all the possible DNA k-mers (k = 8 ~10); such comprehensive binding affinity data usually need to be reduced and represented as motif models before they can be further analyzed and applied. Since proteins can often bind to DNA in multiple modes, one of the major challenges is to decompose the comprehensive affinity data into multimodal motif representations. Here, we describe a new algorithm that uses Hidden Markov Models (HMMs) and can derive precise and multimodal motifs using belief propagations. We describe an HMM-based approach using belief propagations (kmerHMM), which accepts and preprocesses PBM probe raw data into median-binding intensities of individual k-mers. The k-mers are ranked and aligned for training an HMM as the underlying motif representation. Multiple motifs are then extracted from the HMM using belief propagations. Comparisons of kmerHMM with other leading methods on several data sets demonstrated its effectiveness and uniqueness. Especially, it achieved the best performance on more than half of the data sets. In addition, the multiple binding modes derived by kmerHMM are biologically meaningful and will be useful in interpreting other genome-wide data such as those generated from ChIP-seq. The executables and source codes are available at the authors’ websites: e.g. http://www.cs.toronto.edu/~wkc/kmerHMM .
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 57
    Publication Date: 2014-04-15
    Description: Sequence similarity search is a fundamental way of analyzing nucleotide sequences. Despite decades of research, this is not a solved problem because there exist many similarities that are not found by current methods. Search methods are typically based on a seed-and-extend approach, which has many variants (e.g. spaced seeds, transition seeds), and it remains unclear how to optimize this approach. This study designs and tests seeding methods for inter-mammal and inter-insect genome comparison. By considering substitution patterns of real genomes, we design sets of multiple complementary transition seeds, which have better performance (sensitivity per run time) than previous seeding strategies. Often the best seed patterns have more transition positions than those used previously. We also point out that recent computer memory sizes (e.g. 60 GB) make it feasible to use multiple (e.g. eight) seeds for whole mammal genomes. Interestingly, the most sensitive settings achieve diminishing returns for human–dog and melanogaster–pseudoobscura comparisons, but not for human–mouse, which suggests that we still miss many human–mouse alignments. Our optimized heuristics find ~20 000 new human–mouse alignments that are missing from the standard UCSC alignments. We tabulate seed patterns and parameters that work well so they can be used in future research.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 58
    Publication Date: 2014-04-15
    Description: Identifying differential features between conditions is a popular approach to understanding molecular features and their mechanisms underlying a biological process of particular interest. Although many tests for identifying differential expression of gene or gene sets have been proposed, there was limited success in developing methods for differential interactions of genes between conditions because of its computational complexity. We present a method for Evaluation of Dependency DifferentialitY (EDDY), which is a statistical test for differential dependencies of a set of genes between two conditions. Unlike previous methods focused on differential expression of individual genes or correlation changes of individual gene–gene interactions, EDDY compares two conditions by evaluating the probability distributions of dependency networks from genes. The method has been evaluated and compared with other methods through simulation studies, and application to glioblastoma multiforme data resulted in informative cancer and glioblastoma multiforme subtype-related findings. The comparison with Gene Set Enrichment Analysis, a differential expression-based method, revealed that EDDY identifies the gene sets that are complementary to those identified by Gene Set Enrichment Analysis. EDDY also showed much lower false positives than Gene Set Co-expression Analysis, a method based on correlation changes of individual gene–gene interactions, thus providing more informative results. The Java implementation of the algorithm is freely available to noncommercial users. Download from: http://biocomputing.tgen.org/software/EDDY .
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 59
    Publication Date: 2014-09-02
    Description: Inundation of evolutionary markers expedited in Human Genome Project and 1000 Genome Consortium has necessitated pruning of redundant and dependent variables. Various computational tools based on machine-learning and data-mining methods like feature selection/extraction have been proposed to escape the curse of dimensionality in large datasets. Incidentally, evolutionary studies, primarily based on sequentially evolved variations have remained un-facilitated by such advances till date. Here, we present a novel approach of recursive feature selection for hierarchical clustering of Y-chromosomal SNPs/haplogroups to select a minimal set of independent markers, sufficient to infer population structure as precisely as deduced by a larger number of evolutionary markers. To validate the applicability of our approach, we optimally designed MALDI-TOF mass spectrometry-based multiplex to accommodate independent Y-chromosomal markers in a single multiplex and genotyped two geographically distinct Indian populations. An analysis of 105 world-wide populations reflected that 15 independent variations/markers were optimal in defining population structure parameters, such as F ST , molecular variance and correlation-based relationship. A subsequent addition of randomly selected markers had a negligible effect (close to zero, i.e. 1 x 10 –3 ) on these parameters. The study proves efficient in tracing complex population structures and deriving relationships among world-wide populations in a cost-effective and expedient manner.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 60
    Publication Date: 2014-09-17
    Description: Developing a quantitative view of how biological pathways are regulated in response to environmental factors is central for understanding of disease phenotypes. We present a computational framework, named Multivariate Inference of Pathway Activity (MIPA), which quantifies degree of activity induced in a biological pathway by computing five distinct measures from transcriptomic profiles of its member genes. Statistical significance of inferred activity is examined using multiple independent self-contained tests followed by a competitive analysis. The method incorporates a new algorithm to identify a subset of genes that may regulate the extent of activity induced in a pathway. We present an in-depth evaluation of specificity, robustness, and reproducibility of our method. We benchmarked MIPA's false positive rate at less than 1%. Using transcriptomic profiles representing distinct physiological and disease states, we illustrate applicability of our method in (i) identifying gene–gene interactions in autophagy-dependent response to Salmonella infection, (ii) uncovering gene–environment interactions in host response to bacterial and viral pathogens and (iii) identifying driver genes and processes that contribute to wound healing and response to anti-TNFα therapy. We provide relevant experimental validation that corroborates the accuracy and advantage of our method.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 61
    Publication Date: 2014-09-17
    Description: Viral recombination is a key evolutionary mechanism, aiding escape from host immunity, contributing to changes in tropism and possibly assisting transmission across species barriers. The ability to determine whether recombination has occurred and to locate associated specific recombination junctions is thus of major importance in understanding emerging diseases and pathogenesis. This paper describes a method for determining recombinant mosaics (and their proportions) originating from two parent genomes, using high-throughput sequence data. The method involves setting the problem geometrically and the use of appropriately constrained quadratic programming. Recombinants of the honeybee deformed wing virus and the Varroa destructor virus-1 are inferred to illustrate the method from both siRNAs and reads sampling the viral genome population (cDNA library); our results are confirmed experimentally. Matlab software (MosaicSolver) is available.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 62
    Publication Date: 2013-06-08
    Description: An appreciable fraction of introns is thought to have some function, but there is no obvious way to predict which specific intron is likely to be functional. We hypothesize that functional introns experience a different selection regime than non-functional ones and will therefore show distinct evolutionary histories. In particular, we expect functional introns to be more resistant to loss, and that this would be reflected in high conservation of their position with respect to the coding sequence. To test this hypothesis, we focused on introns whose function comes about from microRNAs and snoRNAs that are embedded within their sequence. We built a data set of orthologous genes across 28 eukaryotic species, reconstructed the evolutionary histories of their introns and compared functional introns with the rest of the introns. We found that, indeed, the position of microRNA- and snoRNA-bearing introns is significantly more conserved. In addition, we found that both families of RNA genes settled within introns early during metazoan evolution. We identified several easily computable intronic properties that can be used to detect functional introns in general, thereby suggesting a new strategy to pinpoint non-coding cellular functions.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 63
    Publication Date: 2012-06-28
    Description: Identification of transcriptional regulatory regions and tracing their internal organization are important for understanding the eukaryotic cell machinery. Cis-regulatory modules (CRMs) of higher eukaryotes are believed to possess a regulatory ‘grammar’, or preferred arrangement of binding sites, that is crucial for proper regulation and thus tends to be evolutionarily conserved. Here, we present a method CORECLUST (COnservative REgulatory CLUster STructure) that predicts CRMs based on a set of positional weight matrices. Given regulatory regions of orthologous and/or co-regulated genes, CORECLUST constructs a CRM model by revealing the conserved rules that describe the relative location of binding sites. The constructed model may be consequently used for the genome-wide prediction of similar CRMs, and thus detection of co-regulated genes, and for the investigation of the regulatory grammar of the system. Compared with related methods, CORECLUST shows better performance at identification of CRMs conferring muscle-specific gene expression in vertebrates and early-developmental CRMs in Drosophila .
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 64
    Publication Date: 2012-08-23
    Description: The field of regulatory genomics today is characterized by the generation of high-throughput data sets that capture genome-wide transcription factor (TF) binding, histone modifications, or DNAseI hypersensitive regions across many cell types and conditions. In this context, a critical question is how to make optimal use of these publicly available datasets when studying transcriptional regulation. Here, we address this question in Drosophila melanogaster for which a large number of high-throughput regulatory datasets are available. We developed i-cisTarget (where the ‘ i ’ stands for integrative ), for the first time enabling the discovery of different types of enriched ‘regulatory features’ in a set of co-regulated sequences in one analysis, being either TF motifs or ‘ in vivo ’ chromatin features, or combinations thereof. We have validated our approach on 15 co-expressed gene sets, 21 ChIP data sets, 628 curated gene sets and multiple individual case studies, and show that meaningful regulatory features can be confidently discovered; that bona fide enhancers can be identified, both by in vivo events and by TF motifs; and that combinations of in vivo events and TF motifs further increase the performance of enhancer prediction.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 65
    Publication Date: 2013-11-21
    Description: Traditional methods that aim to identify biomarkers that distinguish between two groups, like Significance Analysis of Microarrays or the t -test, perform optimally when such biomarkers show homogeneous behavior within each group and differential behavior between the groups. However, in many applications, this is not the case. Instead, a subgroup of samples in one group shows differential behavior with respect to all other samples. To successfully detect markers showing such imbalanced patterns of differential signal, a different approach is required. We propose a novel method, specifically designed for the Detection of Imbalanced Differential Signal (DIDS). We use an artificial dataset and a human breast cancer dataset to measure its performance and compare it with three traditional methods and four approaches that take imbalanced signal into account. Supported by extensive experimental results, we show that DIDS outperforms all other approaches in terms of power and positive predictive value. In a mouse breast cancer dataset, DIDS is the only approach that detects a functionally validated marker of chemotherapy resistance. DIDS can be applied to any continuous value data, including gene expression data, and in any context where imbalanced differential signal is manifested.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 66
    Publication Date: 2014-08-01
    Description: Gene set enrichment testing can enhance the biological interpretation of ChIP-seq data. Here, we develop a method, ChIP-Enrich, for this analysis which empirically adjusts for gene locus length (the length of the gene body and its surrounding non-coding sequence). Adjustment for gene locus length is necessary because it is often positively associated with the presence of one or more peaks and because many biologically defined gene sets have an excess of genes with longer or shorter gene locus lengths. Unlike alternative methods, ChIP-Enrich can account for the wide range of gene locus length-to-peak presence relationships (observed in ENCODE ChIP-seq data sets). We show that ChIP-Enrich has a well-calibrated type I error rate using permuted ENCODE ChIP-seq data sets; in contrast, two commonly used gene set enrichment methods, Fisher's exact test and the binomial test implemented in Genomic Regions Enrichment of Annotations Tool (GREAT), can have highly inflated type I error rates and biases in ranking. We identify DNA-binding proteins, including CTCF, JunD and glucocorticoid receptor α (GRα), that show different enrichment patterns for peaks closer to versus further from transcription start sites. We also identify known and potential new biological functions of GRα. ChIP-Enrich is available as a web interface ( http://chip-enrich.med.umich.edu ) and Bioconductor package.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 67
    Publication Date: 2013-01-20
    Description: Identification of differentially expressed subnetworks from protein–protein interaction (PPI) networks has become increasingly important to our global understanding of the molecular mechanisms that drive cancer. Several methods have been proposed for PPI subnetwork identification, but the dependency among network member genes is not explicitly considered, leaving many important hub genes largely unidentified. We present a new method, based on a bagging Markov random field (BMRF) framework, to improve subnetwork identification for mechanistic studies of breast cancer. The method follows a maximum a posteriori principle to form a novel network score that explicitly considers pairwise gene interactions in PPI networks, and it searches for subnetworks with maximal network scores. To improve their robustness across data sets, a bagging scheme based on bootstrapping samples is implemented to statistically select high confidence subnetworks. We first compared the BMRF-based method with existing methods on simulation data to demonstrate its improved performance. We then applied our method to breast cancer data to identify PPI subnetworks associated with breast cancer progression and/or tamoxifen resistance. The experimental results show that not only an improved prediction performance can be achieved by the BMRF approach when tested on independent data sets, but biologically meaningful subnetworks can also be revealed that are relevant to breast cancer and tamoxifen resistance.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 68
    Publication Date: 2013-01-20
    Description: miRDeep and its varieties are widely used to quantify known and novel micro RNA (miRNA) from small RNA sequencing (RNAseq). This article describes miRDeep*, our integrated miRNA identification tool, which is modeled off miRDeep, but the precision of detecting novel miRNAs is improved by introducing new strategies to identify precursor miRNAs. miRDeep* has a user-friendly graphic interface and accepts raw data in FastQ and Sequence Alignment Map (SAM) or the binary equivalent (BAM) format. Known and novel miRNA expression levels, as measured by the number of reads, are displayed in an interface, which shows each RNAseq read relative to the pre-miRNA hairpin. The secondary pre-miRNA structure and read locations for each predicted miRNA are shown and kept in a separate figure file. Moreover, the target genes of known and novel miRNAs are predicted using the TargetScan algorithm, and the targets are ranked according to the confidence score. miRDeep* is an integrated standalone application where sequence alignment, pre-miRNA secondary structure calculation and graphical display are purely Java coded. This application tool can be executed using a normal personal computer with 1.5 GB of memory. Further, we show that miRDeep* outperformed existing miRNA prediction tools using our LNCaP and other small RNAseq datasets. miRDeep* is freely available online at http://www.australianprostatecentre.org/research/software/mirdeep-star .
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 69
    Publication Date: 2013-01-20
    Description: The mRNA export complex TREX (TREX) is known to contain Aly, UAP56, Tex1 and the THO complex, among which UAP56 is required for TREX assembly. Here, we systematically investigated the role of each human TREX component in TREX assembly and its association with the mRNA. We found that Tex1 is essentially a subunit of the THO complex. Aly, THO and UAP56 are all required for assembly of TREX, in which Aly directly interacts with THO subunits Thoc2 and Thoc5. Both Aly and THO function in linking UAP56 to the cap-binding protein CBP80. Interestingly, association of UAP56 with the spliced mRNA, but not with the pre-mRNA, requires Aly and THO. Unexpectedly, we found that Aly and THO require each other to associate with the spliced mRNA. Consistent with these biochemical results, similar to Aly and UAP56, THO plays critical roles in mRNA export. Together, we propose that Aly, THO and UAP56 form a highly integrated unit to associate with the spliced mRNA and function in mRNA export.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 70
    Publication Date: 2012-09-27
    Description: Due to advances in high-throughput biotechnologies biological information is being collected in databases at an amazing rate, requiring novel computational approaches that process collected data into new knowledge in a timely manner. In this study, we propose a computational framework for discovering modular structure, relationships and regularities in complex data. The framework utilizes a semantic-preserving vocabulary to convert records of biological annotations of an object, such as an organism, gene, chemical or sequence, into networks (Anets) of the associated annotations. An association between a pair of annotations in an Anet is determined by the similarity of their co-occurrence pattern with all other annotations in the data. This feature captures associations between annotations that do not necessarily co-occur with each other and facilitates discovery of the most significant relationships in the collected data through clustering and visualization of the Anet. To demonstrate this approach, we applied the framework to the analysis of metadata from the Genomes OnLine Database and produced a biological map of sequenced prokaryotic organisms with three major clusters of metadata that represent pathogens, environmental isolates and plant symbionts.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 71
    Publication Date: 2012-09-27
    Description: We describe here a novel method for integrating gene and miRNA expression profiles in cancer using feed-forward loops (FFLs) consisting of transcription factors (TFs), miRNAs and their common target genes. The dChip-GemiNI (Gene and miRNA Network-based Integration) method statistically ranks computationally predicted FFLs by their explanatory power to account for differential gene and miRNA expression between two biological conditions such as normal and cancer. GemiNI integrates not only gene and miRNA expression data but also computationally derived information about TF–target gene and miRNA–mRNA interactions. Literature validation shows that the integrated modeling of expression data and FFLs better identifies cancer-related TFs and miRNAs compared to existing approaches. We have utilized GemiNI for analyzing six data sets of solid cancers (liver, kidney, prostate, lung and germ cell) and found that top-ranked FFLs account for ~20% of transcriptome changes between normal and cancer. We have identified common FFL regulators across multiple cancer types, such as known FFLs consisting of MYC and miR-15/miR-17 families, and novel FFLs consisting of ARNT, CREB1 and their miRNA partners. The results and analysis web server are available at http://www.canevolve.org/dChip-GemiNi .
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 72
    Publication Date: 2012-10-24
    Description: Recent technology has made it possible to simultaneously perform multi-platform genomic profiling (e.g. DNA methylation (DM) and gene expression (GE)) of biological samples, resulting in so-called ‘multi-dimensional genomic data’. Such data provide unique opportunities to study the coordination between regulatory mechanisms on multiple levels. However, integrative analysis of multi-dimensional genomics data for the discovery of combinatorial patterns is currently lacking. Here, we adopt a joint matrix factorization technique to address this challenge. This method projects multiple types of genomic data onto a common coordinate system, in which heterogeneous variables weighted highly in the same projected direction form a multi-dimensional module (md-module). Genomic variables in such modules are characterized by significant correlations and likely functional associations. We applied this method to the DM, GE, and microRNA expression data of 385 ovarian cancer samples from the The Cancer Genome Atlas project. These md-modules revealed perturbed pathways that would have been overlooked with only a single type of data, uncovered associations between different layers of cellular activities and allowed the identification of clinically distinct patient subgroups. Our study provides an useful protocol for uncovering hidden patterns and their biological implications in multi-dimensional ‘omic’ data.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 73
    Publication Date: 2012-10-24
    Description: Tandem repeats occur frequently in biological sequences. They are important for studying genome evolution and human disease. A number of methods have been designed to detect a single tandem repeat in a sliding window. In this article, we focus on the case that an unknown number of tandem repeat segments of the same pattern are dispersively distributed in a sequence. We construct a probabilistic generative model for the tandem repeats, where the sequence pattern is represented by a motif matrix. A Bayesian approach is adopted to compute this model. Markov chain Monte Carlo (MCMC) algorithms are used to explore the posterior distribution as an effort to infer both the motif matrix of tandem repeats and the location of repeat segments. Reversible jump Markov chain Monte Carlo (RJMCMC) algorithms are used to address the transdimensional model selection problem raised by the variable number of repeat segments. Experiments on both synthetic data and real data show that this new approach is powerful in detecting dispersed short tandem repeats. As far as we know, it is the first work to adopt RJMCMC algorithms in the detection of tandem repeats.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 74
    Publication Date: 2012-11-04
    Description: Genomic experiments (e.g. differential gene expression, single-nucleotide polymorphism association) typically produce ranked list of genes. We present a simple but powerful approach which uses protein–protein interaction data to detect sub-networks within such ranked lists of genes or proteins. We performed an exhaustive study of network parameters that allowed us concluding that the average number of components and the average number of nodes per component are the parameters that best discriminate between real and random networks. A novel aspect that increases the efficiency of this strategy in finding sub-networks is that, in addition to direct connections, also connections mediated by intermediate nodes are considered to build up the sub-networks. The possibility of using of such intermediate nodes makes this approach more robust to noise. It also overcomes some limitations intrinsic to experimental designs based on differential expression, in which some nodes are invariant across conditions. The proposed approach can also be used for candidate disease-gene prioritization. Here, we demonstrate the usefulness of the approach by means of several case examples that include a differential expression analysis in Fanconi Anemia, a genome-wide association study of bipolar disorder and a genome-scale study of essentiality in cancer genes. An efficient and easy-to-use web interface (available at http://www.babelomics.org ) based on HTML5 technologies is also provided to run the algorithm and represent the network.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 75
    Publication Date: 2012-11-04
    Description: An important step in ‘metagenomics’ analysis is the assembly of multiple genomes from mixed sequence reads of multiple species in a microbial community. Most conventional pipelines use a single-genome assembler with carefully optimized parameters. A limitation of a single-genome assembler for de novo metagenome assembly is that sequences of highly abundant species are likely misidentified as repeats in a single genome, resulting in a number of small fragmented scaffolds. We extended a single-genome assembler for short reads, known as ‘Velvet’, to metagenome assembly, which we called ‘MetaVelvet’, for mixed short reads of multiple species. Our fundamental concept was to first decompose a de Bruijn graph constructed from mixed short reads into individual sub-graphs, and second, to build scaffolds based on each decomposed de Bruijn sub-graph as an isolate species genome. We made use of two features, the coverage (abundance) difference and graph connectivity, for the decomposition of the de Bruijn graph. For simulated datasets, MetaVelvet succeeded in generating significantly higher N50 scores than any single-genome assemblers. MetaVelvet also reconstructed relatively low-coverage genome sequences as scaffolds. On real datasets of human gut microbial read data, MetaVelvet produced longer scaffolds and increased the number of predicted genes.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 76
    Publication Date: 2012-11-04
    Description: Tandem repeats (TRs) represent one of the most prevalent features of genomic sequences. Due to their abundance and functional significance, a plethora of detection tools has been devised over the last two decades. Despite the longstanding interest, TR detection is still not resolved. Our large-scale tests reveal that current detectors produce different, often nonoverlapping inferences, reflecting characteristics of the underlying algorithms rather than the true distribution of TRs in genomic data. Our simulations show that the power of detecting TRs depends on the degree of their divergence, and repeat characteristics such as the length of the minimal repeat unit and their number in tandem. To reconcile the diverse predictions of current algorithms, we propose and evaluate several statistical criteria for measuring the quality of predicted repeat units. In particular, we propose a model-based phylogenetic classifier, entailing a maximum-likelihood estimation of the repeat divergence. Applied in conjunction with the state of the art detectors, our statistical classification scheme for inferred repeats allows to filter out false-positive predictions. Since different algorithms appear to specialize at predicting TRs with certain properties, we advise applying multiple detectors with subsequent filtering to obtain the most complete set of genuine repeats.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 77
    Publication Date: 2012-11-25
    Description: MicroRNAs (miRs) function primarily as post-transcriptional negative regulators of gene expression through binding to their mRNA targets. Reliable prediction of a miR’s targets is a considerable bioinformatic challenge of great importance for inferring the miR’s function. Sequence-based prediction algorithms have high false-positive rates, are not in agreement, and are not biological context specific. Here we introduce CoSMic (Context-Specific MicroRNA analysis), an algorithm that combines sequence-based prediction with miR and mRNA expression data. CoSMic differs from existing methods—it identifies miRs that play active roles in the specific biological system of interest and predicts with less false positives their functional targets. We applied CoSMic to search for miRs that regulate the migratory response of human mammary cells to epidermal growth factor (EGF) stimulation. Several such miRs, whose putative targets were significantly enriched by migration processes were identified. We tested three of these miRs experimentally, and showed that they indeed affected the migratory phenotype; we also tested three negative controls. In comparison to other algorithms CoSMic indeed filters out false positives and allows improved identification of context-specific targets. CoSMic can greatly facilitate miR research in general and, in particular, advance our understanding of individual miRs’ function in a specific context.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 78
    Publication Date: 2013-02-20
    Description: High-throughput sequencing is increasingly being used in combination with bisulfite (BS) assays to study DNA methylation at nucleotide resolution. Although several programmes provide genome-wide alignment of BS-treated reads, the resulting information is not readily interpretable and often requires further bioinformatic steps for meaningful analysis. Current post-alignment BS-sequencing programmes are generally focused on the gene-specific level, a restrictive feature when analysis in the non-coding regions, such as enhancers and intergenic microRNAs, is required. Here, we present Genome Bisulfite Sequencing Analyser (GBSA— http://ctrad-csi.nus.edu.sg/gbsa ), a free open-source software capable of analysing whole-genome bisulfite sequencing data with either a gene-centric or gene-independent focus. Through analysis of the largest published data sets to date, we demonstrate GBSA’s features in providing sequencing quality assessment, methylation scoring, functional data management and visualization of genomic methylation at nucleotide resolution. Additionally, we show that GBSA’s output can be easily integrated with other high-throughput sequencing data, such as RNA-Seq or ChIP-seq, to elucidate the role of methylated intergenic regions in gene regulation. In essence, GBSA allows an investigator to explore not only known loci but also all the genomic regions, for which methylation studies could lead to the discovery of new regulatory mechanisms.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 79
    Publication Date: 2013-02-20
    Description: Computationally identifying effective biomarkers for cancers from gene expression profiles is an important and challenging task. The challenge lies in the complicated pathogenesis of cancers that often involve the dysfunction of many genes and regulatory interactions. Thus, sophisticated classification model is in pressing need. In this study, we proposed an efficient approach, called ellipsoidFN (ellipsoid Feature Net), to model the disease complexity by ellipsoids and seek a set of heterogeneous biomarkers. Our approach achieves a non-linear classification scheme for the mixed samples by the ellipsoid concept, and at the same time uses a linear programming framework to efficiently select biomarkers from high-dimensional space. ellipsoidFN reduces the redundancy and improves the complementariness between the identified biomarkers, thus significantly enhancing the distinctiveness between cancers and normal samples, and even between cancer types. Numerical evaluation on real prostate cancer, breast cancer and leukemia gene expression datasets suggested that ellipsoidFN outperforms the state-of-the-art biomarker identification methods, and it can serve as a useful tool for cancer biomarker identification in the future. The Matlab code of ellipsoidFN is freely available from http://doc.aporc.org/wiki/EllipsoidFN .
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 80
    Publication Date: 2013-02-02
    Description: Designing effective antisense sequences is a formidable problem. A method for predicting efficacious antisense holds the potential to provide fundamental insight into this biophysical process. More practically, such an understanding increases the chance of successful antisense design as well as saving considerable time, money and labor. The secondary structure of an mRNA molecule is believed to be in a constant state of flux, sampling several different suboptimal states. We hypothesized that particularly volatile regions might provide better accessibility for antisense targeting. A computational framework, GenAVERT was developed to evaluate this hypothesis. GenAVERT used UNAFold and RNAforester to generate and compare the predicted suboptimal structures of mRNA sequences. Subsequent analysis revealed regions that were particularly volatile in terms of intramolecular hydrogen bonding, and thus potentially superior antisense targets due to their high accessibility. Several mRNA sequences with known natural antisense target sites as well as artificial antisense target sites were evaluated. Upon comparison, antisense sequences predicted based upon the volatility hypothesis closely matched those of the naturally occurring antisense, as well as those artificial target sites that provided efficient down-regulation. These results suggest that this strategy may provide a powerful new approach to antisense design.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 81
    Publication Date: 2013-02-02
    Description: Existence of some extra-genetic (epigenetic) codes has been postulated since the discovery of the primary genetic code. Evident effects of histone post-translational modifications or DNA methylation over the efficiency and the regulation of DNA processes are supporting this postulation. EMdeCODE is an original algorithm that approximate the genomic distribution of given DNA features (e.g. promoter, enhancer, viral integration) by identifying relevant ChIPSeq profiles of post-translational histone marks or DNA binding proteins and combining them in a supermark. EMdeCODE kernel is essentially a two-step procedure: (i) an expectation-maximization process calculates the mixture of epigenetic factors that maximize the Sensitivity (recall) of the association with the feature under study; (ii) the approximated density is then recursively trimmed with respect to a control dataset to increase the precision by reducing the number of false positives. EMdeCODE densities improve significantly the prediction of enhancer loci and retroviral integration sites with respect to previous methods. Importantly, it can also be used to extract distinctive factors between two arbitrary conditions. Indeed EMdeCODE identifies unexpected epigenetic profiles specific for coding versus non-coding RNA, pointing towards a new role for H3R2me1 in coding regions.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 82
    Publication Date: 2013-02-02
    Description: Insertion and deletion polymorphisms (indels) are an important source of genomic variation in plant and animal genomes, but accurate genotyping from low-coverage and exome next-generation sequence data remains challenging. We introduce an efficient population clustering algorithm for diploids and polyploids which was tested on a dataset of 2000 exomes. Compared with existing methods, we report a 4-fold reduction in overall indel genotype error rates with a 9-fold reduction in low coverage regions.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 83
    Publication Date: 2013-02-02
    Description: Conventional approaches to predict transcriptional regulatory interactions usually rely on the definition of a shared motif sequence on the target genes of a transcription factor (TF). These efforts have been frustrated by the limited availability and accuracy of TF binding site motifs, usually represented as position-specific scoring matrices, which may match large numbers of sites and produce an unreliable list of target genes. To improve the prediction of binding sites, we propose to additionally use the unrelated knowledge of the genome layout. Indeed, it has been shown that co-regulated genes tend to be either neighbors or periodically spaced along the whole chromosome. This study demonstrates that respective gene positioning carries significant information. This novel type of information is combined with traditional sequence information by a machine learning algorithm called PreCisIon. To optimize this combination, PreCisIon builds a strong gene target classifier by adaptively combining weak classifiers based on either local binding sequence or global gene position. This strategy generically paves the way to the optimized incorporation of any future advances in gene target prediction based on local sequence, genome layout or on novel criteria. With the current state of the art, PreCisIon consistently improves methods based on sequence information only. This is shown by implementing a cross-validation analysis of the 20 major TFs from two phylogenetically remote model organisms. For Bacillus subtilis and Escherichia coli , respectively, PreCisIon achieves on average an area under the receiver operating characteristic curve of 70 and 60%, a sensitivity of 80 and 70% and a specificity of 60 and 56%. The newly predicted gene targets are demonstrated to be functionally consistent with previously known targets, as assessed by analysis of Gene Ontology enrichment or of the relevant literature and databases.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 84
    Publication Date: 2013-02-02
    Description: microRNAs (miRNAs) are short non-coding regulatory RNA molecules. The activity of a miRNA in a biological process can often be reflected in the expression program that characterizes the outcome of the activity. We introduce a computational approach that infers such activity from high-throughput data using a novel statistical methodology, called minimum-mHG (mmHG), that examines mutual enrichment in two ranked lists. Based on this methodology, we provide a user-friendly web application that supports the statistical assessment of miRNA target enrichment analysis (miTEA) in the top of a ranked list of genes or proteins. Using miTEA, we analyze several target prediction tools by examining performance on public miRNA constitutive expression data. We also apply miTEA to analyze several integrative biology data sets, including a novel matched miRNA/mRNA data set covering nine human tissue types. Our novel findings include proposed direct activity of miR-519 in placenta, a direct activity of the oncogenic miR-15 in different healthy tissue types and a direct activity of the poorly characterized miR-768 in both healthy tissue types and cancer cell lines. The miTEA web application is available at http://cbl-gorilla.cs.technion.ac.il/miTEA/ .
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 85
    Publication Date: 2013-02-02
    Description: Sequence alignment of proteins and nucleic acids is a routine task in bioinformatics. Although the comparison of complete peptides, genes or genomes can be undertaken with a great variety of tools, the alignment of short DNA sequences and motifs entails pitfalls that have not been fully addressed yet. Here we confront the structural superposition of transcription factors with the sequence alignment of their recognized cis elements. Our goals are (i) to test TFcompare ( http://floresta.eead.csic.es/tfcompare ), a structural alignment method for protein–DNA complexes; (ii) to benchmark the pairwise alignment of regulatory elements; (iii) to define the confidence limits and the twilight zone of such alignments and (iv) to evaluate the relevance of these thresholds with elements obtained experimentally. We find that the structure of cis elements and protein–DNA interfaces is significantly more conserved than their sequence and measures how this correlates with alignment errors when only sequence information is considered. Our results confirm that DNA motifs in the form of matrices produce better alignments than individual sequences. Finally, we report that empirical and theoretically derived twilight thresholds are useful for estimating the natural plasticity of regulatory sequences, and hence for filtering out unreliable alignments.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 86
    Publication Date: 2013-02-02
    Description: To mine gene expression data sets effectively, analysis frameworks need to incorporate methods that identify intergenic relationships within enriched biologically relevant subpathways. For this purpose, we developed the Topology Enrichment Analysis frameworK (TEAK). TEAK employs a novel in-house algorithm and a tailor-made Clique Percolation Method to extract linear and nonlinear KEGG subpathways, respectively. TEAK scores subpathways using the Bayesian Information Criterion for context specific data and the Kullback-Leibler divergence for case–control data. In this article, we utilized TEAK with experimental studies to analyze microarray data sets profiling stress responses in the model eukaryote Saccharomyces cerevisiae . Using a public microarray data set, we identified via TEAK linear sphingolipid metabolic subpathways activated during the yeast response to nitrogen stress, and phenotypic analyses of the corresponding deletion strains indicated previously unreported fitness defects for the dpl1 and lag1 mutants under conditions of nitrogen limitation. In addition, we studied the yeast filamentous response to nitrogen stress by profiling changes in transcript levels upon deletion of two key filamentous growth transcription factors, FLO8 and MSS11 . Via TEAK we identified a nonlinear glycerophospholipid metabolism subpathway involving the SLC1 gene, which we found via mutational analysis to be required for yeast filamentous growth.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 87
    Publication Date: 2013-02-02
    Description: Structural variation is an important class of genetic variation in mammals. High-throughput sequencing (HTS) technologies promise to revolutionize copy-number variation (CNV) detection but present substantial analytic challenges. Converging evidence suggests that multiple types of CNV-informative data (e.g. read-depth, read-pair, split-read) need be considered, and that sophisticated methods are needed for more accurate CNV detection. We observed that various sources of experimental biases in HTS confound read-depth estimation, and note that bias correction has not been adequately addressed by existing methods. We present a novel read-depth–based method, GENSENG, which uses a hidden Markov model and negative binomial regression framework to identify regions of discrete copy-number changes while simultaneously accounting for the effects of multiple confounders. Based on extensive calibration using multiple HTS data sets, we conclude that our method outperforms existing read-depth–based CNV detection algorithms. The concept of simultaneous bias correction and CNV detection can serve as a basis for combining read-depth with other types of information such as read-pair or split-read in a single analysis. A user-friendly and computationally efficient implementation of our method is freely available.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 88
    Publication Date: 2013-05-04
    Description: Tumor formation is partially driven by DNA copy number changes, which are typically measured using array comparative genomic hybridization, SNP arrays and DNA sequencing platforms. Many techniques are available for detecting recurring aberrations across multiple tumor samples, including CMAR, STAC, GISTIC and KC-SMART. GISTIC is widely used and detects both broad and focal (potentially overlapping) recurring events. However, GISTIC performs false discovery rate control on probes instead of events. Here we propose Analytical Multi-scale Identification of Recurrent Events, a multi-scale Gaussian smoothing approach, for the detection of both broad and focal (potentially overlapping) recurring copy number alterations. Importantly, false discovery rate control is performed analytically (no need for permutations) on events rather than probes. The method does not require segmentation or calling on the input dataset and therefore reduces the potential loss of information due to discretization. An important characteristic of the approach is that the error rate is controlled across all scales and that the algorithm outputs a single profile of significant events selected from the appropriate scales. We perform extensive simulations and showcase its utility on a glioblastoma SNP array dataset. Importantly, ADMIRE detects focal events that are missed by GISTIC, including two events involving known glioma tumor-suppressor genes: CDKN2C and NF1.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 89
    Publication Date: 2013-04-14
    Description: In this article, we focus on the analysis of competitive gene set methods for detecting the statistical significance of pathways from gene expression data. Our main result is to demonstrate that some of the most frequently used gene set methods, GSEA, GSEArot and GAGE, are severely influenced by the filtering of the data in a way that such an analysis is no longer reconcilable with the principles of statistical inference, rendering the obtained results in the worst case inexpressive. A possible consequence of this is that these methods can increase their power by the addition of unrelated data and noise. Our results are obtained within a bootstrapping framework that allows a rigorous assessment of the robustness of results and enables power estimates. Our results indicate that when using competitive gene set methods, it is imperative to apply a stringent gene filtering criterion. However, even when genes are filtered appropriately, for gene expression data from chips that do not provide a genome-scale coverage of the expression values of all mRNAs, this is not enough for GSEA, GSEArot and GAGE to ensure the statistical soundness of the applied procedure. For this reason, for biomedical and clinical studies, we strongly advice not to use GSEA, GSEArot and GAGE for such data sets.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 90
    Publication Date: 2012-11-25
    Description: Large portions of higher eukaryotic proteomes are intrinsically disordered, and abundant evidence suggests that these unstructured regions of proteins are rich in regulatory interaction interfaces. A major class of disordered interaction interfaces are the compact and degenerate modules known as short linear motifs (SLiMs). As a result of the difficulties associated with the experimental identification and validation of SLiMs, our understanding of these modules is limited, advocating the use of computational methods to focus experimental discovery. This article evaluates the use of evolutionary conservation as a discriminatory technique for motif discovery. A statistical framework is introduced to assess the significance of relatively conserved residues, quantifying the likelihood a residue will have a particular level of conservation given the conservation of the surrounding residues. The framework is expanded to assess the significance of groupings of conserved residues, a metric that forms the basis of SLiMPrints (short linear motif fingerprints), a de novo motif discovery tool. SLiMPrints identifies relatively overconstrained proximal groupings of residues within intrinsically disordered regions, indicative of putatively functional motifs. Finally, the human proteome is analysed to create a set of highly conserved putative motif instances, including a novel site on translation initiation factor eIF2A that may regulate translation through binding of eIF4E.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 91
    Publication Date: 2012-11-25
    Description: The current method for reconstructing gene regulatory networks faces a dilemma concerning the study of bio-medical problems. On the one hand, static approaches assume that genes are expressed in a steady state and thus cannot exploit and describe the dynamic patterns of an evolving process. On the other hand, approaches that can describe the dynamical behaviours require time-course data, which are normally not available in many bio-medical studies. To overcome the limitations of both the static and dynamic approaches, we propose a dynamic cascaded method (DCM) to reconstruct dynamic gene networks from sample-based transcriptional data. Our method is based on the intra-stage steady-rate assumption and the continuity assumption, which can properly characterize the dynamic and continuous nature of gene transcription in a biological process. Our simulation study showed that compared with static approaches, the DCM not only can reconstruct dynamical network but also can significantly improve network inference performance. We further applied our method to reconstruct the dynamic gene networks of hepatocellular carcinoma (HCC) progression. The derived HCC networks were verified by functional analysis and network enrichment analysis. Furthermore, it was shown that the modularity and network rewiring in the HCC networks can clearly characterize the dynamic patterns of HCC progression.
    Keywords: Computational Methods, Genomics
    Print ISSN: 0305-1048
    Electronic ISSN: 1362-4962
    Topics: Biology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. More information can be found here...