ALBERT

All Library Books, journals and Electronic Records Telegrafenberg

feed icon rss

Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
  • 1
    Publication Date: 2011-08-24
    Description: Our purpose is to describe some recent progress in applying fractal concepts to systems of relevance to biology and medicine. We review several biological systems characterized by fractal geometry, with a particular focus on the long-range power-law correlations found recently in DNA sequences containing noncoding material. Furthermore, we discuss the finding that the exponent alpha quantifying these long-range correlations ("fractal complexity") is smaller for coding than for noncoding sequences. We also discuss the application of fractal scaling analysis to the dynamics of heartbeat regulation, and report the recent finding that the normal heart is characterized by long-range "anticorrelations" which are absent in the diseased heart.
    Keywords: Life Sciences (General)
    Type: Chaos, solitons, and fractals (ISSN 0960-0779); Volume 6; 171-201
    Format: text
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 2
    Publication Date: 2011-08-24
    Description: We present evidence supporting the idea that the DNA sequence in genes containing noncoding regions is correlated, and that the correlation is remarkably long range--indeed, base pairs thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene. We resolve the problem of the "non-stationary" feature of the sequence of base pairs by applying a new algorithm called Detrended Fluctuation Analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and noncoding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to all eukaryotic DNA sequences (33 301 coding and 29 453 noncoding) in the entire GenBank database. We describe a simple model to account for the presence of long-range power-law correlations which is based upon a generalization of the classic Levy walk. Finally, we describe briefly some recent work showing that the noncoding sequences have certain statistical features in common with natural languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts, and the Shannon approach to quantifying the "redundancy" of a linguistic text in terms of a measurable entropy function. We suggest that noncoding regions in plants and invertebrates may display a smaller entropy and larger redundancy than coding regions, further supporting the possibility that noncoding regions of DNA may carry biological information.
    Keywords: Life Sciences (General)
    Type: Fractals (ISSN 0218-348X); Volume 3; 2; 269-84
    Format: text
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 3
    Publication Date: 2011-08-24
    Description: DNA sequences have been analysed using models, such as an n-step Markov chain, that incorporate the possibility of short-range nucleotide correlations. We propose here a method for studying the stochastic properties of nucleotide sequences by constructing a 1:1 map of the nucleotide sequence onto a walk, which we term a 'DNA walk'. We then use the mapping to provide a quantitative measure of the correlation between nucleotides over long distances along the DNA chain. Thus we uncover in the nucleotide sequence a remarkably long-range power law correlation that implies a new scale-invariant property of DNA. We find such long-range correlations in intron-containing genes and in nontranscribed regulatory DNA sequences, but not in complementary DNA sequences or intron-less genes.
    Keywords: Life Sciences (General)
    Type: Nature (ISSN 0028-0836); Volume 356; 6365; 168-70
    Format: text
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 4
    Publication Date: 2011-08-24
    Description: Recently, it was observed that noncoding regions of DNA sequences possess long-range power-law correlations, whereas coding regions typically display only short-range correlations. We develop an algorithm based on this finding that enables investigators to perform a statistical analysis on long DNA sequences to locate possible coding regions. The algorithm is particularly successful in predicting the location of lengthy coding regions. For example, for the complete genome of yeast chromosome III (315,344 nucleotides), at least 82% of the predictions correspond to putative coding regions; the algorithm correctly identified all coding regions larger than 3000 nucleotides, 92% of coding regions between 2000 and 3000 nucleotides long, and 79% of coding regions between 1000 and 2000 nucleotides. The predictive ability of this new algorithm supports the claim that there is a fundamental difference in the correlation property between coding and noncoding sequences. This algorithm, which is not species-dependent, can be implemented with other techniques for rapidly and accurately locating relatively long coding regions in genomic sequences.
    Keywords: Life Sciences (General)
    Type: Biophysical journal (ISSN 0006-3495); Volume 67; 1; 64-70
    Format: text
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 5
    Publication Date: 2011-08-24
    Description: We analyze the fluctuations in the correlation exponents obtained for noncoding DNA sequences. We find prominent sample-to-sample variations as well as variations within a single sample in the scaling exponent. To determine if these fluctuations may result from finite system size, we generate correlated random sequences of comparable length and study the fluctuations in this control system. We find that the DNA exponent fluctuations are consistent with those obtained from the control sequences having long-range power-law correlations. Finally, we compare our exponents for the DNA sequences with the exponents obtained from power-spectrum analysis and correlation-function techniques, and demonstrate that the original "DNA-walk" method is intrinsically more accurate due to reduced noise.
    Keywords: Life Sciences (General)
    Type: Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics (ISSN 1063-651X); Volume 47; 5; 3730-3
    Format: text
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 6
    Publication Date: 2011-08-24
    Description: We propose a generalized Levy walk to model fractal landscapes observed in noncoding DNA sequences. We find that this model provides a very close approximation to the empirical data and explains a number of statistical properties of genomic DNA sequences such as the distribution of strand-biased regions (those with an excess of one type of nucleotide) as well as local changes in the slope of the correlation exponent alpha. The generalized Levy-walk model simultaneously accounts for the long-range correlations in noncoding DNA sequences and for the apparently paradoxical finding of long subregions of biased random walks (length lj) within these correlated sequences. In the generalized Levy-walk model, the lj are chosen from a power-law distribution P(lj) varies as lj(-mu). The correlation exponent alpha is related to mu through alpha = 2-mu/2 if 2 〈 mu 〈 3. The model is consistent with the finding of "repetitive elements" of variable length interspersed within noncoding DNA.
    Keywords: Life Sciences (General)
    Type: Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics (ISSN 1063-651X); Volume 47; 6; 4514-23
    Format: text
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 7
    Publication Date: 2011-08-24
    Description: We compare the statistical properties of coding and noncoding regions in eukaryotic and viral DNA sequences by adapting two tests developed for the analysis of natural languages and symbolic sequences. The data set comprises all 30 sequences of length above 50 000 base pairs in GenBank Release No. 81.0, as well as the recently published sequences of C. elegans chromosome III (2.2 Mbp) and yeast chromosome XI (661 Kbp). We find that for the three chromosomes we studied the statistical properties of noncoding regions appear to be closer to those observed in natural languages than those of coding regions. In particular, (i) a n-tuple Zipf analysis of noncoding regions reveals a regime close to power-law behavior while the coding regions show logarithmic behavior over a wide interval, while (ii) an n-gram entropy measurement shows that the noncoding regions have a lower n-gram entropy (and hence a larger "n-gram redundancy") than the coding regions. In contrast to the three chromosomes, we find that for vertebrates such as primates and rodents and for viral DNA, the difference between the statistical properties of coding and noncoding regions is not pronounced and therefore the results of the analyses of the investigated sequences are less conclusive. After noting the intrinsic limitations of the n-gram redundancy analysis, we also briefly discuss the failure of the zeroth- and first-order Markovian models or simple nucleotide repeats to account fully for these "linguistic" features of DNA. Finally, we emphasize that our results by no means prove the existence of a "language" in noncoding DNA.
    Keywords: Life Sciences (General)
    Type: Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics (ISSN 1063-651X); Volume 52; 3; 2939-50
    Format: text
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 8
    Publication Date: 2011-08-24
    Description: Long-range power-law correlations have been reported recently for DNA sequences containing noncoding regions. We address the question of whether such correlations may be a trivial consequence of the known mosaic structure ("patchiness") of DNA. We analyze two classes of controls consisting of patchy nucleotide sequences generated by different algorithms--one without and one with long-range power-law correlations. Although both types of sequences are highly heterogenous, they are quantitatively distinguishable by an alternative fluctuation analysis method that differentiates local patchiness from long-range correlations. Application of this analysis to selected DNA sequences demonstrates that patchiness is not sufficient to account for long-range correlation properties.
    Keywords: Life Sciences (General)
    Type: Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics (ISSN 1063-651X); Volume 49; 2; 1685-9
    Format: text
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 9
    Publication Date: 2011-08-24
    Description: An open question in computational molecular biology is whether long-range correlations are present in both coding and noncoding DNA or only in the latter. To answer this question, we consider all 33301 coding and all 29453 noncoding eukaryotic sequences--each of length larger than 512 base pairs (bp)--in the present release of the GenBank to dtermine whether there is any statistically significant distinction in their long-range correlation properties. Standard fast Fourier transform (FFT) analysis indicates that coding sequences have practically no correlations in the range from 10 bp to 100 bp (spectral exponent beta=0.00 +/- 0.04, where the uncertainty is two standard deviations). In contrast, for noncoding sequences, the average value of the spectral exponent beta is positive (0.16 +/- 0.05) which unambiguously shows the presence of long-range correlations. We also separately analyze the 874 coding and the 1157 noncoding sequences that have more than 4096 bp and find a larger region of power-law behavior. We calculate the probability that these two data sets (coding and noncoding) were drawn from the same distribution and we find that it is less than 10(-10). We obtain independent confirmation of these findings using the method of detrended fluctuation analysis (DFA), which is designed to treat sequences with statistical heterogeneity, such as DNA's known mosaic structure ("patchiness") arising from the nonstationarity of nucleotide concentration. The near-perfect agreement between the two independent analysis methods, FFT and DFA, increases the confidence in the reliability of our conclusion.
    Keywords: Life Sciences (General)
    Type: Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics (ISSN 1063-651X); Volume 51; 5; 5084-91
    Format: text
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 10
    Publication Date: 2011-08-24
    Description: We review evidence supporting the idea that the DNA sequence in genes containing non-coding regions is correlated, and that the correlation is remarkably long range--indeed, nucleotides thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene. We resolve the problem of the "non-stationarity" feature of the sequence of base pairs by applying a new algorithm called detrended fluctuation analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and non-coding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to every DNA sequence (33301 coding and 29453 non-coding) in the entire GenBank database. Finally, we describe briefly some recent work showing that the non-coding sequences have certain statistical features in common with natural and artificial languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts. These statistical properties of non-coding sequences support the possibility that non-coding regions of DNA may carry biological information.
    Keywords: Life Sciences (General)
    Type: Physica A (ISSN 0378-4371); Volume 221; 180-92
    Format: text
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. More information can be found here...