ALBERT

All Library Books, journals and Electronic Records Telegrafenberg

Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
  • 1
    ISSN: 1522-9602
    Source: Springer Online Journal Archives 1860-2000
    Topics: Biology , Mathematics
    Notes: Abstract The self-complementary subset $$\mathcal{T}_0 = \mathcal{X}_0 $$ ∪{AAA,TTT} with $$\mathcal{X}_0 $$ = {AAC, AAT, ACC, ATC, ATT, CAG, CTC, CTG, GAA, GAC, GAG, GAT, GCC, GGC, GGT, GTA, GTC, GTT, TAC, TTC} of 22 trinucleotides has a preferential occurrence in the frame 0 (reading frame established by the ATG start trinucleotide) of protein (coding) genes of both prokaryotes and eukaryotes. The subsets $$\mathcal{T}_1 = \mathcal{X}_1 $$ ∪{CCC} and $$\mathcal{T}_2 = \mathcal{X}_2 $$ ∪{GGG} of 21 trinucleotides have a preferential occurrence in the shifted frames 1 and 2 respectively (frame 0 shifted by one and two nucleotides respectively in the 5′-3′ direction). $$\mathcal{T}_1 $$ and $$\mathcal{T}_2 $$ are complementary to each other. The subset $$\mathcal{T}_0 $$ contains the subset $$\mathcal{X}_0 $$ which has the rarity property (6 × 10−8) to be a complementary maximal circular code with two permutated maximal circular codes $$\mathcal{X}_1 $$ and $$\mathcal{X}_2 $$ in the frames 1 and 2 respectively. $$\mathcal{X}_0 $$ is called a C3 code. A quantitative study of these three subsets $$\mathcal{T}_0 ,\mathcal{T}_1 ,\mathcal{T}_2 $$ in the three frames 0, 1, 2 of protein genes, and the 5′ and 3′ regions of eukaryotes, shows that their occurrence frequencies are constant functions of the trinucleotide positions in the sequences. The frequencies of $$\mathcal{T}_0 ,\mathcal{T}_1 ,\mathcal{T}_2 $$ in the frame 0 of protein genes are 49, 28.5 and 22.5% respectively. In contrast, the frequencies of $$\mathcal{T}_0 ,\mathcal{T}_1 ,\mathcal{T}_2 $$ in the 5′ and 3′ regions of eukaryotes, are independent of the frame. Indeed, the frequency of $$\mathcal{T}_0 $$ in the three frames of 5′ (respectively 3′) regions is equal to 35.5% (respectively 38%) and is greater than the frequencies $$\mathcal{T}_1 $$ and $$\mathcal{T}_2 $$ , both equal to 32.25% (respectively 31%) in the three frames. Several frequency asymmetries unexpectedly observed (e.g. the frequency difference between $$\mathcal{T}_1 $$ and $$\mathcal{T}_2 $$ in the frame 0), are related to a new property of the subset $$\mathcal{T}_0 $$ involving substitutions. An evolutionary analytical model at three parameters (p, q, t) based on an independent mixing of the 22 codons (trinucleotides in frame 0) of $$\mathcal{T}_0 $$ with equiprobability (1/22) followed by t ≈ 4 substitutions per codon according to the proportions p ≈ 0.1; q ≈ 0.1 and r = 1 − p − q ≈ 0.8 in the three codon sites respectively, retrieves the frequencies of $$\mathcal{T}_0 ,\mathcal{T}_1 ,\mathcal{T}_2 $$ observed in the three frames of protein genes and explains these asymmetries. Furthermore, the same model (0.1, 0.1, t) after t ≈ 22 substitutions per codon, retrieves the statistical properties observed in the three frames of the 5′ and 3′ regions. The complex behaviour of these analytical curves is totally unexpected and a priori difficult to imagine.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. More information can be found here...