ISSN:
1522-9602
Quelle:
Springer Online Journal Archives 1860-2000
Thema:
Biologie
,
Mathematik
Notizen:
Abstract A new measure of subalignment similarity is introduced. Specifically, similaritys(l,c) is defined as the logarithm to the basep of the probability of findingc or fewer mismatches in a subalignment of lengthl, wherep is the probability of a match. Previous algorithms can not use this measure to find locally optimal subalignments because, unlike Needleman-Wunsch and Sellers similarities, this measure is nonlinear. A new pattern recognition algorithm is described for finding all locally optimal subalignments of two nucleotide sequences. The DD algorithm can uses(l, c) or any other reasonable similarity function to assess the relative interest of subalignments. The DD algorithm searches only the diagonal graph, which lacks insertions and deletions. This search strategy greatly decreases the computation time and does not require an arbitrary choice of gap cost. The paths of the resulting DD graph usually draw attention to likely locations for insertions and deletions. A heuristic formula is derived for estimating significance levels fors(l, c) in the context of the lengths of the two aligned sequences. The DD algorithm has been used to find interesting subalignments between the nucleotide sequences for human and murine interleukin 2.
Materialart:
Digitale Medien
URL:
http://dx.doi.org/10.1007/BF02462327
Permalink