ALBERT

All Library Books, journals and Electronic Records Telegrafenberg

feed icon rss

Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
  • 1
    Electronic Resource
    Electronic Resource
    Springer
    Algorithmica 12 (1994), S. 345-374 
    ISSN: 1432-0541
    Keywords: Approximate match ; Dynamic programming ; Index ; word neighborhood
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science , Mathematics
    Notes: Abstract Given a relatively short query stringW of lengthP, a long subject stringA of lengthN, and a thresholdD, theapproximate keyword search problem is to find all substrings ofA that align withW with not more than D insertions, deletions, and mismatches. In typical applications, such as searching a DNA sequence database, the size of the “database”A is much larger than that of the queryW, e.g.,N is on the order of millions or billions andP is a hundred to a thousand. In this paper we present an algorithm that given a precomputedindex of the databaseA, finds rare matches in time that issublinear inN, i.e.,N c for somec〈1. The sequenceA must be overa. finite alphabet σ. More precisely, our algorithm requires 0(DN pow(ɛ) logN) expected-time where ɛ=D/P is the maximum number of differences as a percentage of query length, and pow(ɛ) is an increasing and concave function that is 0 when ɛ=0. Thus the algorithm is superior to current O(DN) algorithms when ɛ is small enough to guarantee that pow(ɛ) 〈 1. As seen in the paper, this is true for a wide range of ɛ, e.g., ɛ. up to 33% for DNA sequences (¦⌆¦=4) and 56% for proteins sequences (¦⌆¦=20). In preliminary practical experiments, the approach gives a 50-to 500-fold improvement over previous algorithms for prolems of interest in molecular biology.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 2
    Electronic Resource
    Electronic Resource
    Springer
    Algorithmica 13 (1995), S. 7-51 
    ISSN: 1432-0541
    Keywords: Computational biology ; Branch- and-bound algorithms ; Approximation algorithms ; Fragment assembly ; Sequence reconstruction
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science , Mathematics
    Notes: Abstract The trend toward very large DNA sequencing projects, such as those being undertaken as part of the Human Genome Program, necessitates the development of efficient and precise algorithms for assembling a long DNA sequence from the fragments obtained by shotgun sequencing or other methods. The sequence reconstruction problem that we take as our formulation of DNA sequence assembly is a variation of the shortest common superstring problem, complicated by the presence of sequencing errors and reverse complements of fragments. Since the simpler superstring problem is NP-hard, any efficient reconstruction procedure must resort to heuristics. In this paper, however, a four-phase approach based on rigorous design criteria is presented, and has been found to be very accurate in practice. Our method is robust in the sense that it can accommodate high sequencing error rates, and list a series of alternate solutions in the event that several appear equally good. Moreover, it uses a limited form of multiple sequence alignment to detect, and often correct, errors in the data. Our combined algorithm has successfully reconstructed nonrepetitive sequences of length 50,000 sampled at error rates of as high as 10%.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 3
    Electronic Resource
    Electronic Resource
    Springer
    Algorithmica 14 (1995), S. 85-121 
    ISSN: 1432-0541
    Keywords: Pattern matching ; Regular expressions ; Concave gap penalties ; Approximate matching
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science , Mathematics
    Notes: Abstract Given a sequenceA of lengthM and a regular expressionR of lengthP, an approximate regular expression pattern-matching algorithm computes the score of the optimal alignment betweenA and one of the sequencesB exactly matched byR. An alignment between sequencesA=a1a2 ... aM andB=b1b2... bN is a list of ordered pairs, 〈(i1,j1), (i2j2), ..., (it,jtt)〉 such that ik 〈 ik+1 and jk 〈 jk+1. In this case the alignmentaligns symbols aik and bjk, and leaves blocks of unaligned symbols, orgaps, between them. A scoring schemeS associates costs for each aligned symbol pair and each gap. The alignment's score is the sum of the associated costs, and an optimal alignment is one of minimal score. There are a variety of schemes for scoring alignments. In a concave gap penalty scoring schemeS={δ, w}, a function δ(a, b) gives the score of each aligned pair of symbolsa andb, and aconcave function w(k) gives the score of a gap of lengthk. A function w is concave if and only if it has the property that, for allk 〉 1, w(k + 1) −w(k) ≤w(k) −w(k −1). In this paper we present an O(MP(logM + log2 P)) algorithm for approximate regular expression matching for an arbitraryδ and any concavew.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 4
    Electronic Resource
    Electronic Resource
    Springer
    Algorithmica 13 (1995), S. 1-6 
    ISSN: 1432-0541
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science , Mathematics
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 5
    Electronic Resource
    Electronic Resource
    Springer
    Algorithmica 13 (1995), S. 211-243 
    ISSN: 1432-0541
    Keywords: Pattern matching ; Dynamic programming ; Extended regular expressions ; Molecular biology ; Gene recognition
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science , Mathematics
    Notes: Abstract Some recognition problems are either too complex or too ambiguous to be expressed as a simple pattern matching problem using a sequence or regular expression pattern. In these cases, a richer environment is needed to describe the “patterns” and recognition techniques used to perform the recognition. Some researchers have turned to artificial-intelligence techniques and multistep matching approaches for the problems of gene recognition [5], [7], [18], protein structure recognition [13], and on-line character recognition [6]. This paper presents a class of problems which involve finding matches to “patterns of patterns,” orsuper- patterns, given solutions to the lower-level patterns. The expressiveness of this problem class rivals that of traditional artificial-intelligence characterizations, and yet polynomial-time algorithms are described for each problem in the class.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 6
    Publication Date: 1994-11-01
    Print ISSN: 0178-4617
    Electronic ISSN: 1432-0541
    Topics: Computer Science , Mathematics
    Published by Springer
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 7
    Publication Date: 1995-02-01
    Print ISSN: 0178-4617
    Electronic ISSN: 1432-0541
    Topics: Computer Science , Mathematics
    Published by Springer
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 8
    Publication Date: 1995-07-01
    Print ISSN: 0178-4617
    Electronic ISSN: 1432-0541
    Topics: Computer Science , Mathematics
    Published by Springer
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 9
    Publication Date: 1995-02-01
    Print ISSN: 0178-4617
    Electronic ISSN: 1432-0541
    Topics: Computer Science , Mathematics
    Published by Springer
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 10
    Publication Date: 1995-02-01
    Print ISSN: 0178-4617
    Electronic ISSN: 1432-0541
    Topics: Computer Science , Mathematics
    Published by Springer
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. More information can be found here...