Publication Date:
2019
Description:
〈h3〉Abstract〈/h3〉
〈p〉We consider the construction of the 〈em〉suffix tree〈/em〉 and the 〈em〉directed acyclic word graph〈/em〉 (〈em〉DAWG〈/em〉) indexing data structures for a collection 〈span〉
〈span〉\(\mathcal {T}\)〈/span〉
〈/span〉 of texts, where a new symbol may be appended to any text in 〈span〉
〈span〉\(\mathcal {T} = \{T_1, \ldots , T_K\}\)〈/span〉
〈/span〉, 〈em〉at any time〈/em〉. This 〈em〉fully-online〈/em〉 scenario, which arises in dynamically indexing multi-sensor data, is a natural generalization of the long solved 〈em〉semi-online〈/em〉 text indexing problem, where texts 〈span〉
〈span〉\(T_1, \ldots , T_{k}\)〈/span〉
〈/span〉 are permanently fixed before the next text 〈span〉
〈span〉\(T_{k+1}\)〈/span〉
〈/span〉 is processed for each 〈em〉k〈/em〉 (〈span〉
〈span〉\(1 \le k 〈 K\)〈/span〉
〈/span〉). We first show that a direct application of Weiner’s right-to-left online construction for the suffix tree of a single text to fully-online multiple texts requires superlinear time. This also means that Blumer et al.’s left-to-right online construction for the DAWG of a single text requires superlinear time in the fully-online setting. We then present our fully-online versions of these algorithms that run in 〈span〉
〈span〉\(O(N \log \sigma )\)〈/span〉
〈/span〉 time and 〈em〉O〈/em〉(〈em〉N〈/em〉) space, where 〈em〉N〈/em〉 is the total length of the texts in 〈span〉
〈span〉\(\mathcal {T}\)〈/span〉
〈/span〉 and 〈span〉
〈span〉\(\sigma \)〈/span〉
〈/span〉 is their alphabet size. We then show how to extend Ukkonen’s left-to-right online suffix tree construction to fully-online multiple strings, with the aid of Weiner’s suffix tree for the reversed texts.
〈/p〉
Print ISSN:
0178-4617
Electronic ISSN:
1432-0541
Topics:
Computer Science
,
Mathematics