You are here: Home Publications Fast Frequent String Mining Using Suffix Arrays

Johannes Fischer, Volker Heun, and Stefan Kramer (2005)

Fast Frequent String Mining Using Suffix Arrays

In: ICDM '05: Proceedings of the Fifth IEEE International Conference on Data Mining, pp. 609-612, Washington, DC, USA, IEEE Computer Society Press.

Mining frequent strings in databases has many interesting applications, e.g., in computational biology. We focus on a special kind of constraint-based frequent string mining, namely computing all strings that are frequent in one database and infrequent in another. We present a method to find such strings by using the suffix- and lcp-arrays, which can be computed extremely fast and space efficiently, and further exhibit a good locality behavior. We test our method on several biologically relevant data sets and show that it outperforms existing methods in terms of time and space.

String Mining
String Mining
Document Actions