Jana Schmidt and Stefan Kramer (2011)
The Augmented Itemset Tree: A Data Structure for Online Maximum Frequent Pattern Mining
In: Discovery Science - 14th International Conference, DS 2011, pp. 277-291, Springer.
This paper introduces an approach for incremental maximal frequent pattern (MFP)
mining in sparse binary data, where instances are observed one by one. For this purpose,
we propose the Augmented Itemset Tree (AIST), a data structure that incorporates features
of the FP-tree into the itemset tree.
In the given setting, we assume that just the data structure
is maintained in main memory, and each instance is
observed only once.
The AIST
not only stores observed frequent
patterns, but also allows for quick frequency updates of relevant subpatterns. In order
to quickly identify the current set of exact MFPs, potential candidates are extracted from
former MFPs and patterns that occur in the new instance. The presented approach is evaluated
concerning the runtime and memory requirements depending on the number of instances, minimum
support and different settings of pattern properties. The obtained results suggest that AISTs
are useful for mining maximal frequent itemsets in an online setting whenever larger
patterns can be expected.
