Sie sind hier: Startseite Publikationen Scalable Induction of Probabilistic Real-Time Automata Using Maximum Frequent Pattern Based Clustering

Jana Schmidt, Sonja Ansorge, and Stefan Kramer (2012)

Scalable Induction of Probabilistic Real-Time Automata Using Maximum Frequent Pattern Based Clustering

In: Proceedings of the twelfth SIAM International Conference on Data Mining, pp. to appear, SIAM / Omnipress.

The paper presents a scalable method for learning probabilistic real-time automata (PRTAs), a new type of model that captures the dynamics of multi-dimensional event logs. In multi-dimensional event logs, events are described by several features instead of only one symbol. Moreover, it is not clear up front which events occur in an event log. The learning method to find a PRTA that models such an event log is based on the state merging of a prefix tree acceptor, which is guided by a clustering to determine the states of the automaton. To make the overall approach scalable, an online clustering method based on maximum frequent patterns (MFPs) is used. The approach is evaluated on a synthetic, a biological and a medical data set. The results show that the induction of automata using MFP-based clustering gives easy to understand and stable automata, but most importantly, makes it scalable to large data sets.
2012
Artikelaktionen