Marcel H. Schulz, David Weese, Tobias Rausch, Andreas Döring, Knut Reinert and Martin Vingron. Fast and Adaptive Variable Order Markov Chain Construction. Proceedings of the 8th International Workshop in Algorithms in Bioinformatics (WABI'08) , page 306-317.
Variable order Markov chains (VOMCs) are a flexible class of models that extend the well-known Markov chains. They have been applied to a variety of problems in computational biology, e.g. protein family classification. A linear time and space construction algorithm has been published in 2000 by Apostolico and Bejerano. However, neither a report of the actual running time nor an implementation of it have been published since. In this paper we use the lazy suffix tree and the enhanced suffix array to improve upon the algorithm of Apostolico and Bejerano. We introduce the PISA toolbox which is orders of magnitude faster than current tools for building VOMCs, and is suitable for large scale sequence analysis.
The new algorithms and additional tools for VOMC handling are implemented in the PISA toolbox.
Supported platforms are: Windows and Linux. Please take a look at the pisaDoc.pdf file for usage instructions and examples.
Last Update 4. March 2012