JUCHMME, an acronym for Java Utility for Class Hidden Markov Models and Extensions, is a tool developed for biological sequence analysis. The overall aim of this work has been to develop a software tool of capable of offering a large collection of standard algorithms for Hidden Markov Models (HMMs) as well as a number of extensions and to evaluate this model on various biological problems. The JUCHMME framework is characterized by: - Flexibility: Ease of use and customization for various problems. The user can create models of any architecture, any alphabet (DNA, protein or other), all without requiring programming capabilities (settings will be made through a configuration file).
- Training methods: JUCHMME integrates a wide range of training algorithms for HMM for labeled sequences. This kind of models are often called “class HMMs” and are commonly trained by the Maximum Likelihood (ML) criterion to model within-class data distributions. The tool has been developed to support the Baum-Welch algorithm and its extension necessary to handle labeled data. Other alternatives are also supported, namely the gradient-descent algorithm proposed by Baldi and Chauvin and the Viterbi training (or else “segmental k-means”). Additionally, the Conditional Maximum Likelihood (CML) criterion, which corresponds to discriminative training, is also supported. The CML training can be performed only with gradient based algorithms, and to this end a fast and robust algorithm for individual learning rate adaptation has been implemented. The same algorithm is available for training the Hidden Neural Networks.
- Decoding: It integrates a wide range of decoding algorithms such as Viterbi, N–Best, posterior–Viterbi and Optimal Accuracy Posterior Decoder. Moreover, decoding of partially labeled data is offered with all algorithms in order to allow incorporation of experimental information.
- Training Procedures: It contains built-in model creation and evaluation procedures, such as options for independent tests, self-consistency tests, jackknife tests, k–fold cross-validation and early stopping. All the prediction algorithms also incorporate several reliability measures as well as standard performance indicators that have been proposed (such as the correlation coefficient, Q, or SOV).
- HMM Extensions: To overcome standard HMM and class HMM limitations, a number of extensions have been developed such as segmental k–means (Viterbi training) both for Maximum Likelihood (ML) and for Conditional Maximum Likelihood (CML), Hidden Neural Networks (HNNs), models that condition on previous observations and a method for semi-supervised learning of HMMs that can incorporate labeled, unlabeled and partially-labeled data.
Source code: https://github.com/pbagos/juchmme
JUCHMME User's Guide: [PDF]
|
 Updating...
Ċ Pantelis Bagos, May 4, 2019, 9:46 AM
juchmme_v1.0.3.zip (5462k) Pantelis Bagos, May 4, 2019, 9:46 AM
|