What is MiRduplexSVM

We address the problem of predicting the position of a miRNA:miRNA* duplex on a microRNA hairpin via the development and application of a novel SVM-based methodology. Predicting the miRNA:miRNA* duplex is a first step towards identifying the mature miRNA, suggesting possible miRNA targets and ultimately reducing experimentation effort, time and cost. The proposed methodology combines a unique problem representation and an unbiased protocol for optimizing the SVM hyper-parameters to learn from the latest mirBase an accurate predictive model. The resulting model, termed MiRduplexSVM, is the first one to provide precise information about all four ends of the miRNA duplex. We show that (a) our method greatly outperforms four state of the art tools, namely MaturePred, MiRPara, MatureBayes, MiRdup as well as a Simple Geometric Locator when applied on the same training datasets employed for each tool and evaluated on a common blind test set. (b) In all comparisons, MiRduplexSVM shows superior performance, achieving a 10-60% increase in prediction accuracy for mammalian hairpins and can generalize very well on plant hairpins, without any special optimization. (c) MiRduplexSVM can also be used to accurately predict the miRNA or the miRNA*, given the opposite strand of a duplex. Its performance on this task is superior to the 2nts overhang rule commonly used in computational studies and similar to that of a comparative genomic approach, without the need for prior knowledge or the complexity of performing multiple alignments.