Predict the pathogenicity of amino acid substitutions and their molecular mechanisms

MutPred2 is a standalone and web application developed to classify amino acid substitutions as pathogenic or benign in human. In addition, it predicts their impact on over 50 different protein properties and, thus, enables the inference of molecular mechanisms of pathogenicity.

Predict here Download now Learn more

The original MutPred web application can still be accessed here.
For non-frameshift insertions and deletions, please use our new standalone and web application MutPred-Indel here.
For frameshift and stop-gain variants, please use our new standalone and web application MutPred-LOF here.
For splice variants, we have developed MutPred Splice here.

MutPred2 web server

The web server can be used for small-scale data sets (at most 100 substitutions, with no input sequence of length > 35,000 residues). Currently, this web site provides MutPred2.0. It requires protein sequences in the FASTA format, a list of amino acid substitutions in the corresponding FASTA headers (separated by spaces only), and a valid email address. The protein ID cannot contain spaces, semi-colons or commas.


Download standalone

The standalone executable can be used for genome-scale data sets. In addition to the standard MutPred2 input format (see above), the MutPred2 software also supports the output file from ANNOVAR's coding_change.pl program. This enables the straightforward movement between VCF files and MutPred2. To install and run MutPred2, you will need about 50 GB of hard disk space and at least 4 GB RAM. Click on the link below to download.

For Linux (64-bit)

Download data files and source code

Data files include a tab-delimited file containing the subset of MutPred2's training data that is freely shareable, binary data files that MutPred2 depends on, and binary files for the learned machine learning models for MutPred2 to generate features and make predictions on. MutPred2 code requires MATLAB release 2017b or earlier to work properly. Details about the structure and setup of code are provided in the README file.

Training data (does not contain variants that are exclusively in HGMD due to licensing restrictions)

GitHub repository (for MutPred2.0, the version in the bioRxiv preprint and the main text of the paper)

Model and data files (required for source code to work properly)