Antimicrobial peptides (AMPs) are promising candidates in the fight against multidrug-resistant pathogens due to its broad range of activities and low toxicity. Some AMPs also display antitumor and antivirus functions making them alternative drug candidates for these important diseases. To facilitate the discovery of AMPs and their functions, we provide this one-stop server for antimicrobial peptide and other activity prediction for unknown sequences. Three methods are currently available:
- AmPEP: Predict antimicrobial activity
- Deep-AmPEP30: Convolutional neural network model for short sequence <=30 residues
- RF-AmPEP30:: Random forest model for short sequence <=30 residues
Our methods and server are in constant development. How often is our server accessed? See statistics page.
Deep-AmPEP30: Short Antimicrobial Peptide Prediction
Short-length AMPs are considered better drug options as they have enhanced antimicrobial activities, higher stability, and lower manufacturing cost. As existing AMP prediction methods often mixing long sequences and short sequences in both the training and validation of the prediction model, we found out that their prediction accurcies are surprisingly low (60-77%) for short AMPs. To meet the needs of short AMP prediction, we developed Deep-AmPEP30. This is a sequence-based classification method using selected types of PseKRAAC reduced amino acids composition as features (see Figure 3) and convolutional neural network as learning algorithm. Deep-AmPEP30 was tuned to optimize the prediction of short AMPs of 30 AA or less in length and tested to achieve good performances in accuracy 83%, AUC-ROC 0.92 and AUC-PR 0.94.
Molecular Therapy - Nucleic Acid 2020, 20, 882-894.
AmPEP: Antimicrobial Peptide Prediction
AmPEP is a sequence-based classification method for AMP using random forest. The prediction model is based on the distribution patterns of amino acid properties along the sequence:
Using our collection of large and diverse set of AMP/non-AMP data (3268/166791 sequences), we evaluated 19 random forest classifiers with different positive:negative data ratios by 10-fold cross-validation. Our optimal model, AmPEP with 1:3 data ratio achieved a very high accuracy of 96%, MCC of 0.9, AUC-ROC of 0.99 and Kappa statistic of 0.9. Descriptor analysis by Pearson correlation coefficients of AMP/non-AMP distributions revealed that reduced feature sets (from full-feature of 105 to minimal-feature of 23) can achieve comparable performance in all aspects except some reductions in precision. Furthermore, AmPEP achieved high performance in terms of AUC-ROC (0.995), AUC-PR (0.957), MCC (0.921) and kappa (0.962) using a benchmark dataset. Our performance is 1-5% better than two published methods iAMPpred and iAMP-2L.
This online prediction model has been reimplemented in R and tested to achieve very close accuracy to our original MATLAB implementation used for publication. If you want to run the MATLAB code yourself, feel free to download it from here. A re-implementation of the AmPEP with Python is also available here.
Scientific Reports 2018, 8, 1697.