Mutantelec: An In Silico mutation simulation platform for comparative electrostatic potential profiling of proteins

The electrostatic potential plays a key role in many biological processes like determining the affinity of a ligand to a given protein target, and they are responsible for the catalytic activity of many enzymes. Understanding the effect that amino acid mutations will have on the electrostatic potential of a protein, will allow a thorough understanding of which residues are the most important in a protein. MutantElec, is a friendly web application for in silico generation of site‐directed mutagenesis of proteins and the comparison of electrostatic potential between the wild type protein and the mutant(s), based on the three‐dimensional structure of the protein. The effect of the mutation is evaluated using different approach to the traditional surface map. MutantElec provides a graphical display of the results that allows the visualization of changes occurring at close distance from the mutation and thus uncovers the local and global impact of a specific change. © 2017 Wiley Periodicals, Inc.


Introduction
The electrostatic potential (EP) at the surface of biological macromolecules such as proteins and nucleic acids plays a key role in many biological processes.Electrostatic interactions (EIs) and hydrophobic interfaces guide substrates and ligands to their designated location, and govern all protein interactions with other macromolecules. [1,2]EIs play a key role in determining the affinity of a ligand to a given protein target, and they are responsible for the catalytic activity of many enzymes.One well-studied example is the Saccharomyces cerevisiae phosphoenolpyruvate carboxykinase where the binding of the Mn 12 required for catalysis, is due to the EIs to the side chains of a Lys213. [3,4]The reversible interaction between protein and membrane is critical to many biological processes [5] and these associations have been shown to be partly mediated by EIs. [6,7]Also, electrostatic charge distribution of interacting protein surfaces determines the formation or stabilization of many protein complexes.Thereafter, the mutation of a few or even a single residue can induce the destabilization of the interface. [8]This is the case of the yeast mitochondrial malate dehydrogenase where mutation of His46, located in the a-C helix of the binding interface between monomers, results in the loss of polar interactions at the subunit interface and ultimately the dissociation of the dimer. [9]Substantial evidence of the effect of charge distributions on the functional characteristics of proteins can also be found in the literature.12] The EPs of biological macromolecules can be estimated using Poisson-Boltzmann (PB) equations, and several software packages, such as APBS [13] and Delphi, [14] have been developed to solve these equations.Many of these software take advantage of novel computational approaches, such as Grid computing for distributed calculations, [15] and the use of graphic processor units for EP calculations, [16] which enable calculations to be done in a matter of minutes.EPs are most conveniently displayed as color-coded surface representations using modern graphical programs such as JSmol (http://www.jmol.org/),Pymol (http://www.pymol.org/),and VMD. [17]However, none of these tools reveals the net effect on the EP of the target proteins, nor on the global charge distribution of in silico mutant variants of the protein of interest.
Site-directed mutagenesis is a standard experimental technique used to generate site-specific mutations in known protein-coding genes. [18][21][22][23] However, this technique is very laborious and time consuming, and in silico tools to help direct the selection of candidate residues are scarce.
In spite of tremendous advances in molecular modeling and bioinformatics software, the in silico design of single-site mutant variants and their evaluation requires significant expertise in various sophisticated bioinformatics tools designed for protein modeling and ligand analyses. [24]Molecular modeling software packages (e.g., VMD, [17] ICM, [25] SwissPDB, [26] VegaZZ [27] ) can generate the in silico mutants by replacement of residues, one at a time, but they do not calculate the new electrostatic surface automatically, and do not provide an automated pipeline to generate a series of mutations.Only recently, programs such as SAAMBE [28] and SAAFEC [29] have been developed, that allow the prediction of the changes in the free energy of binding and protein folding respectively, caused by amino acids mutations In addition, programs like BeAtMuSiC [30] and PoPMuSiC-2.0 [31] can predict the changes in protein-protein binding affinity and on protein stability as a consequence of in silico mutation.However, none of those programs allow the prediction of the changes that mutations will have in the EP of a protein.In turn, Delphi [14] performs versatile EP calculations, but lacks the ability to generate in silico mutations.Importantly, most programs do not take into account non-natural amino acids, for example, those are phosphorylated and dephosphorylated as part of the regulatory mechanism of the cell.Being such an important process in cellular regulatory networks, [32,33] exclusion of phosphorylated residues is an important caveat that limits adequate study of the effect of mutations in EP calculations.
Despite the many virtues of the alluded software, most of them are not user friendly and require considerable expertise from the user.For example, some lack the option to input a PDB identifier to initiate the job.Others lack simplicity in their mutation interface: the users must specify manually the code, chain and number of the residue they wish to mutate, along with the new residue.In most applications this step is not automatically validated, and represents a source of errors to biologists with little or none computational knowledge.Furthermore, except for Delphi, none of the applications discussed has an online side-by-side comparison interface between the wild type and the mutant protein To tackle the impairments alluded to above, we designed MutantElec, a web-based application for the study of the effects that mutations have in the EP of a protein of interest.It simulates in silico site-directed mutations and analyzes their effect on the EP distribution of the mutated protein.MutantElec provides a graphical display that allows the visualization of changes occurring at any distance from the introduced point mutation and, thus, uncovers both the local and global effects of a site-specific change.This application is user friend and easy to use, and has thus the potential to control frequent errors associated to misusage.We are confident that MutantElec will prove particularly valuable in the analysis of mutations involving amino acids with similar physicochemical properties and those that are target of posttranslational modifications (such as phosphorylation), where currently available graphical representations of the whole protein surface obscure small changes of the EP.

Methods
The MutantElect web application is based on an intuitive graphical user interface with three main component layers: (1) an INPUT layer, through which a protein structure in PDB format is selected and uploaded into the system, (2) an ANALYSIS layer comprising several subroutines that simulate the mutation(s) chosen by the user and estimates the EP configuration of the wild type and mutated variants of the target protein, and (3) an OUTPUT layer, involving a set of subroutines for the display of the results in a user-friendly and intuitive graphical environment.
Multiple in-house scripts written in TCL and Perl programming languages were created to process and connect the input and output files of the various programs used along the pipeline, including the software used for mutation (Modeller [34] ), for EP calculations (APBS [13] ), as well as for automating the process (Fig. 1).

INPUT layer: consulting web interface
The first part of the system comprises the web interface that allows the user to choose the type of analysis and upload the target protein.The input data upload and consultation web interface was built using HTML, JavaScript and PHP languages, in a simple and intuitive web environment (Fig. 2).The user can choose between two basic analyses: (1) site-specific mutagenesis and (2) mutagenesis of a specific region.The "sitespecific mutagenesis" option enables the user to select the residue that is to be mutated and to specify the desired amino-acid exchange.Furthermore, through this option, any residue can be substituted by all the other 19 common amino acids using a scanning procedure (Supporting Information Figure 1S, Additional file 1), facilitating exploration of the most perturbing and least perturbing changes.In turn, the "mutagenesis of a specific region" option enables the user to mutate up to 10 amino acids in a row, in any selected region of the target protein (Supporting Information Figure 2S, Additional file 1).Alternatively, the user can upload a native and a mutant protein obtained experimentally, to compare the EPs of both using the tool without simulating mutations in silico.After choosing the type of analysis, the UPLOAD query form can be accessed for the user to upload a PDB file containing the 3D coordinates of the protein of interest.The coordinate file can be obtained from the PDB database or from a model generated by the user through comparative modeling, following the instructions provided.The file can be then processed to check the amino acid composition.If a non-standard residue is found, a warning is displayed indicating this residue was deleted of the analysis.The user must modify the PDB file by exchanging or removing the nonstandard residues (e.g., HETATM, ligand, etc) previously.This is important because the force field calculations do not generate appropriate parameters for non-standard residues.
Once the PDB file has been uploaded, the web interface displays all the amino acids in the target protein as a list, ordered numerically from the N-to the C-terminus.The user has then the option to select the residues to be mutated and the corresponding substitutions.Most commonly phosphorylated residues (phosphoserine, phosphothreonine, phosphotyrosine) available for the Force Field CHARMM [35] are also considered in the Muta-ntElec pipeline.Additionally, the user can modify the parameters used to calculate the EP through the APBS software (dielectric constant of the biomolecule, dielectric constant of the solvent, temperature).The default options used in the subsequent analyses are those of proteins under mesophilic conditions having water as solvent, i.e. temperature 5 258C and dielectric constant of water 5 78.5.Once the job is submitted, calculations are sent to the job queue, where the request is processed.

ANALYSIS layer: mutation and electrostatic properties calculation
Amino acid mutation in the 3D structure of the protein is performed by the Mutate module from the Modeller software. [32]he conformation of the mutant sidechain is optimized by Combinatorial chart displaying the electrostatic potential of the input and mutant proteins, and the difference in potential between both.b) Scheme of the environment surrounding the mutated residue (Glu100Ile) for the protein Fur PA (PDB_ID:1MZB). [12]In red are residues around the mutated residue (in blue) within the selected sphere of 15A ˚ratio used for the analysis.This distance can be modified by the user in the parameter setting web page with the option "Distance cutoff."[Color figure can be viewed at wileyonlinelibrary.com] energy minimization (conjugate gradient) and refined using a small number of steps of molecular dynamics as implemented in Modeller. [33]The generated 3D coordinates are then processed using the PDB2PQR software [35] to assign charge and radius parameters for each atom.This information is stored in the pqr file and used for further calculation of the EP using the PB equations as implemented in the APBS software. [13]hen, the ANALYSIS subroutines estimate the EP of each amino acid residue in the input protein and the mutant variant, based on the known structural conformation of the input protein.Next, the variation in the EP distribution caused by the mutation introduced is calculated.These changes are not only related to the point mutation introduced in the target protein (i.e., change of electrostatic charge associated to a single residue) but also to conformational changes of the mutant variant.These changes need to be assessed before the EP distribution of the mutant protein is recalculated.For this purpose, the system does a fast optimization using the Optimizers Module from Modeller [36] through the conjugate gradients method and molecular dynamics simulation.After the spatial conformation of the mutant variant is estimated, its EP distribution is calculated.A detailed report of the observed changes in the EP associated with the point mutation is then provided to the user.Multiple single mutations can be queried simultaneously without limitation and results are conveniently delivered as independent reports for each request processed.

OUTPUT layer: information retrieval
The results are processed to generate files containing the EP per residue per protein.This is done because the APBS results contain the EP per atom, and having the information in this format would make interpreting the results a difficult task.For this, an in-house script was written, that takes the output from APBS, and calculates the "per residue EP" by summing the individual contributions of each atom of every residue of a given protein.Results are presented as numerical and graphical outputs, displayed online as a set of charts which plot the per residue EP distribution of the input protein (Supporting Information Figure 3Sa, Additional file 1) and the mutant variant (Supporting Information Figure 3Sb, Additional file 1), individually or combined (Supporting Information Figure 3Sc, Additional file 1).The difference in EP between the input protein and the mutant variant generated by MutantElec are also displayed (Supporting Information Figure 3Sd, Additional file 1).A summative chart is plotted, which integrates all the above results in a single figure (Fig. 3a) which is accompanied by a close-up representation of the mutation site and its neighboring residues within a sphere of selection of 15A ˚(a default value, that can be manipulated at will by the user) This is exemplified in Figure 3b, using the Fur from P. aeruginosa Fur PA (PDB_ID:1MZB), [12] and a simulated substitution of residue Glu100 for Ile.
To evaluate the significance of the differences in the EP uncovered between the input and the mutant protein, Muta-ntElec performs the non-parametric Wilcoxon Signed-Rank Test, with a confidence level of 0.05. [37]Additionally, the EP maps of the surfaces are produced using JSmol (Supporting Information Figure 4S, Additional file 1).This facilitates the analysis of the results without the need to install and master additional programs, allowing the user to identify all potentially important changes in specific regions on mutation, and to predict the impact of site directed mutations on protein structure and/or function.Finally, the application generates a compressed file (file *.tar.gz) with all the results, for the users to download and perform their own local analyses if deemed necessary.This file also contains the input and output files that were generated during the analysis run (Table 1).The user is notified via an email that includes links to preview and download of the results.All results are also available in plain text format and can be exported into a spreadsheet for further analysis.

Website
MutantElec is freely available for non-commercial use at http:// structuralbio.utalca.cl/mutantelec/.This server is supported by the Center of Bioinformatics at the University of Talca, and will be constantly updated and maintained to ensure reliable and continuous operation.
EXAMPLE: p53 protein.The tumor suppressor protein p53 is a small transcription factor (393 amino acid) that binds to specific DNA sequences and regulates the expression of genes involved in DNA repair, cell cycle arrest, and apoptotic cell death. [38]This protein has been shown to play a key role in many human cancers and it is now estimated that approximately 50% of human tumors contain mutations in this gene. [39]It has several functional domains, including a transcriptional activation domain at its N-terminus, a sequence- File containing information about atomic charge and radius information, necessary for the calculations with APBS software.

_aminopot.txt
Electrostatic potential for each amino acid expressed in mVolts.

_atompot.txt
Electrostatic potential for each atom expressed in mVolts.

_map.dx
Electrostatic maps to display in Pymol or VMD _ distances.txt File with the distances from the mutated residue and the remaining amino acids of the protein _detailedValues.csvElectrostatic potential differences between the input and mutant protein _testResults.txt File with the result of the Wilcoxon Signed-Rank Test specific DNA-binding domain (core domain), an oligomerization domain and a regulatory basic domain at the C-terminus.A wide range of structural and computational studies have contributed to unraveling the structural basis of activation and DNA binding. [38,40,41]lterations in the human p53 protein have been shown to result in a partial or total loss of its ability to bind DNA and correlated with increased probabilities of developing tumors. [42]][45][46] However, evaluation of these mutations require complex analyses in the areas of biochemistry, molecular biology and biophysics that are typically time-consuming and are only possible with significant resources. [39]The results of such studies show that certain mutations in p53 can affect its structure or modify its noncovalent interactions, causing conformational changes that lead to non-functional proteins. [39]In turn, tools like CellDesigner, have been used successfully to aid in the selection and evaluation of drug targets for p53. [47]However, this approach does not address the effects of the point mutations introduced on the proteins EP distribution or its potential effects.
As an example of the capabilities of MutantElec, the potential effect of mutating residue Arg249 in the DNA-binding domain of p53 was evaluated.This residue has one of the highest mutation rates present in patients that have Difference in the calculated electrostatic potential between "wild type" p53 and 19 different mutants of residue R249.The change in the profile is shown for every residue located within 15 A ˚of the mutated residue.These residues are ordered according to the distance to the R249 starting from the nearest one.The profiles indicate that the changes have an impact beyond the position of the mutation altering residues found in distinct functional domains of the p53 protein, this is shown in the highest peaks.In the bottom of the figure is shown a schema of the domain and residues present in p53. [50][Color figure can be viewed at wileyonlinelibrary.com] Figure 5. Calculated electrostatic potential for the residue Arg249 and the 19 variants generated with the scanning mode of the MutantElec system using the input PDB_ID:1TUP. [40]It is possible to observe the major change in the electrostatic potential for the residues with negative charge like Asp y Glu.developed cancer. [48]In the IARC TP53 database, [49] the following mutations of residue 249 have been described R249S/G/I/ K/M/N/T/W.Interestingly, although residue R249 does not directly mediate p53 binding to DNA, [46] its mutation inactivates its function, suggesting that alterations of the neighboring regions caused by the point mutation.
Using MutantElec and the PDB_ID 1TUP [28] containing the 3D structure of the human p53 core domain as input protein, all the above mentioned mutations were generated in silico.The Arg249 residue was replaced with all of the other 19 natural amino-acids, using the "scanning" option of the program (Supporting Information Figure 1S, Additional file 1) and the EPs and the difference in EP between the wild type and each mutant versions of p53 were calculated and analyzed.The changes in the EP profiles for residues located less than 15 A åpart from Arg249 are shown in Figure 4.The majority of mutations resulted in negative EP differences in the neighborhood of Arg249, with Glu249 and Asp249 being the two mutations that produced the most significant EP changes in the vicinity of the target residue.The profiles obtained indicate that these changes have an impact beyond the mutated residue, an effect that can also be observed in the EP map (Supporting Information Figure 4S, Additional file 1).Further analyses on the effect of mutations of Arg249 with known inactivating effects on p53 (R249S/G/I/K/M/N/T/W) are displayed in Figure 5.The residues that produced major local changes in the EP were the negatively charged ones (Glu and Asp).The Lys249 mutation, in turn, was the only one to produce a positive change in the local EP.
The analysis with MutantElec also revealed which amino acids were most affected by the in silico mutation procedure.These residues were Tyr163, His168, Met246, Glu171, Ser166, Glu285 in decreasing order of magnitude, all of which have polar side chains with the exception of Met246 (Fig. 6a).The spatial distribution of these residues (Fig. 6b) reveals that they are less than 5 A ˚away from the mutated Arg249 with the sole exception of residue Glu285 (<10 A ˚).According to the IARC TP53 database, the most frequently encountered missense mutations that inactivate p53 are Met246, Glu171, His168, and Tyr163 (Table 2S, Supporting Information).These same four mutations, plus Val173 and Ser166, were uncovered by Muta-ntElec as the ones with the most perturbing effects in the EP of the DNA binding domain, and thus potentially affecting the capacity of p53 to bind its target DNA sequence.This provides an example of the value of MutantElec to predict and explain important biological changes due to single site variations or systematic scanning of perturbations in a protein sequence.

Conclusions
MutantElec is a rapid and simple bioinformatic tool that can be used to predict and evaluate changes in the charge distribution of a protein after mutating, in silico, one or several amino acids.This information can be extremely useful for understanding the contribution and importance of specific amino acids to protein function.In addition, MutantElec can aid the understanding of how mutations can cause malfunctions of enzymes and the resulting physiological changes that cause diseases.It is also a useful tool for the rational design of site-directed mutagenesis experiments.MutantElec is user friendly and can be used by scientists who do not have extensive training in bioinformatics and structural biology.It is expected that MutantElec will be a useful tool for teaching and training in protein science and medicine.

Acknowledgment
MutantElec is supported by Center of Bioinformatics from the University of Talca and will be constantly updated and maintained to ensure reliable operation.

Figure 1 .
Figure 1.Workflow of the MutantElect web application.The three layers are color-coded and the components of each layer are represented accordingly.External software packages used in the pipeline are represented by the parallelogram while in-house developments (scripts and subroutines) are shown as rectangles.INPUT layer (N8 1).Routine used to upload the input PDB file and to define the mutations to be performed.ANALYSIS layer (N8 2).Module employed in the generation of the in silico mutations and the calculation of the electrostatic potential.OUTPUT layer (N8 3).Subroutines in charge of performing the statistical analyses and comparisons and in generating the output.[Color figure can be viewed at wileyonlinelibrary.com]

Figure 2 .
Figure 2. Consulting web interface of MutantElec.a) Analysis options, namely "site-specific" or "specific region" of the target protein and entry form for input data upload.b) Calculation parameters and default values (temperature 5 298.15K (mesophilic conditions); dielectric constant of the water 5 78.54; distance cutoff (radio of selection for the analysis) from the residue mutated 5 15A ˚. c) Site-directed mutations selection scroll-down menu.The option "All (Scanning)" allows the calculation to be repeated 19 times for every other of the 19 possible residues.[Color figure can be viewed at wileyonlinelibrary.com]

Figure 3 .
Figure 3. MutantElec analyzes.a)Combinatorial chart displaying the electrostatic potential of the input and mutant proteins, and the difference in potential between both.b) Scheme of the environment surrounding the mutated residue (Glu100Ile) for the protein Fur PA (PDB_ID:1MZB).[12]In red are residues around the mutated residue (in blue) within the selected sphere of 15A ˚ratio used for the analysis.This distance can be modified by the user in the parameter setting web page with the option "Distance cutoff."[Color figure can be viewed at wileyonlinelibrary.com]

Figure 4 .
Figure 4. Difference in the calculated electrostatic potential between "wild type" p53 and 19 different mutants of residue R249.The change in the profile is shown for every residue located within 15 A ˚of the mutated residue.These residues are ordered according to the distance to the R249 starting from the nearest one.The profiles indicate that the changes have an impact beyond the position of the mutation altering residues found in distinct functional domains of the p53 protein, this is shown in the highest peaks.In the bottom of the figure is shown a schema of the domain and residues present in p53.[50] [Color figure can be viewed at wileyonlinelibrary.com]

Figure 6 .
Figure 6.Analysis of the more significant change for the residue 249 for p53 protein.a) show amino acids that were most affected by the respective in silico mutations with respect to the wild type protein, this are the residues Tyr163, His168, Met246, Glu171, Ser166, Glu285 in decreasing order of magnitude.b) Spatial distribution of these residues, it is possible to observe some of these residues are located at less than 5 A ˚to the mutated Arg249 except the residue Glu285 (>10 A ˚). [Color figure can be viewed at wileyonlinelibrary.com]

Table 1 .
Description of the files that are sent to the user after the calculations have finished.Configuration file for APBS calculations protein.pdbCoordinate file uploaded by the user proteinWT-Mutant.pdbCoordinate file for the mutant protein generated by Mutator.proteinWT-Mutant.pqr