TSTMP: Target Selection for human TransMembrane Proteins

Case studies for modeling TMPs using TSTMP as a starting point

Structure modelling of TMPs is a commonly used method, since the number of available structures is rather low. In the following examples we show, that using default settings on these services may lead to erroneous prediction, since these methods do not consider the correct topology prediction and the correct alignment of TMHs. We also show that using TSTMP for suggesting possible template structures, the accuracy of these predictions can be increased.

Figure 1:
Homology modeling of the V-type proton ATPase

For homology modeling

We used SWISS-MODEL [1] for homology modelling the V-type proton ATPase 21 kDa proteolipid subunit (UniProt ID: VATO_HUMAN), consisting 5 TMSs [2]. The server has several options, including the automatic search for template and alignment to it. However, using this mode with the sequence information alone, the inbuilt search function of SWISS-MODEL provided structures with 4 TMSs (or less), where at least the first TMS was missing, therefore the provided alignments and models were erroneous (see the model and the alignment of the first hit (PDB: 3j9v_X) on Supplementary Figure 3A). To overcome this problem, we have used the structure suggested by TSTMP (PDB: 4wib_B) with the alignment from TMFoldWeb and ClustalO. By adding the alignment to the "target-template alignment" option of SWISS-MODEL, it successfully modelled the protein with all 5 TMS (Supplementary Figure 3B). We have to note, that in contrast to ABCG2 (see below), we do not have initial hints about the structure of this protein, however it is plausible that the acquired model based on TSTMP’s suggestion is closer to the unrevealed structure, than the default one.




For ab initio structure prediction

Figure 2:
De novo prediction of ABCG2 protein using EVFold

We used EVFold to predict the structure of ABCG2 protein (UniProt ID: ABCG2_HUMAN). The pre-calculated structure of ABCG2 on the EVFold website [3] consists of seven transmembrane regions, therefore the provided structure is wrong.

When we used the alignment of the ABC2_membrane Pfam family (PF01061), EVFold resulted in a structure with 5 transmembrane helices, even when the correct topology with six membrane helices was used, because the PF01061 family alignment only cover five transmembrane helices.

Finally, using the correct topology with six transmembrane helices and an alignment containing enough sequences over the six TMSs, by setting the parameter of ‘Maximum gaps allowed in a column to be included in EC calculation (%)’ to 70% from the default 30%, resulted in a structure that could be structurally aligned to the newly solved structure of human ABCG5 protein (PDB id: 5do7) with an RMSD of 5.549 (Figure 2, 5do7: magenta, ABCG2: yellow)(The EVFold results can be downloaded from here).

These results show that modeling of transmembrane proteins as accurately as possible requires precise knowledge of the topology of the transmembrane domain and a multiple sequence alignment that covers all of the known or predicted transmembrane regions.



References

  1. Biasini M, Bienert S, Waterhouse A, Arnold K, Studer G, Schmidt T, et al. SWISS-MODEL: modelling protein tertiary and quaternary structure using evolutionary information. Nucleic Acids Res. Oxford University Press; 2014;42: W252-8. doi:10.1093/nar/gku340
  2. Flannery AR, Graham LA, Stevens TH. Topological characterization of the c, c’, and c" subunits of the vacuolar ATPase from the yeast Saccharomyces cerevisiae. J Biol Chem. American Society for Biochemistry and Molecular Biology; 2004;279: 39856–62. doi:10.1074/jbc.M406767200
  3. Marks DS, Hopf TA, Sander C. Protein structure prediction from sequence variation. Nat Biotechnol. Nature Publishing Group, a division of Macmillan Publishers Limited. All Rights Reserved.; 2012;30: 1072–1080. doi:10.1038/nbt.2419

Evidence levels

  3D
  Modelable
  Target

Target Track statuses

 selected, cloned or expressed
 solubilized or purified
 crystallized or HSQC satisfactory
 XRAY, NMR or ERAY data collected
 model fitted
 in structure database