Usage of the server
- Browse menu
- Database can be browsed by identifiers, number of transmembrane segments, evidence levels, TargetTrack statuses and cluster IDs. Entries can be selected by ticking the box next to them (ticking the checkbox in the header would select all entries in the category). The selected items can be downloaded by clicking on 'Download selected item(s)' button.
- Download menu
- Apart from the possibility of downloading entries in separate XML files from the Browse menu, users may download selected subsets in one XML (like all 3D or modelable proteins) from the Download menu.
- Search by sequence
- Sequences can be submitted in FASTA format. Besides finding the identical sequence, similar sequences are also searched using BLAST.
- Search by identifier
- The user can enter any identifier or part of an identifier that are indexed in TSTMP database, including TSTMP ID (TSTMP_004387), HTP ID (e.g. HTP_004387), UniProt ID (e.g. TSN2_HUMAN), UniProt AC (O60636) or PDB/PDBTM (e.g. 3odu) identifier.
- Search by number of homologues
- The user can specify the number of homologues of the target proteins.
- Search by number of TargetTrack entries
- The user can specify the number of homologues in TargetTrack database.
- Search by TargetTrack status
- The user can specify the status of TargetTrack homologues. More options can be selected at the same time.
- Search result panel
- The results of the various searches can be seen in the search result panel (Figure 1).
It contains the following columns:
- Identifier: unique TSTMP identifier, which is numbered the same way as entries in the HTP database.
- Name: Name of the protein
- TM segments: number of transmembrane segments predicted by CCTOP
- TargetTrack: Summary of TargetTrack homologue status. The colored bars show the progress achieved on the protein, selected (red), soluble (yellow), crystallized (blue), diffraction data available (green), modeled (grey), work stopped (black)
- Homologues (targets): Number of homologues of the protein (number of target homologues of the protein).
- Cluster: Proteins were divided into clusters based on sequence identity. Proteins from the same cluster are likely to be homologues.
- Entry viewer panel
This panel is for summarizing the information collected and/or predicted for each protein in the human transmembrane proteome.
- Identifier: TSTMP identifier of the browsed protein
- Evidence: the evidence level of the entry (3D/modelable/target)
- Name: Name of the protein
- Cross-references: cross references to UniProt, PDBTM (only available in case of '3D' or 'modelable' evidence level), HTP and TargetTrack databases.
- Cluster: Cluster id, size of cluster, list of proteins with the most homologues from this cluster (the cluster size and the number of homologues are not equal, as the most distant relatives can not be used to model each other).
- Sequence: amino acid sequence of the protein.
- by 'most wanted'
- The 'most wanted' list shows, how the human transmembrane proteome could be modeled by the minimum number of steps (i.e. with the smallest number of crystallized protein). We have sorted the proteins based on how many target homologues they have, then selected the one with the most relative. Then we removed all entries from the list that could be modeled based on the selected protein(s) and iterated this process until no human transmembrane protein were left.
- by evidence
- We defined three types of evidence for the entries: proteins that could be matched with at least one PDB entry were categorized as '3D'. Proteins that could be assigned to a PDBTM structure by TMFoldRec and HHBlits received evidence type ‘modelable’. The rest of the entries had evidence type ‘target’ (See Description of the database for details).
- by number of homologues
- The list of the most wanted transmembrane proteins shows, how the human transmembrane proteome could be modeled with the minimum number of steps (i.e. with the smallest number of crystallized proteins). We have sorted the proteins based on how many homologues target they have, then selected the one with the most relative. Then we removed all entries from the list that could be modeled based on the selected protein(s) and iterated this process until no human transmembrane protein were left. The upper figure shows cluster sizes from the most wanted list. The lower figure depicts the cumulative size of number of homologues, i.e. how the coverage of yet unmodeled human transmembrane proteome changes by solving structures from the most wanted list (red), from those proteins that can be found in the TargeTrack database (blue) or from choosing target by chance (orange). According to the graph 1546 and 1731 of the uncharactized proteins could be modeled if the structure of the first 50 and 100 proteins from the list would be solved, respectively.
- by TargetTrack status
- TSTMP sequences were searched against TargetTrack trial sequences to find homologous entries. The bars show the distribution of TSTMP sequences based on their TargetTrack status (if available).
- by the growing number of 3D structures
- Cumulative totals of the number of new transmembrane protein structures since 2007 according to PDBTM (blue). Since this data is highly redundant (e.g. the same protein was often crystallized multiple times), the number of different proteins related to these structures are shown as well (red). Finally, the number of different human proteins revealed by these PDB entries are shown (green). Exponential curves were fitted to each plot in a similar way as described by Dickerson in PDB Newsletter (1978), using the year in the exponent (starting from 2007)(black dashed lines) (the equations are: e0.1712*(Year-2006)+6.0812, e0.1425*(Year-2006)+5.2258 and e0.2504*(Year-2006)+2.2690 for 3D, Unique 3D and human unique 3D transmembrane structures, respectively. Regarding only polytopic transmembrane proteins, the eauations are: e0.1965*(Year-2006)+5.5821, e0.1684*(Year-2006)+4.6498 and e0.3448*(Year-2006)+0.8605 for 3D, Unique 3D and human unique 3D transmembrane structures, respectively. According to this figure, even if the growth of solved membrane protein structures is exponential, the rise of the curve is rather plain considering only different proteins.
The search result panel
The entry viewer panel
Target Track statuses
|selected, cloned or expressed|
|solubilized or purified|
|crystallized or HSQC satisfactory|
|XRAY, NMR or ERAY data collected|
|in structure database|