Home  |   Tutorials  |   DynOmics 1.0  |   Yang's Lab  |   Server Status  |   Example


1. Workflow

2. User submission

3. Result rendering

4. Algorithm

5. References


1.      Workflow

The workflow in DRDOCK can be summarized into four modules: structure preprocessing, docking/ranking 2016 FDA-approved drugs, 10 ns NPT MD simulation in water, and result rendering (Figure 1). The user-submitted protein target in PDB format is first preprocessed before docking, including the determination of the pKa and protonation state of ionizable residues, optimizing the orientation of rotatable residue side-chains, and disulfide bond detection [1]. As an optional service, the user can choose to relax the target structure, without bound co-crystallizers, by 1 ns MD simulation in water before docking. The service then docks 2016 FDA-approved drugs to the target, and features inferred from docking poses are used to calculate the LOD scores for drug ranking [1]. Users then view the docking poses online on the results page and submit the interested docking poses for 10 ns NPT MD simulations of the protein-drug complexes in water. The resulting trajectories are used for assessing the drug binding stability (i.e., whether the drugs stay in the binding site through the simulation) and binding free energy by MM/GBSA [1]. The trajectories are then updated to the results page for the playback of manipulatable 3D animations.


Figure 1. The workflow in DRDOCK. DRDOCK begins with preprocessing the user-submitted protein target. Users can choose to directly use the input structure for docking or optionally perform a one ns MD simulation to minimally relax the structure in water before docking. The preprocessed structure is then used to dock 2016 FDA-approved drugs, where the LOD scores calculated from the docking poses are used for drug ranking. Users then view the docking poses on the online results page and pick interesting docking poses to simulate the dynamic protein-drug interaction in 10 ns NPT MD simulations with water. The resulting trajectory is used for binding free energy calculation by MM/GBSA and can be played on the online results page. The figure is reproduced from Tsai & Yang, 2021 [1].


2.      User submission

DRDOCK requires users to submit a well-prepared single-chain protein structure in PDB format and the corresponding residues IDs of the target sites, such as active sites for enzymes, that are used to identify the closest docking poses the users should interest in (Figure 2). Since the integrity of the protein structure is essential for meaningful MD simulations, the protein structure should be prepared for preventing any missing backbone heavy atoms (CA, C, N, O). The missing of any side-chain heavy atoms is allowed since they can be automatically patched according to pre-defined amino acid templates but may slightly distort the predicted binding poses in docking. DRDOCK requires the protein structure to only comprise 20 standard amino acids, of which the force constants and atom charges are already well developed and calibrated in AMBER ff14SB force field (Maier et al., 2015) [2]. For those structures containing some of the standard amino acid-derived residues, DRDOCK can tolerate that by mutating the residues back to their closest standard amino acids. Otherwise, the users should manually modify and mutate back those residues according to the original protein sequence. The most convenient way to achieve this is to submit the desired target protein structure with non-standard amino acids with its corresponding original peptide sequence to the SWISS-MODEL web server (Waterhouse et al., 2018) [3] under the “template mode”, which can automatically mutate any residue that is different from the original peptide sequence back and patch any missing heavy atoms, which is our recommended way to prepare a submission-ready target structure.


Figure 2. The user submission form and results page of DRDOCK. (A) The job submission form in the main page of DRDOCK. The user is required to provide a well-prepared single-chain protein structure (“PDB file”). The user can assign a specific chain ID, model number for NMR-resolved structures, a desired segment in the specified protein chain (in “Residues range”). Users are required to specify the IDs of residues that comprise a target site (e.g. enzyme active site residues). Lastly, the user should provide a valid email address for receiving the job status and results from DRDOCK. The orange fields are required while grey fields are optional. (B) The results page of DRDOCK. The drugs are rank-ordered based on our ranking algorithm that ranks docked poses as described in Tsai & Yang, 2021 [1]. The drug name, 2D structure, description, basic properties, and PubChem link are shown in the drug panel. The rank-ordered drug list in CSV format can be downloaded (the download button in the purple bar atop). To submit poses for MD simulations, click the flask icon to enter the simulation submission mode, where a submission panel and a “submit” button will show up in the functional bar atop (beneath the navigation bar). A checkmark also shows up and replaces the flask icon for chosen drugs. Users can also switch the docking poses to submit different poses for the same drug. Finally, click the “submit” button in the submission panel to submit all the checked poses to the web server. Users can browse the rank-ordered drugs based on the simulation results [1] by toggling the “MD” at the upper right corner (highlighted in the black box in Figure 2B). Other details about DRDOCK, including that our file parser can allow several types of non-standard amino acids, can be found in Tsai & Yang, 2021 [1]. The figure is reproduced from Tsai & Yang, 2021 [1].


3.      Result rendering

DRDOCK is designed for allowing real-time access to the latest calculation results while the entire drug screening is still in the process. The drugs and their docking poses shown in the results page are rank-ordered based on the LOD scores, described in Tsai & Yang, 2021 [1]. The drug name, 2D structure, drug properties, and external link to PubChem (Kim et al., 2019) [4] can be viewed in each drug panel (Figure 3, top). Docking poses for a drug can be selected by the switch at the top right (Figure 3, top). Any docking pose can be further submitted for a 10 ns MD simulation (Figure 2 and 3). When the simulation is finished, a superposed trajectory can be visualized in the embedded NGL viewer by clicking the “view trajectory” button (Figure 3, bottom), and the corresponding features and the estimated binding free energies for each frame can be seen in the floated left panel (Figure 3, bottom). For a better use of the computing resources, simulations for at most 20 selected poses can be submitted for each uploaded protein target. Finally, a complete list of rank-ordered drugs, all the docking poses, MD simulation trajectories, calculated features can be downloaded by clicking the “download” button (Figure 2 and 3).


Figure 3. The functionality and info-display in the drug panel. The user can visualize the docking pose by clicking the eye-shape “view docking pose” button, and an extended space showing the binding mode of protein and drug will show up (top), clicking the “view docking features” button would render us a list of properties derived from the docking results (middle). Clicking the video-shape of “view trajectory” button pops open an NGL viewer (Rose et al., 2018) [5] and plays the MD animation for the interacting drug-target complex if the pose were chosen for simulations. With this, MM/GBSA-estimated binding free energy is listed on the side, which is important for re-ranking the docking results. The pose and trajectory can be downloaded from the “download” button. The figure is reproduced from Tsai & Yang, 2021 [1].


4.      Algorithm

We treat the drug screening problem as a ranking problem to reduce the disparity coming from different docking software used, different sampling depth and parameterization details of the drug force field, etc. The ranking of drugs is decided by a score based on the distance of pose features’ statistics distributions, sampled within our benchmark set, from the true binders and the decoys. The features include the original value of Vina-predicted binding affinity (affinity), the distance between COMs of a drug pose and assigned target residues (termed as “COM distance”), and the number of poses belonging to the top ten of best affinities in the group clustered based on an adjacency-based clustering method [1] (adjacency-based clustering). To get such statistics, we pooled all the generated poses from the 16 protein-drug complexes in the training set, derived from a 2016 drug library (see Methods in [1]), and derived the distribution of the true binding poses and decoys for each of the features associated with each pose (Figure 4). These distributions provided the statistical basis to annotate a sampled docking pose to be a true binder or decoy, based on its relative probability. Specifically, we calculated a log-odds (LOD) score for each of the sampled poses according to the following equation (Eq. 1):




, where x represents a sampled docking pose; f represents one of the specified features, e.g. the pose affinity, the pose’s distance to the target site, and the size of the poses cluster.  is the value of feature f for the pose  located.  and are the probability distributions of feature f for docking poses sampled from the true binders and the decoys, respectively. Noted that  are binned values and therefore different  values within the same bin have the same  or values. If there is no count for a bin, a small probability (10-7) is assigned for that bin. The LOD score takes the idea of relative entropy, representing how more or less likely a given sampled pose can result from a true binder than from a decoy. A higher LOD score means the pose is more likely from a true binder. The 2016 drugs are then rank-ordered based on their best LOD score among the 20 sampled poses.


Figure 4. The difference in the distributions of the affinities (top), COM distance (middle), and adjacency-based clustering (bottom) between the “answer” poses (the true binder pose, left columns) and decoys (right columns).


While docking serves as a quick screening tool for in silico drug screening, as a trade-off, the lack of considering the solvation effect and molecular dynamics involved in the protein-ligand interaction introduces false positives [6]. Explicit solvent MD simulations using classical force fields optimized for biological molecules are recognized to better describe the protein-ligand binding dynamics in a timescale that allowed for weak binders to dissociate [7, 8, 9, 10]. The MD simulations for user-selected protein-pose complexes were performed in an explicit solvent, 310K, and 1 atm NPT ensemble for 10 ns after preliminary energy minimization and equilibration of the system. During the simulation, a snapshot is taken every 0.1 ns, resulting in a trajectory composed of 100 frames. The simulation results were summarized using the designed indicators, including the mean binding free energy from MM/GBSA calculation over sampled snapshots, the drug leaving time when the drug center of mass moving away from the target site more than 10 Å, and the largest distance of the drug COM to the target sites sampled during the simulations. A simulated drug got a higher rank if having a favorable mean binding free energy and staying long enough in the target site (or never leaving) during the 10 ns simulation. For more detail, please refer to [1].


5.      References

[1] Tsai, K.-L., Yang, L.-W., 2021. DRDOCK: A drug repurposing platform integrating automated docking, simulations and a log-odds-based drug ranking scheme. doi:10.1101/2021.01.31.429052

[2] Maier, J.A., Martinez, C., Kasavajhala, K., Wickstrom, L., Hauser, K.E., Simmerling, C., 2015. ff14SB: Improving the Accuracy of Protein Side Chain and Backbone Parameters from ff99SB. Journal of Chemical Theory and Computation 11, 3696–3713. doi:10.1021/acs.jctc.5b00255

[3] Waterhouse, A., Bertoni, M., Bienert, S., Studer, G., Tauriello, G., Gumienny, R., Heer, F.T., de Beer, T.A.P., Rempfer, C., Bordoli, L., Lepore, R., Schwede, T., 2018. SWISS-MODEL: homology modelling of protein structures and complexes. Nucleic Acids Research 46. doi:10.1093/nar/gky427

[4] Kim, S., Chen, J., Cheng, T., Gindulyte, A., He, J., He, S., Li, Q., Shoemaker, B.A., Thiessen, P.A., Yu, B., Zaslavsky, L., Zhang, J., Bolton, E.E., 2018. PubChem 2019 update: improved access to chemical data. Nucleic Acids Research 47. doi:10.1093/nar/gky1033

[5] Rose, A.S., Bradley, A.R., Valasatava, Y., Duarte, J.M., Prlić, A., Rose, P.W., 2018. NGL viewer: web-based molecular graphics for large complexes. Bioinformatics 34, 3755–3758. doi:10.1093/bioinformatics/bty419

[6] Pantsar, T., Poso, A., 2018. Binding Affinity via Docking: Fact and Fiction. Molecules 23, 1899. doi:10.3390/molecules23081899

[7] Huang, D., Caflisch, A., 2011. Small Molecule Binding to Proteins: Affinity and Binding/Unbinding Dynamics from Atomistic Simulations. ChemMedChem 6, 1578–1580. doi:10.1002/cmdc.201100237

[8] Liu, K., Kokubo, H., 2017. Exploring the Stability of Ligand Binding Modes to Proteins by Molecular Dynamics Simulations: A Cross-docking Study. Journal of Chemical Information and Modeling 57, 2514–2522. doi:10.1021/acs.jcim.7b00412

[9] Liu, P.-F., Tsai, K.-L., Hsu, C.-J., Tsai, W.-L., Cheng, J.-S., Chang, H.-W., Shiau, C.-W., Goan, Y.-G., Tseng, H.-H., Wu, C.-H., Reed, J.C., Yang, L.-W., Shu, C.-W., 2018. Drug Repurposing Screening Identifies Tioconazole as an ATG4 Inhibitor that Suppresses Autophagy and Sensitizes Cancer Cells to Chemotherapy. Theranostics 8, 830–845. doi:10.7150/thno.22012

[10] Mollica, L., Decherchi, S., Zia, S.R., Gaspari, R., Cavalli, A., Rocchia, W., 2015. Kinetics of protein-ligand unbinding via smoothed potential molecular dynamics simulations. Scientific Reports 5. doi:10.1038/srep11539



Reference:  Tsai,KL. and and Yang,LW. (2021) DRDOCK: A DrugRepurposing platform integrating automated docking, simulation and a log-odds-based drug ranking scheme. doi: 10.1101/2021.01.31.429052 (To be peer-reviewed)

Contact:  The server is maintained by the Yang Lab at the Institute of Bioinformatics and Structural Biology at National Tsing Hua University, Taiwan.

For questions and comments please contact Kun-Lin Tsai.