DRDOCK
Home  |   Tutorials  |   DynOmics 1.0  |   Yang's Lab  |   Server Status  |   Example

Tutorials

 

1. Submit protein target for drug screening

2. Reassess the drug binding by MD simulations

3. User’s guide

3-1. Workflow

3-2. User submission

3-3. Result rendering

3-4. Algorithm

4. References

 

1.      Submit protein target for drug screening

DRDOCK automatically performs drug screening for 2016 FDA-approved drugs on a user-submitted protein target. To submit a protein target, DRDOCK accepts two types of submission, 1) a 3D protein structure file in PDB format and 2) a protein sequence composing one-letter amino acid codes. In this tutorial, we will target the SARS-CoV-2 non-structural protein 16 (Nsp16) for drug screening to find drugs competing for the binding site with the methyl donor S-Adenosylmethionine (SAM) required for the post-translational modification of the 5’ cap structure of the nascent transcribed viral mRNA.

 

To submit 1) a 3D protein structure, a file containing the atom coordinates of the protein target in a standard PDB format file is required. An example PDB file of SARS-CoV-2 Nsp16 can be downloaded on the Protein Data Bank (PDB code: 6WVN) or directly by this link. The PDB file contains (the information can be found on the PDB page with the structure shown in their 3D structure viewer) two peptide chains, one for the non-structural protein 16 protein (Nsp16, chain A) and the other one for the non-structural protein 10 protein (Nsp10, chain B), and co-crystallized with several small molecules and ions, including SAM that accompanied in chain A with a residue ID of 7112. This PDB file is used as the starting point for DRDOCK submission.

 

Now, visit the home page of DRDOCK where a form by default for protein structure file submission is presented. Point to the button in the first item of the form labeled with “PDB file” and marked with an orange brick on the left, which indicates the field of the item is a required input for submission. Click the upload button and choose the downloaded PDF file, and the file name for submission will be presented on the right of the button.

 

Our target protein, nsp16, is presented in the PDB file as the “A” chain. For a PDB file containing multiple peptide chains, specify the chain ID, which is “A” in our case, in the left field of the second item labeled with “Chain ID”. If this optional field is not filled, by default, DRDOCK selects the first peptide chain presented in the PDB file, which is still the “A” chain in our case.

 

Next, DRDOCK requires the user to assign one or more target residues, where the center of mass will be the target site where the drugs are desired to dock. Assign the target residue in the field of the fourth item as “7112”, the residue ID of SAM in the PDB file. If multiple residues (for example, the catalytic triad) are desired for target site assignment, type the residue IDs separated by comma (,) or dash (-) if the residue IDs are consecutive.

 

If you prefer receiving an email notification of the link to the result page after a successful submission and when the drug screening is completed (see below), key in an email address in the optional field of the fifth item of the form. If you would like to let DRDOCK perform a short MD simulation to relax the protein structure a bit in the water box before the drug screening, click the “Relax the structure by MD before docking” option.

 

一張含有 文字, 電腦, 電子產品, 螢幕擷取畫面 的圖片

自動產生的描述

 

 

Click the “submit” button below the form for job submission. A processing icon will appear with the status indicated below.

 

一張含有 文字, 螢幕擷取畫面, 字型, 數字 的圖片

自動產生的描述

 

If the input protein structure and provided information are processed successfully, DRDOCK will automatically redirect to the result page, with the URL as “https://dyn.life.nthu.edu.tw/drdock/results/<your job ID>/”. You can memo the job ID and re-visit the result page later after the drug screening is completed.

 

一張含有 文字, 電子產品, 螢幕擷取畫面, 軟體 的圖片

自動產生的描述

 

If you provide an email address in the submission form, you will also receive an email notification once your input has been successfully processed, with the link to the result page included.

 

一張含有 文字, 螢幕擷取畫面, 字型 的圖片

自動產生的描述

 

To check the job status, click the “Server Status” tab in the navigation bar above. All of the submitted job IDs queued or running in the server, as indicated by “Q” or “R” in the “Status” column, are listed.

一張含有 文字, 螢幕擷取畫面, 數字, 字型 的圖片

自動產生的描述

 

If the submission fails to be processed, a notification page will be shown, where the possible reasons and solutions are provided, with the processing log and error messages presented in the box below.

一張含有 文字, 電子產品, 螢幕擷取畫面, 字型 的圖片

自動產生的描述

 

To submit by 2) the protein target sequence, a peptide sequence encoded with one letter amino acid code should be provided. An example can be found in PDB code 6WVN and accessed by this link in FASTA format. Copy the first two lines since the nsp16 protein is in the “A” chain. Go to the home page. Click “AlphaFold2 modeling” below the submission form to switch to sequence submission mode. Paste the copied FASTA sequence to the text field below the “Protein sequence” label.

 

Because the FASTA sequence of the protein target does not contain the coordinate information of the SAM molecule, we use residue IDs 104 and 135 instead (Notice the residue ID is numbered according to the submitted peptide sequence and starting from 1), where the center of mass is used as the target site for approximation of the location of SAM. Click “Submit” for job submission. The target protein structure will be modeled according to the submitted protein sequence by AlphaFold2 before subsequent drug screening.

 

一張含有 文字, 螢幕擷取畫面, 字型, 數字 的圖片

自動產生的描述

 

Once the drug screening is completed, the state icon will change from “Running”
to “Complete” in the top left below the purple functional bar.

 

 

If an optional email address is provided in the submission form, an email notification of job completion will also be sent.

 

 

The drug screening result is presented following the ranked order of drugs according to the LOD scores of the top-ranked poses in each drug and the ranking rules of DRDOCK, in the “docking” tab of the top-right toggle.

 

 

To view the docking pose with the protein target in the 3D structure, click the “view pose” button represented with an eye icon. The target protein is shown in cartoon, the drug docking pose in thick sticks, and the protein residues contacted with the drug docking pose in thin sticks. The target site is indicated by a yellow ball.

 

Rotate the 3D structure by mouse-clicking and dragging inside the area of the expanded structure viewer. Drag and move the entire complex by right-clicking and dragging. Resize the molecules using the mouse wheel.

 

 

Click the “list icon” to show calculated feature values, including pose LOD score, affinity, the distance to the target site, and the poses clustering size, derived from the docking results.

 

 

Click the “download icon” to download the preprocessed protein target PDB file used for docking, the resulting docking pose (in PDB and PDBQT format), and the run log file outputted by the docking software.

 

 

Click the pose switch to browse other lower-ranked poses.

 

 

2.      Reassess the drug bindings by MD simulations

DRDOCK provides each submitted protein target with 20 selected protein-drug docking poses for the binding reassessment using MD simulations. To submit a drug docking pose for MD simulation and binding free energy calculation, click the “MD icon” for poses of interest.

 

 

For selected poses, the “MD icon” will be marked with a tick. Click the tick again will cancel the selection.

 

 

When selecting a pose for MD simulation, the current selected and the remaining available number of poses for simulations will be shown in the above functional bar.

 

 

You can also choose to let DRDOCK automatically select the top-ranked pose of each top-ranked drug for the remaining quota.

 

 

To unchoose all the selected poses, click the close icon on the right.

 

 

To submit the selected poses for MD simulations, click the “submit” button.

 

 

When the submitted simulations are completed, the results can be accessed under the same URL of the result page, with the “MD icon” replaced with a “video player icon”.

 

 

If an optional email address is provided in the submission form, an email notification of job completion with the link to the result page will also be sent.

 

 

You can also click the “MD” toggle on the top-right corner to only show the drugs and poses with simulations results, rank-ordered according to the calculated binding free energy based on the simulations trajectories and the ranking rules of DRDOCK.

 

 

Click the “video player icon” to open the simulation player. Play the simulation trajectory by clicking the “play button” on the player control island at the right-bottom corner.

 

 

Drag the “progress bar” to switch to the specific snapshot of the trajectory as indicated by the number shown on the right of the player control island.

 

 

The representation of the protein target, drug, and target site is the same as that in the view of the docking pose. The feature island floating on the top-left corner shows the calculated feature values per snapshot (the top panel) and the summarized feature values for the entire trajectory (the bottom panel).

 

The trajectory is allowed for download, including the topology files (supplied with both AMBER (.parm7) and CHARMM (.psf) formats) and the coordinates files (supplied with both NetCDF and DCD formats). These files are added to the folder from the updated download link. Click the “download icon” to download the trajectory files.

 

 

3.      User’s guide

3-1. Workflow

The workflow in DRDOCK can be summarized into four modules: structure preprocessing, docking/ranking 2016 FDA-approved drugs, 10 ns NPT MD simulation in water, and result rendering (Figure 1). The user-submitted protein target in PDB format is first preprocessed before docking, including the determination of the pKa and protonation state of ionizable residues, optimizing the orientation of rotatable residue side-chains, and disulfide bond detection [1]. As an optional service, the user can choose to relax the target structure, without bound co-crystallizers, by 1 ns MD simulation in water before docking. The service then docks 2016 FDA-approved drugs to the target, and features inferred from docking poses are used to calculate the LOD scores for drug ranking [1]. Users then view the docking poses online on the results page and submit the interested docking poses for 10 ns NPT MD simulations of the protein-drug complexes in water. The resulting trajectories are used for assessing the drug binding stability (i.e., whether the drugs stay in the binding site through the simulation) and binding free energy by MM/GBSA [1]. The trajectories are then updated to the results page for the playback of manipulatable 3D animations.

 

Figure 1. The workflow in DRDOCK. DRDOCK begins with preprocessing the user-submitted protein target. Users can choose to directly use the input structure for docking or optionally perform a one ns MD simulation to minimally relax the structure in water before docking. The preprocessed structure is then used to dock 2016 FDA-approved drugs, where the LOD scores calculated from the docking poses are used for drug ranking. Users then view the docking poses on the online results page and pick interesting docking poses to simulate the dynamic protein-drug interaction in 10 ns NPT MD simulations with water. The resulting trajectory is used for binding free energy calculation by MM/GBSA and can be played on the online results page. The figure is reproduced from Tsai & Yang, 2021 [1].

 

3-2. User submission

DRDOCK requires users to submit a well-prepared single-chain protein structure in PDB format and the corresponding residues IDs of the target sites, such as active sites for enzymes, that are used to identify the closest docking poses the users should interest in (Figure 2). Since the integrity of the protein structure is essential for meaningful MD simulations, the protein structure should be prepared for preventing any missing backbone heavy atoms (CA, C, N, O). The missing of any side-chain heavy atoms is allowed since they can be automatically patched according to pre-defined amino acid templates but may slightly distort the predicted binding poses in docking. DRDOCK requires the protein structure to only comprise 20 standard amino acids, of which the force constants and atom charges are already well developed and calibrated in AMBER ff14SB force field (Maier et al., 2015) [2]. For those structures containing some of the standard amino acid-derived residues, DRDOCK can tolerate that by mutating the residues back to their closest standard amino acids. Otherwise, the users should manually modify and mutate back those residues according to the original protein sequence. The most convenient way to achieve this is to submit the desired target protein structure with non-standard amino acids with its corresponding original peptide sequence to the SWISS-MODEL web server (Waterhouse et al., 2018) [3] under the “template mode”, which can automatically mutate any residue that is different from the original peptide sequence back and patch any missing heavy atoms, which is our recommended way to prepare a submission-ready target structure.

 

Figure 2. The user submission form and results page of DRDOCK. (A) The job submission form in the main page of DRDOCK. The user is required to provide a well-prepared single-chain protein structure (“PDB file”). The user can assign a specific chain ID, model number for NMR-resolved structures, a desired segment in the specified protein chain (in “Residues range”). Users are required to specify the IDs of residues that comprise a target site (e.g. enzyme active site residues). Lastly, the user should provide a valid email address for receiving the job status and results from DRDOCK. The orange fields are required while grey fields are optional. (B) The results page of DRDOCK. The drugs are rank-ordered based on our ranking algorithm that ranks docked poses as described in Tsai & Yang, 2021 [1]. The drug name, 2D structure, description, basic properties, and PubChem link are shown in the drug panel. The rank-ordered drug list in CSV format can be downloaded (the download button in the purple bar atop). To submit poses for MD simulations, click the flask icon to enter the simulation submission mode, where a submission panel and a “submit” button will show up in the functional bar atop (beneath the navigation bar). A checkmark also shows up and replaces the flask icon for chosen drugs. Users can also switch the docking poses to submit different poses for the same drug. Finally, click the “submit” button in the submission panel to submit all the checked poses to the web server. Users can browse the rank-ordered drugs based on the simulation results [1] by toggling the “MD” at the upper right corner (highlighted in the black box in Figure 2B). Other details about DRDOCK, including that our file parser can allow several types of non-standard amino acids, can be found in Tsai & Yang, 2021 [1]. The figure is reproduced from Tsai & Yang, 2021 [1].

 

3-3. Result rendering

DRDOCK is designed for allowing real-time access to the latest calculation results while the entire drug screening is still in the process. The drugs and their docking poses shown in the results page are rank-ordered based on the LOD scores, described in Tsai & Yang, 2021 [1]. The drug name, 2D structure, drug properties, and external link to PubChem (Kim et al., 2019) [4] can be viewed in each drug panel (Figure 3, top). Docking poses for a drug can be selected by the switch at the top right (Figure 3, top). Any docking pose can be further submitted for a 10 ns MD simulation (Figure 2 and 3). When the simulation is finished, a superposed trajectory can be visualized in the embedded NGL viewer by clicking the “view trajectory” button (Figure 3, bottom), and the corresponding features and the estimated binding free energies for each frame can be seen in the floated left panel (Figure 3, bottom). For a better use of the computing resources, simulations for at most 20 selected poses can be submitted for each uploaded protein target. Finally, a complete list of rank-ordered drugs, all the docking poses, MD simulation trajectories, calculated features can be downloaded by clicking the “download” button (Figure 2 and 3).

 

Figure 3. The functionality and info-display in the drug panel. The user can visualize the docking pose by clicking the eye-shape “view docking pose” button, and an extended space showing the binding mode of protein and drug will show up (top), clicking the “view docking features” button would render us a list of properties derived from the docking results (middle). Clicking the video-shape of “view trajectory” button pops open an NGL viewer (Rose et al., 2018) [5] and plays the MD animation for the interacting drug-target complex if the pose were chosen for simulations. With this, MM/GBSA-estimated binding free energy is listed on the side, which is important for re-ranking the docking results. The pose and trajectory can be downloaded from the “download” button. The figure is reproduced from Tsai & Yang, 2021 [1].

 

3-4. Algorithm

We treat the drug screening problem as a ranking problem to reduce the disparity coming from different docking software used, different sampling depth and parameterization details of the drug force field, etc. The ranking of drugs is decided by a score based on the distance of pose features’ statistics distributions, sampled within our benchmark set, from the true binders and the decoys. The features include the original value of Vina-predicted binding affinity (affinity), the distance between COMs of a drug pose and assigned target residues (termed as “COM distance”), and the number of poses belonging to the top ten of best affinities in the group clustered based on an adjacency-based clustering method [1] (adjacency-based clustering). To get such statistics, we pooled all the generated poses from the 16 protein-drug complexes in the training set, derived from a 2016 drug library (see Methods in [1]), and derived the distribution of the true binding poses and decoys for each of the features associated with each pose (Figure 4). These distributions provided the statistical basis to annotate a sampled docking pose to be a true binder or decoy, based on its relative probability. Specifically, we calculated a log-odds (LOD) score for each of the sampled poses according to the following equation (Eq. 1):

       

(1)

       

, where x represents a sampled docking pose; f represents one of the specified features, e.g. the pose affinity, the pose’s distance to the target site, and the size of the poses cluster.  is the value of feature f for the pose  located.  and are the probability distributions of feature f for docking poses sampled from the true binders and the decoys, respectively. Noted that  are binned values and therefore different  values within the same bin have the same  or values. If there is no count for a bin, a small probability (10-7) is assigned for that bin. The LOD score takes the idea of relative entropy, representing how more or less likely a given sampled pose can result from a true binder than from a decoy. A higher LOD score means the pose is more likely from a true binder. The 2016 drugs are then rank-ordered based on their best LOD score among the 20 sampled poses.

 

Figure 4. The difference in the distributions of the affinities (top), COM distance (middle), and adjacency-based clustering (bottom) between the “answer” poses (the true binder pose, left columns) and decoys (right columns).

 

While docking serves as a quick screening tool for in silico drug screening, as a trade-off, the lack of considering the solvation effect and molecular dynamics involved in the protein-ligand interaction introduces false positives [6]. Explicit solvent MD simulations using classical force fields optimized for biological molecules are recognized to better describe the protein-ligand binding dynamics in a timescale that allowed for weak binders to dissociate [7, 8, 9, 10]. The MD simulations for user-selected protein-pose complexes were performed in an explicit solvent, 310K, and 1 atm NPT ensemble for 10 ns after preliminary energy minimization and equilibration of the system. During the simulation, a snapshot is taken every 0.1 ns, resulting in a trajectory composed of 100 frames. The simulation results were summarized using the designed indicators, including the mean binding free energy from MM/GBSA calculation over sampled snapshots, the drug leaving time when the drug center of mass moving away from the target site more than 10 Å, and the largest distance of the drug COM to the target sites sampled during the simulations. A simulated drug got a higher rank if having a favorable mean binding free energy and staying long enough in the target site (or never leaving) during the 10 ns simulation. For more detail, please refer to [1].

 

4.      References

[1] Tsai, K.-L., Yang, L.-W., 2021. DRDOCK: A drug repurposing platform integrating automated docking, simulations and a log-odds-based drug ranking scheme. doi:10.1101/2021.01.31.429052

[2] Maier, J.A., Martinez, C., Kasavajhala, K., Wickstrom, L., Hauser, K.E., Simmerling, C., 2015. ff14SB: Improving the Accuracy of Protein Side Chain and Backbone Parameters from ff99SB. Journal of Chemical Theory and Computation 11, 3696–3713. doi:10.1021/acs.jctc.5b00255

[3] Waterhouse, A., Bertoni, M., Bienert, S., Studer, G., Tauriello, G., Gumienny, R., Heer, F.T., de Beer, T.A.P., Rempfer, C., Bordoli, L., Lepore, R., Schwede, T., 2018. SWISS-MODEL: homology modelling of protein structures and complexes. Nucleic Acids Research 46. doi:10.1093/nar/gky427

[4] Kim, S., Chen, J., Cheng, T., Gindulyte, A., He, J., He, S., Li, Q., Shoemaker, B.A., Thiessen, P.A., Yu, B., Zaslavsky, L., Zhang, J., Bolton, E.E., 2018. PubChem 2019 update: improved access to chemical data. Nucleic Acids Research 47. doi:10.1093/nar/gky1033

[5] Rose, A.S., Bradley, A.R., Valasatava, Y., Duarte, J.M., Prlić, A., Rose, P.W., 2018. NGL viewer: web-based molecular graphics for large complexes. Bioinformatics 34, 3755–3758. doi:10.1093/bioinformatics/bty419

[6] Pantsar, T., Poso, A., 2018. Binding Affinity via Docking: Fact and Fiction. Molecules 23, 1899. doi:10.3390/molecules23081899

[7] Huang, D., Caflisch, A., 2011. Small Molecule Binding to Proteins: Affinity and Binding/Unbinding Dynamics from Atomistic Simulations. ChemMedChem 6, 1578–1580. doi:10.1002/cmdc.201100237

[8] Liu, K., Kokubo, H., 2017. Exploring the Stability of Ligand Binding Modes to Proteins by Molecular Dynamics Simulations: A Cross-docking Study. Journal of Chemical Information and Modeling 57, 2514–2522. doi:10.1021/acs.jcim.7b00412

[9] Liu, P.-F., Tsai, K.-L., Hsu, C.-J., Tsai, W.-L., Cheng, J.-S., Chang, H.-W., Shiau, C.-W., Goan, Y.-G., Tseng, H.-H., Wu, C.-H., Reed, J.C., Yang, L.-W., Shu, C.-W., 2018. Drug Repurposing Screening Identifies Tioconazole as an ATG4 Inhibitor that Suppresses Autophagy and Sensitizes Cancer Cells to Chemotherapy. Theranostics 8, 830–845. doi:10.7150/thno.22012

[10] Mollica, L., Decherchi, S., Zia, S.R., Gaspari, R., Cavalli, A., Rocchia, W., 2015. Kinetics of protein-ligand unbinding via smoothed potential molecular dynamics simulations. Scientific Reports 5. doi:10.1038/srep11539


logl

Reference:  Tsai,KL. and and Yang,LW. (2021) DRDOCK: A DrugRepurposing platform integrating automated docking, simulation and a log-odds-based drug ranking scheme. doi: 10.1101/2021.01.31.429052 (To be peer-reviewed)

Contact:  The server is maintained by the Yang Lab at the Institute of Bioinformatics and Structural Biology at National Tsing Hua University, Taiwan.

For questions and comments please contact Kun-Lin Tsai.