TDR Targets

Acknowledgements

The WHO/TDR Targets Network would like to acknowledge the contributions of the following labs and people, in no particular order.

Anna Gaulton and John Overington, for help with integrating ChEMBL data, and for early access to a pre-release version of ChEMBL.
Feng Chen, University of Pennsylvania, for help with integrating OrthoMCL data.
Gaia Paolini and Andrew Hopkins, Pfizer Sandwich, for the druggability and compound desirability data.
Bissan Al-Lazikani, Edith Chan and John Overington, Inpharmatica, for the mapping of pathogen proteins against Starlite and the druggability index.
Ursula Pieper and Andrej Sali, University of California at San Francisco, for the genome-wide modelling of the pathogen proteins.
Tilde Carlow and Kshitiz Chaudhary, New England Biolabs, for the C. elegans RNAi data.
Bob Campbell, Marine Biological Laboratory, for the mapping of parasite genes against proteins in DrugBank.
Peter Ertl, Novartis Institutes for Biomedical Research, for the JME Java Molecule Editor.

Data sources

Following is a list of the data sources that were used in TDR Targets Release 4.0, and the corresponding references for citation purposes.

Genes, proteins, and annotation
Essential genes
Druggability
3D Models
Ortholog groups
Metabolic Pathways, EC numbers
Chemicals, compounds, drugs

Genes, proteins and annotation

Plasmodium falciparum 3D7 genes and annotation were obtained from PlasmoDB release 6.3

Trypanosoma brucei, Leishmania major, and Trypanosoma cruzi genes and annotation were obtained from TriTrypDB release 2.0.

Mycobacterium tuberculosis genes and annotation were obtained from Tuberculist release 2.3.

Mycobacterium leprae genes and annotation were obtained from Leprosy mycobrowser release 2.1.

Toxoplasma gondii genes and annotation were obtained from ToxoDB release 6.0.

Schistosoma mansoni genes and annotation were obtained from GeneDB (Smansoni v4.0), additional annotation including EC numbers where obtained from SchistoDB release 2.0.

Brugia malayi genes and annotation were obtained from GenBank.

References:
PlasmoDB v5: new looks, new genomes. Stoeckert CJ Jr, Fischer S, Kissinger JC, Heiges M, Aurrecoechea C, Gajria B and Roos DS. 2006. Trends Parasitol 22: 543-6.
GeneDB: a resource for prokaryotic and eukaryotic organisms. Hertz-Fowler C, et al. 2004. Nucleic Acids Res. 32: D339-43.
The genome of the kinetoplastid parasite, Leishmania major. Ivens AC et al. 2005. Science 309: 436-42.
The genome of the African trypanosome Trypanosoma brucei. Berriman M et al. 2005. Science 309: 416-22.
The genome sequence of Trypanosoma cruzi, etiologic agent of Chagas disease. El-Sayed N et al. 2005. Science 309: 409-15.
Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence. Cole ST et al. 1998. Nature 393: 537-544.
Learning from the genome sequence of Mycobacterium tuberculosis H37Rv. 1999. Cole ST. FEBS Lett 452: 7-10.
The genome of the blood fluke Schistosoma mansoni. 2009. Berriman M et al. Nature 460: 352.
SchistoDB: a Schistosoma mansoni genome resource. 2009. Zerlotini A et al. Nucleic Acids Research 37: D579.
Draft Genome of the Filarial Nematode Parasite Brugia malayi. 2007. Ghedin E et al. Science 317: 1756.

Essentiality data

Information about essential genes in Escherichia coli has been obtained from a number of genome-wide mutation and/or knockout studies: see Profiling of E. coli chromosome (PEC), National Institute of Genetics, Japan; The Keio Collection; and the references below.

Information about essential genes in Saccharomyces cerevisiae was also derived from a number of genome-wide studies and has been obtained from the Saccharomyces Genome Database (SGD).

Information about essential genes in Mycobacterium tuberculosis has been obtained from the National Microbial Pathogen Data Resource (NMPDR), see references below.

Information about phenotypes caused by knockdown experiments (RNAi) in Caenorhabditis elegans was derived from a number of studies. The information was downloaded from Wormbase, and organized by Kshitiz Chaudhary and Tilde Carlow (New England Biolabs).

References:
Systematic mutagenesis of the Escherichia coli genome. Kang Y, Durfee T, Glasner JD, Qiu Y, Frisch D, Winterberg KM, Blattner FR. 2004. J Bacteriol. 186: 4921-30.
Experimental determination and system level analysis of essential genes in Escherichia coli MG1655. Gerdes SY et al. 2003. J Bacteriol. 185: 5673-84.
Construction of consecutive deletions of the Escherichia coli chromosome. Kato J, and Hashimoto M. 2007. Molecular Systems Biology 3: 132.
Cell size and nucleoid organization of engineered Escherichia coli cells with a reduced genome. Hashimoto M et al. 2005. Mol Microbiol. 55: 137-49.
Construction of Escherichia coli K-12 in-frame, single-gene knockout mutants: the Keio collection. Baba T. et al. 2006. Molecular Systems Biology 2: 2006.0008.
Genes required for mycobacterial growth defined by high density mutagenesis. Sassetti CM, Boyd DH, Rubin EJ. 2003. Mol Microbiol 48: 77-84.
Functional profiling of the Saccharomyces cerevisiae genome. Giaever G, et al. 2002. Nature 418: 387-91.
Mechanisms of haploinsufficiency revealed by genome-wide profiling in yeast. Deutschbauer AM, et al. 2005. Genetics 169: 1915-25.
Large-scale analysis of gene function in Caenorhabditis elegans by high-throughput RNAi. Maeda I, et al. 2001. Current Biology 11: 171-176.

Druggability data

Druggability index (Dindex)
The mapping of pathogen genes to known druggable targets and the Dindex values for those targets has been provided by Bissan Al-Lazikani, Edith Chan and John Overington (Inpharmatica)

Compound desirability index
The compound desirability index for all compounds associated with known druggable targets have been calculated by Gaia Paolini and Andrew Hopkins (Pfizer Sandwich)

References:
The druggable genome. Hopkins AL and Groom CR. 2002. Nature Rev Drug Discov 1: 727-730.
Global mapping of pharmacological space. Paolini GV, Shapland RH, van Hoorn WP, Mason JS, Hopkins AL. 2006. Nat Biotechnol 24: 805-15.

Three dimensional models

The three dimensional models of pathogen proteins were provided by Modbase.

References:
MODBASE, a database of annotated comparative protein structure models, and associated resources. Ursula Pieper, Narayanan Eswar, Fred Davis, M.S. Madhusudhan, Andrea Rossi, Marc A. Marti-Renom, Rachel Karchin, Ben Webb, David Eramian, Min-Yi Shen, Libusha Kelly, Francisco Melo and Andrej Sali. 2006. Nucleic Acids Research 34: D291-D295.

Ortholog groups

The OrthoMCL-based orthologous predictions for proteins were obtained from OrthoMCL v4

References:
OrthoMCL-DB: querying a comprehensive multi-species collection of ortholog groups. Chen F, Mackey AJ, Stoeckert CJ Jr and Roos DS. 2006. Nucleic Acids Res 34: D363.
OrthoMCL: identification of ortholog groups for eukaryotic genomes. Li L, Stoeckert CJ Jr and Roos DS. 2003. Genome Res 13: 2178.

Metabolic Pathways, EC numbers

Mappings of genes to metabolic pathways and to the IUBMB Enzyme Classification were obtained from the Kyoto Encyclopedia of Genes and Genomes (KEGG) database.

References:
KEGG for representation and analysis of molecular networks involving diseases and drugs. 2010. Kanehisa M et al. Nucleic Acids Research 38: D355.

Chemical compounds

Chemical compounds listed in our database come from the ChEMBL, PubChem, and DrugBank databases.

The Tres Cantos Antimalarial TCAMS dataset (GSK), the Novartis-GNF Malaria Box datase, and the St. Jude Children's Research Hospital Malaria dataset were obtained from ChEMBL-NTD.

References:
DrugBank: a comprehensive resource for in silico drug discovery and exploration. Wishart DS, Knox C, Guo AC, Shrivastava S, Hassanali M, Stothard P, Chang Z, Woolsey J. Nucleic Acids Res 34: D668.
GSK: Thousands of chemical starting points for antimalarial lead identification. 2010. Gamo F-J et al. Nature 465: 305.
Novartis-GNF Malaria Box: K Gagaring, R Borboa, C Francek, Z Chen, J Buenviaje, D Plouffe, E Winzeler, A Brinker, T Diagana, J Taylor, R Glynne, A Chatterjee, K Kuhen. Genomics Institute of the Novartis Research Foundation (GNF), 10675 John Jay Hopkins Drive, San Diego CA 92121, USA and Novartis Institute for Tropical Disease, 10 Biopolis Road, Chromos # 05-01, 138 670 Singapore
St. Judes Malaria Dataset: Chemical genetics of Plasmodium falciparum. 2010. Guiguemde WA, et al. Nature 465: 311.