2. The drug development process - KAIST

Impact of genomics and proteomics on Biotechnology / Biology Genomics : Systemic study of the entire genome of an organism To sequence the entire genome and to physically map the genome a rrangement (assign exact position of the genes /non-coding regions i n genome) Before 1990s, the sequencing and study at a single gene level: Labor ious and time-consuming task Development of high-throughput sequencing technologies and hig hly automated hardware system Faster (in excess of 1 kb/h), cheaper, and more accurate Sequencing a human whole genome: ~ $ 1,000 Genome sequences of more than 2,000 organisms Genomes of various animals and plants :, mouse, rat, sheep, pig, monkey, dog, chicken, wheat, barley, Arabidopsis

Human genome project - Started in 1990 - Completed in 2003 : ~ 3.2 giga bases(Gb), 1,000 times larger than a typical bacterial genome - Less than 1/3 of the genome is transcribed into mRNA - Only 5 % of the RNA encodes polypeptides Number of polypeptide-encoding genes :~ 30,000 Significance of genome data in Biotechnology/Biology Provide full sequence information of every protein: - Identification of undiscovered proteins : Understanding their functions - Discovery of new drug targets Current drugs on the market target one of at most 500 proteins: Major targets are proteins (e.g., Kinases)

Sequence data of many human pathogens (e.g., Helicobacter pylori, Mycobacterium tuberculosis, Vibrio cholerae) Provide drug targets against pathogens (e.g., gene products essential for pathogen viability or infectivity) Offer some clues in underlying mechanism of diseases Development of a new dug New methods/tools in Biotechnology, Biology, and Medical scienc es The ability to interrogate the human genome has altered our ap proach to studying complex diseases and development of thera pies. The emergence of genome-wide analysis tools has opened the d oor to investigating the function of each genes, genomic biomar

ker discovery, validation, and pharmacogenomics. Leading medical/clinical researchers: Actively studying genomic approaches to understand diseases, an d learn how these can be translated into medical and clinical setti ngs. Translational research Functional genomics Issues Biological functions of between one-third and half of sequenced gene products remain unknown Assessment of biological functions of the sequenced genes Crucial to understanding the relationship between genotype a nd phenotype as well as direct identification of drug targets Shift in the focus of genome research

Elucidation of biological function of genes In the narrow sense : Biological function/activity of the isolated gene p roduct In broader meaning : - Where in the cell the product acts, and what other cellular elements it interacts with Interactome - How such interactions contributes to the overall physiology of the organism Systems Biology General definition of functional genomics : Determining the function of proteins deduced from genome sequence is a central goal in the post genome era

Elucidating the biological function of gene products Assignment of function of gene products (Proteins) Biochemical (molecular) function Assignment based on sequence homology Based on structure Based on ligand-binding specificity

Based on cellular process Based on biological process Based on proteomics or high-throughput func tional genomics Conventional approaches Clone and express a gene to produce the protein encoded by the gene Try to purify the protein to homogeneity - Size, charge, hydrophobicity, oligomeric state, glycosylation Develop an assay for its function Identify the activity / function

- Grow crystals, solve structure Time-consuming and laborious for huge numbers of genes Assignment of function to the sequenced gene products Comparison of sequence/structure data in a high-through manner Sequence homology study Computer-based sequence comparison between a gene of unknown function and genes whose functions (or gene product function) have been assigned High homology : high similarity in function Assigning a putative function to 40 - 60 % of all new gene sequence s Phylogenetic profiling

A phylogenetic tree or evolutionary tree : a diagram showing the evol utionary interrelations of a group of organisms or genes or proteins d erived from a common ancestral form based upon similarities and diff erences in their physical and/or genetic characteristics Youtube: www.youtube.com/watch?v=xwuhmMIIspo Phylogenetic profiling : Study of evolutionary relationships among vari ous biological species or other entities based on similarities and differ ences in their physical and/or genetic characteristics Closely related species should be expected to have very similar sets of genes Proteins that function in the same cellular context frequently have si milar phylogenetic profiles : During evolution, all such functionally link ed proteins tend to be either preserved or eliminated in a new species:

Proteins with similar profiles are likely to belong to a common group of functionally linked proteins. Establishing a pattern of presence or absence of a particular gene cod ing for a protein of unknown function across a range of different organi sms whose genomes have been sequences: Discovery of previously unknown enzymes in metabolic pathway s, transcription factors that bind to conserved regulatory sites, an d explanations for roles of certain mutations in human disease, pl ant specific gene functions

Rosetta Stone Approach Hypothesis: Some pairs of interacting proteins are encoded by two genes in some genome or by fused genes in other genomes Two separate polypeptides (X and Y) found in one organism may oc cur in a different organism as a single fused protein(XY) Function of the unknown gene in one organism can be deduced fro m the function of fused genes in different organism Gyrase : Relieves strain while double-stranded DNA is being unwound by h elicase Type II topoisomerase (heterodimer) : catalyzes the introduction of negative supercoils in DNA in the presence of ATP.

Gyrase (bactrial topoisomerase II) : heterotetramer made up of 2 gyrA (97 kD a) subunits and 2 gyrB (90 kDa) subunits. Knock-out animal study Generation and study of mice in which a specific gene has been del eted Phenotype observation Structural genomics approach Resolution of 3-D structure of proteins

Pathway maps Linked set of biochemical reactions catalyzed by enzymes Questions: Is the extrapolation between species valid? Have orthologs been identified accurately? Orthologs: Genes in different species that evolved from a common ancestral gene by speciation, retaining the same function in the course of evolution. - Identification of orthologs is critical for reliable prediction of gene function in newly sequenced genomes.

Homologs : A gene related to a second gene by descent from a common ancestral DNA sequence regardless of their functions Paralogs: Genes related by duplication within a genome. Paralogs evolve new functions during a course of evolution, even if these are related to the original one. DNA microarray technology : DNA chip Sequence data provide a map and possibility of assigning the putativ e functions of the genes in genome based on sequence comparisons Information regarding which genes are expressed and functionally ac tive at any given circumstance and time DNA microarray data : Provide clues as to the biological function of the corresponding gene

s Starting point Offer an approach to search for disease biomarkers and drug targets ex) If a particular mRNA is only produced by a cancer cell compared to a normal cell, the mRNA (or its polypeptide product) may be a target for basic research for cancer, a good target for a new anti-cancer d rug, biomarker for diagnosis. Microarrays: Tool for gene expression profiling DNA microarray (Gene chip) : Comparison of mRNA expression levels between a sample (cancer cell) and a reference (normal cell) in a high-throughput way : mRNA expression profiling - cDNA chip : mRNA expression profiling - Oligo chip ( ~ 50 mers) : mRNA expression profiling - SNPs (Single Nucleotide Polymorphisms)

- Complementary probes are designed from gene sequence Solid support (such as glass microscope slide) on which DNA of known sequence (probes) is deposited in a grid-like array - 250,000 different short oligonucleotide probes in cm2 - 10,000 full-length cDNA in cm2 General procedure for mRNA expression profiling mRNA is isolated from matched samples of interest. mRNA is typically converted to cDNA, labeled with fluorescence dyes(Cy3 and Cy5) or radioactivity Hybridization with the complementary probes on DNA chip Analysis and comparison of expression levels of mRNAs between a sample and a reference mRNA expression profiling

Page 173 Microarrays: array surface DNA Microarray Methodology - Flash Animation www.bio.davidson.edu/Courses/genomics/chip/chip.html Organisms represented on microarrays Metazoans: human, mouse, rat, worm, insect Fungi: yeast

Plants: Arabidopsis Other organisms: e.g. bacteria, viruses Questions to be addressed using microarrays Wild-type versus mutant cells Cultured cells with or without drug Physiological states (hibernation, cell polarity formation) Normal versus disease tissues (cancer, autism) mRNA expression profiling using cDNA microarray cDNA clones sample

reference mRNA exitation Laser2 Laser1 Reverse transcriptase emission PCR amplification

Purification cy3 cy5 Robotic printing Hybridize targets to microarray Cy3 : ex 550 nm / em 570 Cy5 : ex 649 nm / em 670 Computer

analysis Green : up-regulated in a sample Red : up-regulated in a reference Yellow : equally expressed Commercially available DNA chip Overall procedure experimental design Sample acquisition

purify RNA, label Data acquisition hybridize, wash, image Data analysis data storage Data confirmation

Biological insight Stage 1: Experimental design Biological samples: technical and biological replicates mRNA extraction, conversion, labeling, hybridization Arrangement of array elements on a surface PCR (Polymerase Chain Reaction) Developed in 1983 by Kary Mullis Nobel prize in Chemistry in 1993

Melting at 95 oC Annealing at 55 oC Elongation at 72 oC Thermostable DNA polymerase from thermophilic bacterium Stage 2: RNA and probe preparation Confirm purity by running agarose gel Measure the absorbance at 260 and 280 nm and calculate the ratio to confirm purity and quantity Synthesis of cDNA and labeling using reverse transcriptase Stage 3: Hybridization to DNA arrays

Mixing of equal amounts of cDNA from a reference and a sample Load the solution to DNA microarray Incubation for hybridization followed by washing and drying Stage 4: Image analysis Gene expression levels are quantified Fluorescence intensities are measured with a scanner, or radioactivity with a phosphorimage analyzer Example of an approximately 37,500 probe-spotted oligo microarray with enlarged inset to show detail Fig. 6.20

Page 181 Stage 5: Microarray data analysis How can arrays be compared? Which genes are regulated? Are differences authentic? What are the criteria for statistical significance? Are there meaningful patterns in the data (such as groups)?

Stage 6: Biological confirmation Microarray experiments can be thought of as hypothesis-generating experiments : Clues Differential up- or down-regulation of specific genes can be confirmed using independent assays : - Northern blots - Polymerase chain reaction (RT-PCR) - in situ hybridization Use of DNA microarray Comparison of gene expression levels

Different tissues Different environmental conditions (treated with drug) Normal and cancer cells Outcome of data analysis Search for a specific gene(s) responsible for biological phenomenon Starting point for basic research Search for biopharmaceuticals/drug targets Identification of potential biomarkers for diagnosis SNP detection But need validation Search for a gene responsible for a

disease Control Rett syndrome is a childhood neuro-developmental disorder characterized by normal early development followed by loss of purposeful use of the hands, distinctive hand movements, slowed brain and head growth, gait abnormalities, seizures, and mental retardation. It affects females almost exclusively.

Rett A-- B Crystallin is over-expressed in Rett Syndrome mRNA Expression Profiling in lung cancer patient and normal person using DNA Microarray Gene Name alcohol dehydrogenase IB (class I), beta polypeptide mucolipin 1 null creatine kinase, brain artemin SP110 nuclear body protein

apoptosis antagonizing transcription factor killer cell lectin-like receptor subfamily D, member 1 fatty acid desaturase 3 SH2 domain protein 2A cholinergic receptor, nicotinic, epsilon polypeptide ribosomal protein L29 TGFB-induced factor 2 (TALE family homeobox) ectonucleoside triphosphate diphosphohydrolase 2 null TBC1 domain family, member 8 (with GRAM domain) 3-hydroxymethyl-3-methylglutaryl-Coenzyme A lyase (hydroxymethylglutaricaciduria) null homeo box D4 null

eukaryotic translation initiation factor 3, subunit 8, 110kDa Rho guanine nucleotide exchange factor (GEF) 10 aquaporin 5 Gene Name Regulation DOWN DOWN UP DOWN DOWN DOWN UP


DOWN cytochrome b-561 TATA box binding protein (TBP)-associated factor, 32kDa glypican 4 AT rich interactive domain 4A (RBP1-like) TEA domain family member 4 G protein-coupled receptor 50 ret finger protein 2 chromosome 11 open reading frame 24 null null null null

minichromosome maintenance deficient 3 (S. cerevisiae) null null null null defensin, alpha 6, Paneth cell-specific null small nuclear ribonucleoprotein polypeptide C null HLA-B associated transcript 3 mitogen-activated protein kinase kinase kinase 6 null

Choi et al., J Thorac Oncol (2006), 1, 622-628 Regulation DOWN DOWN DOWN DOWN UP DOWN DOWN UP DOWN DOWN UP DOWN

UP UP DOWN DOWN DOWN DOWN DOWN UP DOWN UP UP DOWN Expression profiles under different nutritional conditions

cDNA microarray chip containing 2,500 genes from yeast Expression profiling of genes from Yeast grown at 2% galactose and glucose Green: up-regulation in yeast grown at galactose Red : up-regulation in yeast grown at glucose Yellow : equally expressed Lashkari et al., PNAS (1997) Advantages of microarray experiments

Fast Data on >20,000 genes in several weeks Comprehensive Entire yeast, mouse, and human genome on a chip Flexible - As more genomes are sequenced, more arrays can be made. - Custom arrays can be made to represent genes of interest Easy Cheap

Submit RNA samples to a core facility Chip representing 20,000 genes for $350; Robotic spotter/scanner cost $50,000 Disadvantages of microarray experiments Cost : Some researchers cant afford to do appropriate controls, replicates mRNA significance Quality control Final products of gene expression are proteins

Impossible to assess elements on array surface - Artifacts with image analysis - Artifacts with data analysis Proteomics Proteins are directly involved in most of biological functions Drug targets : mostly proteins Protein expression levels can not be accurately detected or measu red via DNA array technology mRNA levels are not directly correlated with those of the mRNA-e ncoded polypeptide A significant proportion of eukaryote mRNA undergoes differentia l splicing, and can yield more than one polypeptide product No detailed information regarding how the functional activity of e

xpressed proteins will be regulated. (e.g., post-translational modifications : phosphorylation, glycosylation, sumoylation, proteolysis) Proteomics : Proteom-based analysis Proteins are responsible for specific functions, drug targets, or potential biomarkers : More successfully identified by direct analysis of the expressed proteins in the cell Systematic and comprehensive analysis of the proteins (proteo m) expressed in the cell and their functions - Direct comparison of protein expression levels - Changes in cellular protein profiles with cellular conditions

Proteomics approach by 2-D protein gels General procedure Extraction of the total protein content from the target cell/tissue Separation of proteins by 2-D gel electrophoresis Dimension one: iso-electric focusing Dimension two: SDS-PAGE (polyacrylamide gel)

Elution of protein spots Analysis of eluted proteins for identification 2-D gel electrophoresis between two different conditions How do you figure out which spot is what? Protein micro-sequencing using Edman degradation protocol (partial a mino acid sequence) : laborious and time-consuming Protein analysis using mass spectrometry

- Molecular mass of protein : MALDI-TOF - Digestion pattern by Trypsin : MALDI-TOF - Amino acid sequence of a digested peptide : Tandom mass spectrometry Usually have a core facility do these, or collaborate with expert Identification or assignment of protein function by sequence homolog y search Basic components of a mass spectrometer Ion source Convert sample molecules into ions (ionization)

Mass analyzer Sorts the ions by their masses by applying electromagnetic fields Detector Measures the value of an indicator quantity and thus provides data for calculating the abundances of each ion present

Mass spectrometers Ion source Detector Mass analyzer MALDI (Matrix Assisted Laser Desorption Ionization) ESI (Electrospray Ionization) EI (Electron Ionization) CI (Chemical Ionization) FAB (Fast Atom Bombardment)

TOF (Time of Flight) Quadrupole FT-ICR (FTMS) Ion Trap Time of Flight in mass spectrometry Ions are accelerated by an electric filed of known strength. - This acceleration results in an ion having the same kinetic energy as any other ion that has the same charge.

- The velocity of the ion depends on the mass-to-charge ratio. The time for the ion to reach a detector at a known distance is me asured. - This time will depend on the mass-to-charge ratio of the ions. - The elapsed time from the instant a particle leaves a source to the instant it reaches a detector. From this time and the known experimental parameters, mass-tocharge of the ion is determined . Time of Flight When the charged particle is accelerated into time-of-flight tube by the voltage U, its kineti c energy of any mass is mv2 The smaller the molecular mass, the higher the velocity of a molecule ; Calculate the m/z b y measuring the flight time

Mass-to-charge (m/z) ratio of a molecule is determined by measuring the flight time in the tube MALDI-TOF Mass Spectrometry Matrix-assisted laser desorption/ionization (MALDI) Soft ionization technique allows the analysis of biomolecules (such as prote in, peptides, and sugars) and large organic molecules, which tend to be frag ile and fragmented when ionized by more conventional ionization methods The matrix absorbs the laser energy, and the matrix is ionized (by addition

of a proton) by this event. The matrix then transfers proton to the analyte molecules (e.g., protein mo lecules), thus charging the analyte Commonly used matrix - 3,5-dimethoxy-4-hydroxycinnamic acid(sinapinic acid), - -cyano-4-hydroxycinnamic acid (alpha-cyano or alpha-matrix) - 2,5- dihydroxybenzoic acid (DHB) Time-of-Flight mass spectrometry : - Ions are accelerated by an electrical field to the same kinetic ener

gy - The velocity of the ion depends on the mass-to-charge ratio. - From the elapsed time to reach a detector, the mass-to-charge ra tio can be determined. Use of proteomics Identification of a protein responsible for cellular function under specific conditions : - Treatment of drugs, stress etc. - Identification of key enzymes in metabolic pathways - Construction of new strains Discovery of disease biomarkers - Comparison of protein levels between patient and normal person

Protein profiling Use of proteomics in metabolic pathway engineering L-Threonine Essential amino acid Feed and food additives Raw material for synthesis of various medicines World-wide production of amino acids Amino acid Production

Capacity (MT / annum) (MT / annum) L-Lysine-HCl 583,000 704,000 DLMethionine 496,000 680,000

L-Threonine 27,000 49,000 Source: Feedinfo. 2002 Biosynthetic pathway of L-Thr in E. coli Glucose Phosphenolpyruvate ppc

Pyruvate metL L-Aspartate thrA aspC Oxaloacetate lysC thrA Homoserine aceBAK

mdh L-Aspartyl phosphate asd L-Aspartate semidaldehyde TCA cycle dapA metA L-Lysine

L-Methionine thrB Homoserine phosphate thrC L-Threonine ilvA Feedback repression L-Isoleucine Development of an L-Threonine-overproducing strain

Conventional mutagenic method Use of protein expression profiles in biosynthetic pathwa y between parent and an L-threonine-producing strain Production level of L-threonine - W3110 (Wild-type E. Coli ) : < 0.001 g/L - TF 5015 (Mutant) : ~ 20 g/L Proteome Analysis of two strains TF5015 Identification of protein spots by MALDI-TDF

Expression level (arbitrary units) W311 0 7 6 5 W3110 TF5015 4 3 2

1 0 1. AldA 2. 3. 4. 5. IcdA AceA ArgG ThrC 6 7. 8.

9. OppA LeuC Udp Protein 10. LeuD 11 12. YfiD 13

14 Lee et al., J Bacteriol (2004)

Recently Viewed Presentations



    CSA not synonym for rape. Non-contact. Contact, non-penetrating. Penetrating. Force/manipulation. How often can we (not) tell? ... Child sexual abuse can happen without physical contact There will be no injuries. The mechanism of the abuse: Non-contact.
  • Common Criteria

    Common Criteria

    Common Criteria Ravi Sandhu Common Criteria International unification CC v2.1 is ISO 15408 Flexibility Separation of Functional requirements Assurance requirements Marginally successful so far v1 1996, v2 1998, widespread use ???
  • Chapter 9: Cultural Studies - OUP

    Chapter 9: Cultural Studies - OUP

    Irony, autobiography, naturalism, tragedy. Myth of persecuted people (cf Hebrews) Periods of Colonial, Antebellum, Reconstruction, pre-World War II, Harlem Renaissance, Naturalism and Modernism, Contemporary
  • Microsoft Word 2010 Training Create your first Word

    Microsoft Word 2010 Training Create your first Word

    Overview: Start at the beginning. Create your first Word document I. In this course, you'll learn how to create your first document in Word. You'll find out how to type where you want to on a page, fix spelling errors,...
  • Genetic factors and biochemical individuality of human skin

    Genetic factors and biochemical individuality of human skin

    Human skin surface contains four different types of glands that excrete compounds from polar and non-polar lipids and peptides to small volatile compounds (VOCs). Gland distribution differs within the different skin regions. Underarm (axillary) area contains all four types of...
  • Telling your story Why are Narratives Important? Narratives

    Telling your story Why are Narratives Important? Narratives

    Bob's Red Mill claims to be the nation's leader in stone milling and offers tours of its mill in person to show how "unprocessed" the product is. Rather than using more modern production methods, Bob's uses quartz mill stones which...
  • Switching from Field Enumeration to an ABS Frame:

    Switching from Field Enumeration to an ABS Frame:

    Did this analysis in a continued quest for a pattern of when bias would be a problem and when it would not. Given that NSDUH is SO large, we thought this was driving the large number of significant differences. If...
  • Use of USDA Insurance Programs to Protect Dairy

    Use of USDA Insurance Programs to Protect Dairy

    Note: The LGM analysis undertaken on Sept. 24, 2014 for the upcoming Sept. contract offering. For the LGM coverage we chose the least cost contract over the Nov, 2014 - Aug 2015 period that returns a net margin equal to...