Sunday, December 31, 2017

The Upcoming Lectures in Computational Biophysics

DateSpeakerLecture TitleRelated Papers

25-July-2017Rebecca Alford
(Jeff Gray Lab)
A deep-dive into the Rosetta all-atom energy functionp1
8-August-2017Dr. Ozge Yoluk
Elucidating the Gating Mechanism of Cys-Loop Receptorsp1

Tuesday, August 8, 2017

Lecture 22: Özge Yoluk (Alex MacKerell Lab)

Ozge Yoluk is a postdoctoral researcher at Alex MacKerell’s lab at University of Maryland, School of Pharmacy. She obtained her bachelor's degree in Molecular Biology and Genetics from Istanbul University. She then moved to Uppsala and obtained her master’s degree in Applied Biotechnology. During her master studies, she transitioned to computational studies and decided to build her career in this field. She had learned and applied computational methods to understand the gating mechanism of ion channels during her graduate studies in Erik Lindahl’s Lab at KTH. She is currently working with proteins involved in base-excision repair pathway and force field development. 

Tuesday, July 25, 2017

Lecture 21: Rebecca Alford (Jeff Gray Lab)

A deep-dive into the Rosetta all-atom energy function
Over the past decade, the Rosetta biomolecular modeling suite has informed various biological questions and engineering challenges ranging from folding and docking to the interpretation of low-resolution structural data, design of nanomaterials, and vaccines. Central to Rosetta’s success is the energy function: a set of mathematical models used to approximate the free energy of a macromolecule. In this presentation, Rebecca will describe the concepts and calculations that underlie the Rosetta energy function. I will present the mathematics and origin of the major energy terms (van der Waals, implicit solvation, electrostatics, and hydrogen bonding, and design reference energies), and Rebecca will explain where numerical stability and efficiency limitations have led to modifications of these functions. Applying these concepts, she will explain how to use a Rosetta energy calculation to select and analyze the features of output models. Finally, we discuss the latest advances in the energy function that extend capabilities from soluble proteins to also include membranes, DNA, RNA, and other macromolecules.

Rebecca Alford is a Chemical and Biomolecular Engineering Ph.D. Student in Jeff Gray’s Lab at Johns Hopkins University. Her overall goal is to develop computational tools to investigate biology and disease at the molecular level. As an undergraduate, she created RosettaMP: a suite of tools to investigate membrane protein structures. Currently, she is developing computational models of cell membrane environments toward improving energy functions for structure prediction and design. Rebecca is funded by a Hertz Foundation Fellowship and a National Science Foundation Graduate Research Fellowship.

Tuesday, May 23, 2017

Lecture 20: Dr. Osman Yogurtcu (Margaret Johnson Lab)

Membrane Recruitment Enables Weak Binding Endocytic Proteins to Form Stable Complexes Membrane targeting and assembly of proteins is required for vesicle trafficking and receptor mediated signaling, but it is not known to what extent the proteins recruited to these events may have evolved to exploit the 2D surface for assembly, versus pre-assembling in solution. We show that the phospholipid targeting proteins of clathrin-mediated endocytosis dramatically enhance their effective binding strength and subsequent complex formation to one another after surface recruitment in yeast and metazoans. For proteins such as clathrin that do not directly bind lipids, the enhancement is still achieved by using three distinct binding sites to stabilize the clathrin to peripheral membrane proteins on the surface. We derive simple formulas that quantify the degree of binding enhancement as a function of the protein and lipid concentrations, binding constants, and critically, the ratio of volume to membrane surface area. Our results thus apply to any cell type or geometries, including in vitro systems and the targeting of internal organelles from the cytoplasm. With a sufficient concentration of lipid recruiters, such as PIP2, we show that the effective binding strength is enhanced by orders of magnitude and becomes, surprisingly, independent of the protein-protein binding strength. We quantify how this effect varies for proteins involved in later stages of vesicle trafficking and cell division in yeast. Coupled with detailed spatially and structurally resolved simulations, we have further measured the effect of membrane recruitment on controlling the speed of assembly, and influences of crowding and diffusion on this process.

Osman N. Yogurtcu began his career in science as an undergraduate at Koc University, Turkey using computational polymer models to study protein-drug interactions. He received his M.Sc. in computational science and engineering from the same institution. During his Ph.D. in the Mechanical Engineering Department at the Johns Hopkins University, his research focused on mechanical properties of biofilaments, such as actin, that have crucial importance on cell viability. After graduation, he joined Prof. Margaret Johnson's lab in Johns Hopkins biophysics department where they worked on computational modelling of receptor mediated endocytosis. 

Tuesday, May 16, 2017

Lecture 19: Jennifer Lu (Steven Salzberg Lab)

KrakEN and Bracken.

Metagenomics is a rapidly growing field of study, driven in part by our ability to generate enormous amounts of DNA sequence rapidly and inexpensively. Since the human genome was first published in 2001 (The International Human Genome Sequencing Consortium, 2001; Venter et al., 2001), sequencing technology has become approximately one million times faster and cheaper, making it possible for individual labs to generate as much sequence data as the entire Human Genome Project in just a few days. In the context of metagenomics experiments, this makes it possible to sample a complex mixture of microbes by “shotgun” sequencing, which involves simply isolating DNA, preparing the DNA for sequencing, and sequencing the mixture as deeply as possible. Shotgun sequencing is relatively unbiased compared to targeted sequencing methods (Venter et al., 2004), including widely-used 16S ribosomal RNA sequencing, and it has the additional advantage that it captures any species with a DNA-based genome, including eukaryotes that lack a 16S rRNA gene. Because it is unbiased, shotgun sequencing can also be used to estimate the abundance of each taxon (species, genus, phylum, etc.) in the original sample, by counting the number of reads belonging to each taxon. 
Along with the technological advances, the number of finished and draft genomes has also grown exponentially over the past decade. At present, there are thousands of complete bacterial genomes, 20,000 draft bacterial genomes, and 80,000 full or partial virus genomes in the public GenBank archive (Benson et al., 2015). This rich resource of sequenced genomes now makes it possible to sequence uncultured, unprocessed microbial DNA from almost any environment, ranging from soil to the deep ocean to the human body, and use computational sequence comparisons to identify many of the formerly hidden species in these environments (Riesenfeld, Schloss & Handelsman, 2004). Several accurate methods have appeared that can align a sequence “read” to a database of microbial genomes rapidly and accurately (see below), but this step alone is not sufficient to estimate how much of a species is present. Complications arise when closely related species are present in the same sample–a situation that arises quite frequently–because many reads align equally well to more than one species. This requires a separate abundance estimation algorithm to resolve. In their recent article, Jennifer Lu and Steven Salzberg from Johns Hopkins University and their colleagues describe a new method, Bracken, that goes beyond simply classifying individual reads and computes the abundance of species, genera, or other taxonomic categories from the DNA sequences collected in a metagenomics experiment.
Number of reads within the Mycobacterium genus as assigned by Kraken (blue), estimated by Bracken (purple) and compared to the true read counts (green)[1].
Bracken (Bayesian Reestimation of Abundance after Classification with KrakEN) uses the taxonomic assignments made by Kraken, a very fast read-level classifier, along with information about the genomes themselves to estimate abundance at the species level, the genus level, or above. The authors of the study demonstrate that Bracken can produce accurate species- and genus-level abundance estimates even when a sample contains multiple near-identical species.

[1] Lu J, Breitwieser FP, Thielen P, Salzberg SL. Bracken: Estimating species abundance in metagenomics data. PeerJ Computer Science. 2017 Jan 2;3:e104.
[2] Wood DE, Salzberg SL. Kraken: ultrafast metagenomic sequence classification using exact alignments. Genome biology. 2014 Mar 3;15(3):R46.
Jennifer Lu is a Biomedical Engineering Ph.D. Candidate in Professor Steven Salzberg's lab at the Center for Computational Biology at Johns Hopkins University. With a background in Chemical and Biomolecular Engineering and Computer Science, Jennifer began her Ph.D. with the intent of applying her knowledge in Computer Science to Biomedical research. Currently, her research is focused on computational genomics and the usage of next-generation sequencing for diagnosing bacterial, fungal, or viral infections relating to human health and diseases. As part of this research, she develops and uses various computational methods for quantifying DNA sequence similarities and analyzing the genomes of human pathogens.

Tuesday, April 11, 2017

Lecture 18: Jeliazko Jeliazkov (Jeff Gray Lab)

Computational Modeling and Docking of Antibody Structures.

The vertebrate adaptive immune system is capable of promoting cells to degranulate or phagocytose nearly any foreign pathogen by producing immunoglobulin G (IgG) proteins (antibodies) that recognize a specific region (epitope) of a pathogenic molecule (antigen). The ability to bind diverse antigens requires a diverse population of antibodies, which is achieved through complex processes in bone marrow and lymphatic tissues, namely V(D)J recombination and somatic hypermutation. The diversity of antibodies is astonishing; the size of the theoretical naive antibody repertoire is estimated to be >1e13 in humans. In addition to their biological importance, antibodies are routinely used in biotechnology as probes and diagnostics, and dozens of antibodies have been approved as therapeutics.

Therapeutic monoclonal antibodies are a genre of biopharmaceuticals which has benefitted healthcare in various fields from oncology to immune and inflammatory disorders. Development of successful novel therapeutic antibodies requires an understanding of drug and disease mechanisms and the ability to stabilize, affinity mature, and humanize antibodies. Antibody structures can help overcome these challenges by providing atomic-level insights into structure–function relationships and the antibody–antigen interaction [e.g. see refs. (1–4)]. However, experimental techniques for obtaining antibody structures, like X-ray crystallography and nuclear magnetic resonance, are laborious, time-consuming and costly. Computational antibody structure prediction provides a fast and inexpensive route to obtain structures, including those which are not obtainable otherwise.

A schematic[1] of the modeling protocols (full flowcharts
for Rosetta Antibody and Rosetta SnugDock
are available in the original publications).
In their recent Nature Protocols article, Jeliazko Jeliazkov and Jeff Gray from Johns Hopkins Department of Chemical and Biomolecular Engineering and their international collaborators describe Rosetta-based computational protocols for predicting the 3D structure of an antibody from the sequence (RosettaAntibody) and then docking the antibody to protein antigens (SnugDock). Antibody modeling leverages canonical loop conformations to graft large segments from experimentally determined structures, as well as offering (i) energetic calculations to minimize loops, (ii) docking methodology to refine the VL–VH relative orientation and (iii) de novo prediction of the elusive complementarity determining region (CDR) H3 loop. To alleviate model uncertainty, antibody–antigen docking resamples CDR loop conformations and can use multiple models to represent an ensemble of conformations for the antibody, the antigen or both. These protocols can be run fully automated via the ROSIE web server or manually on a computer with user control of individual steps. For best results, the protocol requires roughly 1,000 CPU-hours for antibody modeling and 250 CPU-hours for antibody–antigen docking. Tasks can be completed in under a day by using public supercomputers. In the figure, the structure on the left shows the FV antibody domains predicted by homology modeling (heavy chain in dark blue with CDR H1 and H2 loops in orange and CDR H3 loop in red; light chain in yellow with its CDR loops in light blue). The structure on the right depicts an antibody–antigen structure output by docking (antigen in green).

[1]Weitzner BD, Jeliazkov JR, Lyskov S, Marze N, Kuroda D, Frick R, Adolf-Bryfogle J, Biswas N, Dunbrack Jr RL, Gray JJ. Modeling and docking of antibody structures with Rosetta. Nature Protocols. 2017 Feb 1;12(2):401-16.
[2]Sircar A, Kim ET, Gray JJ. RosettaAntibody: antibody variable region homology modeling server. Nucleic acids research. 2009 May 20:gkp387.
Jeliazko Jeliazkov received his B.S. in Physics from the University of Illinois at Urbana-Champaign. Since graduating, he joined the Program in Molecular Biophysics and is pursuing a Ph.D. under the tutelage of Prof. Jeffrey Gray. His research involves the computational prediction of protein–protein interactions, focusing in particular on antibody–antigen, disordered–ordered protein domain, and crystallographic (non-biological) protein–protein interactions.

Tuesday, April 4, 2017

Lecture 17: Prof. Sagar Khare

A New Weapon in the Fight Against Cancer and Viral Infections: Custom-Designed Enzymes
Arising out of natural selection, the structures of proteins (and their complexes with small molecules, nucleic acids, and other proteins) display exquisitely fine-tuned molecular recognition, which is critical for life to operate. Under selection conditions, accurate molecular recognition must be robust to random perturbations such as mutations. Yet, natural proteins are also evolvable — variation in a few amino acids can lead to profound changes in function, e.g. a new enzymatic activity can arise in an “old” enzyme. In other words, these molecular interactions have the fascinating property of being simultaneously functionally robust and plastic.
Fitness scoring function [1].
Characterizing the substrate specificity of protease enzymes is critical for illuminating the molecular basis of their diverse and complex roles in a wide array of biological processes. Rapid and accurate prediction of their extended substrate specificity would also aid in the design of custom proteases capable of selectively and controllably cleaving biotechnologically or therapeutically relevant targets. However, current in silico approaches for protease specificity prediction, rely on, and are therefore limited by, machine learning of sequence patterns in known experimental data. In his talk, Prof. Sagar Khare described a general approach for predicting peptidase substrates de novo using protein structure modeling and biophysical evaluation of enzyme–substrate complexes. His research team constructed atomic resolution models of thousands of candidate substrate–enzyme complexes for each of five model proteases belonging to the four major protease mechanistic classes—serine, cysteine, aspartyl, and metallo-proteases—and develop a discriminatory scoring function using enzyme design modules from Rosetta and AMBER's MMPBSA. They ranked putative substrates based on calculated interaction energy with a modeled near-attack conformation of the enzyme active site. Their results show that the energetic patterns obtained from these simulations can be used to robustly rank and classify known cleaved and uncleaved peptides and that these structural-energetic patterns have greater discriminatory power compared to purely sequence-based statistical inference. Combining sequence and energetic patterns using machine-learning algorithms further improves classification performance, and analysis of structural models provides physical insight into the structural basis for the observed specificities.
Summary of the CPG2 circular permutations [2].
In the second part of his talk, Prof. Khare talked about spatio-temporal design of enzymes. Carboxypeptidase G2 (CPG2) is an Food and Drug Administration (FDA)-approved enzyme drug used to treat methotrexate (MTX) toxicity in cancer patients receiving MTX treatment. It has also been used in directed enzyme-prodrug chemotherapy, but this strategy has been hampered by off-site activation of the prodrug by the circulating enzyme. The development of a tumor protease activatable CPG2, which could be achieved using a circular permutation of CPG2 fused to an inactivating ‘prodomain’, would aid in these applications. The research team reported the development of a protease accessibility-based screen to identify candidate sites for circular permutation in proximity of the CPG2 active site. The resulting six circular permutants showed similar expression, structure, thermal stability, and, in four cases, activity levels compared to the wild-type enzyme. They rationalize these results based on structural models of the permutants obtained using the Rosetta software by developing a cell growth-based selection system, and demonstrated that when fused to periplasm-directing signal peptides, one of the circular permutants confers MTX resistance in Escherichiacoli with equal efficiency as the wild-type enzyme. As the permutants have similar properties to wild-type CPG2, these enzymes are promising starting points for the development of autoinhibited, protease-activatable zymogen forms of CPG2 for use in therapeutic contexts[2]
Related References:
[1] Pethe, M. A., Rubenstein, A. B., & Khare, S. D. (2017). Large-scale Structure-based Prediction and Identification of Novel Protease Substrates using Computational Protein Design. Journal of Molecular Biology, 429(2), 220-236
[2] Yachnin, B. J., & Khare, S. D. (2017). Engineering carboxypeptidase G2 circular permutations for the design of an autoinhibited enzyme. Protein engineering, design & selection: PEDS, 1.
Sagar Khare is an Assistant Professor at Rutgers University. He teaches Chemistry and Chemical Biology and seeks to understand the structural determinants of enzymatic specificity and reactivity using a combination of computational protein design and experimental characterization. His research team's goal is to develop a quantitative and predictive understanding of specificity at protein-ligand and protein-peptide interfaces; which will inform various therapeutic and synthetic applications. Prior to working at Rutgers, Prof. Sagar Khare was a Postdoctoral Fellow at the University of Washington in Seattle, Washington. He was also a Software Engineer for Affymax Research Institute in Bangalore, India.