Tuesday, February 28, 2017

Lecture 15: Prof. Alex MacKerell

From Lemkul, J. A. et. al., Chemical reviews, 2016 [2].
Recent developments in polarizable force field research
Atomistic molecular dynamics (MD) simulations have become an integral component of the tool set used to examine biomolecular systems. Since the first MD simulation of a protein in 1977 [1], the time scales and sizes of computationally tractable simulations have grown by orders of magnitude. Systems may now be simulated on the microsecond or millisecond time scale and contain over a million atoms due to increased computing power and ever-improving algorithms for parallel and graphical processing unit (GPU)-based computing. Such capabilities also allow for more rigorous testing of the accuracy of the models used for such MD simulations.
Molecular mechanics force fields that explicitly account for induced polarization represent the next generation of physical models for molecular dynamics simulations. Several methods exist for modeling induced polarization, and in this seminar, Prof. MacKerell reviewed the classical Drude oscillator model, in which electronic degrees of freedom are modeled by charged particles attached to the nuclei of their core atoms by harmonic springs. He described the latest developments in Drude force field parametrization and application, primarily in the last 15 years. Emphasis was placed on the Drude-2013 polarizable force field for proteins, DNA, lipids, and carbohydrates. He discussed its parametrization protocol, development history, and recent simulations of biologically interesting systems, highlighting specific studies in which induced polarization plays a critical role in reproducing experimental observables and understanding physical behavior. As the Drude oscillator model is computationally tractable and available in a wide range of simulation packages, it is anticipated that use of these more complex physical models will lead to new and important discoveries of the physical forces driving a range of chemical and biological phenomena.
Related References:
[1] McCammon, J. A., Gelin, B. R., & Karplus, M. (1977). Dynamics of folded proteins. Nature, 267(5612), 585.
[2] Lemkul, J. A., Huang, J., Roux, B., & MacKerell Jr, A. D. (2016). An empirical polarizable force field based on the classical drude oscillator model: development history and recent applications. Chemical reviews, 116(9), 4983-5013.
__________________________________________________________________________
Alex MacKerell received an A.S. in biology in 1979 from Gloucester County College, Sewell, NJ, followed by a B.S. in chemistry in 1981 from the University of Hawaii, Honolulu, HI, and a Ph.D. in biochemistry in 1985 from Rutgers University, New Brunswick, NJ. Subsequent training involved postdoctoral fellowships in the Department of Medical Biophysics, Karolinska Intitutet, Stockholm, Sweden, in the area of experimental and theoretical biophysics and in the Department of Chemistry, Harvard University in theoretical chemistry. Following one year as a visiting professor at Swarthmore College, Swarthmore, PA, he assumed his faculty position in the School of Pharmacy, University of Maryland, Baltimore in 1993. MacKerell is currently the Grollman-Glick Professor of Pharmaceutical Sciences in the School of Pharmacy and the Director of the University of Maryland Computer-Aided Drug Design Center. MacKerell is also cofounder and Chief Scientific Officer of SilcsBio LLC. Research interests include the development of theoretical chemistry methods, with emphasis on empirical force field development; structure–function studies of proteins, carbohydrates, and nucleic acids; and the application of theoretical methods to drug discovery.

Thursday, December 1, 2016

Lecture 14: David Holland (Margaret Johnson Lab)

Finding the right partner in a crowded world

Mammalian cells contain ∼21,000 genes encoding upwards to 100,000 protein types. 5-40% of cell volume is occupied by macromolecules, posing challenges for cell proteins to locate functional partners and increasing the risk of nonspecific (nonfunctional) interactions. Overexpressed proteins, in particular, will saturate functional partners, leaving leftovers for nonspecific binding instead. Eukaryotic cells have evolved various methods to help proteins function reliably, including compartmentalization, allostery, and structural and chemical properties of binding sites. Cells may optimize specificity as a function of their binding networks; first through concentration balance, next through network structure. Using the Gillespie algorithm, David Holland and Dr. Margaret Johnson from Johns Hopkins Biophysics simulated specific and nonspecific binding in 500 networks of 90-200 nodes with varying topological properties under equal, random, and stoichiometrically balanced protein concentrations. Binding affinities for all specific and nonspecific interactions were determined using a coarse-grained protein sequence model. The research team found out that the concentration balance significantly reduced the number of nonspecific interactions, as did local topological patterns (motifs) that allowed increased difference between specific and nonspecific binding affinities. 
Schematic illustrating (left) the binding of three proteins, where matching features indicate binding interfaces, (center) the corresponding interface-interaction network with orange and green indicating shared and non-shared binding interfaces, respectively, and (right) the protein-protein interaction network (from [1]).
To test if these motifs are selected for in real networks, they sampled possible interface-interaction networks (IINs) for the Clathrin-mediated endocytosis PPI network in yeast, using a Monte Carlo algorithm and a fitness function to select for features that allow increased specificity. The fitness function reproduced several features of the real IIN, including fragmentation, a scale-free degree distribution, the presence of square motifs and a low number of chain and triangle motifs. These features differed significantly from unbiased sampling. In conclusion, there is selective pressure on the evolution of real IINs to avoid nonfunctional interactions and that non-optimal features of the real IIN (chains, larger network components) likely serve a functional purpose.

Relevant Publications:
[1]Johnson, ME, & G Hummer (2013) “Evolutionary Pressure on the Topology of Protein Interface Interaction Networks.” J Phys Chem B 117:13098-106. 
[2]Johnson, ME, G Hummer (2013). “Interface-resolved network of protein-protein interactions.” PLoS Comput Biol 9(5):e1003065 
[3]Keil, C, E Verschueren, J Yang, & L Serrano (2013) “Integration of protein abundance and structure data reveals competition in the ErbB signaling network.” Science Signaling 6(306): ra109 [4]Johnson, ME, & G Hummer (2011) “Nonspecific binding limits the number of proteins in a cell and shapes their interaction networks.” PNAS 108(2):603-8. 
[5]Zhang, J, S Maslov, & EI Shakhnovich (2008) “Constraints imposed by non-functional protein-protein interactions on gene expression and proteome size.” Mol Sys Biol 4:210 
[6]Vavouri, T, JI Semple, R Garcia-Verdugo, & B Lehner (2009) “Intrinsic protein disorder and interaction promiscuity are widely associated with dosage sensitivity.” Cell 138: 198-208.
_________________________________________________________________________________

David Holland received his B.Sc. in Biomedical Engineering from the University of Virginia in 2011. After graduating, he joined the department of Biomedical Engineering at Johns Hopkins. He is currently earning his Ph.D. under Margaret Johnson where he studies the effects of protein abundance and protein-protein interaction network structure on protein mis-interactions. In his spare time, David practices taekwondo and is also teaching a course on network science. 

Tuesday, November 8, 2016

Lecture 13: Athena Chen (Margaret Johnson Lab)

Spatial Cell Modeling Methods 
Due to the complexity of cells, it is useful to use computational tools to understand and predict mechanisms of biological processes. Ordinary differential equations (ODEs) and partial differential equations (PDEs) have proven to be successful at modeling various large systems. However, to understand certain biological processes such as bacterial cell division, high spatial resolution is necessary to discern the underlying mechanisms and interactions. ODEs and PDEs, along with the Gillespie algorithm for stochastic modeling, do not provide spatial resolution at a single-particle level; they are all concentration-based methods. MCell, the Free Propagator Reweighting Algorithm (FPR), and Smoldyn are algorithms that provide single-particle resolution, but may be computationally expensive and may not necessarily yield accurate protein dynamics.

Change in concentration of molecule A in an irreversible 3D reaction A+A -> 0. In this parameter set, the initial distances between molecules causes an increased initial reaction rate.
In her recent work, Athena Chen and her mentors Dr. Margaret Johnson and Dr. Osman Yogurtcu in the Johns Hopkins Biophysics department, analyzed the strengths and limitations of ODEs, PDEs, Gillespie, MCell, Smoldyn, and FPR through establishing and performing a set of benchmark tests. Though all modeling methods extracted the correct equilibrium concentrations for most of the tested reactions, the resulting protein dynamics were not necessarily correct. Simulations at high rates and large densities showed that despite providing single-particle resolution, Smoldyn and MCell do not pick up single-particle effects where the distance between two molecules affects the probability of binding. Furthermore, the dynamics given by Smoldyn and MCell are dependent on the time step selected. On the other hand, FPR correctly identified single-particle effects and yielded dynamics independent of the selected timestep.
As an example of the effects of molecular geometry, diffusion, and stochasticity of protein dynamics, we examined a model for bacterial cell division. From oscillations in protein concentrations, the cell can identify the center of the cell to ensure identical offspring and division of genetic information. 
Written by Athena Chen
Relevant Articles:
1-Yogurtcu, Osman N., and Margaret E. Johnson. "Theory of bi-molecular association dynamics in 2D for accurate model and experimental parameterization of binding rates.The Journal of chemical physics 143.8 (2015): 084117.
2-Andrews, Steven S., et al. "Detailed simulations of cell biology with Smoldyn 2.1." PLoS Comput Biol 6.3 (2010): e1000705.
4-Kerr, Rex A., et al. "Fast Monte Carlo simulation methods for biological reaction-diffusion systems in solution and on surfaces." SIAM journal on scientific computing 30.6 (2008): 3126-3149.
_________________________________________________________________________________
Athena Chen is currently working on her Bachelors of Arts in Biophysics and Bachelors of Science in Applied Mathematics and Statistics from Johns Hopkins University. As part of Dr. Margaret Johnson’s lab in the department of Biophysics, she analyzes the accuracy of methods for modeling the dynamics of protein interactions. In her free time, she enjoys yoga and figure skating. 

Tuesday, October 25, 2016

Lecture 12: Andrei Kucharavy (Rong Li Lab)

Cellular Adaptation Under Stress
Diagram for a cell population adaptation.
Whether the products of human activity, or naturally occurring social, economic or biological, complex systems share the same properties. Composed of a large number of individual components, their components do not have a straightforward relation to their properties and often interact one with another in unexpected ways. Because of that, different instances of the same complex systems are built from slightly different components. Such differences give rise to heterogeneity within a population, which in turn raises significant difficulties for their study. From the biological perspective, such events have been formalized as Fisher’s geometric model, that has been formalized in the thirties of the last century and has been independently re-discovered in unrelated domains as algorithms for ergodic explorations of multi-dimensional spaces for an optimal value function point. 

Andrei Kucharavy and Dr. Rong Li from Johns Hopkins Medicine propose an enhancement of Fisher’s geometric model, allowing to explain a range of previously unexplained observations in biology. Mathematical analysis of their enhancement provides a set of rules applicable to the optimization of a large class of ergodic exploration algorithms.

Related Journal Articles:
1. H. A. Orr, The genetic theory of adaptation: a brief history. Nat. Rev. Genet. 6, 119–127 (2005) 
2. H. A. Orr, R. L. Unckless, The Population Genetics of Evolutionary Rescue. PLoS Genet. 10, e1004551 (2014). 
3. P. S. Pennings, Standing genetic variation and the evolution of drug resistance in HIV. PLoS Comput. Biol. 8 (2012).
_________________________________________________________________________________
Andrei Kucharavy received his Engineer’s Degree in Physics, Mathematics, Programming and Bioinformatics from Ecole Polytechnique, France in 2011. After graduation, he did masters in computational biology at Ecole Polytechnique Fédérale de Lausanne, Switzerland. Currently, Andrei is a Ph.D. student under the joint direction of Dr. Rong Li from Johns Hopkins and Dr. Gilles Fischer, exploring molecular mechanisms of aneuploidy-enabled stress adaptation and drug resistance it enables in yeast and cancer. He works on developing computational methods enabling systematic analysis of molecular mechanisms underlying complex traits. The interface of biological network analysis and evolution theory is his particular interest.

Tuesday, October 11, 2016

Lecture 11: Dr. Ana Damjanović

Simulating with the right pH.
Schematic representation of a constant-pH simulation
with the two-dimensional EDS-HREM method.
Solution pH is one of the most important environmental factors that affects the structure and dynamics of proteins. Almost all biologically relevant properties of proteins are affected by pH: stability, folding and assembly, interactions with ligands and other biological molecules, solubility, aggregation properties, and enzymatic activity. A change in pH may induce a change in the protonation state of ionizable groups, which in turn can cause structural changes in proteins. Structural changes triggered by protonation/deprotonation can be exploited for function, such as in the case of ATP synthase, bacteriorhodopsin, cytochrome c oxidase, or the photoactive yellow protein.

Superimposed 1 ns trajectories of snake cardiotoxin
from the (A) 2D and (B) 1D constant-pH
EDS-HREM simulations at pH = 2.
Ana Damjanovic from Johns Hopkins Biophysics and her colleagues from NIH present a new method for enhanced sampling for constant-pH simulations in explicit water based on a two-dimensional (2D) replica exchange scheme. The new method is a significant extension of a previously developed constant-pH simulation method, which is based on enveloping distribution sampling (EDS) coupled with a one-dimensional (1D) Hamiltonian exchange method (HREM). EDS constructs a hybrid Hamiltonian from multiple discrete end state Hamiltonians that, in this case, represent different protonation states of the system. The ruggedness and heights of the hybrid Hamiltonian’s energy barriers can be tuned by the smoothness parameter. Within the context of the 1D EDS-HREM method, exchanges are performed between replicas with different smoothness parameters, allowing frequent protonation-state transitions and sampling of conformations that are favored by the end-state Hamiltonians. In this work, the 1D method is extended to 2D with an additional dimension, external pH. Within the context of the 2D method (2D EDS-HREM), exchanges are performed on a lattice of Hamiltonians with different pH conditions and smoothness parameters. The research team demonstrates that both the 1D and 2D methods exactly reproduce the thermodynamic properties of the semi-grand canonical (SGC) ensemble of a system at a given pH. They have tested the new 2D method on aspartic acid, glutamic acid, lysine, a four-residue peptide (sequence KAAE), and snake cardiotoxin. In all cases, the 2D method converges faster and without loss of precision; the only limitation is a loss of flexibility in how CPU time is employed. The results for snake cardiotoxin demonstrate that the 2D method enhances protonation-state transitions, samples a wider conformational space with the same amount of computational resources, and converges significantly faster overall than the original 1D method.
_________________________________________________________________________________
Ana Damjanovic received her B.Sc. in Physics from Belgrade University (1995), and a Ph.D. in Physics from University of Illinois at Urbana-Champaign (2001). For her Ph.D. she worked with Prof. Klaus Schulten on QM description of energy transfer in light-harvesting complexes in various photosynthetic organisms. She did postdocs at UC Berkeley and Johns Hopkins University. At JHU she worked on understanding hydration and conformational changes in proteins through molecular dynamics simulations. She is presently an Associate Research Scientist and Lecturer in the Dept. of Biophysics at Johns Hopkins University. Her current research interests are in the area of development and applications of molecular dynamics simulations at constant pH.

Tuesday, September 27, 2016

Lecture 10: Collin Tokheim (Rachel Karchin Lab)

Hotspots in Cancer.
Missense mutations are perhaps the most difficult mutation type to interpret in human cancers. Truncating loss-of-function mutations and structural rearrangements generate major changes in the protein product of a gene, but a single missense mutation yields only a small change in protein chemistry. The impact of missense mutation on protein function, cellular behavior, cancer etiology, and progression may be negligible or profound, for reasons that are not yet well understood. Missense mutations are frequent in most cancer types, accounting for approximately 85% of the somatic mutations observed in solid human tumors, and the cancer genomics community has prioritized the task of identifying important missense mutations discovered in sequencing studies. Whole exome sequencing (WES) studies of cancer have created new opportunities to better understand the importance of missense mutations. This enormous collection of data now allows detection of patterns with power that was unheard of a few years ago.

Comparison of hotspot detection in the TSG FBXW7 in 1D and 3D [1].
In their recent work, Collin Tokheim from Johns Hopkins Biomedical Engineering and his colleagues used The Cancer Genome Atlas mutation data and identified 3D clusters of cancer mutations ("hotspot regions") at amino-acid-residue resolution in 91 genes, of which 56 are known cancer-associated genes. The hotspot regions identified by their method are smaller than a protein domain or protein– protein interface and in many cases can be linked precisely with functional features such as binding sites, active sites, and sites of experimentally characterized mutations. The hotspot regions are shown to be biologically relevant to cancer, and they discovered that there are characteristic differences between regions in the two types of driver genes, oncogenes and tumor suppressor genes (TSG). These differences include region size, mutational diversity, evolutionary conservation, and amino acid residue physiochemistry. For the first time, the research team quantifies why the great majority of well-known hotspot regions occur in oncogenes. Because hotspot regions in TSGs are larger, more heterogeneous than those in oncogenes, they are more difficult to detect using protein sequence alone and are likely to be underreported. The results indicate that protein structure–based 3D mutation clustering increases power to find hotspot regions, particularly in TSGs.

Good reads on the subject:
[1] Tokheim, Collin, et al. "Exome-scale discovery of hotspot mutation regions in human cancer using 3D protein structure." Cancer research (2016): canres-3190.
[2] Perdigão, Nelson, et al. "Unexpected features of the dark proteome." Proceedings of the National Academy of Sciences 112.52 (2015): 15898-15903.
[3] Kamburov, Atanas, et al. "Comprehensive assessment of cancer missense mutation clustering in protein structures." Proceedings of the National Academy of Sciences 112.40 (2015): E5486-E5495.
_________________________________________________________________________________
Collin Tokheim got his bachelor's degree in biomedical engineering at the University of Iowa. After a short stint working in an RNA genomics lab, he came to Hopkins to work on a Ph.D. in Biomedical Engineering. He currently works in Rachel Karchin's lab doing computational research applied to cancer genomics. Collin's current research focuses on how protein structures used on an exome-scale can inform which missense mutations are likely drivers of cancer.




Tuesday, September 13, 2016

Lecture 9: Max C. Klein (Elijah Roberts Lab)

An epigenetic landscape
(from Epigenetics Unraveled).
Analyzing rare events in biology. 
Changes in cellular phenotype can often be tightly correlated with changes in biochemistry. The macroscopic (phenotypical) and the microscopic (biochemical) descriptions of these phenotype switching events can be unified through the theoretical framework of epigenetic landscapes (EL). In approximate terms, the EL of a cellular system is a map that describes every possible system state (expressed in terms of a count of relevant DNA, proteins, etc.) and the probability of the system being in each of these states. An EL can be used to completely describe the dynamics of its associated system, allowing for a deep level of understanding and, potentially, control of the system. It is possible to take the list of chemical species and reactions involved in a cellular process and, using theory and modeling, generate the corresponding EL. Unfortunately, the computer time required to generate an EL using the standard stochastic simulation method, called Brute Force Sampling (BFS), makes it difficult to generate the EL of even a relatively simple multi-state system.

The Forward Flux Sampling (FFS) method, which belongs to a larger family of Enhanced Sampling methods, was previously developed by ten Wolde and coworkers to address this computational limitation. Although FFS can speed up EL simulations by multiple orders of magnitude, there is a commensurate increase in the complexity of simulation setup. Specifically, there are a number of novel free parameters that the simulation user must specify. Each of these parameters has a significant and indirect influence on the precision of an FFS simulation’s final results. It is, therefore, desirable to develop a version of the FFS algorithm in which the relationship of parameter choice to error is made explicit.

Max Klein and Dr. Elijah Roberts from Johns Hopkins Biophysics have designed and implemented a supercomputer-compatible version of the Forward Flux Enhanced Sampling method for stochastic simulation of biochemical networks. Their version greatly simplifies parameter choice and simulation setup relative to the base Forward Flux algorithm. A user of the program needs only to specify the desired level of precision error. The program will determine a set of parameters to use that will achieve this target error level while optimizing the simulation run time. The Forward Flux implementation has been tested and verified in terms of its ability to calculate switching rate constants and epigenetic landscapes. A preview build of the code is currently available here.

Related Articles:
[1] Dickson, Alex, and Aaron R. Dinner. "Enhanced sampling of nonequilibrium steady states." Annual review of physical chemistry 61 (2010): 441-459.
[2] Allen, Rosalind J., Daan Frenkel, and Pieter Rein ten Wolde. "Forward flux sampling-type schemes for simulating rare events: Efficiency analysis." The Journal of chemical physics 124.19 (2006): 194111.
[3] Becker, Nils B., and Pieter Rein ten Wolde. "Rare switching events in non-stationary systems." The Journal of chemical physics 136.17 (2012): 174119.
[4] Gardner, Timothy S., Charles R. Cantor, and James J. Collins. "Construction of a genetic toggle switch in Escherichia coli." Nature 403.6767 (2000): 339-342.
_________________________________________________________________________________
Max C. Klein received his B.A. in Physics, from Reed College, Oregon in 2013. After graduation, he joined the Program in Molecular Biophysics at Johns Hopkins University. Currently, with Dr. Elijah Roberts from the Biophysics Department, Max is developing new methods to simulate decision making in cells using a hybrid multi-CPU/multi-GPU computational architecture.