
Research Highlights

The microorganisms living in the human gut, collectively known as the human gut microbiome, have been shown to have an enormous effect on human health. In the January 2012 issue of the Proceedings of the National Academy of Sciences, researchers in the Borenstein Lab introduced a novel computational framework for studying this microbial community, analyzing it as a single biological system rather than a set of separate species. To do so, the researchers reconstructed complex metabolic networks of the interactions among microbial enzymes in a microbiome, and examined how the topology of these networks differs across human hosts. They found significant topological differences in networks from the microbiomes of obese hosts and hosts with inflammatory bowel disease, compared to healthy human hosts. Their findings suggest that obesity and inflammatory bowel disease may be related to changes in the way the microbiome interacts with the gut environment, rather than disruption of core metabolic processes.
Associated Faculty:
Interactions among genes and their gene products comprise a regulatory network. The inference of such networks from large-scale genomics data poses a computational challenge. In the November 2011 issue of PNAS, Yeung and colleagues presented a methodology to construct gene regulatory networks from time series expression data, integrating various types of external biological knowledge available from public repositories. They generated microarray data measuring time-dependent gene-expression levels in 95 genotyped yeast segregants subjected to a drug perturbation. Their algorithm is capable of generating feedback loops and showed that the inferred network recovered existing and novel regulatory relationships. In addition, they generated independent microarray data on selected deletion mutants to prospectively test network predictions.
Associated Faculty:
Recent advances in sequencing technology have enabled highly sensitive measurement of whole transcriptomes. In the September 2011 issue of mBio, the Katze Lab reported the first use of next-generation sequencing to measure changes in total mRNA from an HIV-1-infected T cell line. This approach identified novel viral RNA splice variants and was able to detect small changes in mRNA abundance, including the suppression of cellular genes associated with T cell activation early after infection. In addition, HIV-1 infection resulted in the altered expression of multiple forms of noncoding RNAs, providing insights into the regulation of microRNAs. These findings give new insights into the panoply of changes that occur in host cells infected with HIV-1 and provide the groundwork for using new sequencing technologies in future studies investigating the host response to virus infection.
Associated Faculty:
In the April 2011 issue of Biochemistry, the Qian Lab proposed a new perspective for protein science that considers both intrinsically disordered proteins and functional promiscuity as consequences of macromolecular ensemble dynamics. In terms of conformational ensembles, They showed that certain enzymes could have multiple substrates, often in substrate mixtures, and can form multiple products from a single substrate. The idea is discussed in the context of enzymes in detoxification.
Associated Faculty:
The STEP HIV-1 vaccine trial was concluded in 2009 and involved 3,000 volunteers from 34 sites worldwide. Overall, the vaccine failed to protect people from infection. In the February 2011 issue of Nature Medicine, the Mullins Lab reported on the results of a large viral genome sequencing and analysis effort to determine whether the vaccine nevertheless provided a benefit that may have been overshadowed by the negative effects of the formulation. They found that the vaccine did indeed exert a partially protective effect by blocking strains of HIV antigenically similar to the vaccine. Their results were the first to demonstrate a positive impact of a vaccine on HIV-1 infection. Although the selective pressures detected were not sufficient to prevent infection, they do provide a new benchmark to evaluate the impact of forthcoming vaccines and provide impetus for designing novel vaccine inserts to block a larger segment of the HIV population from infecting vaccine recipients. A parallel study was conducted on breakthrough infections that occurred in the most recent HIV vaccine trial called RV144, one that did have a small protective effect, with the gene sequencing results due to be released in Spring 2012.
Associated Faculty:
Kinetic studies of self-regulating gene networks in single cells have shown that slow binding and unbinding of transcription factors and DNA can lead to stochastic gene expression, with implications for isogenetic variations. A resonance effect, that is, optimal transition rates between two isogenetic states of a cell, has been suggested as a function of the rate of binding and unbinding. In the February 2011 issue of The Journal of Chemical Physics, the Qian Lab studied this phenomenon and developed a 5-state kinetic model for the resonance effect. Rapid transitions between different isogenetic states may be a mechanism for cell fate control cell differentiation.
Associated Faculty:
The molecular motor myosin drives the contraction of muscle, but doesn't just produce force along the axis of shortening. Models of muscle contraction have primarily treated myosin as a simple spring oriented parallel to its direction of movement. This assumption does not allow prediction of the relationship between the forces produced and the spacing between contractile filaments or of radial forces, perpendicular to the axis of shortening, that are observed during muscle contraction. In the December 2010 issue of PLoS Computational Biology, C. Dave Williams, Mike Regnier, and Tom Daniel developed an alternative model, still computationally efficient enough to be used in simulations, that incorporates springs that are both extensional and torsional (angle-dependent, like those found in a watch). Their model captures much of the spacing-dependent kinetics and forces that are missing from single-spring models.
Associated Faculty:
Transcribed regions in the human genome differ from adjacent intergenic regions in transposable element density, crossover rates, and asymmetric substitution and sequence composition patterns. In the November 2010 issue of Genome Research, members of the Green Lab tested whether these differences reflect selection or are instead a byproduct of germline transcription. Crossover rate shows a strong negative correlation with gene expression in meiotic tissues, suggesting that crossover is inhibited by transcription. Strand-biased composition (G+T content) and A → G versus T → C substitution asymmetry are both positively correlated with germline gene expression. They found no evidence for a strand bias in allele frequency data, implying that the substitution asymmetry reflects a mutation rather than a fixation bias. The density of transposable elements is positively correlated with germline expression, suggesting that such elements preferentially insert into regions that are actively transcribed. For each of the features examined, their analyses favor a nonselective explanation for the observed trends and point to the role of germline gene expression in shaping the mammalian genome.
Associated Faculty:
The two most common diseases in humans that cause irreversible damage, destructive periodontitis and dental caries, involve disturbance of the interface between protein and hydroxyapatite surfaces. In the October 2010 issue of Pattern Recognition Letters, the Samudrala Lab created a variety of bioinformatic protein sequence analysis methods designed to identify functional mechanisms of mineralization proteins. They applied these methods to define meta-functional signatures that describe the form and function of amelogenin, a protein highly conserved across vertebrates but with unique sequence and function with respect to other proteins in nature. They predicted physiologic mechanisms of amelogenin that were prospectively verified in the laboratory, including nucleation of calcium phosphate seed crystals, maturation, and physiologic binding of enamel hydroxyapatite.
Associated Faculty:
Ram Samudrala received the 2010 NIH Director’s Pioneer Award for a novel and unique computational multitarget fragment-based docking protocol. Its purpose is to implement a comprehensive and efficient drug discovery pipeline with higher efficency, lowered cost, and increased success rates, compared to current approaches. The Samudrala Lab is applying this protocol to evaluate how all approved drugs bind to all known disease target protein structures. The top predictions are verified in the laboratory and clinic to repurpose drugs approved for other indications as new therapeutics, particularly for underserved diseases.
Associated Faculty:
The Qian Lab works on biochemical reaction systems in small volumes such as cells, in which many important proteins have small copy numbers and reaction kinetics are stochastic and discrete. In the September 2010 issue of the Journal of Physical Chemistry B they proposed a new mechanism for regulating protein functions called Cyclic Conformational Modification. In contrast to the classical ligand-induced conformational change, Cyclic Conformational Modification only requires a catalytic amount of ligand as an activator, with an associated energy expenditure.
Associated Faculty:
Predicting a stable three-dimensional structure from any given amino acid sequence using first physical principles remains a formidable computational challenge. Aiming to recruit human visual and strategic powers to the task, the Baker Lab and the Department of Computer Science & Engineering created an online multiplayer game called Foldit, in which thousands of nonscientists compete and collaborate to produce a rich set of search strategies for protein structure refinement. This work, published in the August 2010 issue of Nature, shows that even computationally complex scientific problems can be effectively crowd-sourced using interactive multiplayer games.
Associated Faculty:
In the June 2010 issue of Nature Biotechnology, Xiaoyu Chen and Martin Tompa assessed four expert ENCODE alignments, each of which aligns 28 vertebrate sequences on 554,000,000 base-pairs of total input sequence. They reported a disturbing lack of agreement among the alignments not only in species distant from human, but even in mouse, a well-studied model organism. Overall, the assessment showed that the Pecan alignment method produced the most accurate or nearly most accurate alignment in all species and genomic location categories, while still providing coverage comparable to or better than that of the other alignments.
Associated Faculty:
The Noble Lab's three-dimensional model of the yeast genome was published in the May 2010 issue of Nature. They developed a method to globally capture intra- and inter-chromosomal interactions, and applied it to generate a map at kilobase resolution of the haploid genome of Saccharomyces cerevisiae. The map recapitulates known features of genome organization and identifies new features.
Associated Faculty:
Genomic DNA is tightly packaged by histone proteins to form nucleosomes, which must be mobilized to allow regulatory proteins to bind and polymerases to replicate and transcribe DNA. These active processes can cause the histones to turn over. A study published in the May 2010 issue of Science by the Henikoff Lab describes a new method for measuring histone turnover genome-wide using a metabolic labeling approach. Termed "CATCH-IT", for Covalent Attachment of Tags to Capture Histones and Identify Turnover, this method was used to show that histones at epigenetic regulatory elements turn over much faster than a single cell cycle. This implies that histone marks do not themselves transmit epigenetic information, but rather are continually replenished during the cell cycle.
Associated Faculty:
Associated Publications:
- R. Deal, J. Henikoff, S. Henikoff, "Genome-wide kinetics of nucleosome turnover determined by metabolic labeling of histones", Science, vol. 328 (2010) 1161-4. Pubmed 20508129.
- M. Muers, "Epigenomics: Catching nucleosomes in action", Nature Reviews Genetics, vol. 11 (2010) 457. Pubmed 20531368.
The Dynameomics Database was featured in the April 2010 issue of Structure. The Dynameomics Project is the Daggett Lab's initiative to simulate the folding pathway of all known protein folds. This article was also highlighted by Nature Methods as an important tool for studying protein dynamics and the development of disease.
Associated Faculty:
One of the best-studied transcription factors, MyoD, is a master regulator of myogenesis, the process by which precursor cells differentiate into skeletal muscle cells. Indeed, expressing this one protein in a fibroblast (skin cell) causes extensive remodeling of the cell, resulting in a facsimile of a muscle cell. In the April 2010 issue of Developmental CellWalter L. Ruzzo and collaborators at the Fred Hutchinson Cancer Research Center assayed genome-wide MyoD binding in differentiating and mature muscle cells. They used "ChIP-seq" technology -- chromatin immunoprecipitation followed by next-gen sequencing. As expected, MyoD is found near start sites of the several hundred genes that are known to be driven by this powerful regulator. Surprisingly, however, these sites comprise only a tiny fraction of the sites bound by MyoD, many thousands of which occur far from any gene. These data suggest that MyoD is unexpectedly multifunctional, initiating changes in chromatin state that result in broad reprogramming of the cell, in addition to its canonical role in activating genes immediately adjacent to some of its target sites.
Associated Faculty:
Associated Publications:
- M. Biggin, "MyoD, a lesson in widespread DNA binding", Dev. Cell, vol. 18 (2010) 505-6. Pubmed 20412764.
- Y. Cao, Z. Yao, D. Sarkar, M. Lawrence, G. Sanchez, M. Parker, K. MacQuarrie, J. Davison, M. Morgan, W. Ruzzo, R. Gentleman, S. Tapscott, "Genome-wide MyoD binding in skeletal muscle cells: a potential for broad cellular reprogramming", Dev. Cell, vol. 18 (2010) 662-74. Pubmed 20412780.
In chemistry and biology, the formalism of mass action equations is commonly used to model and understand the time evolution of complex molecular systems. In a typical application one is given a system of interacting molecules and then attempts to build a chemical reaction model based on that system. In the March 2010 issue of the Proceedings of the National Academy of Sciences, the Seelig Lab took the opposite approach and asked, "Given a formal chemical reaction network with a desired dynamical behavior, can one find molecules that implement this behavior?" They demonstrated that this is in fact possible, and proposed a specific DNA-based implementation for arbitrary chemical reaction networks. In their approach, the formalism of chemical reaction networks becomes a prescriptive "programming language" rather than a descriptive modeling language.
Associated Faculty:
Nuclear Magnetic Resonance (NMR) is a powerful method for determining protein structures in the physiologically relevant solution state. For larger proteins, however, the NMR spectrum gets extremely crowded, rendering it virtually impossible to assign the spectrum. In the February 2010 issue of Science, the Baker Lab demonstrated that sparse, backbone-only NMR data can be combined with the Rosetta protein structure prediction program to computationally determine large NMR structures up to 200 amino acids.
Associated Faculty:
The size, shape, and behavior of the modern domesticated dog have been sculpted by artificial selection for at least 14,000 years. The genetic substrates of selective breeding, however, remain largely unknown. In the January 2010 issue of the Proceedings of the National Academy of Sciences, the Akey Lab reported the first genome-wide scan for artificial selection in 275 dogs from 10 phenotypically diverse breeds that were genotyped for over 21,000 autosomal SNPs. In total, 155 regions that possess strong signatures of recent selection were identified. These regions contain candidate genes for phenotypes that vary most conspicuously among breeds, including size, coat color and texture, behavior, skeletal morphology, and physiology. In particular, their work demonstrated a significant association between the HAS2 gene and skin wrinkling in the Shar-Pei, and provided evidence that regulatory evolution has played a prominent role in the phenotypic diversification of modern dog breeds.
Associated Faculty:
Hepatitis C virus is notorious for its ability to stay under the immune system radar and cause chronic and often debilitating liver disease. In the January 2010 issue of PLoS Pathogens, the Katze Lab showed how the virus disrupts the normal metabolic functioning of the liver. This was the first study to use sensitive mass spectrometry and computational methods to analyze changes in the protein and lipid profiles of liver cells during hepatitis C virus infection. The approach led to the identification of mitochondrial fatty acid enzymes as key cellular proteins targeted by hepatitis C virus. These findings raise the prospect that these enzymes may be useful as diagnostic indicators or possibly even as drug targets.
Associated Faculty:
Recent work implicates the human and mouse PRDM9 genes in specifying meiotic recombination hotspots. PRDM9 encodes a protein with histone H3(K4) trimethyltransferase activity, a KRAB domain, and a DNA-binding domain consisting of multiple tandem C2H2 zinc finger domains. In the December 2009 issue of PLoS One, the Thomas Lab analyzed human coding polymorphism and interspecies evolutionary changes in PRDM9. They found that the zinc finger domains are evolving very rapidly, with compelling evidence of positive selection in primates. Positively selected amino acids are predominantly those known to make nucleotide-specific contacts in C2H2 zinc fingers. These results suggest that PRDM9 is subject to recurrent selection to change DNA-binding specificity. The human PRDM9 protein is highly polymorphic in its zinc finger domains and nearly all polymorphisms affect the same nucleotide contact residues that are subject to positive selection. Zinc finger domain nucleotide sequences are strongly homogenized within species, indicating that interfinger recombination contributes to their evolution. These findings fit with the fact that recombination hotspots move around rapidly during mammalian evolution, with PRDM9 acting as either a driver or a follower of this movement.
Associated Faculty:
Studying genomic patterns of human population structure provides important insights into human evolutionary history and the relationship among populations, and it has significant practical implications for disease-gene mapping. In the May 2009 issue of The American Journal of Human Genetics, the Akey Lab used principal component analysis to study intra-continental population structure in humans. Their methodology was applied to a dataset of 650,000 SNPs genotyped in 944 unrelated individuals from 52 populations. They demonstrated that substantial information about population structure is contained in the lower-ranked principal components. In total, they identified 18 significant principal components, some of which distinguish individual populations. They also estimated the set of all SNPs significantly correlated with each of the most informative axes of variation. These polymorphisms, unlike ancestry-informative markers, constitute a much larger set of loci that drive genomic signatures of population structure.
Associated Faculty:
The Samudrala Lab displayed one of the very best performances for metal ion binding site prediction of over 100 groups in the 2008 international blinded experiment, the Critical Assessment of Techniques for Protein Structure Prediction (CASP8). Shown here is the accurate prediction of Listeria monocytogenes histidinol phosphatase HisK (red) binding zinc and iron ions at the catalytic site.
Associated Faculty:
When measurable characters in a group of species are studied, they must be considered using the phylogeny of the species. We can test statistically whether a pair of characters have evolved independently, or have evolved in a correlated manner. In ongoing work described by Joe Felsenstein in his 2008 Sir Julian Huxley Lecture to the Systematics Association, he and Fred Bookstein extend this to (x,y) landmarks collected automatically from digitized forms. One of the objectives is to find ways of placing fossil forms in a phylogeny which is constructed for present-day forms from molecular data.
Associated Faculty: