Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 121
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
Cell ; 186(5): 923-939.e14, 2023 03 02.
Artículo en Inglés | MEDLINE | ID: mdl-36868214

RESUMEN

We conduct high coverage (>30×) whole-genome sequencing of 180 individuals from 12 indigenous African populations. We identify millions of unreported variants, many predicted to be functionally important. We observe that the ancestors of southern African San and central African rainforest hunter-gatherers (RHG) diverged from other populations >200 kya and maintained a large effective population size. We observe evidence for ancient population structure in Africa and for multiple introgression events from "ghost" populations with highly diverged genetic lineages. Although currently geographically isolated, we observe evidence for gene flow between eastern and southern Khoesan-speaking hunter-gatherer populations lasting until ∼12 kya. We identify signatures of local adaptation for traits related to skin color, immune response, height, and metabolic processes. We identify a positively selected variant in the lightly pigmented San that influences pigmentation in vitro by regulating the enhancer activity and gene expression of PDPK1.


Asunto(s)
Aclimatación , Pigmentación de la Piel , Humanos , Secuenciación Completa del Genoma , Densidad de Población , África , Proteínas Quinasas Dependientes de 3-Fosfoinosítido
2.
Nat Methods ; 20(8): 1232-1236, 2023 08.
Artículo en Inglés | MEDLINE | ID: mdl-37386188

RESUMEN

Phylogenetic models of molecular evolution are central to numerous biological applications spanning diverse timescales, from hundreds of millions of years involving orthologous proteins to just tens of days relating to single cells within an organism. A fundamental problem in these applications is estimating model parameters, for which maximum likelihood estimation is typically employed. Unfortunately, maximum likelihood estimation is a computationally expensive task, in some cases prohibitively so. To address this challenge, we here introduce CherryML, a broadly applicable method that achieves several orders of magnitude speedup by using a quantized composite likelihood over cherries in the trees. The massive speedup offered by our method should enable researchers to consider more complex and biologically realistic models than previously possible. Here we demonstrate CherryML's utility by applying it to estimate a general 400 × 400 rate matrix for residue-residue coevolution at contact sites in three-dimensional protein structures; we estimate that using current state-of-the-art methods such as the expectation-maximization algorithm for the same task would take >100,000 times longer.


Asunto(s)
Evolución Molecular , Proteínas , Filogenia , Funciones de Verosimilitud , Algoritmos , Modelos Genéticos
3.
Proc Natl Acad Sci U S A ; 120(44): e2311219120, 2023 Oct 31.
Artículo en Inglés | MEDLINE | ID: mdl-37883436

RESUMEN

The expanding catalog of genome-wide association studies (GWAS) provides biological insights across a variety of species, but identifying the causal variants behind these associations remains a significant challenge. Experimental validation is both labor-intensive and costly, highlighting the need for accurate, scalable computational methods to predict the effects of genetic variants across the entire genome. Inspired by recent progress in natural language processing, unsupervised pretraining on large protein sequence databases has proven successful in extracting complex information related to proteins. These models showcase their ability to learn variant effects in coding regions using an unsupervised approach. Expanding on this idea, we here introduce the Genomic Pre-trained Network (GPN), a model designed to learn genome-wide variant effects through unsupervised pretraining on genomic DNA sequences. Our model also successfully learns gene structure and DNA motifs without any supervision. To demonstrate its utility, we train GPN on unaligned reference genomes of Arabidopsis thaliana and seven related species within the Brassicales order and evaluate its ability to predict the functional impact of genetic variants in A. thaliana by utilizing allele frequencies from the 1001 Genomes Project and a comprehensive database of GWAS. Notably, GPN outperforms predictors based on popular conservation scores such as phyloP and phastCons. Our predictions for A. thaliana can be visualized as sequence logos in the UCSC Genome Browser (https://genome.ucsc.edu/s/gbenegas/gpn-arabidopsis). We provide code (https://github.com/songlab-cal/gpn) to train GPN for any given species using its DNA sequence alone, enabling unsupervised prediction of variant effects across the entire genome.


Asunto(s)
Arabidopsis , Arabidopsis/genética , Estudio de Asociación del Genoma Completo , Genómica , Genoma , ADN
4.
PLoS Pathog ; 19(3): e1011230, 2023 03.
Artículo en Inglés | MEDLINE | ID: mdl-36940219

RESUMEN

In Brazil, Leishmania braziliensis is the main causative agent of the neglected tropical disease, cutaneous leishmaniasis (CL). CL presents on a spectrum of disease severity with a high rate of treatment failure. Yet the parasite factors that contribute to disease presentation and treatment outcome are not well understood, in part because successfully isolating and culturing parasites from patient lesions remains a major technical challenge. Here we describe the development of selective whole genome amplification (SWGA) for Leishmania and show that this method enables culture-independent analysis of parasite genomes obtained directly from primary patient skin samples, allowing us to circumvent artifacts associated with adaptation to culture. We show that SWGA can be applied to multiple Leishmania species residing in different host species, suggesting that this method is broadly useful in both experimental infection models and clinical studies. SWGA carried out directly on skin biopsies collected from patients in Corte de Pedra, Bahia, Brazil, showed extensive genomic diversity. Finally, as a proof-of-concept, we demonstrated that SWGA data can be integrated with published whole genome data from cultured parasite isolates to identify variants unique to specific geographic regions in Brazil where treatment failure rates are known to be high. SWGA provides a relatively simple method to generate Leishmania genomes directly from patient samples, unlocking the potential to link parasite genetics with host clinical phenotypes.


Asunto(s)
Genoma de Protozoos , Leishmaniasis Cutánea , Parasitología , Piel , Genoma de Protozoos/genética , Humanos , Genética de Población , Piel/parasitología , Brasil , Leishmaniasis Cutánea/parasitología , Parasitología/métodos , Leishmania braziliensis/genética
5.
Proc Natl Acad Sci U S A ; 119(46): e2210247119, 2022 Nov 16.
Artículo en Inglés | MEDLINE | ID: mdl-36343260

RESUMEN

Genetic variants in SLC22A5, encoding the membrane carnitine transporter OCTN2, cause the rare metabolic disorder Carnitine Transporter Deficiency (CTD). CTD is potentially lethal but actionable if detected early, with confirmatory diagnosis involving sequencing of SLC22A5. Interpretation of missense variants of uncertain significance (VUSs) is a major challenge. In this study, we sought to characterize the largest set to date (n = 150) of OCTN2 variants identified in diverse ancestral populations, with the goals of furthering our understanding of the mechanisms leading to OCTN2 loss-of-function (LOF) and creating a protein-specific variant effect prediction model for OCTN2 function. Uptake assays with 14C-carnitine revealed that 105 variants (70%) significantly reduced transport of carnitine compared to wild-type OCTN2, and 37 variants (25%) severely reduced function to less than 20%. All ancestral populations harbored LOF variants; 62% of green fluorescent protein (GFP)-tagged variants impaired OCTN2 localization to the plasma membrane of human embryonic kidney (HEK293T) cells, and subcellular localization significantly associated with function, revealing a major LOF mechanism of interest for CTD. With these data, we trained a model to classify variants as functional (>20% function) or LOF (<20% function). Our model outperformed existing state-of-the-art methods as evaluated by multiple performance metrics, with mean area under the receiver operating characteristic curve (AUROC) of 0.895 ± 0.025. In summary, in this study we generated a rich dataset of OCTN2 variant function and localization, revealed important disease-causing mechanisms, and improved upon machine learning-based prediction of OCTN2 variant function to aid in variant interpretation in the diagnosis and treatment of CTD.


Asunto(s)
Carnitina , Proteínas de Transporte de Catión Orgánico , Humanos , Miembro 5 de la Familia 22 de Transportadores de Solutos/genética , Miembro 5 de la Familia 22 de Transportadores de Solutos/metabolismo , Proteínas de Transporte de Catión Orgánico/genética , Proteínas de Transporte de Catión Orgánico/metabolismo , Células HEK293 , Carnitina/genética , Carnitina/metabolismo , Genómica
6.
Genome Res ; 31(10): 1794-1806, 2021 10.
Artículo en Inglés | MEDLINE | ID: mdl-34301624

RESUMEN

Direct comparison of bulk gene expression profiles is complicated by distinct cell type mixtures in each sample that obscure whether observed differences are actually caused by changes in the expression levels themselves or are simply a result of differing cell type compositions. Single-cell technology has made it possible to measure gene expression in individual cells, achieving higher resolution at the expense of increased noise. If carefully incorporated, such single-cell data can be used to deconvolve bulk samples to yield accurate estimates of the true cell type proportions, thus enabling one to disentangle the effects of differential expression and cell type mixtures. Here, we propose a generative model and a likelihood-based inference method that uses asymptotic statistical theory and a novel optimization procedure to perform deconvolution of bulk RNA-seq data to produce accurate cell type proportion estimates. We show the effectiveness of our method, called RNA-Sieve, across a diverse array of scenarios involving real data and discuss extensions made uniquely possible by our probabilistic framework, including a demonstration of well-calibrated confidence intervals.


Asunto(s)
ARN , Transcriptoma , Perfilación de la Expresión Génica/métodos , Funciones de Verosimilitud , RNA-Seq , Análisis de Secuencia de ARN , Análisis de la Célula Individual/métodos
7.
Genome Res ; 31(2): 239-250, 2021 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-33361114

RESUMEN

Biosynthetic gene clusters (BGCs) are operonic sets of microbial genes that synthesize specialized metabolites with diverse functions, including siderophores and antibiotics, which often require export to the extracellular environment. For this reason, genes for transport across cellular membranes are essential for the production of specialized metabolites and are often genomically colocalized with BGCs. Here, we conducted a comprehensive computational analysis of transporters associated with characterized BGCs. In addition to known exporters, in BGCs we found many importer-specific transmembrane domains that co-occur with substrate binding proteins possibly for uptake of siderophores or metabolic precursors. Machine learning models using transporter gene frequencies were predictive of known siderophore activity, molecular weights, and a measure of lipophilicity (log P) for corresponding BGC-synthesized metabolites. Transporter genes associated with BGCs were often equally or more predictive of metabolite features than biosynthetic genes. Given the importance of siderophores as pathogenicity factors, we used transporters specific for siderophore BGCs to identify both known and uncharacterized siderophore-like BGCs in genomes from metagenomes from the infant and adult gut microbiome. We find that 23% of microbial genomes from premature infant guts have siderophore-like BGCs, but only 3% of those assembled from adult gut microbiomes do. Although siderophore-like BGCs from the infant gut are predominantly associated with Enterobacteriaceae and Staphylococcus, siderophore-like BGCs can be identified from taxa in the adult gut microbiome that have rarely been recognized for siderophore production. Taken together, these results show that consideration of BGC-associated transporter genes can inform predictions of specialized metabolite structure and function.

8.
Nat Methods ; 18(8): 903-911, 2021 08.
Artículo en Inglés | MEDLINE | ID: mdl-34354295

RESUMEN

The development of DNA-barcoded antibodies to tag cell surface molecules has enabled the use of droplet-based single-cell sequencing (dsc-seq) to profile protein abundances from thousands of cells simultaneously. As compared to flow and mass cytometry, the high per cell cost of current dsc-seq-based workflows precludes their use in clinical applications and large-scale pooled screens. Here, we introduce SCITO-seq, a workflow that uses splint oligonucleotides (oligos) to enable combinatorially indexed dsc-seq of DNA-barcoded antibodies from over 105 cells per reaction using commercial microfluidics. By encoding sample barcodes into splint oligos, we demonstrate that multiplexed SCITO-seq produces reproducible estimates of cellular composition and surface protein expression comparable to those from mass cytometry. We further demonstrate two modified splint oligo designs that extend SCITO-seq to achieve compatibility with commercial DNA-barcoded antibodies and simultaneous expression profiling of the transcriptome and surface proteins from the same cell. These results demonstrate SCITO-seq as a flexible and ultra-high-throughput platform for sequencing-based single-cell protein and multimodal profiling.


Asunto(s)
Citometría de Flujo/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Microfluídica/métodos , Análisis de Secuencia de ARN/métodos , Análisis de la Célula Individual/métodos , Transcriptoma , Estudios de Casos y Controles , Perfilación de la Expresión Génica , Humanos
9.
PLoS Comput Biol ; 19(4): e1010137, 2023 04.
Artículo en Inglés | MEDLINE | ID: mdl-37068103

RESUMEN

Addressing many of the major outstanding questions in the fields of microbial evolution and pathogenesis will require analyses of populations of microbial genomes. Although population genomic studies provide the analytical resolution to investigate evolutionary and mechanistic processes at fine spatial and temporal scales-precisely the scales at which these processes occur-microbial population genomic research is currently hindered by the practicalities of obtaining sufficient quantities of the relatively pure microbial genomic DNA necessary for next-generation sequencing. Here we present swga2.0, an optimized and parallelized pipeline to design selective whole genome amplification (SWGA) primer sets. Unlike previous methods, swga2.0 incorporates active and machine learning methods to evaluate the amplification efficacy of individual primers and primer sets. Additionally, swga2.0 optimizes primer set search and evaluation strategies, including parallelization at each stage of the pipeline, to dramatically decrease program runtime. Here we describe the swga2.0 pipeline, including the empirical data used to identify primer and primer set characteristics, that improve amplification performance. Additionally, we evaluate the novel swga2.0 pipeline by designing primer sets that successfully amplify Prevotella melaninogenica, an important component of the lung microbiome in cystic fibrosis patients, from samples dominated by human DNA.


Asunto(s)
Genoma , Genómica , Humanos , Análisis de Secuencia de ADN/métodos , ADN
10.
Nature ; 553(7687): 203-207, 2018 01 11.
Artículo en Inglés | MEDLINE | ID: mdl-29323294

RESUMEN

Despite broad agreement that the Americas were initially populated via Beringia, the land bridge that connected far northeast Asia with northwestern North America during the Pleistocene epoch, when and how the peopling of the Americas occurred remains unresolved. Analyses of human remains from Late Pleistocene Alaska are important to resolving the timing and dispersal of these populations. The remains of two infants were recovered at Upward Sun River (USR), and have been dated to around 11.5 thousand years ago (ka). Here, by sequencing the USR1 genome to an average coverage of approximately 17 times, we show that USR1 is most closely related to Native Americans, but falls basal to all previously sequenced contemporary and ancient Native Americans. As such, USR1 represents a distinct Ancient Beringian population. Using demographic modelling, we infer that the Ancient Beringian population and ancestors of other Native Americans descended from a single founding population that initially split from East Asians around 36 ± 1.5 ka, with gene flow persisting until around 25 ± 1.1 ka. Gene flow from ancient north Eurasians into all Native Americans took place 25-20 ka, with Ancient Beringians branching off around 22-18.1 ka. Our findings support a long-term genetic structure in ancestral Native Americans, consistent with the Beringian 'standstill model'. We show that the basal northern and southern Native American branches, to which all other Native Americans belong, diverged around 17.5-14.6 ka, and that this probably occurred south of the North American ice sheets. We also show that after 11.5 ka, some of the northern Native American populations received gene flow from a Siberian population most closely related to Koryaks, but not Palaeo-Eskimos, Inuits or Kets, and that Native American gene flow into Inuits was through northern and not southern Native American groups. Our findings further suggest that the far-northern North American presence of northern Native Americans is from a back migration that replaced or absorbed the initial founding population of Ancient Beringians.


Asunto(s)
Efecto Fundador , Genoma Humano/genética , Indígenas Norteamericanos/genética , Modelos Genéticos , Filogenia , Alaska , Asia Oriental/etnología , Flujo Génico , Genética de Población , Historia Antigua , Migración Humana , Humanos , Lactante , Ríos , Siberia/etnología , Factores de Tiempo
11.
Nature ; 538(7624): 201-206, 2016 Oct 13.
Artículo en Inglés | MEDLINE | ID: mdl-27654912

RESUMEN

Here we report the Simons Genome Diversity Project data set: high quality genomes from 300 individuals from 142 diverse populations. These genomes include at least 5.8 million base pairs that are not present in the human reference genome. Our analysis reveals key features of the landscape of human genome variation, including that the rate of accumulation of mutations has accelerated by about 5% in non-Africans compared to Africans since divergence. We show that the ancestors of some pairs of present-day human populations were substantially separated by 100,000 years ago, well before the archaeologically attested onset of behavioural modernity. We also demonstrate that indigenous Australians, New Guineans and Andamanese do not derive substantial ancestry from an early dispersal of modern humans; instead, their modern human ancestry is consistent with coming from the same source as that of other non-Africans.


Asunto(s)
Variación Genética/genética , Genoma Humano/genética , Genómica , Tasa de Mutación , Filogenia , Grupos Raciales/genética , Animales , Australia , Población Negra/genética , Conjuntos de Datos como Asunto , Genética de Población , Historia Antigua , Migración Humana/historia , Humanos , Nativos de Hawái y Otras Islas del Pacífico/genética , Hombre de Neandertal/genética , Nueva Guinea , Análisis de Secuencia de ADN , Especificidad de la Especie , Factores de Tiempo
12.
Proc Natl Acad Sci U S A ; 116(34): 17115-17120, 2019 08 20.
Artículo en Inglés | MEDLINE | ID: mdl-31387977

RESUMEN

There has been much interest in analyzing genome-scale DNA sequence data to infer population histories, but inference methods developed hitherto are limited in model complexity and computational scalability. Here we present an efficient, flexible statistical method, diCal2, that can use whole-genome sequence data from multiple populations to infer complex demographic models involving population size changes, population splits, admixture, and migration. Applying our method to data from Australian, East Asian, European, and Papuan populations, we find that the population ancestral to Australians and Papuans started separating from East Asians and Europeans about 100,000 y ago, and that the separation of East Asians and Europeans started about 50,000 y ago, with pervasive gene flow between all pairs of populations.


Asunto(s)
Flujo Génico , Estudio de Asociación del Genoma Completo , Migración Humana , Modelos Genéticos , Nativos de Hawái y Otras Islas del Pacífico/genética , Secuenciación Completa del Genoma , Australia , Genética de Población , Historia Antigua , Humanos , Nativos de Hawái y Otras Islas del Pacífico/historia
13.
Biophys J ; 120(8): 1309-1313, 2021 04 20.
Artículo en Inglés | MEDLINE | ID: mdl-33582139

RESUMEN

The totally asymmetric simple exclusion process (TASEP), which describes the stochastic dynamics of interacting particles on a lattice, has been actively studied over the past several decades and applied to model important biological transport processes. Here, we present a software package, called EGGTART (Extensive GUI gives TASEP-realization in Real Time), which quantifies and visualizes the dynamics associated with a generalized version of the TASEP with an extended particle size and heterogeneous jump rates. This computational tool is based on analytic formulas obtained from deriving and solving the hydrodynamic limit of the process. It allows an immediate quantification of the particle density, flux, and phase diagram, as a function of a few key parameters associated with the system, which would be difficult to achieve via conventional stochastic simulations. Our software should therefore be of interest to biophysicists studying general transport processes and can in particular be used in the context of gene expression to model and quantify mRNA translation of different coding sequences.


Asunto(s)
Biosíntesis de Proteínas , Transporte Biológico , Biofisica
14.
J Biol Chem ; 295(33): 11435-11454, 2020 08 14.
Artículo en Inglés | MEDLINE | ID: mdl-32518159

RESUMEN

mRNA levels are determined by the balance between mRNA synthesis and decay. Protein factors that mediate both processes, including the 5'-3' exonuclease Xrn1, are responsible for a cross-talk between the two processes that buffers steady-state mRNA levels. However, the roles of these proteins in transcription remain elusive and controversial. Applying native elongating transcript sequencing (NET-seq) to yeast cells, we show that Xrn1 functions mainly as a transcriptional activator and that its disruption manifests as a reduction of RNA polymerase II (Pol II) occupancy downstream of transcription start sites. By combining our sequencing data and mathematical modeling of transcription, we found that Xrn1 modulates transcription initiation and elongation of its target genes. Furthermore, Pol II occupancy markedly increased near cleavage and polyadenylation sites in xrn1Δ cells, whereas its activity decreased, a characteristic feature of backtracked Pol II. We also provide indirect evidence that Xrn1 is involved in transcription termination downstream of polyadenylation sites. We noted that two additional decay factors, Dhh1 and Lsm1, seem to function similarly to Xrn1 in transcription, perhaps as a complex, and that the decay factors Ccr4 and Rpb4 also perturb transcription in other ways. Interestingly, the decay factors could differentiate between SAGA- and TFIID-dominated promoters. These two classes of genes responded differently to XRN1 deletion in mRNA synthesis and were differentially regulated by mRNA decay pathways, raising the possibility that one distinction between these two gene classes lies in the mechanisms that balance mRNA synthesis with mRNA decay.


Asunto(s)
Exorribonucleasas/metabolismo , Regulación Fúngica de la Expresión Génica , ARN Polimerasa II/metabolismo , Proteínas de Saccharomyces cerevisiae/metabolismo , Saccharomyces cerevisiae/metabolismo , Exorribonucleasas/genética , Eliminación de Gen , ARN Polimerasa II/genética , Estabilidad del ARN , ARN Mensajero/genética , ARN Mensajero/metabolismo , Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/genética , Sitio de Iniciación de la Transcripción , Activación Transcripcional
15.
Theor Popul Biol ; 141: 34-43, 2021 10.
Artículo en Inglés | MEDLINE | ID: mdl-34186053

RESUMEN

The ancestral recombination graph (ARG) contains the full genealogical information of the sample, and many population genetic inference problems can be solved using inferred or sampled ARGs. In particular, the waiting distance between tree changes along the genome can be used to make inference about the distribution and evolution of recombination rates. To this end, we here derive an analytic expression for the distribution of waiting distances between tree changes under the sequentially Markovian coalescent model and obtain an accurate approximation to the distribution of waiting distances for topology changes. We use these results to show that some of the recently proposed methods for inferring sequences of trees along the genome provide strongly biased distributions of waiting distances. In addition, we provide a correction to an undercounting problem facing all available ARG inference methods, thereby facilitating the use of ARG inference methods to estimate temporal changes in the recombination rate.


Asunto(s)
Modelos Genéticos , Recombinación Genética , Algoritmos , Genoma , Cadenas de Markov , Filogenia
16.
Nucleic Acids Res ; 47(8): 4198-4210, 2019 05 07.
Artículo en Inglés | MEDLINE | ID: mdl-30805621

RESUMEN

The ribosome exit tunnel is an important structure involved in the regulation of translation and other essential functions such as protein folding. By comparing 20 recently obtained cryo-EM and X-ray crystallography structures of the ribosome from all three domains of life, we here characterize the key similarities and differences of the tunnel across species. We first show that a hierarchical clustering of tunnel shapes closely reflects the species phylogeny. Then, by analyzing the ribosomal RNAs and proteins, we explain the observed geometric variations and show direct association between the conservations of the geometry, structure and sequence. We find that the tunnel is more conserved in the upper part close to the polypeptide transferase center, while in the lower part, it is substantially narrower in eukaryotes than in bacteria. Furthermore, we provide evidence for the existence of a second constriction site in eukaryotic exit tunnels. Overall, these results have several evolutionary and functional implications, which explain certain differences between eukaryotes and prokaryotes in their translation mechanisms. In particular, they suggest that major co-translational functions of bacterial tunnels were externalized in eukaryotes, while reducing the tunnel size provided some other advantages, such as facilitating the nascent chain elongation and enabling antibiotic resistance.


Asunto(s)
Archaea/genética , Bacterias/genética , Eucariontes/genética , Biosíntesis de Proteínas , ARN Ribosómico/química , Proteínas Ribosómicas/química , Ribosomas/ultraestructura , Secuencia de Aminoácidos , Archaea/clasificación , Archaea/metabolismo , Bacterias/clasificación , Bacterias/metabolismo , Microscopía por Crioelectrón , Cristalografía por Rayos X , Eucariontes/clasificación , Eucariontes/metabolismo , Conformación de Ácido Nucleico , Filogenia , Pliegue de Proteína , Estructura Secundaria de Proteína , ARN Ribosómico/genética , ARN Ribosómico/metabolismo , Proteínas Ribosómicas/genética , Proteínas Ribosómicas/metabolismo , Ribosomas/clasificación , Ribosomas/genética , Ribosomas/metabolismo , Alineación de Secuencia , Homología de Secuencia de Aminoácido
17.
PLoS Genet ; 14(8): e1007620, 2018 08.
Artículo en Inglés | MEDLINE | ID: mdl-30142215

RESUMEN

[This corrects the article DOI: 10.1371/journal.pgen.1007166.].

18.
PLoS Genet ; 14(1): e1007166, 2018 01.
Artículo en Inglés | MEDLINE | ID: mdl-29337993

RESUMEN

Previous studies have shown that translation elongation is regulated by multiple factors, but the observed heterogeneity remains only partially explained. To dissect quantitatively the different determinants of elongation speed, we use probabilistic modeling to estimate initiation and local elongation rates from ribosome profiling data. This model-based approach allows us to quantify the extent of interference between ribosomes on the same transcript. We show that neither interference nor the distribution of slow codons is sufficient to explain the observed heterogeneity. Instead, we find that electrostatic interactions between the ribosomal exit tunnel and specific parts of the nascent polypeptide govern the elongation rate variation as the polypeptide makes its initial pass through the tunnel. Once the N-terminus has escaped the tunnel, the hydropathy of the nascent polypeptide within the ribosome plays a major role in modulating the speed. We show that our results are consistent with the biophysical properties of the tunnel.


Asunto(s)
Codón/metabolismo , Extensión de la Cadena Peptídica de Translación , Ribosomas/metabolismo , Animales , Conjuntos de Datos como Asunto , Humanos , Péptidos/genética , Péptidos/metabolismo , Unión Proteica , Biosíntesis de Proteínas/fisiología , Dominios y Motivos de Interacción de Proteínas , Ribosomas/química , Ribosomas/fisiología , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo
19.
Proc Natl Acad Sci U S A ; 113(20): E2822-31, 2016 May 17.
Artículo en Inglés | MEDLINE | ID: mdl-27140647

RESUMEN

The genetic, epigenetic, and physiological differences among cells in clonal microbial colonies are underexplored opportunities for discovery. A recently developed genetic assay reveals that transient losses of heterochromatic repression, a heritable form of gene silencing, occur throughout the growth of Saccharomyces colonies. This assay requires analyzing two-color fluorescence patterns in yeast colonies, which is qualitatively appealing but quantitatively challenging. In this paper, we developed a suite of automated image processing, visualization, and classification algorithms (MORPHE) that facilitated the analysis of heterochromatin dynamics in the context of colonial growth and that can be broadly adapted to many colony-based assays in Saccharomyces and other microbes. Using the features that were automatically extracted from fluorescence images, our classification method distinguished loss-of-silencing patterns between mutants and wild type with unprecedented precision. Application of MORPHE revealed subtle but significant differences in the stability of heterochromatic repression between various environmental conditions, revealed that haploid cells experienced higher rates of silencing loss than diploids, and uncovered the unexpected contribution of a sirtuin to heterochromatin dynamics.


Asunto(s)
Saccharomyces cerevisiae/metabolismo , Algoritmos , Bioensayo , Regulación Fúngica de la Expresión Génica , Silenciador del Gen , Genes Reporteros , Proteínas Fluorescentes Verdes/biosíntesis , Proteínas Fluorescentes Verdes/genética , Procesamiento de Imagen Asistido por Computador , Fenotipo , Saccharomyces cerevisiae/genética
20.
Genome Res ; 25(2): 268-79, 2015 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-25564017

RESUMEN

With the recent increase in study sample sizes in human genetics, there has been growing interest in inferring historical population demography from genomic variation data. Here, we present an efficient inference method that can scale up to very large samples, with tens or hundreds of thousands of individuals. Specifically, by utilizing analytic results on the expected frequency spectrum under the coalescent and by leveraging the technique of automatic differentiation, which allows us to compute gradients exactly, we develop a very efficient algorithm to infer piecewise-exponential models of the historical effective population size from the distribution of sample allele frequencies. Our method is orders of magnitude faster than previous demographic inference methods based on the frequency spectrum. In addition to inferring demography, our method can also accurately estimate locus-specific mutation rates. We perform extensive validation of our method on simulated data and show that it can accurately infer multiple recent epochs of rapid exponential growth, a signal that is difficult to pick up with small sample sizes. Lastly, we use our method to analyze data from recent sequencing studies, including a large-sample exome-sequencing data set of tens of thousands of individuals assayed at a few hundred genic regions.


Asunto(s)
Sitios Genéticos , Variación Genética , Genética de Población , Genómica , Tasa de Mutación , Densidad de Población , Exoma , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Modelos Genéticos , Modelos Estadísticos , Reproducibilidad de los Resultados
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA