Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 100
Filtrar
1.
J Mammary Gland Biol Neoplasia ; 29(1): 3, 2024 Jan 30.
Artigo em Inglês | MEDLINE | ID: mdl-38289401

RESUMO

During female adolescence and pregnancy, rising levels of hormones result in a cyclic source of signals that control the development of mammary tissue. While such alterations are well understood from a whole-gland perspective, the alterations that such hormones bring to organoid cultures derived from mammary glands have yet to be fully mapped. This is of special importance given that organoids are considered suitable systems to understand cross species breast development. Here we utilized single-cell transcriptional profiling to delineate responses of murine and human normal breast organoid systems to female hormones across evolutionary distinct species. Collectively, our study represents a molecular atlas of epithelial dynamics in response to estrogen and pregnancy hormones.


Assuntos
Mama , Estrogênios , Adolescente , Gravidez , Humanos , Animais , Camundongos , Feminino , Organoides
2.
bioRxiv ; 2023 Nov 02.
Artigo em Inglês | MEDLINE | ID: mdl-37961642

RESUMO

AlphaMissense is a recently developed method that is designed to classify missense variants into pathogenic, benign, or ambiguous categories across the entire human proteome. Asparagine Synthetase Deficiency (ASNSD) is a developmental disorder associated with severe symptoms, including congenital microcephaly, seizures, and premature death. Diagnosing ASNSD relies on identifying mutations in the asparagine synthetase (ASNS) gene through DNA sequencing and determining whether these variants are pathogenic or benign. Pathogenic ASNS variants are predicted to disrupt the protein's structure and/or function, leading to asparagine depletion within cells and inhibition of cell growth. AlphaMissense offers a promising solution for the rapid classification of ASNS variants established by DNA sequencing and provides a community resource of pathogenicity scores and classifications for newly diagnosed ASNSD patients. Here, we assessed AlphaMissense's utility in ASNSD by benchmarking it against known critical residues in ASNS and evaluating its performance against a list of previously reported ASNSD-associated variants. We also present a pipeline to calculate AlphaMissense scores for any protein in the UniProt database. AlphaMissense accurately attributed a high average pathogenicity score to known critical residues within the two ASNS active sites and the connecting intramolecular tunnel. The program successfully categorized 78.9% of known ASNSD-associated missense variants as pathogenic. The remaining variants were primarily labeled as ambiguous, with a smaller proportion classified as benign. This study underscores the potential role of AlphaMissense in classifying ASNS variants in suspected cases of ASNSD, potentially providing clarity to patients and their families grappling with ongoing diagnostic uncertainty.

3.
PLoS Genet ; 19(11): e1011032, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37934781

RESUMO

Investigators have recently introduced powerful methods for population genetic inference that rely on supervised machine learning from simulated data. Despite their performance advantages, these methods can fail when the simulated training data does not adequately resemble data from the real world. Here, we show that this "simulation mis-specification" problem can be framed as a "domain adaptation" problem, where a model learned from one data distribution is applied to a dataset drawn from a different distribution. By applying an established domain-adaptation technique based on a gradient reversal layer (GRL), originally introduced for image classification, we show that the effects of simulation mis-specification can be substantially mitigated. We focus our analysis on two state-of-the-art deep-learning population genetic methods-SIA, which infers positive selection from features of the ancestral recombination graph (ARG), and ReLERNN, which infers recombination rates from genotype matrices. In the case of SIA, the domain adaptive framework also compensates for ARG inference error. Using the domain-adaptive SIA (dadaSIA) model, we estimate improved selection coefficients at selected loci in the 1000 Genomes CEU population. We anticipate that domain adaptation will prove to be widely applicable in the growing use of supervised machine learning in population genetics.


Assuntos
Aprendizado de Máquina , Redes Neurais de Computação , Aprendizado de Máquina Supervisionado , Simulação por Computador , Genética Populacional
4.
Nucleic Acids Res ; 51(21): e106, 2023 Nov 27.
Artigo em Inglês | MEDLINE | ID: mdl-37889042

RESUMO

In metazoans, both transcription initiation and the escape of RNA polymerase (RNAP) from promoter-proximal pausing are key rate-limiting steps in gene expression. These processes play out at physically proximal sites on the DNA template and appear to influence one another through steric interactions. Here, we examine the dynamics of these processes using a combination of statistical modeling, simulation, and analysis of real nascent RNA sequencing data. We develop a simple probabilistic model that jointly describes the kinetics of transcription initiation, pause-escape, and elongation, and the generation of nascent RNA sequencing read counts under steady-state conditions. We then extend this initial model to allow for variability across cells in promoter-proximal pause site locations and steric hindrance of transcription initiation from paused RNAPs. In an extensive series of simulations, we show that this model enables accurate estimation of initiation and pause-escape rates. Furthermore, we show by simulation and analysis of real data that pause-escape is often strongly rate-limiting and that steric hindrance can dramatically reduce initiation rates. Our modeling framework is applicable to a variety of inference problems, and our software for estimation and simulation is freely available.


Assuntos
RNA Polimerases Dirigidas por DNA , Transcrição Gênica , Humanos , RNA Polimerases Dirigidas por DNA/genética , RNA Polimerases Dirigidas por DNA/metabolismo , Regiões Promotoras Genéticas , RNA , Sequência de Bases
5.
bioRxiv ; 2023 Sep 29.
Artigo em Inglês | MEDLINE | ID: mdl-37808747

RESUMO

During female adolescence and pregnancy, rising levels of hormones result in a cyclic source of signals that control the development of mammary tissue. While such alterations are well understood from a whole-gland perspective, the alterations that such hormones bring to organoid cultures derived from mammary glands have yet to be fully mapped. This is of special importance given that organoids are considered suitable systems to understand cross species breast development. Here we utilized single-cell transcriptional profiling to delineate responses of murine and human normal breast organoid systems to female hormones across evolutionary distinct species. Collectively, our study represents a molecular atlas of epithelial dynamics in response to estrogen and pregnancy hormones.

6.
Genome Biol Evol ; 15(9)2023 09 04.
Artigo em Inglês | MEDLINE | ID: mdl-37728212

RESUMO

Bats are exceptional among mammals for their powered flight, extended lifespans, and robust immune systems and therefore have been of particular interest in comparative genomics. Using the Oxford Nanopore Technologies long-read platform, we sequenced the genomes of two bat species with key phylogenetic positions, the Jamaican fruit bat (Artibeus jamaicensis) and the Mesoamerican mustached bat (Pteronotus mesoamericanus), and carried out a comprehensive comparative genomic analysis with a diverse collection of bats and other mammals. The high-quality, long-read genome assemblies revealed a contraction of interferon (IFN)-α at the immunity-related type I IFN locus in bats, resulting in a shift in relative IFN-ω and IFN-α copy numbers. Contradicting previous hypotheses of constitutive expression of IFN-α being a feature of the bat immune system, three bat species lost all IFN-α genes. This shift to IFN-ω could contribute to the increased viral tolerance that has made bats a common reservoir for viruses that can be transmitted to humans. Antiviral genes stimulated by type I IFNs also showed evidence of rapid evolution, including a lineage-specific duplication of IFN-induced transmembrane genes and positive selection in IFIT2. In addition, 33 tumor suppressors and 6 DNA-repair genes showed signs of positive selection, perhaps contributing to increased longevity and reduced cancer rates in bats. The robust immune systems of bats rely on both bat-wide and lineage-specific evolution in the immune gene repertoire, suggesting diverse immune strategies. Our study provides new genomic resources for bats and sheds new light on the extraordinary molecular evolution in this critically important group of mammals.


Assuntos
Quirópteros , Neoplasias , Humanos , Animais , Quirópteros/genética , Filogenia , Evolução Molecular , Genômica , Longevidade , Neoplasias/genética , Neoplasias/veterinária
7.
bioRxiv ; 2023 Nov 10.
Artigo em Inglês | MEDLINE | ID: mdl-37503269

RESUMO

Meiotic drivers subvert Mendelian expectations by manipulating reproductive development to bias their own transmission. Chromosomal drive typically functions in asymmetric female meiosis, while gene drive is normally postmeiotic and typically found in males. Using single molecule and single-pollen genome sequencing, we describe Teosinte Pollen Drive, an instance of gene drive in hybrids between maize (Zea mays ssp. mays) and teosinte mexicana (Zea mays ssp. mexicana), that depends on RNA interference (RNAi). 22nt small RNAs from a non-coding RNA hairpin in mexicana depend on Dicer-Like 2 (Dcl2) and target Teosinte Drive Responder 1 (Tdr1), which encodes a lipase required for pollen viability. Dcl2, Tdr1, and the hairpin are in tight pseudolinkage on chromosome 5, but only when transmitted through the male. Introgression of mexicana into early cultivated maize is thought to have been critical to its geographical dispersal throughout the Americas, and a tightly linked inversion in mexicana spans a major domestication sweep in modern maize. A survey of maize landraces and sympatric populations of teosinte mexicana reveals correlated patterns of admixture among unlinked genes required for RNAi on at least 4 chromosomes that are also subject to gene drive in pollen from synthetic hybrids. Teosinte Pollen Drive likely played a major role in maize domestication and diversification, and offers an explanation for the widespread abundance of "self" small RNAs in the germlines of plants and animals.

8.
bioRxiv ; 2023 Sep 06.
Artigo em Inglês | MEDLINE | ID: mdl-36909514

RESUMO

Investigators have recently introduced powerful methods for population genetic inference that rely on supervised machine learning from simulated data. Despite their performance advantages, these methods can fail when the simulated training data does not adequately resemble data from the real world. Here, we show that this "simulation mis-specification" problem can be framed as a "domain adaptation" problem, where a model learned from one data distribution is applied to a dataset drawn from a different distribution. By applying an established domain-adaptation technique based on a gradient reversal layer (GRL), originally introduced for image classification, we show that the effects of simulation mis-specification can be substantially mitigated. We focus our analysis on two state-of-the-art deep-learning population genetic methods-SIA, which infers positive selection from features of the ancestral recombination graph (ARG), and ReLERNN, which infers recombination rates from genotype matrices. In the case of SIA, the domain adaptive framework also compensates for ARG inference error. Using the domain-adaptive SIA (dadaSIA) model, we estimate improved selection coefficients at selected loci in the 1000 Genomes CEU population. We anticipate that domain adaptation will prove to be widely applicable in the growing use of supervised machine learning in population genetics.

9.
bioRxiv ; 2023 Dec 23.
Artigo em Inglês | MEDLINE | ID: mdl-38187771

RESUMO

Across all branches of life, transcription elongation is a crucial, regulated phase in gene expression. Many recent studies in eukaryotes have focused on the regulation of promoter-proximal pausing of RNA Polymerase II (Pol II), but rates of productive elongation also vary substantially throughout the gene body, both within and across genes. Here, we introduce a probabilistic model for systematically evaluating potential determinants of the local elongation rate based on nascent RNA sequencing (NRS) data. Our model is derived from a unified model for both the kinetics of Pol II movement along the DNA template and the generation of NRS read counts at steady state. It allows for a continuously variable elongation rate along the gene body, with the rate at each nucleotide defined by a generalized linear relationship with nearby genomic and epigenomic features. High-dimensional feature vectors are accommodated through a sparse-regression extension. We show with simulations that the model allows accurate detection of associated features and accurate prediction of local elongation rates. In an analysis of public PRO-seq and epigenomic data, we identify several features that are strongly associated with reductions in the local elongation rate, including DNA methylation, splice sites, RNA stem-loops, CTCF binding sites, and several histone marks, including H3K36me3 and H4K20me1. By contrast, low-complexity sequences and H3K79me2 marks are associated with increases in elongation rate. In an analysis of DNA k-mers, we find that cytosine nucleotides are strongly associated with reductions in local elongation rate, particularly when preceded by guanines and followed by adenines or thymines. Increases in elongation rate are associated with thymines and A+T-rich k-mers. These associations are generally shared across cell types, and by considering them our model is effective at predicting features of held-out PRO-seq data. Overall, our analysis is the first to permit genome-wide predictions of relative nucleotide-specific elongation rates based on complex sets of genomic and epigenomic covariates. We have made predictions available for the K562, CD14+, MCF-7, and HeLa-S3 cell types in a UCSC Genome Browser track.

10.
PLoS Genet ; 18(11): e1010474, 2022 11.
Artigo em Inglês | MEDLINE | ID: mdl-36318577

RESUMO

Insular organisms often evolve predictable phenotypes, like flightlessness, extreme body sizes, or increased melanin deposition. The evolutionary forces and molecular targets mediating these patterns remain mostly unknown. Here we study the Chestnut-bellied Monarch (Monarcha castaneiventris) from the Solomon Islands, a complex of closely related subspecies in the early stages of speciation. On the large island of Makira M. c. megarhynchus has a chestnut belly, whereas on the small satellite islands of Ugi, and Santa Ana and Santa Catalina (SA/SC) M. c. ugiensis is entirely iridescent blue-black (i.e., melanic). Melanism has likely evolved twice, as the Ugi and SA/SC populations were established independently. To investigate the genetic basis of melanism on each island we generated whole genome sequence data from all three populations. Non-synonymous mutations at the MC1R pigmentation gene are associated with melanism on SA/SC, while ASIP, an antagonistic ligand of MC1R, is associated with melanism on Ugi. Both genes show evidence of selective sweeps in traditional summary statistics and statistics derived from the ancestral recombination graph (ARG). Using the ARG in combination with machine learning, we inferred selection strength, timing of onset and allele frequency trajectories. MC1R shows evidence of a recent, strong, soft selective sweep. The region including ASIP shows more complex signatures; however, we find evidence for sweeps in mutations near ASIP, which are comparatively older than those on MC1R and have been under relatively strong selection. Overall, our study shows convergent melanism results from selective sweeps at independent molecular targets, evolving in taxa where coloration likely mediates reproductive isolation with the neighboring chestnut-bellied subspecies.


Assuntos
Melanose , Passeriformes , Animais , Receptor Tipo 1 de Melanocortina/genética , Pigmentação/genética , Melanose/genética , Passeriformes/genética , Frequência do Gene
11.
Genes Dev ; 2022 Aug 18.
Artigo em Inglês | MEDLINE | ID: mdl-35981753

RESUMO

Promoter-proximal RNA Pol II pausing is a critical step in transcriptional control. Pol II pausing has been predominantly studied in tissue culture systems. While Pol II pausing has been shown to be required for mammalian development, the phenotypic and mechanistic details of this requirement are unknown. Here, we found that loss of Pol II pausing stalls pluripotent state transitions within the epiblast of the early mouse embryo. Using Nelfb -/- mice and a NELFB degron mouse pluripotent stem cell model, we show that embryonic stem cells (ESCs) representing the naïve state of pluripotency successfully initiate a transition program but fail to balance levels of induced and repressed genes and enhancers in the absence of NELF. We found an increase in chromatin-associated NELF during transition from the naïve to later pluripotent states. Overall, our work defines the acute and long-term molecular consequences of NELF loss and reveals a role for Pol II pausing in the pluripotency continuum as a modulator of cell state transitions.

12.
Nat Commun ; 13(1): 4312, 2022 07 25.
Artigo em Inglês | MEDLINE | ID: mdl-35879308

RESUMO

Large-scale genome sequencing has enabled the measurement of strong purifying selection in protein-coding genes. Here we describe a new method, called ExtRaINSIGHT, for measuring such selection in noncoding as well as coding regions of the human genome. ExtRaINSIGHT estimates the prevalence of "ultraselection" by the fractional depletion of rare single-nucleotide variants, after controlling for variation in mutation rates. Applying ExtRaINSIGHT to 71,702 whole genome sequences from gnomAD v3, we find abundant ultraselection in evolutionarily ancient miRNAs and neuronal protein-coding genes, as well as at splice sites. By contrast, we find much less ultraselection in other noncoding RNAs and transcription factor binding sites, and only modest levels in ultraconserved elements. We estimate that ~0.4-0.7% of the human genome is ultraselected, implying ~ 0.26-0.51 strongly deleterious mutations per generation. Overall, our study sheds new light on the genome-wide distribution of fitness effects by combining deep sequencing data and classical theory from population genetics.


Assuntos
Genoma Humano , Mutação Puntual , Evolução Molecular , Genética Populacional , Genoma Humano/genética , Humanos , Mutação , Seleção Genética
13.
Plant Genome ; 15(2): e20204, 2022 06.
Artigo em Inglês | MEDLINE | ID: mdl-35416423

RESUMO

Alignments of multiple genomes are a cornerstone of comparative genomics, but generating these alignments remains technically challenging and often impractical. We developed the msa_pipeline workflow (https://bitbucket.org/bucklerlab/msa_pipeline) to allow practical and sensitive multiple alignment of diverged plant genomes and calculation of conservation scores with minimal user inputs. As high repeat content and genomic divergence are substantial challenges in plant genome alignment, we also explored the effect of different masking approaches and parameters of the LAST aligner using genome assemblies of 33 grass species. Compared with conventional masking with RepeatMasker, a masking approach based on k-mers (nucleotide sequences of k length) increased the alignment rate of coding sequence and noncoding functional regions by 25 and 14%, respectively. We further found that default alignment parameters generally perform well, but parameter tuning can increase the alignment rate for noncoding functional regions by over 52% compared with default LAST settings. Finally, by increasing alignment sensitivity from the default baseline, parameter tuning can increase the number of noncoding sites that can be scored for conservation by over 76%. Overall, tuning of masking and alignment parameters can generate optimized multiple alignments to drive biological discovery in plants.


Assuntos
Genoma de Planta , Genômica , Sequência de Bases , Fluxo de Trabalho
14.
Mol Biol Evol ; 39(1)2022 01 07.
Artigo em Inglês | MEDLINE | ID: mdl-34888675

RESUMO

Detecting signals of selection from genomic data is a central problem in population genetics. Coupling the rich information in the ancestral recombination graph (ARG) with a powerful and scalable deep-learning framework, we developed a novel method to detect and quantify positive selection: Selection Inference using the Ancestral recombination graph (SIA). Built on a Long Short-Term Memory (LSTM) architecture, a particular type of a Recurrent Neural Network (RNN), SIA can be trained to explicitly infer a full range of selection coefficients, as well as the allele frequency trajectory and time of selection onset. We benchmarked SIA extensively on simulations under a European human demographic model, and found that it performs as well or better as some of the best available methods, including state-of-the-art machine-learning and ARG-based methods. In addition, we used SIA to estimate selection coefficients at several loci associated with human phenotypes of interest. SIA detected novel signals of selection particular to the European (CEU) population at the MC1R and ABCC11 loci. In addition, it recapitulated signals of selection at the LCT locus and several pigmentation-related genes. Finally, we reanalyzed polymorphism data of a collection of recently radiated southern capuchino seedeater taxa in the genus Sporophila to quantify the strength of selection and improved the power of our previous methods to detect partial soft sweeps. Overall, SIA uses deep learning to leverage the ARG and thereby provides new insight into how selective sweeps shape genomic diversity.


Assuntos
Aprendizado Profundo , Seleção Genética , Genética Populacional , Modelos Genéticos , Recombinação Genética
15.
Genome Biol ; 22(1): 278, 2021 09 23.
Artigo em Inglês | MEDLINE | ID: mdl-34556174

RESUMO

High-throughput CRISPR-Cas9 knockout screens are widely used to evaluate gene essentiality in cancer research. Here we introduce a probabilistic modeling framework, Analysis of CRISPR-based Essentiality (ACE), that accounts for multiple sources of variation in CRISPR-Cas9 screens and enables new statistical tests for essentiality. We show using simulations that ACE is effective at predicting both absolute and differential essentiality. When applied to publicly available data, ACE identifies known and novel candidates for genotype-specific essentiality, including RNA m6-A methyltransferases that exhibit enhanced essentiality in the presence of inactivating TP53 mutations. ACE provides a robust framework for identifying genes responsive to subtype-specific therapeutic targeting.


Assuntos
Sistemas CRISPR-Cas , Genes Essenciais , Modelos Estatísticos , Software , Genes p53 , Genótipo , Mutação
16.
Bioinformatics ; 37(24): 4727-4736, 2021 12 11.
Artigo em Inglês | MEDLINE | ID: mdl-34382072

RESUMO

MOTIVATION: Quantification of isoform abundance has been extensively studied at the mature RNA level using RNA-seq but not at the level of precursor RNAs using nascent RNA sequencing. RESULTS: We address this problem with a new computational method called Deconvolution of Expression for Nascent RNA-sequencing data (DENR), which models nascent RNA-sequencing read-counts as a mixture of user-provided isoforms. The baseline algorithm is enhanced by machine-learning predictions of active transcription start sites and an adjustment for the typical 'shape profile' of read-counts along a transcription unit. We show that DENR outperforms simple read-count-based methods for estimating gene and isoform abundances, and that transcription of multiple pre-RNA isoforms per gene is widespread, with frequent differences between cell types. In addition, we provide evidence that a majority of human isoform diversity derives from primary transcription rather than from post-transcriptional processes. AVAILABILITY AND IMPLEMENTATION: DENR and nascentRNASim are freely available at https://github.com/CshlSiepelLab/DENR (version v1.0.0) and https://github.com/CshlSiepelLab/nascentRNASim (version v0.3.0). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Isoformas de RNA , RNA , Humanos , Isoformas de RNA/genética , Software , Isoformas de Proteínas/genética , Análise de Sequência de RNA/métodos , Fatores de Iniciação em Eucariotos/genética
17.
J Mammary Gland Biol Neoplasia ; 26(1): 43-66, 2021 03.
Artigo em Inglês | MEDLINE | ID: mdl-33988830

RESUMO

The developing mammary gland depends on several transcription-dependent networks to define cellular identities and differentiation trajectories. Recent technological advancements that allow for single-cell profiling of gene expression have provided an initial picture into the epithelial cellular heterogeneity across the diverse stages of gland maturation. Still, a deeper dive into expanded molecular signatures would improve our understanding of the diversity of mammary epithelial and non-epithelial cellular populations across different tissue developmental stages, mouse strains and mammalian species. Here, we combined differential mammary gland fractionation approaches and transcriptional profiles obtained from FACS-isolated mammary cells to improve our definitions of mammary-resident, cellular identities at the single-cell level. Our approach yielded a series of expression signatures that illustrate the heterogeneity of mammary epithelial cells, specifically those of the luminal fate, and uncovered transcriptional changes to their lineage-defined, cellular states that are induced during gland development. Our analysis also provided molecular signatures that identified non-epithelial mammary cells, including adipocytes, fibroblasts and rare immune cells. Lastly, we extended our study to elucidate expression signatures of human, breast-resident cells, a strategy that allowed for the cross-species comparison of mammary epithelial identities. Collectively, our approach improved the existing signatures of normal mammary epithelial cells, as well as elucidated the diversity of non-epithelial cells in murine and human breast tissue. Our study provides a useful resource for future studies that use single-cell molecular profiling strategies to understand normal and malignant breast development.


Assuntos
Células Epiteliais/fisiologia , Perfilação da Expressão Gênica/métodos , Glândulas Mamárias Animais/fisiologia , Glândulas Mamárias Humanas/fisiologia , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Transcriptoma , Animais , Linhagem da Célula/fisiologia , Células Epiteliais/citologia , Feminino , Humanos , Glândulas Mamárias Animais/citologia , Glândulas Mamárias Humanas/citologia , Camundongos , Camundongos Endogâmicos BALB C , Camundongos Endogâmicos C57BL
18.
medRxiv ; 2021 Mar 31.
Artigo em Inglês | MEDLINE | ID: mdl-33821285

RESUMO

The innate and adaptive immune response are regulated by biological clocks, and circulating lymphocytes are lowest at sunrise. Accordingly, severity of disease in mouse models is highly dependent on the time of day of viral infection. Here, we explore whether circadian immunity contributes significantly to seasonality of respiratory viruses, including influenza and SARS-CoV-2. Susceptibility-Infection-Recovery-Susceptibility (SIRS) models of influenza and SIRS-derived models of COVID-19 suggest that local sunrise time is a better predictor of the basic reproductive number (R0) than climate, even when day length is taken into account. Moreover, these models predict a window of susceptibility when local sunrise time corresponds to the morning commute and contact rate is expected to be high. Counterfactual modeling suggests that retaining daylight savings time in the fall would reduce the length of this window, and substantially reduce seasonal waves of respiratory infections.

19.
BMC Biol ; 19(1): 30, 2021 02 15.
Artigo em Inglês | MEDLINE | ID: mdl-33588838

RESUMO

BACKGROUND: The concentrations of distinct types of RNA in cells result from a dynamic equilibrium between RNA synthesis and decay. Despite the critical importance of RNA decay rates, current approaches for measuring them are generally labor-intensive, limited in sensitivity, and/or disruptive to normal cellular processes. Here, we introduce a simple method for estimating relative RNA half-lives that is based on two standard and widely available high-throughput assays: Precision Run-On sequencing (PRO-seq) and RNA sequencing (RNA-seq). RESULTS: Our method treats PRO-seq as a measure of transcription rate and RNA-seq as a measure of RNA concentration, and estimates the rate of RNA decay required for a steady-state equilibrium. We show that this approach can be used to assay relative RNA half-lives genome-wide, with good accuracy and sensitivity for both coding and noncoding transcription units. Using a structural equation model (SEM), we test several features of transcription units, nearby DNA sequences, and nearby epigenomic marks for associations with RNA stability after controlling for their effects on transcription. We find that RNA splicing-related features are positively correlated with RNA stability, whereas features related to miRNA binding and DNA methylation are negatively correlated with RNA stability. Furthermore, we find that a measure based on U1 binding and polyadenylation sites distinguishes between unstable noncoding and stable coding transcripts but is not predictive of relative stability within the mRNA or lincRNA classes. We also identify several histone modifications that are associated with RNA stability. CONCLUSION: We introduce an approach for estimating the relative half-lives of individual RNAs. Together, our estimation method and systematic analysis shed light on the pervasive impacts of RNA stability on cellular RNA concentrations.


Assuntos
Instabilidade Genômica , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Estabilidade de RNA , Sequenciamento de Nucleotídeos em Larga Escala/instrumentação , Humanos , RNA-Seq/métodos
20.
Proc Natl Acad Sci U S A ; 117(48): 30554-30565, 2020 12 01.
Artigo em Inglês | MEDLINE | ID: mdl-33199636

RESUMO

Numerous studies of emerging species have identified genomic "islands" of elevated differentiation against a background of relative homogeneity. The causes of these islands remain unclear, however, with some signs pointing toward "speciation genes" that locally restrict gene flow and others suggesting selective sweeps that have occurred within nascent species after speciation. Here, we examine this question through the lens of genome sequence data for five species of southern capuchino seedeaters, finch-like birds from South America that have undergone a species radiation during the last ∼50,000 generations. By applying newly developed statistical methods for ancestral recombination graph inference and machine-learning methods for the prediction of selective sweeps, we show that previously identified islands of differentiation in these birds appear to be generally associated with relatively recent, species-specific selective sweeps, most of which are predicted to be soft sweeps acting on standing genetic variation. Many of these sweeps coincide with genes associated with melanin-based variation in plumage, suggesting a prominent role for sexual selection. At the same time, a few loci also exhibit indications of possible selection against gene flow. These observations shed light on the complex manner in which natural selection shapes genome sequences during speciation.


Assuntos
Ilhas Genômicas , Modelos Genéticos , Animais , Biodiversidade , Variação Genética , Aprendizado de Máquina
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA