Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 32
Filtrar
1.
Genome Biol Evol ; 16(3)2024 Mar 02.
Artigo em Inglês | MEDLINE | ID: mdl-38319079

RESUMO

Reptiles exhibit a variety of modes of sex determination, including both temperature-dependent and genetic mechanisms. Among those species with genetic sex determination, sex chromosomes of varying heterogamety (XX/XY and ZZ/ZW) have been observed with different degrees of differentiation. Karyotype studies have demonstrated that Gila monsters (Heloderma suspectum) have ZZ/ZW sex determination and this system is likely homologous to the ZZ/ZW system in the Komodo dragon (Varanus komodoensis), but little else is known about their sex chromosomes. Here, we report the assembly and analysis of the Gila monster genome. We generated a de novo draft genome assembly for a male using 10X Genomics technology. We further generated and analyzed short-read whole genome sequencing and whole transcriptome sequencing data for three males and three females. By comparing female and male genomic data, we identified four putative Z chromosome scaffolds. These putative Z chromosome scaffolds are homologous to Z-linked scaffolds identified in the Komodo dragon. Further, by analyzing RNAseq data, we observed evidence of incomplete dosage compensation between the Gila monster Z chromosome and autosomes and a lack of balance in Z-linked expression between the sexes. In particular, we observe lower expression of the Z in females (ZW) than males (ZZ) on a global basis, though we find evidence suggesting local gene-by-gene compensation. This pattern has been observed in most other ZZ/ZW systems studied to date and may represent a general pattern for female heterogamety in vertebrates.


Assuntos
Animais Peçonhentos , Heloderma suspectum , Lagartos , Animais , Masculino , Feminino , Lagartos/genética , Cromossomos Sexuais/genética , Cariótipo , Mecanismo Genético de Compensação de Dose
2.
bioRxiv ; 2023 Apr 28.
Artigo em Inglês | MEDLINE | ID: mdl-37163099

RESUMO

Reptiles exhibit a variety of modes of sex determination, including both temperature-dependent and genetic mechanisms. Among those species with genetic sex determination, sex chromosomes of varying heterogamety (XX/XY and ZZ/ZW) have been observed with different degrees of differentiation. Karyotype studies have demonstrated that Gila monsters (Heloderma suspectum) have ZZ/ZW sex determination and this system is likely homologous to the ZZ/ZW system in the Komodo dragon (Varanus komodoensis), but little else is known about their sex chromosomes. Here, we report the assembly and analysis of the Gila monster genome. We generated a de novo draft genome assembly for a male using 10X Genomics technology. We further generated and analyzed short-read whole genome sequencing and whole transcriptome sequencing data for three males and three females. By comparing female and male genomic data, we identified four putative Z-chromosome scaffolds. These putative Z-chromosome scaffolds are homologous to Z-linked scaffolds identified in the Komodo dragon. Further, by analyzing RNAseq data, we observed evidence of incomplete dosage compensation between the Gila monster Z chromosome and autosomes and a lack of balance in Z-linked expression between the sexes. In particular, we observe lower expression of the Z in females (ZW) than males (ZZ) on a global basis, though we find evidence suggesting local gene-by-gene compensation. This pattern has been observed in most other ZZ/ZW systems studied to date and may represent a general pattern for female heterogamety in vertebrates.

3.
Cell Genom ; 2(5)2022 May.
Artigo em Inglês | MEDLINE | ID: mdl-36452119

RESUMO

Genome in a Bottle benchmarks are widely used to help validate clinical sequencing pipelines and develop variant calling and sequencing methods. Here we use accurate linked and long reads to expand benchmarks in 7 samples to include difficult-to-map regions and segmental duplications that are challenging for short reads. These benchmarks add more than 300,000 SNVs and 50,000 insertions or deletions (indels) and include 16% more exonic variants, many in challenging, clinically relevant genes not covered previously, such as PMS2. For HG002, we include 92% of the autosomal GRCh38 assembly while excluding regions problematic for benchmarking small variants, such as copy number variants, that should not have been in the previous version, which included 85% of GRCh38. It identifies eight times more false negatives in a short read variant call set relative to our previous benchmark. We demonstrate that this benchmark reliably identifies false positives and false negatives across technologies, enabling ongoing methods development.

5.
Science ; 376(6588): 44-53, 2022 04.
Artigo em Inglês | MEDLINE | ID: mdl-35357919

RESUMO

Since its initial release in 2000, the human reference genome has covered only the euchromatic fraction of the genome, leaving important heterochromatic regions unfinished. Addressing the remaining 8% of the genome, the Telomere-to-Telomere (T2T) Consortium presents a complete 3.055 billion-base pair sequence of a human genome, T2T-CHM13, that includes gapless assemblies for all chromosomes except Y, corrects errors in the prior references, and introduces nearly 200 million base pairs of sequence containing 1956 gene predictions, 99 of which are predicted to be protein coding. The completed regions include all centromeric satellite arrays, recent segmental duplications, and the short arms of all five acrocentric chromosomes, unlocking these complex regions of the genome to variational and functional studies.


Assuntos
Genoma Humano , Projeto Genoma Humano , Análise de Sequência de DNA/normas , Linhagem Celular , Cromossomos Artificiais Bacterianos/genética , Cromossomos Humanos/genética , Humanos , Valores de Referência
6.
Nature ; 594(7861): 77-81, 2021 06.
Artigo em Inglês | MEDLINE | ID: mdl-33953399

RESUMO

The divergence of chimpanzee and bonobo provides one of the few examples of recent hominid speciation1,2. Here we describe a fully annotated, high-quality bonobo genome assembly, which was constructed without guidance from reference genomes by applying a multiplatform genomics approach. We generate a bonobo genome assembly in which more than 98% of genes are completely annotated and 99% of the gaps are closed, including the resolution of about half of the segmental duplications and almost all of the full-length mobile elements. We compare the bonobo genome to those of other great apes1,3-5 and identify more than 5,569 fixed structural variants that specifically distinguish the bonobo and chimpanzee lineages. We focus on genes that have been lost, changed in structure or expanded in the last few million years of bonobo evolution. We produce a high-resolution map of incomplete lineage sorting and estimate that around 5.1% of the human genome is genetically closer to chimpanzee or bonobo and that more than 36.5% of the genome shows incomplete lineage sorting if we consider a deeper phylogeny including gorilla and orangutan. We also show that 26% of the segments of incomplete lineage sorting between human and chimpanzee or human and bonobo are non-randomly distributed and that genes within these clustered segments show significant excess of amino acid replacement compared to the rest of the genome.


Assuntos
Evolução Molecular , Genoma/genética , Genômica , Pan paniscus/genética , Filogenia , Animais , Fator de Iniciação 4A em Eucariotos/genética , Feminino , Genes , Gorilla gorilla/genética , Anotação de Sequência Molecular/normas , Pan troglodytes/genética , Pongo/genética , Duplicações Segmentares Genômicas , Análise de Sequência de DNA
7.
Nucleic Acids Res ; 49(D1): D916-D923, 2021 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-33270111

RESUMO

The GENCODE project annotates human and mouse genes and transcripts supported by experimental data with high accuracy, providing a foundational resource that supports genome biology and clinical genomics. GENCODE annotation processes make use of primary data and bioinformatic tools and analysis generated both within the consortium and externally to support the creation of transcript structures and the determination of their function. Here, we present improvements to our annotation infrastructure, bioinformatics tools, and analysis, and the advances they support in the annotation of the human and mouse genomes including: the completion of first pass manual annotation for the mouse reference genome; targeted improvements to the annotation of genes associated with SARS-CoV-2 infection; collaborative projects to achieve convergence across reference annotation databases for the annotation of human and mouse protein-coding genes; and the first GENCODE manually supervised automated annotation of lncRNAs. Our annotation is accessible via Ensembl, the UCSC Genome Browser and https://www.gencodegenes.org.


Assuntos
COVID-19/prevenção & controle , Biologia Computacional/métodos , Bases de Dados Genéticas , Genômica/métodos , Anotação de Sequência Molecular/métodos , SARS-CoV-2/genética , Animais , COVID-19/epidemiologia , COVID-19/virologia , Epidemias , Humanos , Internet , Camundongos , Pseudogenes/genética , RNA Longo não Codificante/genética , SARS-CoV-2/metabolismo , SARS-CoV-2/fisiologia , Transcrição Gênica/genética
8.
Science ; 370(6523)2020 12 18.
Artigo em Inglês | MEDLINE | ID: mdl-33335035

RESUMO

The rhesus macaque (Macaca mulatta) is the most widely studied nonhuman primate (NHP) in biomedical research. We present an updated reference genome assembly (Mmul_10, contig N50 = 46 Mbp) that increases the sequence contiguity 120-fold and annotate it using 6.5 million full-length transcripts, thus improving our understanding of gene content, isoform diversity, and repeat organization. With the improved assembly of segmental duplications, we discovered new lineage-specific genes and expanded gene families that are potentially informative in studies of evolution and disease susceptibility. Whole-genome sequencing (WGS) data from 853 rhesus macaques identified 85.7 million single-nucleotide variants (SNVs) and 10.5 million indel variants, including potentially damaging variants in genes associated with human autism and developmental delay, providing a framework for developing noninvasive NHP models of human disease.


Assuntos
Predisposição Genética para Doença , Genoma , Macaca mulatta/genética , Polimorfismo de Nucleotídeo Único , Animais , Variação Genética , Humanos , Anotação de Sequência Molecular , Sequenciamento Completo do Genoma
9.
Nature ; 587(7833): 246-251, 2020 11.
Artigo em Inglês | MEDLINE | ID: mdl-33177663

RESUMO

New genome assemblies have been arriving at a rapidly increasing pace, thanks to decreases in sequencing costs and improvements in third-generation sequencing technologies1-3. For example, the number of vertebrate genome assemblies currently in the NCBI (National Center for Biotechnology Information) database4 increased by more than 50% to 1,485 assemblies in the year from July 2018 to July 2019. In addition to this influx of assemblies from different species, new human de novo assemblies5 are being produced, which enable the analysis of not only small polymorphisms, but also complex, large-scale structural differences between human individuals and haplotypes. This coming era and its unprecedented amount of data offer the opportunity to uncover many insights into genome evolution but also present challenges in how to adapt current analysis methods to meet the increased scale. Cactus6, a reference-free multiple genome alignment program, has been shown to be highly accurate, but the existing implementation scales poorly with increasing numbers of genomes, and struggles in regions of highly duplicated sequences. Here we describe progressive extensions to Cactus to create Progressive Cactus, which enables the reference-free alignment of tens to thousands of large vertebrate genomes while maintaining high alignment quality. We describe results from an alignment of more than 600 amniote genomes, which is to our knowledge the largest multiple vertebrate genome alignment created so far.


Assuntos
Genoma/genética , Genômica/métodos , Alinhamento de Sequência/métodos , Software , Vertebrados/genética , Âmnio , Animais , Simulação por Computador , Genômica/normas , Haplótipos , Humanos , Controle de Qualidade , Alinhamento de Sequência/normas , Software/normas
11.
Nat Biotechnol ; 38(11): 1347-1355, 2020 11.
Artigo em Inglês | MEDLINE | ID: mdl-32541955

RESUMO

New technologies and analysis methods are enabling genomic structural variants (SVs) to be detected with ever-increasing accuracy, resolution and comprehensiveness. To help translate these methods to routine research and clinical practice, we developed a sequence-resolved benchmark set for identification of both false-negative and false-positive germline large insertions and deletions. To create this benchmark for a broadly consented son in a Personal Genome Project trio with broadly available cells and DNA, the Genome in a Bottle Consortium integrated 19 sequence-resolved variant calling methods from diverse technologies. The final benchmark set contains 12,745 isolated, sequence-resolved insertion (7,281) and deletion (5,464) calls ≥50 base pairs (bp). The Tier 1 benchmark regions, for which any extra calls are putative false positives, cover 2.51 Gbp and 5,262 insertions and 4,095 deletions supported by ≥1 diploid assembly. We demonstrate that the benchmark set reliably identifies false negatives and false positives in high-quality SV callsets from short-, linked- and long-read sequencing and optical mapping.


Assuntos
Mutação em Linhagem Germinativa/genética , Mutação INDEL/genética , Diploide , Variação Estrutural do Genoma , Humanos , Anotação de Sequência Molecular , Análise de Sequência de DNA
12.
Commun Biol ; 3(1): 318, 2020 06 25.
Artigo em Inglês | MEDLINE | ID: mdl-32587328

RESUMO

We performed shallow single-cell sequencing of genomic DNA across 1475 cells from a cell-line, COLO829, to resolve overall complexity and clonality. This melanoma tumor-line has been previously characterized by multiple technologies and is a benchmark for evaluating somatic alterations. In some of these studies, COLO829 has shown conflicting and/or indeterminate copy number and, thus, single-cell sequencing provides a tool for gaining insight. Following shallow single-cell sequencing, we first identified at least four major sub-clones by discriminant analysis of principal components of single-cell copy number data. Based on clustering, break-point and loss of heterozygosity analysis of aggregated data from sub-clones, we identified distinct hallmark events that were validated within bulk sequencing and spectral karyotyping. In summary, COLO829 exhibits a classical Dutrillaux's monosomic/trisomic pattern of karyotype evolution with endoreduplication, where consistent sub-clones emerge from the loss/gain of abnormal chromosomes. Overall, our results demonstrate how shallow copy number profiling can uncover hidden biological insights.


Assuntos
Melanoma/genética , Melanoma/patologia , Análise de Célula Única/métodos , Linhagem Celular Tumoral , Variações do Número de Cópias de DNA , Humanos , Cariotipagem , Perda de Heterozigosidade , Análise de Sequência de DNA
13.
Bioinformatics ; 36(12): 3905-3906, 2020 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-32330223

RESUMO

SUMMARY: Bulk RNA sequencing studies have demonstrated that human leukocyte antigen (HLA) genes may be expressed in a cell type-specific and allele-specific fashion. Single-cell gene expression assays have the potential to further resolve these expression patterns, but currently available methods do not perform allele-specific quantification at the molecule level. Here, we present scHLAcount, a post-processing workflow for single-cell RNA-seq data that computes allele-specific molecule counts of the HLA genes based on a personalized reference constructed from the sample's HLA genotypes. AVAILABILITY AND IMPLEMENTATION: scHLAcount is available under the MIT license at https://github.com/10XGenomics/scHLAcount. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Análise de Célula Única , Software , Alelos , Expressão Gênica , Humanos , Análise de Sequência de RNA , Fluxo de Trabalho
14.
Nat Biotechnol ; 38(8): 954-961, 2020 08.
Artigo em Inglês | MEDLINE | ID: mdl-32231336

RESUMO

Single-cell CRISPR screens enable the exploration of mammalian gene function and genetic regulatory networks. However, use of this technology has been limited by reliance on indirect indexing of single-guide RNAs (sgRNAs). Here we present direct-capture Perturb-seq, a versatile screening approach in which expressed sgRNAs are sequenced alongside single-cell transcriptomes. Direct-capture Perturb-seq enables detection of multiple distinct sgRNA sequences from individual cells and thus allows pooled single-cell CRISPR screens to be easily paired with combinatorial perturbation libraries that contain dual-guide expression vectors. We demonstrate the utility of this approach for high-throughput investigations of genetic interactions and, leveraging this ability, dissect epistatic interactions between cholesterol biogenesis and DNA repair. Using direct capture Perturb-seq, we also show that targeting individual genes with multiple sgRNAs per cell improves efficacy of CRISPR interference and activation, facilitating the use of compact, highly active CRISPR libraries for single-cell screens. Last, we show that hybridization-based target enrichment permits sensitive, specific sequencing of informative transcripts from single-cell RNA-seq experiments.


Assuntos
Sistemas CRISPR-Cas , Técnicas de Amplificação de Ácido Nucleico/métodos , RNA Guia de Cinetoplastídeos/genética , Regulação da Expressão Gênica , Marcação de Genes , Células HEK293 , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Análise de Célula Única , Transcriptoma
16.
Nat Commun ; 10(1): 3660, 2019 08 14.
Artigo em Inglês | MEDLINE | ID: mdl-31413257

RESUMO

Virtually all tumors are genetically heterogeneous, containing mutationally-defined subclonal cell populations that often have distinct phenotypes. Single-cell RNA-sequencing has revealed that a variety of tumors are also transcriptionally heterogeneous, but the relationship between expression heterogeneity and subclonal architecture is unclear. Here, we address this question in the context of Acute Myeloid Leukemia (AML) by integrating whole genome sequencing with single-cell RNA-sequencing (using the 10x Genomics Chromium Single Cell 5' Gene Expression workflow). Applying this approach to five cryopreserved AML samples, we identify hundreds to thousands of cells containing tumor-specific mutations in each case, and use the results to distinguish AML cells (including normal-karyotype AML cells) from normal cells, identify expression signatures associated with subclonal mutations, and find cell surface markers that could be used to purify subclones for further study. This integrative approach for connecting genotype to phenotype is broadly applicable to any sample that is phenotypically and genetically heterogeneous.


Assuntos
Leucemia Mieloide Aguda/genética , RNA/genética , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Perfilação da Expressão Gênica/métodos , Genômica , Genótipo , Humanos , Mutação , Fenótipo , Polimorfismo de Nucleotídeo Único , Sequenciamento Completo do Genoma
17.
Hum Genet ; 138(7): 715-721, 2019 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-31087184

RESUMO

Sequences encoding Olduvai (DUF1220) protein domains show the largest human-specific increase in copy number of any coding region in the genome and have been linked to human brain evolution. Most human-specific copies of Olduvai (119/165) are encoded by three NBPF genes that are adjacent to three human-specific NOTCH2NL genes that have been shown to promote cortical neurogenesis. Here, employing genomic, phylogenetic, and transcriptomic evidence, we show that these NOTCH2NL/NBPF gene pairs evolved jointly, as two-gene units, very recently in human evolution, and are likely co-regulated. Remarkably, while three NOTCH2NL paralogs were added, adjacent Olduvai sequences hyper-amplified, adding 119 human-specific copies. The data suggest that human-specific Olduvai domains and adjacent NOTCH2NL genes may function in a coordinated, complementary fashion to promote neurogenesis and human brain expansion in a dosage-related manner.


Assuntos
Evolução Biológica , Encéfalo/crescimento & desenvolvimento , Encéfalo/metabolismo , Proteínas de Transporte/genética , Genoma Humano , Receptor Notch2/genética , Genômica , Humanos , Filogenia , Domínios Proteicos
18.
G3 (Bethesda) ; 9(6): 1795-1805, 2019 06 05.
Artigo em Inglês | MEDLINE | ID: mdl-30996023

RESUMO

Isogenic laboratory mouse strains enhance reproducibility because individual animals are genetically identical. For the most widely used isogenic strain, C57BL/6, there exists a wealth of genetic, phenotypic, and genomic data, including a high-quality reference genome (GRCm38.p6). Now 20 years after the first release of the mouse reference genome, C57BL/6J mice are at least 26 inbreeding generations removed from GRCm38 and the strain is now maintained with periodic reintroduction of cryorecovered mice derived from a single breeder pair, aptly named Adam and Eve. To provide an update to the mouse reference genome that more accurately represents the genome of today's C57BL/6J mice, we took advantage of long read, short read, and optical mapping technologies to generate a de novo assembly of the C57BL/6J Eve genome (B6Eve). Using these data, we have addressed recurring variants observed in previous mouse genomic studies. We have also identified structural variations, closed gaps in the mouse reference assembly, and revealed previously unannotated coding sequences. This B6Eve assembly explains discrepant observations that have been associated with GRCm38-based analyses, and will inform a reference genome that is more representative of the C57BL/6J mice that are in use today.


Assuntos
Genoma , Genômica , Animais , Biologia Computacional/métodos , Feminino , Genômica/métodos , Endogamia , Masculino , Camundongos , Camundongos Endogâmicos C57BL , Linhagem , Fenótipo , Polimorfismo de Nucleotídeo Único
19.
Genome Res ; 29(4): 635-645, 2019 04.
Artigo em Inglês | MEDLINE | ID: mdl-30894395

RESUMO

Large-scale population analyses coupled with advances in technology have demonstrated that the human genome is more diverse than originally thought. To date, this diversity has largely been uncovered using short-read whole-genome sequencing. However, these short-read approaches fail to give a complete picture of a genome. They struggle to identify structural events, cannot access repetitive regions, and fail to resolve the human genome into haplotypes. Here, we describe an approach that retains long range information while maintaining the advantages of short reads. Starting from ∼1 ng of high molecular weight DNA, we produce barcoded short-read libraries. Novel informatic approaches allow for the barcoded short reads to be associated with their original long molecules producing a novel data type known as "Linked-Reads". This approach allows for simultaneous detection of small and large variants from a single library. In this manuscript, we show the advantages of Linked-Reads over standard short-read approaches for reference-based analysis. Linked-Reads allow mapping to 38 Mb of sequence not accessible to short reads, adding sequence in 423 difficult-to-sequence genes including disease-relevant genes STRC, SMN1, and SMN2 Both Linked-Read whole-genome and whole-exome sequencing identify complex structural variations, including balanced events and single exon deletions and duplications. Further, Linked-Reads extend the region of high-confidence calls by 68.9 Mb. The data presented here show that Linked-Reads provide a scalable approach for comprehensive genome analysis that is not possible using short reads alone.


Assuntos
Estudo de Associação Genômica Ampla/métodos , Polimorfismo Genético , Sequenciamento Completo do Genoma/métodos , Linhagem Celular , Genoma Humano , Humanos , Peptídeos e Proteínas de Sinalização Intercelular , Proteínas de Membrana/genética , Proteína 1 de Sobrevivência do Neurônio Motor/genética , Proteína 2 de Sobrevivência do Neurônio Motor/genética
20.
Cell ; 176(4): 743-756.e17, 2019 02 07.
Artigo em Inglês | MEDLINE | ID: mdl-30735633

RESUMO

Direct comparisons of human and non-human primate brains can reveal molecular pathways underlying remarkable specializations of the human brain. However, chimpanzee tissue is inaccessible during neocortical neurogenesis when differences in brain size first appear. To identify human-specific features of cortical development, we leveraged recent innovations that permit generating pluripotent stem cell-derived cerebral organoids from chimpanzee. Despite metabolic differences, organoid models preserve gene regulatory networks related to primary cell types and developmental processes. We further identified 261 differentially expressed genes in human compared to both chimpanzee organoids and macaque cortex, enriched for recent gene duplications, and including multiple regulators of PI3K-AKT-mTOR signaling. We observed increased activation of this pathway in human radial glia, dependent on two receptors upregulated specifically in human: INSR and ITGB8. Our findings establish a platform for systematic analysis of molecular changes contributing to human brain development and evolution.


Assuntos
Córtex Cerebral/citologia , Organoides/metabolismo , Animais , Evolução Biológica , Encéfalo/citologia , Técnicas de Cultura de Células/métodos , Diferenciação Celular/genética , Córtex Cerebral/metabolismo , Redes Reguladoras de Genes/genética , Humanos , Células-Tronco Pluripotentes Induzidas/citologia , Macaca , Neurogênese/genética , Organoides/crescimento & desenvolvimento , Pan troglodytes , Células-Tronco Pluripotentes/citologia , Análise de Célula Única , Especificidade da Espécie , Transcriptoma/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA