Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 37
Filtrar
1.
Artículo en Inglés | MEDLINE | ID: mdl-38663087

RESUMEN

The Human Genome Project was an enormous accomplishment, providing a foundation for countless explorations into the genetics and genomics of the human species. Yet for many years, the human genome reference sequence remained incomplete and lacked representation of human genetic diversity. Recently, two major advances have emerged to address these shortcomings: complete gap-free human genome sequences, such as the one developed by the Telomere-to-Telomere Consortium, and high-quality pangenomes, such as the one developed by the Human Pangenome Reference Consortium. Facilitated by advances in long-read DNA sequencing and genome assembly algorithms, complete human genome sequences resolve regions that have been historically difficult to sequence, including centromeres, telomeres, and segmental duplications. In parallel, pangenomes capture the extensive genetic diversity across populations worldwide. Together, these advances usher in a new era of genomics research, enhancing the accuracy of genomic analysis, paving the path for precision medicine, and contributing to deeper insights into human biology.

2.
Genome Res ; 34(3): 454-468, 2024 Apr 25.
Artículo en Inglés | MEDLINE | ID: mdl-38627094

RESUMEN

Reference-free genome phasing is vital for understanding allele inheritance and the impact of single-molecule DNA variation on phenotypes. To achieve thorough phasing across homozygous or repetitive regions of the genome, long-read sequencing technologies are often used to perform phased de novo assembly. As a step toward reducing the cost and complexity of this type of analysis, we describe new methods for accurately phasing Oxford Nanopore Technologies (ONT) sequence data with the Shasta genome assembler and a modular tool for extending phasing to the chromosome scale called GFAse. We test using new variants of ONT PromethION sequencing, including those using proximity ligation, and show that newer, higher accuracy ONT reads substantially improve assembly quality.


Asunto(s)
Nanoporos , Humanos , Análisis de Secuencia de ADN/métodos , Secuenciación de Nanoporos/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Programas Informáticos , Genómica/métodos
3.
Cell Rep ; 43(4): 113988, 2024 Apr 23.
Artículo en Inglés | MEDLINE | ID: mdl-38517886

RESUMEN

The basal breast cancer subtype is enriched for triple-negative breast cancer (TNBC) and displays consistent large chromosomal deletions. Here, we characterize evolution and maintenance of chromosome 4p (chr4p) loss in basal breast cancer. Analysis of The Cancer Genome Atlas data shows recurrent deletion of chr4p in basal breast cancer. Phylogenetic analysis of a panel of 23 primary tumor/patient-derived xenograft basal breast cancers reveals early evolution of chr4p deletion. Mechanistically we show that chr4p loss is associated with enhanced proliferation. Gene function studies identify an unknown gene, C4orf19, within chr4p, which suppresses proliferation when overexpressed-a member of the PDCD10-GCKIII kinase module we name PGCKA1. Genome-wide pooled overexpression screens using a barcoded library of human open reading frames identify chromosomal regions, including chr4p, that suppress proliferation when overexpressed in a context-dependent manner, implicating network interactions. Together, these results shed light on the early emergence of complex aneuploid karyotypes involving chr4p and adaptive landscapes shaping breast cancer genomes.


Asunto(s)
Neoplasias de la Mama , Redes Reguladoras de Genes , Humanos , Femenino , Neoplasias de la Mama/genética , Neoplasias de la Mama/patología , Animales , Ratones , Cromosomas Humanos Par 4/genética , Proliferación Celular/genética , Aberraciones Cromosómicas , Línea Celular Tumoral , Neoplasias de la Mama Triple Negativas/genética , Neoplasias de la Mama Triple Negativas/patología
4.
Nat Biotechnol ; 42(4): 663-673, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-37165083

RESUMEN

Pangenome references address biases of reference genomes by storing a representative set of diverse haplotypes and their alignment, usually as a graph. Alternate alleles determined by variant callers can be used to construct pangenome graphs, but advances in long-read sequencing are leading to widely available, high-quality phased assemblies. Constructing a pangenome graph directly from assemblies, as opposed to variant calls, leverages the graph's ability to represent variation at different scales. Here we present the Minigraph-Cactus pangenome pipeline, which creates pangenomes directly from whole-genome alignments, and demonstrate its ability to scale to 90 human haplotypes from the Human Pangenome Reference Consortium. The method builds graphs containing all forms of genetic variation while allowing use of current mapping and genotyping tools. We measure the effect of the quality and completeness of reference genomes used for analysis within the pangenomes and show that using the CHM13 reference from the Telomere-to-Telomere Consortium improves the accuracy of our methods. We also demonstrate construction of a Drosophila melanogaster pangenome.


Asunto(s)
Drosophila melanogaster , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Animales , Drosophila melanogaster/genética , Haplotipos/genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Alelos , Análisis de Secuencia de ADN , Genoma Humano/genética
5.
Nat Methods ; 20(10): 1483-1492, 2023 10.
Artículo en Inglés | MEDLINE | ID: mdl-37710018

RESUMEN

Long-read sequencing technologies substantially overcome the limitations of short-reads but have not been considered as a feasible replacement for population-scale projects, being a combination of too expensive, not scalable enough or too error-prone. Here we develop an efficient and scalable wet lab and computational protocol, Napu, for Oxford Nanopore Technologies long-read sequencing that seeks to address those limitations. We applied our protocol to cell lines and brain tissue samples as part of a pilot project for the National Institutes of Health Center for Alzheimer's and Related Dementias. Using a single PromethION flow cell, we can detect single nucleotide polymorphisms with F1-score comparable to Illumina short-read sequencing. Small indel calling remains difficult within homopolymers and tandem repeats, but achieves good concordance to Illumina indel calls elsewhere. Further, we can discover structural variants with F1-score on par with state-of-the-art de novo assembly methods. Our protocol phases small and structural variants at megabase scales and produces highly accurate, haplotype-specific methylation calls.


Asunto(s)
Genoma Humano , Secuenciación de Nanoporos , Humanos , Análisis de Secuencia de ADN/métodos , Haplotipos , Metilación , Proyectos Piloto , Secuenciación de Nucleótidos de Alto Rendimiento/métodos
6.
Nature ; 617(7960): 312-324, 2023 05.
Artículo en Inglés | MEDLINE | ID: mdl-37165242

RESUMEN

Here the Human Pangenome Reference Consortium presents a first draft of the human pangenome reference. The pangenome contains 47 phased, diploid assemblies from a cohort of genetically diverse individuals1. These assemblies cover more than 99% of the expected sequence in each genome and are more than 99% accurate at the structural and base pair levels. Based on alignments of the assemblies, we generate a draft pangenome that captures known variants and haplotypes and reveals new alleles at structurally complex loci. We also add 119 million base pairs of euchromatic polymorphic sequences and 1,115 gene duplications relative to the existing reference GRCh38. Roughly 90 million of the additional base pairs are derived from structural variation. Using our draft pangenome to analyse short-read data reduced small variant discovery errors by 34% and increased the number of structural variants detected per haplotype by 104% compared with GRCh38-based workflows, which enabled the typing of the vast majority of structural variant alleles per sample.


Asunto(s)
Genoma Humano , Genómica , Humanos , Diploidia , Genoma Humano/genética , Haplotipos/genética , Análisis de Secuencia de ADN , Genómica/normas , Estándares de Referencia , Estudios de Cohortes , Alelos , Variación Genética
7.
bioRxiv ; 2023 Feb 22.
Artículo en Inglés | MEDLINE | ID: mdl-36865218

RESUMEN

As a step towards simplifying and reducing the cost of haplotype resolved de novo assembly, we describe new methods for accurately phasing nanopore data with the Shasta genome assembler and a modular tool for extending phasing to the chromosome scale called GFAse. We test using new variants of Oxford Nanopore Technologies' (ONT) PromethION sequencing, including those using proximity ligation and show that newer, higher accuracy ONT reads substantially improve assembly quality.

9.
bioRxiv ; 2023 Apr 05.
Artículo en Inglés | MEDLINE | ID: mdl-36711673

RESUMEN

Long-read sequencing technologies substantially overcome the limitations of short-reads but to date have not been considered as feasible replacement at scale due to a combination of being too expensive, not scalable enough, or too error-prone. Here, we develop an efficient and scalable wet lab and computational protocol for Oxford Nanopore Technologies (ONT) long-read sequencing that seeks to provide a genuine alternative to short-reads for large-scale genomics projects. We applied our protocol to cell lines and brain tissue samples as part of a pilot project for the NIH Center for Alzheimer's and Related Dementias (CARD). Using a single PromethION flow cell, we can detect SNPs with F1-score better than Illumina short-read sequencing. Small indel calling remains to be difficult inside homopolymers and tandem repeats, but is comparable to Illumina calls elsewhere. Further, we can discover structural variants with F1-score comparable to state-of the-art methods involving Pacific Biosciences HiFi sequencing and trio information (but at a lower cost and greater throughput). Using ONT based phasing, we can then combine and phase small and structural variants at megabase scales. Our protocol also produces highly accurate, haplotype-specific methylation calls. Overall, this makes large-scale long-read sequencing projects feasible; the protocol is currently being used to sequence thousands of brain-based genomes as a part of the NIH CARD initiative. We provide the protocol and software as open-source integrated pipelines for generating phased variant calls and assemblies.

10.
bioRxiv ; 2023 Dec 15.
Artículo en Inglés | MEDLINE | ID: mdl-38168361

RESUMEN

Pangenomes, by including genetic diversity, should reduce reference bias by better representing new samples compared to them. Yet when comparing a new sample to a pangenome, variants in the pangenome that are not part of the sample can be misleading, for example, causing false read mappings. These irrelevant variants are generally rarer in terms of allele frequency, and have previously been dealt with using allele frequency filters. However, this is a blunt heuristic that both fails to remove some irrelevant variants and removes many relevant variants. We propose a new approach, inspired by local ancestry inference methods, that imputes a personalized pangenome subgraph based on sampling local haplotypes according to k-mer counts in the reads. Our approach is tailored for the Giraffe short read aligner, as the indexes it needs for read mapping can be built quickly. We compare the accuracy of our approach to state-of-the-art methods using graphs from the Human Pangenome Reference Consortium. The resulting personalized pangenome pipelines provide faster pangenome read mapping than comparable pipelines that use a linear reference, reduce small variant genotyping errors by 4x relative to the Genome Analysis Toolkit (GATK) best-practice pipeline, and for the first time make short-read structural variant genotyping competitive with long-read discovery methods.

11.
Neuro Oncol ; 24(9): 1494-1508, 2022 09 01.
Artículo en Inglés | MEDLINE | ID: mdl-35416251

RESUMEN

BACKGROUND: Glioblastoma is a treatment-resistant brain cancer. Its hierarchical cellular nature and its tumor microenvironment (TME) before, during, and after treatments remain unresolved. METHODS: Here, we used single-cell RNA sequencing to analyze new and recurrent glioblastoma and the nearby subventricular zone (SVZ). RESULTS: We found 4 glioblastoma neural lineages are present in new and recurrent glioblastoma with an enrichment of the cancer mesenchymal lineage, immune cells, and reactive astrocytes in early recurrences. Cancer lineages were hierarchically organized around cycling oligodendrocytic and astrocytic progenitors that are transcriptomically similar but distinct to SVZ neural stem cells (NSCs). Furthermore, NSCs from the SVZ of patients with glioblastoma harbored glioblastoma chromosomal anomalies. Lastly, mesenchymal cancer cells and TME reactive astrocytes shared similar gene signatures which were induced by radiotherapy in a myeloid-dependent fashion in vivo. CONCLUSION: These data reveal the dynamic, immune-dependent nature of glioblastoma's response to treatments and identify distant NSCs as likely cells of origin.


Asunto(s)
Neoplasias Encefálicas , Glioblastoma , Células-Madre Neurales , Neoplasias Encefálicas/patología , Glioblastoma/patología , Humanos , Ventrículos Laterales/patología , Células-Madre Neurales/patología , Análisis de la Célula Individual , Microambiente Tumoral
12.
Nat Biotechnol ; 40(7): 1035-1041, 2022 07.
Artículo en Inglés | MEDLINE | ID: mdl-35347328

RESUMEN

Whole-genome sequencing (WGS) can identify variants that cause genetic disease, but the time required for sequencing and analysis has been a barrier to its use in acutely ill patients. In the present study, we develop an approach for ultra-rapid nanopore WGS that combines an optimized sample preparation protocol, distributing sequencing over 48 flow cells, near real-time base calling and alignment, accelerated variant calling and fast variant filtration for efficient manual review. Application to two example clinical cases identified a candidate variant in <8 h from sample preparation to variant identification. We show that this framework provides accurate variant calls and efficient prioritization, and accelerates diagnostic clinical genome sequencing twofold compared with previous approaches.


Asunto(s)
Secuenciación de Nanoporos , Nanoporos , Mapeo Cromosómico , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Secuenciación Completa del Genoma/métodos
15.
Science ; 374(6574): abg8871, 2021 Dec 17.
Artículo en Inglés | MEDLINE | ID: mdl-34914532

RESUMEN

We introduce Giraffe, a pangenome short-read mapper that can efficiently map to a collection of haplotypes threaded through a sequence graph. Giraffe maps sequencing reads to thousands of human genomes at a speed comparable to that of standard methods mapping to a single reference genome. The increased mapping accuracy enables downstream improvements in genome-wide genotyping pipelines for both small variants and larger structural variants. We used Giraffe to genotype 167,000 structural variants, discovered in long-read studies, in 5202 diverse human genomes that were sequenced using short reads. We conclude that pangenomics facilitates a more comprehensive characterization of variation and, as a result, has the potential to improve many genomic analyses.


Asunto(s)
Variación Genética , Genoma Humano , Genómica/métodos , Técnicas de Genotipaje , Algoritmos , Alelos , Biología Computacional , Genoma Fúngico , Genotipo , Haplotipos , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo , Saccharomyces/genética , Saccharomyces cerevisiae/genética , Análisis de Secuencia de ADN
16.
F1000Res ; 10: 246, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34621504

RESUMEN

In October 2020, 62 scientists from nine nations worked together remotely in the Second Baylor College of Medicine & DNAnexus hackathon, focusing on different related topics on Structural Variation, Pan-genomes, and SARS-CoV-2 related research.   The overarching focus was to assess the current status of the field and identify the remaining challenges. Furthermore, how to combine the strengths of the different interests to drive research and method development forward. Over the four days, eight groups each designed and developed new open-source methods to improve the identification and analysis of variations among species, including humans and SARS-CoV-2. These included improvements in SV calling, genotyping, annotations and filtering. Together with advancements in benchmarking existing methods. Furthermore, groups focused on the diversity of SARS-CoV-2. Daily discussion summary and methods are available publicly at  https://github.com/collaborativebioinformatics provides valuable insights for both participants and the research community.


Asunto(s)
COVID-19 , SARS-CoV-2 , Animales , Genoma Viral , Humanos , Vertebrados
17.
Methods Mol Biol ; 2381: 285-303, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34590283

RESUMEN

Cancer can develop from an accumulation of alterations, some of which cause a nonmalignant cell to transform to a malignant state exhibiting increased rate of cell growth and evasion of growth suppressive mechanisms, eventually leading to tissue invasion and metastatic disease. Triple-negative breast cancers (TNBC) are heterogeneous and are clinically characterized by the lack of expression of hormone receptors and human epidermal growth factor receptor 2 (HER2), which limits its treatment options. Since tumor evolution is driven by diverse cancer cell populations and their microenvironment, it is imperative to map TNBC at single-cell resolution. Here, we describe an experimental procedure for isolating a single-cell suspension from a TNBC patient-derived xenograft, subjecting it to single-cell RNA sequencing using droplet-based technology from 10× Genomics and analyzing the transcriptomic data at single-cell resolution to obtain inferred copy number aberration profiles, using scCNA. Data obtained using this single-cell RNA sequencing experimental and analytical methodology should enhance our understanding of intratumor heterogeneity which is key for identifying genetic vulnerabilities and developing effective therapies.


Asunto(s)
Variaciones en el Número de Copia de ADN , Neoplasias de la Mama Triple Negativas , Animales , Línea Celular Tumoral , Modelos Animales de Enfermedad , Genómica , Xenoinjertos , Humanos , Neoplasias de la Mama Triple Negativas/genética , Microambiente Tumoral
18.
Neuro Oncol ; 23(9): 1470-1480, 2021 09 01.
Artículo en Inglés | MEDLINE | ID: mdl-33433612

RESUMEN

BACKGROUND: Sixty percent of surgically resected brain metastases (BrM) recur within 1 year. These recurrences have long been thought to result from the dispersion of cancer cells during surgery. We tested the alternative hypothesis that invasion of cancer cells into the adjacent brain plays a significant role in local recurrence and shortened overall survival. METHODS: We determined the invasion pattern of 164 surgically resected BrM and correlated with local recurrence and overall survival. We performed single-cell RNA sequencing (scRNAseq) of >15,000 cells from BrM and adjacent brain tissue. Validation of targets was performed with a novel cohort of BrM patient-derived xenografts (PDX) and patient tissues. RESULTS: We demonstrate that invasion of metastatic cancer cells into the adjacent brain is associated with local recurrence and shortened overall survival. scRNAseq of paired tumor and adjacent brain samples confirmed the existence of invasive cancer cells in the tumor-adjacent brain. Analysis of these cells identified cold-inducible RNA-binding protein (CIRBP) overexpression in invasive cancer cells compared to cancer cells located within the metastases. Applying PDX models that recapitulate the invasion pattern observed in patients, we show that CIRBP is overexpressed in highly invasive BrM and is required for efficient invasive growth in the brain. CONCLUSIONS: These data demonstrate peritumoral invasion as a driver of treatment failure in BrM that is functionally mediated by CIRBP. These findings improve our understanding of the biology underlying postoperative treatment failure and lay the groundwork for rational clinical trial development based upon invasion pattern in surgically resected BrM.


Asunto(s)
Neoplasias Encefálicas , Radiocirugia , Encéfalo , Neoplasias Encefálicas/genética , Neoplasias Encefálicas/cirugía , Humanos , Recurrencia Local de Neoplasia/genética , Proteínas de Unión al ARN/genética
20.
Clin Cancer Res ; 26(20): 5462-5476, 2020 10 15.
Artículo en Inglés | MEDLINE | ID: mdl-32816949

RESUMEN

PURPOSE: Pancreatic ductal adenocarcinoma (PDAC) arising in patients with a germline BRCA1 or BRCA2 (gBRCA) mutation may be sensitive to platinum and PARP inhibitors (PARPi). However, treatment stratification based on gBRCA mutational status alone is associated with heterogeneous responses. EXPERIMENTAL DESIGN: We performed a seven-arm preclinical trial consisting of 471 mice, representing 12 unique PDAC patient-derived xenografts, of which nine were gBRCA mutated. From 179 patients whose PDAC was whole-genome and transcriptome sequenced, we identified 21 cases with homologous recombination deficiency (HRD), and investigated prognostic biomarkers. RESULTS: We found that biallelic inactivation of BRCA1/BRCA2 is associated with genomic hallmarks of HRD and required for cisplatin and talazoparib (PARPi) sensitivity. However, HRD genomic hallmarks persisted in xenografts despite the emergence of therapy resistance, indicating the presence of a genomic scar. We identified tumor polyploidy and a low Ki67 index as predictors of poor cisplatin and talazoparib response. In patients with HRD PDAC, tumor polyploidy and a basal-like transcriptomic subtype were independent predictors of shorter survival. To facilitate clinical assignment of transcriptomic subtype, we developed a novel pragmatic two-marker assay (GATA6:KRT17). CONCLUSIONS: In summary, we propose a predictive and prognostic model of gBRCA-mutated PDAC on the basis of HRD genomic hallmarks, Ki67 index, tumor ploidy, and transcriptomic subtype.


Asunto(s)
Proteína BRCA1/genética , Proteína BRCA2/genética , Recombinación Homóloga/efectos de los fármacos , Neoplasias Pancreáticas/tratamiento farmacológico , Animales , Biomarcadores de Tumor/genética , Cisplatino/administración & dosificación , Cisplatino/efectos adversos , Modelos Animales de Enfermedad , Femenino , Xenoinjertos , Humanos , Masculino , Ratones , Mutación , Neoplasias Pancreáticas/genética , Neoplasias Pancreáticas/patología , Ftalazinas/administración & dosificación , Ftalazinas/efectos adversos , Inhibidores de Poli(ADP-Ribosa) Polimerasas/administración & dosificación
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...