Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 26
Filtrar
Mais filtros








Base de dados
Intervalo de ano de publicação
1.
medRxiv ; 2024 Mar 26.
Artigo em Inglês | MEDLINE | ID: mdl-38585974

RESUMO

Most current studies rely on short-read sequencing to detect somatic structural variation (SV) in cancer genomes. Long-read sequencing offers the advantage of better mappability and long-range phasing, which results in substantial improvements in germline SV detection. However, current long-read SV detection methods do not generalize well to the analysis of somatic SVs in tumor genomes with complex rearrangements, heterogeneity, and aneuploidy. Here, we present Severus: a method for the accurate detection of different types of somatic SVs using a phased breakpoint graph approach. To benchmark various short- and long-read SV detection methods, we sequenced five tumor/normal cell line pairs with Illumina, Nanopore, and PacBio sequencing platforms; on this benchmark Severus showed the highest F1 scores (harmonic mean of the precision and recall) as compared to long-read and short-read methods. We then applied Severus to three clinical cases of pediatric cancer, demonstrating concordance with known genetic findings as well as revealing clinically relevant cryptic rearrangements missed by standard genomic panels.

2.
medRxiv ; 2024 Mar 07.
Artigo em Inglês | MEDLINE | ID: mdl-38496498

RESUMO

Less than half of individuals with a suspected Mendelian condition receive a precise molecular diagnosis after comprehensive clinical genetic testing. Improvements in data quality and costs have heightened interest in using long-read sequencing (LRS) to streamline clinical genomic testing, but the absence of control datasets for variant filtering and prioritization has made tertiary analysis of LRS data challenging. To address this, the 1000 Genomes Project ONT Sequencing Consortium aims to generate LRS data from at least 800 of the 1000 Genomes Project samples. Our goal is to use LRS to identify a broader spectrum of variation so we may improve our understanding of normal patterns of human variation. Here, we present data from analysis of the first 100 samples, representing all 5 superpopulations and 19 subpopulations. These samples, sequenced to an average depth of coverage of 37x and sequence read N50 of 54 kbp, have high concordance with previous studies for identifying single nucleotide and indel variants outside of homopolymer regions. Using multiple structural variant (SV) callers, we identify an average of 24,543 high-confidence SVs per genome, including shared and private SVs likely to disrupt gene function as well as pathogenic expansions within disease-associated repeats that were not detected using short reads. Evaluation of methylation signatures revealed expected patterns at known imprinted loci, samples with skewed X-inactivation patterns, and novel differentially methylated regions. All raw sequencing data, processed data, and summary statistics are publicly available, providing a valuable resource for the clinical genetics community to discover pathogenic SVs.

3.
Am J Hum Genet ; 111(3): 544-561, 2024 Mar 07.
Artigo em Inglês | MEDLINE | ID: mdl-38307027

RESUMO

Cervical cancer is caused by human papillomavirus (HPV) infection, has few approved targeted therapeutics, and is the most common cause of cancer death in low-resource countries. We characterized 19 cervical and four head and neck cancer cell lines using long-read DNA and RNA sequencing and identified the HPV types, HPV integration sites, chromosomal alterations, and cancer driver mutations. Structural variation analysis revealed telomeric deletions associated with DNA inversions resulting from breakage-fusion-bridge (BFB) cycles. BFB is a common mechanism of chromosomal alterations in cancer, and our study applies long-read sequencing to this important chromosomal rearrangement type. Analysis of the inversion sites revealed staggered ends consistent with exonuclease digestion of the DNA after breakage. Some BFB events are complex, involving inter- or intra-chromosomal insertions or rearrangements. None of the BFB breakpoints had telomere sequences added to resolve the dicentric chromosomes, and only one BFB breakpoint showed chromothripsis. Five cell lines have a chromosomal region 11q BFB event, with YAP1-BIRC3-BIRC2 amplification. Indeed, YAP1 amplification is associated with a 10-year-earlier age of diagnosis of cervical cancer and is three times more common in African American women. This suggests that individuals with cervical cancer and YAP1-BIRC3-BIRC2 amplification, especially those of African ancestry, might benefit from targeted therapy. In summary, we uncovered valuable insights into the mechanisms and consequences of BFB cycles in cervical cancer using long-read sequencing.


Assuntos
Infecções por Papillomavirus , Neoplasias do Colo do Útero , Feminino , Humanos , Neoplasias do Colo do Útero/genética , Aberrações Cromossômicas , Telômero/genética , DNA
4.
medRxiv ; 2023 Aug 22.
Artigo em Inglês | MEDLINE | ID: mdl-37662332

RESUMO

Cervical cancer is caused by human papillomavirus (HPV) infection, has few approved targeted therapeutics, and is the most common cause of cancer death in low-resource countries. We characterized 19 cervical and four head and neck cell lines using long-read DNA and RNA sequencing and identified the HPV types, HPV integration sites, chromosomal alterations, and cancer driver mutations. Structural variation analysis revealed telomeric deletions associated with DNA inversions resulting from breakage-fusion-bridge (BFB) cycles. BFB is a common mechanism of chromosomal alterations in cancer, and this is one of the first analyses of these events using long-read sequencing. Analysis of the inversion sites revealed staggered ends consistent with exonuclease digestion of the DNA after breakage. Some BFB events are complex, involving inter- or intra-chromosomal insertions or rearrangements. None of the BFB breakpoints had telomere sequences added to resolve the dicentric chromosomes and only one BFB breakpoint showed chromothripsis. Five cell lines have a Chr11q BFB event, with YAP1/BIRC2/BIRC3 gene amplification. Indeed, YAP1 amplification is associated with a 10-year earlier age of diagnosis of cervical cancer and is three times more common in African American women. This suggests that cervical cancer patients with YAP1/BIRC2/BIRC3-amplification, especially those of African American ancestry, might benefit from targeted therapy. In summary, we uncovered new insights into the mechanisms and consequences of BFB cycles in cervical cancer using long-read sequencing.

5.
Nat Methods ; 20(10): 1483-1492, 2023 10.
Artigo em Inglês | MEDLINE | ID: mdl-37710018

RESUMO

Long-read sequencing technologies substantially overcome the limitations of short-reads but have not been considered as a feasible replacement for population-scale projects, being a combination of too expensive, not scalable enough or too error-prone. Here we develop an efficient and scalable wet lab and computational protocol, Napu, for Oxford Nanopore Technologies long-read sequencing that seeks to address those limitations. We applied our protocol to cell lines and brain tissue samples as part of a pilot project for the National Institutes of Health Center for Alzheimer's and Related Dementias. Using a single PromethION flow cell, we can detect single nucleotide polymorphisms with F1-score comparable to Illumina short-read sequencing. Small indel calling remains difficult within homopolymers and tandem repeats, but achieves good concordance to Illumina indel calls elsewhere. Further, we can discover structural variants with F1-score on par with state-of-the-art de novo assembly methods. Our protocol phases small and structural variants at megabase scales and produces highly accurate, haplotype-specific methylation calls.


Assuntos
Genoma Humano , Sequenciamento por Nanoporos , Humanos , Análise de Sequência de DNA/métodos , Haplótipos , Metilação , Projetos Piloto , Sequenciamento de Nucleotídeos em Larga Escala/métodos
6.
bioRxiv ; 2023 Apr 05.
Artigo em Inglês | MEDLINE | ID: mdl-36711673

RESUMO

Long-read sequencing technologies substantially overcome the limitations of short-reads but to date have not been considered as feasible replacement at scale due to a combination of being too expensive, not scalable enough, or too error-prone. Here, we develop an efficient and scalable wet lab and computational protocol for Oxford Nanopore Technologies (ONT) long-read sequencing that seeks to provide a genuine alternative to short-reads for large-scale genomics projects. We applied our protocol to cell lines and brain tissue samples as part of a pilot project for the NIH Center for Alzheimer's and Related Dementias (CARD). Using a single PromethION flow cell, we can detect SNPs with F1-score better than Illumina short-read sequencing. Small indel calling remains to be difficult inside homopolymers and tandem repeats, but is comparable to Illumina calls elsewhere. Further, we can discover structural variants with F1-score comparable to state-of the-art methods involving Pacific Biosciences HiFi sequencing and trio information (but at a lower cost and greater throughput). Using ONT based phasing, we can then combine and phase small and structural variants at megabase scales. Our protocol also produces highly accurate, haplotype-specific methylation calls. Overall, this makes large-scale long-read sequencing projects feasible; the protocol is currently being used to sequence thousands of brain-based genomes as a part of the NIH CARD initiative. We provide the protocol and software as open-source integrated pipelines for generating phased variant calls and assemblies.

7.
bioRxiv ; 2023 Dec 15.
Artigo em Inglês | MEDLINE | ID: mdl-38168361

RESUMO

Pangenomes, by including genetic diversity, should reduce reference bias by better representing new samples compared to them. Yet when comparing a new sample to a pangenome, variants in the pangenome that are not part of the sample can be misleading, for example, causing false read mappings. These irrelevant variants are generally rarer in terms of allele frequency, and have previously been dealt with using allele frequency filters. However, this is a blunt heuristic that both fails to remove some irrelevant variants and removes many relevant variants. We propose a new approach, inspired by local ancestry inference methods, that imputes a personalized pangenome subgraph based on sampling local haplotypes according to k-mer counts in the reads. Our approach is tailored for the Giraffe short read aligner, as the indexes it needs for read mapping can be built quickly. We compare the accuracy of our approach to state-of-the-art methods using graphs from the Human Pangenome Reference Consortium. The resulting personalized pangenome pipelines provide faster pangenome read mapping than comparable pipelines that use a linear reference, reduce small variant genotyping errors by 4x relative to the Genome Analysis Toolkit (GATK) best-practice pipeline, and for the first time make short-read structural variant genotyping competitive with long-read discovery methods.

8.
Genome Res ; 32(11-12): 2119-2133, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36418060

RESUMO

The advent of long and accurate "HiFi" reads has greatly improved our ability to generate complete metagenome-assembled genomes (MAGs), enabling "complete metagenomics" studies that were nearly impossible to conduct with short reads. In particular, HiFi reads simplify the identification and phasing of mutations in MAGs: It is increasingly feasible to distinguish between positions that are prone to mutations and positions that rarely ever mutate, and to identify co-occurring groups of mutations. However, the problems of identifying rare mutations in MAGs, estimating the false-discovery rate (FDR) of these identifications, and phasing identified mutations remain open in the context of HiFi data. We present strainFlye, a pipeline for the FDR-controlled identification and analysis of rare mutations in MAGs assembled using HiFi reads. We show that deep HiFi sequencing has the potential to reveal and phase tens of thousands of rare mutations in a single MAG, identify hotspots and coldspots of these mutations, and detail MAGs' growth dynamics.


Assuntos
Bactérias , Metagenoma , Bactérias/genética , Metagenômica , Mutação
9.
Nature ; 611(7936): 519-531, 2022 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-36261518

RESUMO

The current human reference genome, GRCh38, represents over 20 years of effort to generate a high-quality assembly, which has benefitted society1,2. However, it still has many gaps and errors, and does not represent a biological genome as it is a blend of multiple individuals3,4. Recently, a high-quality telomere-to-telomere reference, CHM13, was generated with the latest long-read technologies, but it was derived from a hydatidiform mole cell line with a nearly homozygous genome5. To address these limitations, the Human Pangenome Reference Consortium formed with the goal of creating high-quality, cost-effective, diploid genome assemblies for a pangenome reference that represents human genetic diversity6. Here, in our first scientific report, we determined which combination of current genome sequencing and assembly approaches yield the most complete and accurate diploid genome assembly with minimal manual curation. Approaches that used highly accurate long reads and parent-child data with graph-based haplotype phasing during assembly outperformed those that did not. Developing a combination of the top-performing methods, we generated our first high-quality diploid reference assembly, containing only approximately four gaps per chromosome on average, with most chromosomes within ±1% of the length of CHM13. Nearly 48% of protein-coding genes have non-synonymous amino acid changes between haplotypes, and centromeric regions showed the highest diversity. Our findings serve as a foundation for assembling near-complete diploid human genomes at scale for a pangenome reference to capture global genetic variation from single nucleotides to structural rearrangements.


Assuntos
Mapeamento Cromossômico , Diploide , Genoma Humano , Genômica , Humanos , Mapeamento Cromossômico/normas , Genoma Humano/genética , Haplótipos/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Sequenciamento de Nucleotídeos em Larga Escala/normas , Análise de Sequência de DNA/métodos , Análise de Sequência de DNA/normas , Padrões de Referência , Genômica/métodos , Genômica/normas , Cromossomos Humanos/genética , Variação Genética/genética
10.
Nat Methods ; 19(4): 429-440, 2022 04.
Artigo em Inglês | MEDLINE | ID: mdl-35396482

RESUMO

Evaluating metagenomic software is key for optimizing metagenome interpretation and focus of the Initiative for the Critical Assessment of Metagenome Interpretation (CAMI). The CAMI II challenge engaged the community to assess methods on realistic and complex datasets with long- and short-read sequences, created computationally from around 1,700 new and known genomes, as well as 600 new plasmids and viruses. Here we analyze 5,002 results by 76 program versions. Substantial improvements were seen in assembly, some due to long-read data. Related strains still were challenging for assembly and genome recovery through binning, as was assembly quality for the latter. Profilers markedly matured, with taxon profilers and binners excelling at higher bacterial ranks, but underperforming for viruses and Archaea. Clinical pathogen detection results revealed a need to improve reproducibility. Runtime and memory usage analyses identified efficient programs, including top performers with other metrics. The results identify challenges and guide researchers in selecting methods for analyses.


Assuntos
Metagenoma , Metagenômica , Archaea/genética , Metagenômica/métodos , Reprodutibilidade dos Testes , Análise de Sequência de DNA , Software
11.
Science ; 376(6588): 44-53, 2022 04.
Artigo em Inglês | MEDLINE | ID: mdl-35357919

RESUMO

Since its initial release in 2000, the human reference genome has covered only the euchromatic fraction of the genome, leaving important heterochromatic regions unfinished. Addressing the remaining 8% of the genome, the Telomere-to-Telomere (T2T) Consortium presents a complete 3.055 billion-base pair sequence of a human genome, T2T-CHM13, that includes gapless assemblies for all chromosomes except Y, corrects errors in the prior references, and introduces nearly 200 million base pairs of sequence containing 1956 gene predictions, 99 of which are predicted to be protein coding. The completed regions include all centromeric satellite arrays, recent segmental duplications, and the short arms of all five acrocentric chromosomes, unlocking these complex regions of the genome to variational and functional studies.


Assuntos
Genoma Humano , Projeto Genoma Humano , Análise de Sequência de DNA/normas , Linhagem Celular , Cromossomos Artificiais Bacterianos/genética , Cromossomos Humanos/genética , Humanos , Valores de Referência
12.
Nat Biotechnol ; 40(7): 1075-1081, 2022 07.
Artigo em Inglês | MEDLINE | ID: mdl-35228706

RESUMO

Although most existing genome assemblers are based on de Bruijn graphs, the construction of these graphs for large genomes and large k-mer sizes has remained elusive. This algorithmic challenge has become particularly pressing with the emergence of long, high-fidelity (HiFi) reads that have been recently used to generate a semi-manual telomere-to-telomere assembly of the human genome. To enable automated assemblies of long, HiFi reads, we present the La Jolla Assembler (LJA), a fast algorithm using the Bloom filter, sparse de Bruijn graphs and disjointig generation. LJA reduces the error rate in HiFi reads by three orders of magnitude, constructs the de Bruijn graph for large genomes and large k-mer sizes and transforms it into a multiplex de Bruijn graph with varying k-mer sizes. Compared to state-of-the-art assemblers, our algorithm not only achieves five-fold fewer misassemblies but also generates more contiguous assemblies. We demonstrate the utility of LJA via the automated assembly of a human genome that completely assembled six chromosomes.


Assuntos
Algoritmos , Genoma Humano , Genoma Humano/genética , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Análise de Sequência de DNA , Software
13.
Genome Biol ; 23(1): 57, 2022 02 21.
Artigo em Inglês | MEDLINE | ID: mdl-35189932

RESUMO

Although the use of long-read sequencing improves the contiguity of assembled viral genomes compared to short-read methods, assembling complex viral communities remains an open problem. We describe the viralFlye tool for identification and analysis of metagenome-assembled viruses in long-read assemblies. We show it significantly improves viral assemblies and demonstrate that long-reads result in a much larger array of predicted virus-host associations as compared to short-read assemblies. We demonstrate that the identification of novel CRISPR arrays in bacterial genomes from a newly assembled metagenomic sample provides information for predicting novel hosts for novel viruses.


Assuntos
Metagenômica , Vírus , Genoma Bacteriano , Metagenoma , Metagenômica/métodos , Análise de Sequência de DNA/métodos , Vírus/genética
14.
Nat Biotechnol ; 40(5): 711-719, 2022 05.
Artigo em Inglês | MEDLINE | ID: mdl-34980911

RESUMO

Microbial communities might include distinct lineages of closely related organisms that complicate metagenomic assembly and prevent the generation of complete metagenome-assembled genomes (MAGs). Here we show that deep sequencing using long (HiFi) reads combined with Hi-C binning can address this challenge even for complex microbial communities. Using existing methods, we sequenced the sheep fecal metagenome and identified 428 MAGs with more than 90% completeness, including 44 MAGs in single circular contigs. To resolve closely related strains (lineages), we developed MAGPhase, which separates lineages of related organisms by discriminating variant haplotypes across hundreds of kilobases of genomic sequence. MAGPhase identified 220 lineage-resolved MAGs in our dataset. The ability to resolve closely related microbes in complex microbial communities improves the identification of biosynthetic gene clusters and the precision of assigning mobile genetic elements to host genomes. We identified 1,400 complete and 350 partial biosynthetic gene clusters, most of which are novel, as well as 424 (298) potential host-viral (host-plasmid) associations using Hi-C data.


Assuntos
Metagenoma , Microbiota , Animais , Fezes , Metagenoma/genética , Metagenômica , Microbiota/genética , Análise de Sequência de DNA , Ovinos
15.
Nat Methods ; 18(11): 1322-1332, 2021 11.
Artigo em Inglês | MEDLINE | ID: mdl-34725481

RESUMO

Long-read sequencing has the potential to transform variant detection by reaching currently difficult-to-map regions and routinely linking together adjacent variations to enable read-based phasing. Third-generation nanopore sequence data have demonstrated a long read length, but current interpretation methods for their novel pore-based signal have unique error profiles, making accurate analysis challenging. Here, we introduce a haplotype-aware variant calling pipeline, PEPPER-Margin-DeepVariant, that produces state-of-the-art variant calling results with nanopore data. We show that our nanopore-based method outperforms the short-read-based single-nucleotide-variant identification method at the whole-genome scale and produces high-quality single-nucleotide variants in segmental duplications and low-mappability regions where short-read-based genotyping fails. We show that our pipeline can provide highly contiguous phase blocks across the genome with nanopore reads, contiguously spanning between 85% and 92% of annotated genes across six samples. We also extend PEPPER-Margin-DeepVariant to PacBio HiFi data, providing an efficient solution with superior performance over the current WhatsHap-DeepVariant standard. Finally, we demonstrate de novo assembly polishing methods that use nanopore and PacBio HiFi reads to produce diploid assemblies with high accuracy (Q35+ nanopore-polished and Q40+ PacBio HiFi-polished).


Assuntos
Genes , Haplótipos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Nanoporos , Polimorfismo de Nucleotídeo Único , Análise de Sequência de DNA/métodos , Software , Genoma Humano , Humanos , Anotação de Sequência Molecular
16.
Nat Methods ; 17(11): 1103-1110, 2020 11.
Artigo em Inglês | MEDLINE | ID: mdl-33020656

RESUMO

Long-read sequencing technologies have substantially improved the assemblies of many isolate bacterial genomes as compared to fragmented short-read assemblies. However, assembling complex metagenomic datasets remains difficult even for state-of-the-art long-read assemblers. Here we present metaFlye, which addresses important long-read metagenomic assembly challenges, such as uneven bacterial composition and intra-species heterogeneity. First, we benchmarked metaFlye using simulated and mock bacterial communities and show that it consistently produces assemblies with better completeness and contiguity than state-of-the-art long-read assemblers. Second, we performed long-read sequencing of the sheep microbiome and applied metaFlye to reconstruct 63 complete or nearly complete bacterial genomes within single contigs. Finally, we show that long-read assembly of human microbiomes enables the discovery of full-length biosynthetic gene clusters that encode biomedically important natural products.


Assuntos
Genoma Bacteriano/genética , Genoma Humano/genética , Metagenoma/genética , Metagenômica/métodos , Microbiota/genética , Algoritmos , Animais , Benchmarking , Microbioma Gastrointestinal/genética , Humanos , Análise de Sequência de DNA/métodos , Ovinos , Software , Especificidade da Espécie
17.
Nat Biotechnol ; 37(5): 540-546, 2019 05.
Artigo em Inglês | MEDLINE | ID: mdl-30936562

RESUMO

Accurate genome assembly is hampered by repetitive regions. Although long single molecule sequencing reads are better able to resolve genomic repeats than short-read data, most long-read assembly algorithms do not provide the repeat characterization necessary for producing optimal assemblies. Here, we present Flye, a long-read assembly algorithm that generates arbitrary paths in an unknown repeat graph, called disjointigs, and constructs an accurate repeat graph from these error-riddled disjointigs. We benchmark Flye against five state-of-the-art assemblers and show that it generates better or comparable assemblies, while being an order of magnitude faster. Flye nearly doubled the contiguity of the human genome assembly (as measured by the NGA50 assembly quality metric) compared with existing assemblers.


Assuntos
Genoma Bacteriano/genética , Genoma Humano/genética , Genômica/métodos , Sequências Repetitivas de Ácido Nucleico/genética , Algoritmos , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Anotação de Sequência Molecular , Análise de Sequência de DNA , Software
18.
Bioinformatics ; 35(18): 3476-3478, 2019 09 15.
Artigo em Inglês | MEDLINE | ID: mdl-30715194

RESUMO

SUMMARY: Currently, most genome assembly projects focus on contigs and scaffolds rather than assembly graphs that provide a more comprehensive representation of an assembly. Since interactive visualization of large assembly graphs remains an open problem, we developed an Assembly Graph Browser (AGB) tool that visualizes large assembly graphs, extending the functionality of previously developed visualization approaches. Assembly Graph Browser includes a number of novel functions including repeat analysis, construction of the contracted assembly graphs (i.e. the graphs obtained by collapsing a selected set of edges) and a new approach to visualizing large assembly graphs. AVAILABILITY AND IMPLEMENTATION: http://www.github.com/almiheenko/AGB. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Software
19.
Nat Biotechnol ; 2018 Oct 15.
Artigo em Inglês | MEDLINE | ID: mdl-30320765

RESUMO

Although shotgun metagenomic sequencing of microbiome samples enables partial reconstruction of strain-level community structure, obtaining high-quality microbial genome drafts without isolation and culture remains difficult. Here, we present an application of read clouds, short-read sequences tagged with long-range information, to microbiome samples. We present Athena, a de novo assembler that uses read clouds to improve metagenomic assemblies. We applied this approach to sequence stool samples from two healthy individuals and compared it with existing short-read and synthetic long-read metagenomic sequencing techniques. Read-cloud metagenomic sequencing and Athena assembly produced the most comprehensive individual genome drafts with high contiguity (>200-kb N50, fewer than ten contigs), even for bacteria with relatively low (20×) raw short-read-sequence coverage. We also sequenced a complex marine-sediment sample and generated 24 intermediate-quality genome drafts (>70% complete, <10% contaminated), nine of which were complete (>90% complete, <5% contaminated). Our approach allows for culture-free generation of high-quality microbial genome drafts by using a single shotgun experiment.

20.
Nat Genet ; 50(11): 1574-1583, 2018 11.
Artigo em Inglês | MEDLINE | ID: mdl-30275530

RESUMO

We report full-length draft de novo genome assemblies for 16 widely used inbred mouse strains and find extensive strain-specific haplotype variation. We identify and characterize 2,567 regions on the current mouse reference genome exhibiting the greatest sequence diversity. These regions are enriched for genes involved in pathogen defence and immunity and exhibit enrichment of transposable elements and signatures of recent retrotransposition events. Combinations of alleles and genes unique to an individual strain are commonly observed at these loci, reflecting distinct strain phenotypes. We used these genomes to improve the mouse reference genome, resulting in the completion of 10 new gene structures. Also, 62 new coding loci were added to the reference genome annotation. These genomes identified a large, previously unannotated, gene (Efcab3-like) encoding 5,874 amino acids. Mutant Efcab3-like mice display anomalies in multiple brain regions, suggesting a possible role for this gene in the regulation of brain development.


Assuntos
Mapeamento Cromossômico , Loci Gênicos , Genoma , Haplótipos , Camundongos Endogâmicos/genética , Animais , Animais de Laboratório , Mapeamento Cromossômico/veterinária , Haplótipos/genética , Camundongos , Camundongos Endogâmicos BALB C/genética , Camundongos Endogâmicos C3H/genética , Camundongos Endogâmicos C57BL/genética , Camundongos Endogâmicos CBA/genética , Camundongos Endogâmicos DBA/genética , Camundongos Endogâmicos NOD/genética , Camundongos Endogâmicos/classificação , Anotação de Sequência Molecular , Filogenia , Polimorfismo de Nucleotídeo Único , Especificidade da Espécie
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA