Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
BMC Biol ; 22(1): 13, 2024 Jan 25.
Artigo em Inglês | MEDLINE | ID: mdl-38273258

RESUMO

BACKGROUND: Single-nucleotide polymorphisms (SNPs) are the most widely used form of molecular genetic variation studies. As reference genomes and resequencing data sets expand exponentially, tools must be in place to call SNPs at a similar pace. The genome analysis toolkit (GATK) is one of the most widely used SNP calling software tools publicly available, but unfortunately, high-performance computing versions of this tool have yet to become widely available and affordable. RESULTS: Here we report an open-source high-performance computing genome variant calling workflow (HPC-GVCW) for GATK that can run on multiple computing platforms from supercomputers to desktop machines. We benchmarked HPC-GVCW on multiple crop species for performance and accuracy with comparable results with previously published reports (using GATK alone). Finally, we used HPC-GVCW in production mode to call SNPs on a "subpopulation aware" 16-genome rice reference panel with ~ 3000 resequenced rice accessions. The entire process took ~ 16 weeks and resulted in the identification of an average of 27.3 M SNPs/genome and the discovery of ~ 2.3 million novel SNPs that were not present in the flagship reference genome for rice (i.e., IRGSP RefSeq). CONCLUSIONS: This study developed an open-source pipeline (HPC-GVCW) to run GATK on HPC platforms, which significantly improved the speed at which SNPs can be called. The workflow is widely applicable as demonstrated successfully for four major crop species with genomes ranging in size from 400 Mb to 2.4 Gb. Using HPC-GVCW in production mode to call SNPs on a 25 multi-crop-reference genome data set produced over 1.1 billion SNPs that were publicly released for functional and breeding studies. For rice, many novel SNPs were identified and were found to reside within genes and open chromatin regions that are predicted to have functional consequences. Combined, our results demonstrate the usefulness of combining a high-performance SNP calling architecture solution with a subpopulation-aware reference genome panel for rapid SNP discovery and public deployment.


Assuntos
Genoma de Planta , Polimorfismo de Nucleotídeo Único , Fluxo de Trabalho , Melhoramento Vegetal , Software , Sequenciamento de Nucleotídeos em Larga Escala/métodos
2.
Database (Oxford) ; 20232023 11 16.
Artigo em Inglês | MEDLINE | ID: mdl-37971714

RESUMO

Diploid A-genome wheat (einkorn wheat) presents a nutrition-rich option as an ancient grain crop and a resource for the improvement of bread wheat against abiotic and biotic stresses. Realizing the importance of this wheat species, reference-level assemblies of two einkorn wheat accessions were generated (wild and domesticated). This work reports an einkorn genome database that provides an interface to the cereals research community to perform comparative genomics, applied genetics and breeding research. It features queries for annotated genes, the use of a recent genome browser release, and the ability to search for sequence alignments using a modern BLAST interface. Other features include a comparison of reference einkorn assemblies with other wheat cultivars through genomic synteny visualization and an alignment visualization tool for BLAST results. Altogether, this resource will help wheat research and breeding. Database URL  https://wheat.pw.usda.gov/GG3/pangenome.


Assuntos
Genoma de Planta , Triticum , Triticum/genética , Genoma de Planta/genética , Melhoramento Vegetal , Genômica/métodos , Estudos de Associação Genética
3.
Nature ; 620(7975): 830-838, 2023 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-37532937

RESUMO

Einkorn (Triticum monococcum) was the first domesticated wheat species, and was central to the birth of agriculture and the Neolithic Revolution in the Fertile Crescent around 10,000 years ago1,2. Here we generate and analyse 5.2-Gb genome assemblies for wild and domesticated einkorn, including completely assembled centromeres. Einkorn centromeres are highly dynamic, showing evidence of ancient and recent centromere shifts caused by structural rearrangements. Whole-genome sequencing analysis of a diversity panel uncovered the population structure and evolutionary history of einkorn, revealing complex patterns of hybridizations and introgressions after the dispersal of domesticated einkorn from the Fertile Crescent. We also show that around 1% of the modern bread wheat (Triticum aestivum) A subgenome originates from einkorn. These resources and findings highlight the history of einkorn evolution and provide a basis to accelerate the genomics-assisted improvement of einkorn and bread wheat.


Assuntos
Produção Agrícola , Genoma de Planta , Genômica , Triticum , Triticum/classificação , Triticum/genética , Produção Agrícola/história , História Antiga , Sequenciamento Completo do Genoma , Introgressão Genética , Hibridização Genética , Pão/história , Genoma de Planta/genética , Centrômero/genética
4.
Nat Commun ; 14(1): 1567, 2023 03 21.
Artigo em Inglês | MEDLINE | ID: mdl-36944612

RESUMO

Understanding and exploiting genetic diversity is a key factor for the productive and stable production of rice. Here, we utilize 73 high-quality genomes that encompass the subpopulation structure of Asian rice (Oryza sativa), plus the genomes of two wild relatives (O. rufipogon and O. punctata), to build a pan-genome inversion index of 1769 non-redundant inversions that span an average of ~29% of the O. sativa cv. Nipponbare reference genome sequence. Using this index, we estimate an inversion rate of ~700 inversions per million years in Asian rice, which is 16 to 50 times higher than previously estimated for plants. Detailed analyses of these inversions show evidence of their effects on gene expression, recombination rate, and linkage disequilibrium. Our study uncovers the prevalence and scale of large inversions (≥100 bp) across the pan-genome of Asian rice and hints at their largely unexplored role in functional biology and crop performance.


Assuntos
Oryza , Oryza/genética , Análise de Sequência de DNA , Genoma de Planta/genética , Evolução Biológica , Filogenia
5.
Bioinformatics ; 38(6): 1677-1684, 2022 03 04.
Artigo em Inglês | MEDLINE | ID: mdl-34951628

RESUMO

MOTIVATION: Structural genomic variants account for much of human variability and are involved in several diseases. Structural variants are complex and may affect coding regions of multiple genes, or affect the functions of genomic regions in different ways from single nucleotide variants. Interpreting the phenotypic consequences of structural variants relies on information about gene functions, haploinsufficiency or triplosensitivity and other genomic features. Phenotype-based methods to identifying variants that are involved in genetic diseases combine molecular features with prior knowledge about the phenotypic consequences of altering gene functions. While phenotype-based methods have been applied successfully to single nucleotide variants as well as short insertions and deletions, the complexity of structural variants makes it more challenging to link them to phenotypes. Furthermore, structural variants can affect a large number of coding regions, and phenotype information may not be available for all of them. RESULTS: We developed DeepSVP, a computational method to prioritize structural variants involved in genetic diseases by combining genomic and gene functions information. We incorporate phenotypes linked to genes, functions of gene products, gene expression in individual cell types and anatomical sites of expression, and systematically relate them to their phenotypic consequences through ontologies and machine learning. DeepSVP significantly improves the success rate of finding causative variants in several benchmarks and can identify novel pathogenic structural variants in consanguineous families. AVAILABILITY AND IMPLEMENTATION: https://github.com/bio-ontology-research-group/DeepSVP. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Aprendizado Profundo , Humanos , Genótipo , Fenótipo , Genômica , Nucleotídeos
6.
Nat Commun ; 11(1): 4488, 2020 09 08.
Artigo em Inglês | MEDLINE | ID: mdl-32901040

RESUMO

Sustainable food production in the context of climate change necessitates diversification of agriculture and a more efficient utilization of plant genetic resources. Fonio millet (Digitaria exilis) is an orphan African cereal crop with a great potential for dryland agriculture. Here, we establish high-quality genomic resources to facilitate fonio improvement through molecular breeding. These include a chromosome-scale reference assembly and deep re-sequencing of 183 cultivated and wild Digitaria accessions, enabling insights into genetic diversity, population structure, and domestication. Fonio diversity is shaped by climatic, geographic, and ethnolinguistic factors. Two genes associated with seed size and shattering showed signatures of selection. Most known domestication genes from other cereal models however have not experienced strong selection in fonio, providing direct targets to rapidly improve this crop for agriculture in hot and dry environments.


Assuntos
Digitaria/genética , Grão Comestível/genética , África , Agricultura/métodos , Mudança Climática , Digitaria/classificação , Domesticação , Grão Comestível/classificação , Evolução Molecular , Variação Genética , Genoma de Planta , Anotação de Sequência Molecular , Seleção Genética , Especificidade da Espécie
7.
Sci Rep ; 7(1): 9058, 2017 08 22.
Artigo em Inglês | MEDLINE | ID: mdl-28831090

RESUMO

Next generation sequencing (NGS) data analysis is highly compute intensive. In-memory computing, vectorization, bulk data transfer, CPU frequency scaling are some of the hardware features in the modern computing architectures. To get the best execution time and utilize these hardware features, it is necessary to tune the system level parameters before running the application. We studied the GATK-HaplotypeCaller which is part of common NGS workflows, that consume more than 43% of the total execution time. Multiple GATK 3.x versions were benchmarked and the execution time of HaplotypeCaller was optimized by various system level parameters which included: (i) tuning the parallel garbage collection and kernel shared memory to simulate in-memory computing, (ii) architecture-specific tuning in the PairHMM library for vectorization, (iii) including Java 1.8 features through GATK source code compilation and building a runtime environment for parallel sorting and bulk data transfer (iv) the default 'on-demand' mode of CPU frequency is over-clocked by using 'performance-mode' to accelerate the Java multi-threads. As a result, the HaplotypeCaller execution time was reduced by 82.66% in GATK 3.3 and 42.61% in GATK 3.7. Overall, the execution time of NGS pipeline was reduced to 70.60% and 34.14% for GATK 3.3 and GATK 3.7 respectively.


Assuntos
Biologia Computacional/métodos , Biologia Computacional/normas , Sequenciamento de Nucleotídeos em Larga Escala , Análise de Sequência de DNA , Mapeamento Cromossômico , Bases de Dados Genéticas , Genômica/métodos , Genômica/normas , Técnicas de Genotipagem , Sequenciamento de Nucleotídeos em Larga Escala/normas , Reprodutibilidade dos Testes , Análise de Sequência de DNA/normas , Software , Fluxo de Trabalho
8.
J Neurol ; 263(11): 2308-2318, 2016 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-27544505

RESUMO

Parkinson's disease (PD) is a progressive neurological disorder and appears to have gender-specific symptoms. Studies have observed a higher frequency for development of PD in male than in female. In the current study, we evaluated the gender-based changes in cortical thickness and structural connectivity in PD patients. With informed consent, 64 PD (43 males and 21 females) patients, and 46 (12 males and 34 females) age-matched controls underwent clinical assessment including Mini-Mental State Examination (MMSE) and magnetic resonance imaging on a 1.5 Tesla clinical MR scanner. Whole brain high-resolution T1-weighted images were acquired from all subjects and used to measure cortical thickness and structural network connectivity. No significant difference in MMSE score was observed between male and female both in control and PD subjects. Male PD patients showed significantly reduced cortical thickness in multiple brain regions including frontal, parietal, temporal, and occipital lobes as compared with those in female PD patients. The graph theory-based network analysis depicted lower connection strengths, lower clustering coefficients, and altered network hubs in PD male than in PD female. Male-specific cortical thickness changes and altered connectivity in PD patients may derive from behavioral, physiological, environmental, and genetical differences between male and female, and may have significant implications in diagnosing and treating PD among genders.


Assuntos
Córtex Cerebral/patologia , Vias Neurais/patologia , Doença de Parkinson/patologia , Caracteres Sexuais , Idoso , Idoso de 80 Anos ou mais , Análise de Variância , Estudos de Casos e Controles , Córtex Cerebral/diagnóstico por imagem , Feminino , Humanos , Processamento de Imagem Assistida por Computador , Imageamento por Ressonância Magnética , Masculino , Entrevista Psiquiátrica Padronizada , Pessoa de Meia-Idade , Vias Neurais/diagnóstico por imagem
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...