Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 665
Filtrar
1.
Hum Genet ; 141(2): 273-281, 2022 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-35048190

RESUMO

Recombination is a major force that shapes genetic diversity. Determination of recombination rate is important and can theoretically be improved by increasing the sample size. However, it is nearly impossible to estimate recombination rates using traditional population genetics methods when the sample size is large because these methods are highly computationally demanding. In this study, we used a refined machine learning approach to estimate the recombination rate of the human genome using the UK10K human genomic dataset with 7,562 genomic sequences and its three subsets with 200, 400 and 2,000 genomic sequences. The estimation was performed under the human Out-of-Africa demographic model. We not only obtained an accurate human genetic map, but also found that the fluctuation of estimated recombination rate is reduced along the human genome when the sample size increases. The estimated UK10K recombination rate heterogeneity is less than that estimated from its subsets. Our results demonstrate how the sample size affects the estimated recombination rate, and analyses of a larger number of genomes result in a more precise estimation of recombination rate. The accurate genetic map based on UK10K data set is also expected to benefit other human biology researches.


Assuntos
Mapeamento Cromossômico/métodos , Genoma Humano , Mapeamento Cromossômico/estatística & dados numéricos , Bases de Dados Genéticas/estatística & dados numéricos , Genética Populacional , Humanos , Aprendizado de Máquina , Modelos Genéticos , Recombinação Genética , Tamanho da Amostra , Software , Reino Unido
2.
Int J Obes (Lond) ; 46(2): 307-315, 2022 02.
Artigo em Inglês | MEDLINE | ID: mdl-34689180

RESUMO

BACKGROUND: The Berlin Fat Mouse Inbred line (BFMI) is a model for obesity and the metabolic syndrome. This study aimed to identify genetic variants associated with impaired glucose metabolism using the obese lines BFMI861-S1 and BFMI861-S2, which are genetically closely related, but differ in several traits. BFMI861-S1 is insulin resistant and stores ectopic fat in the liver, whereas BFMI861-S2 is insulin sensitive. METHODS: In generation 10, 397 males of an advanced intercross line (AIL) BFMI861-S1 × BFMI861-S2 were challenged with a high-fat, high-carbohydrate diet and phenotyped over 25 weeks. QTL-analysis was performed after selective genotyping of 200 mice using the GigaMUGA Genotyping Array. Additional 197 males were genotyped for 7 top SNPs in QTL regions. For the prioritization of positional candidate genes whole genome sequencing and gene expression data of the parental lines were used. RESULTS: Overlapping QTL for gonadal adipose tissue weight and blood glucose concentration were detected on chromosome (Chr) 3 (95.8-100.1 Mb), and for gonadal adipose tissue weight, liver weight, and blood glucose concentration on Chr 17 (9.5-26.1 Mb). Causal modeling suggested for Chr 3-QTL direct effects on adipose tissue weight, but indirect effects on blood glucose concentration. Direct effects on adipose tissue weight, liver weight, and blood glucose concentration were suggested for Chr 17-QTL. Prioritized positional candidate genes for the identified QTL were Notch2 and Fmo5 (Chr 3) and Plg and Acat2 (Chr 17). Two additional QTL were detected for gonadal adipose tissue weight on Chr 15 (67.9-74.6 Mb) and for body weight on Chr 16 (3.9-21.4 Mb). CONCLUSIONS: QTL mapping together with a detailed prioritization approach allowed us to identify candidate genes associated with traits of the metabolic syndrome. In addition, we provided evidence for direct and indirect genetic effects on blood glucose concentration in the insulin-resistant mouse line BFMI861-S1.


Assuntos
Obesidade/dietoterapia , Locos de Características Quantitativas/genética , Animais , Carboidratos/efeitos adversos , Mapeamento Cromossômico/métodos , Mapeamento Cromossômico/estatística & dados numéricos , Dieta Hiperlipídica/efeitos adversos , Dieta Hiperlipídica/estatística & dados numéricos , Modelos Animais de Doenças , Camundongos , Obesidade/metabolismo , Obesidade/fisiopatologia , Locos de Características Quantitativas/fisiologia
3.
Pak J Biol Sci ; 24(9): 997-1014, 2021 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-34585553

RESUMO

<b>Background and Objective:</b> Barley is considering one of the most important cereal crops at the local and global levels. It is ranked second in terms of nutritional importance after wheat and its flour contributes significantly to bridging the large nutritional gap in the production of Egyptian bread. The aim of this study concentrated on knowing and testing the genetic behaviour responsible for salinity stress tolerance in barley as trying to improve barley crop and increase its ability for abiotic stress resistance under Egyptian conditions. <b>Materials and Methods:</b> Twenty-one crosses and ten parents of barley with different responses to salinity tolerance were evaluated in this investigation under normal and salinity conditions. Yield and its components and some physiological traits related to salt stress tolerance were the most important studied attributes evaluated in this regard under both conditions. Moreover, SSR markers were used to evaluate and identified associated markers for salinity tolerance in selected hybrids and comparing among the ten barley parents. <b>Results:</b> The final results confirmed that the three testers; Giza 123, Giza 126 and Giza 2000 besides; the crosses; Line 1XTester 1 (Giza 125XGiza 123), Line 2XTester 1 (Giza 133XGiza 123), Line 1XTester 2 (Giza 125XGiza 126), Line 2XTester 2 (Giza 133XGiza 126) and Line 1XTester 3 (Giza 125XGiza 2000) exhibited highly salinity tolerance under saline stress treatment compared with the control experiment. Among 15 analyzed barley entries, the chosen set of 11 markers amplified 20 alleles with an average of 1.81, with a range from 1-4 alleles. <b>Conclusion:</b> The results of SSR analysis and the data on valued agricultural trait loci determined the genetic distance among parents and their hybrids, which is of an unlimited rate for breeders.


Assuntos
Hordeum/microbiologia , Estresse Salino , Quimera/microbiologia , Quimera/fisiologia , Mapeamento Cromossômico/métodos , Mapeamento Cromossômico/estatística & dados numéricos , Egito , Hordeum/fisiologia
4.
PLoS Comput Biol ; 17(9): e1009373, 2021 09.
Artigo em Inglês | MEDLINE | ID: mdl-34534210

RESUMO

Despite the growing constellation of genetic loci linked to common traits, these loci have yet to account for most heritable variation, and most act through poorly understood mechanisms. Recent machine learning (ML) systems have used hierarchical biological knowledge to associate genetic mutations with phenotypic outcomes, yielding substantial predictive power and mechanistic insight. Here, we use an ontology-guided ML system to map single nucleotide variants (SNVs) focusing on 6 classic phenotypic traits in natural yeast populations. The 29 identified loci are largely novel and account for ~17% of the phenotypic variance, versus <3% for standard genetic analysis. Representative results show that sensitivity to hydroxyurea is linked to SNVs in two alternative purine biosynthesis pathways, and that sensitivity to copper arises through failure to detoxify reactive oxygen species in fatty acid metabolism. This work demonstrates a knowledge-based approach to amplifying and interpreting signals in population genetic studies.


Assuntos
Aprendizado de Máquina , Modelos Genéticos , Herança Multifatorial , Benomilo/toxicidade , Mapeamento Cromossômico/métodos , Mapeamento Cromossômico/estatística & dados numéricos , Biologia Computacional , Cobre/toxicidade , Ontologia Genética , Estudo de Associação Genômica Ampla , Glucose/metabolismo , Glicina/metabolismo , Hidroxiureia/farmacologia , Bases de Conhecimento , Redes e Vias Metabólicas/efeitos dos fármacos , Redes e Vias Metabólicas/genética , Mutação , Redes Neurais de Computação , Nucleotidiltransferases/metabolismo , Fenótipo , Polimorfismo de Nucleotídeo Único , Saccharomyces cerevisiae/efeitos dos fármacos , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Biologia de Sistemas
5.
Nat Genet ; 53(8): 1260-1269, 2021 08.
Artigo em Inglês | MEDLINE | ID: mdl-34226706

RESUMO

Exome association studies to date have generally been underpowered to systematically evaluate the phenotypic impact of very rare coding variants. We leveraged extensive haplotype sharing between 49,960 exome-sequenced UK Biobank participants and the remainder of the cohort (total n ≈ 500,000) to impute exome-wide variants with accuracy R2 > 0.5 down to minor allele frequency (MAF) ~0.00005. Association and fine-mapping analyses of 54 quantitative traits identified 1,189 significant associations (P < 5 × 10-8) involving 675 distinct rare protein-altering variants (MAF < 0.01) that passed stringent filters for likely causality. Across all traits, 49% of associations (578/1,189) occurred in genes with two or more hits; follow-up analyses of these genes identified allelic series containing up to 45 distinct 'likely-causal' variants. Our results demonstrate the utility of within-cohort imputation in population-scale genome-wide association studies, provide a catalog of likely-causal, large-effect coding variant associations and foreshadow the insights that will be revealed as genetic biobank studies continue to grow.


Assuntos
Bancos de Espécimes Biológicos , Sequenciamento do Exoma/estatística & dados numéricos , Frequência do Gene , Proteínas/genética , Pressão Sanguínea/genética , Mapeamento Cromossômico/métodos , Mapeamento Cromossômico/estatística & dados numéricos , Marcadores Genéticos , Estudo de Associação Genômica Ampla/estatística & dados numéricos , Humanos , Desequilíbrio de Ligação , Proteínas de Membrana/genética , Modelos Genéticos , Fenótipo , Polimorfismo de Nucleotídeo Único , Proteínas/metabolismo , Receptores do Fator Natriurético Atrial/genética , Reino Unido , Sequenciamento do Exoma/métodos
6.
Genome Biol ; 22(1): 188, 2021 06 24.
Artigo em Inglês | MEDLINE | ID: mdl-34167583

RESUMO

BACKGROUND: Single-cell RNA sequencing (scRNA-seq) has enabled the unbiased, high-throughput quantification of gene expression specific to cell types and states. With the cost of scRNA-seq decreasing and techniques for sample multiplexing improving, population-scale scRNA-seq, and thus single-cell expression quantitative trait locus (sc-eQTL) mapping, is increasingly feasible. Mapping of sc-eQTL provides additional resolution to study the regulatory role of common genetic variants on gene expression across a plethora of cell types and states and promises to improve our understanding of genetic regulation across tissues in both health and disease. RESULTS: While previously established methods for bulk eQTL mapping can, in principle, be applied to sc-eQTL mapping, there are a number of open questions about how best to process scRNA-seq data and adapt bulk methods to optimize sc-eQTL mapping. Here, we evaluate the role of different normalization and aggregation strategies, covariate adjustment techniques, and multiple testing correction methods to establish best practice guidelines. We use both real and simulated datasets across single-cell technologies to systematically assess the impact of these different statistical approaches. CONCLUSION: We provide recommendations for future single-cell eQTL studies that can yield up to twice as many eQTL discoveries as default approaches ported from bulk studies.


Assuntos
Mapeamento Cromossômico/estatística & dados numéricos , Genoma Humano , Células-Tronco Pluripotentes Induzidas/metabolismo , Locos de Características Quantitativas , Análise de Célula Única/métodos , Alelos , Linhagem Celular , Perfilação da Expressão Gênica , Regulação da Expressão Gênica , Humanos , Células-Tronco Pluripotentes Induzidas/citologia , Análise de Sequência de RNA , Software , Sequenciamento do Exoma
7.
PLoS Comput Biol ; 17(4): e1008926, 2021 04.
Artigo em Inglês | MEDLINE | ID: mdl-33872311

RESUMO

Next-generation sequencing (NGS) has transformed molecular biology and contributed to many seminal insights into genomic regulation and function. Apart from whole-genome sequencing, an NGS workflow involves alignment of the sequencing reads to the genome of study, after which the resulting alignments can be used for downstream analyses. However, alignment is complicated by the repetitive sequences; many reads align to more than one genomic locus, with 15-30% of the genome not being uniquely mappable by short-read NGS. This problem is typically addressed by discarding reads that do not uniquely map to the genome, but this practice can lead to systematic distortion of the data. Previous studies that developed methods for handling ambiguously mapped reads were often of limited applicability or were computationally intensive, hindering their broader usage. In this work, we present SmartMap: an algorithm that augments industry-standard aligners to enable usage of ambiguously mapped reads by assigning weights to each alignment with Bayesian analysis of the read distribution and alignment quality. SmartMap is computationally efficient, utilizing far fewer weighting iterations than previously thought necessary to process alignments and, as such, analyzing more than a billion alignments of NGS reads in approximately one hour on a desktop PC. By applying SmartMap to peak-type NGS data, including MNase-seq, ChIP-seq, and ATAC-seq in three organisms, we can increase read depth by up to 53% and increase the mapped proportion of the genome by up to 18% compared to analyses utilizing only uniquely mapped reads. We further show that SmartMap enables the analysis of more than 140,000 repetitive elements that could not be analyzed by traditional ChIP-seq workflows, and we utilize this method to gain insight into the epigenetic regulation of different classes of repetitive elements. These data emphasize both the dangers of discarding ambiguously mapped reads and their power for driving biological discovery.


Assuntos
Teorema de Bayes , Mapeamento Cromossômico/estatística & dados numéricos , Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , Imunoprecipitação da Cromatina , DNA/genética , Conjuntos de Dados como Assunto , Genoma Humano , Humanos , Sequências Repetitivas de Ácido Nucleico , Reprodutibilidade dos Testes , Alinhamento de Sequência
8.
Nucleic Acids Res ; 48(21): 12074-12084, 2020 12 02.
Artigo em Inglês | MEDLINE | ID: mdl-33219687

RESUMO

CRISPR-Cas systems require discriminating self from non-self DNA during adaptation and interference. Yet, multiple cases have been reported of bacteria containing self-targeting spacers (STS), i.e. CRISPR spacers targeting protospacers on the same genome. STS has been suggested to reflect potential auto-immunity as an unwanted side effect of CRISPR-Cas defense, or a regulatory mechanism for gene expression. Here we investigated the incidence, distribution, and evasion of STS in over 100 000 bacterial genomes. We found STS in all CRISPR-Cas types and in one fifth of all CRISPR-carrying bacteria. Notably, up to 40% of I-B and I-F CRISPR-Cas systems contained STS. We observed that STS-containing genomes almost always carry a prophage and that STS map to prophage regions in more than half of the cases. Despite carrying STS, genetic deterioration of CRISPR-Cas systems appears to be rare, suggesting a level of escape from the potentially deleterious effects of STS by other mechanisms such as anti-CRISPR proteins and CRISPR target mutations. We propose a scenario where it is common to acquire an STS against a prophage, and this may trigger more extensive STS buildup by primed spacer acquisition in type I systems, without detrimental autoimmunity effects as mechanisms of auto-immunity evasion create tolerance to STS-targeted prophages.


Assuntos
Bactérias/genética , Proteínas Associadas a CRISPR/genética , Sistemas CRISPR-Cas/imunologia , Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas/imunologia , Genoma Bacteriano , Prófagos/genética , Autoimunidade/genética , Bactérias/imunologia , Bactérias/virologia , Sequência de Bases , Proteína 9 Associada à CRISPR/genética , Proteína 9 Associada à CRISPR/imunologia , Proteínas Associadas a CRISPR/imunologia , Mapeamento Cromossômico/estatística & dados numéricos , Software
9.
Nucleic Acids Res ; 48(21): e123, 2020 12 02.
Artigo em Inglês | MEDLINE | ID: mdl-33074315

RESUMO

The recently developed Hi-C technique has been widely applied to map genome-wide chromatin interactions. However, current methods for analyzing diploid Hi-C data cannot fully distinguish between homologous chromosomes. Consequently, the existing diploid Hi-C analyses are based on sparse and inaccurate allele-specific contact matrices, which might lead to incorrect modeling of diploid genome architecture. Here we present ASHIC, a hierarchical Bayesian framework to model allele-specific chromatin organizations in diploid genomes. We developed two models under the Bayesian framework: the Poisson-multinomial (ASHIC-PM) model and the zero-inflated Poisson-multinomial (ASHIC-ZIPM) model. The proposed ASHIC methods impute allele-specific contact maps from diploid Hi-C data and simultaneously infer allelic 3D structures. Through simulation studies, we demonstrated that ASHIC methods outperformed existing approaches, especially under low coverage and low SNP density conditions. Additionally, in the analyses of diploid Hi-C datasets in mouse and human, our ASHIC-ZIPM method produced fine-resolution diploid chromatin maps and 3D structures and provided insights into the allelic chromatin organizations and functions. To summarize, our work provides a statistically rigorous framework for investigating fine-scale allele-specific chromatin conformations. The ASHIC software is publicly available at https://github.com/wmalab/ASHIC.


Assuntos
Montagem e Desmontagem da Cromatina , Cromatina/ultraestrutura , Mapeamento Cromossômico/estatística & dados numéricos , Software , Alelos , Animais , Teorema de Bayes , Cromatina/metabolismo , Mapeamento Cromossômico/métodos , Simulação por Computador , Diploide , Fibroblastos/metabolismo , Fibroblastos/ultraestrutura , Impressão Genômica , Histonas/genética , Histonas/metabolismo , Humanos , Fator de Crescimento Insulin-Like II/genética , Fator de Crescimento Insulin-Like II/metabolismo , Internet , Camundongos , Polimorfismo de Nucleotídeo Único
10.
Am J Hum Genet ; 107(5): 895-910, 2020 11 05.
Artigo em Inglês | MEDLINE | ID: mdl-33053335

RESUMO

Most methods for fast detection of identity by descent (IBD) segments report identity by state segments without any quantification of the uncertainty in the endpoints and lengths of the IBD segments. We present a method for determining the posterior probability distribution of IBD segment endpoints. Our approach accounts for genotype errors, recent mutations, and gene conversions which disrupt DNA sequence identity within IBD segments, and it can be applied to large cohorts with whole-genome sequence or SNP array data. We find that our method's estimates of uncertainty are well calibrated for homogeneous samples. We quantify endpoint uncertainty for 77.7 billion IBD segments from 408,883 individuals of white British ancestry in the UK Biobank, and we use these IBD segments to find regions showing evidence of recent natural selection. We show that many spurious selection signals are eliminated by the use of unbiased estimates of IBD segment endpoints and a pedigree-based genetic map. Eleven of the twelve regions with the greatest evidence for recent selection in our scan have been identified as selected in previous analyses using different approaches. Our computationally efficient method for quantifying IBD segment endpoint uncertainty is implemented in the open source ibd-ends software package.


Assuntos
Identificação Biométrica/métodos , Mapeamento Cromossômico/estatística & dados numéricos , Genoma Humano , Padrões de Herança , Modelos Estatísticos , Polimorfismo de Nucleotídeo Único , Bancos de Espécimes Biológicos , Família , Feminino , Estudo de Associação Genômica Ampla , Humanos , Masculino , Linhagem , Software , Incerteza , Reino Unido
11.
JMIR Public Health Surveill ; 6(2): e15917, 2020 04 30.
Artigo em Inglês | MEDLINE | ID: mdl-32352389

RESUMO

BACKGROUND: Many public health departments use record linkage between surveillance data and external data sources to inform public health interventions. However, little guidance is available to inform these activities, and many health departments rely on deterministic algorithms that may miss many true matches. In the context of public health action, these missed matches lead to missed opportunities to deliver interventions and may exacerbate existing health inequities. OBJECTIVE: This study aimed to compare the performance of record linkage algorithms commonly used in public health practice. METHODS: We compared five deterministic (exact, Stenger, Ocampo 1, Ocampo 2, and Bosh) and two probabilistic record linkage algorithms (fastLink and beta record linkage [BRL]) using simulations and a real-world scenario. We simulated pairs of datasets with varying numbers of errors per record and the number of matching records between the two datasets (ie, overlap). We matched the datasets using each algorithm and calculated their recall (ie, sensitivity, the proportion of true matches identified by the algorithm) and precision (ie, positive predictive value, the proportion of matches identified by the algorithm that were true matches). We estimated the average computation time by performing a match with each algorithm 20 times while varying the size of the datasets being matched. In a real-world scenario, HIV and sexually transmitted disease surveillance data from King County, Washington, were matched to identify people living with HIV who had a syphilis diagnosis in 2017. We calculated the recall and precision of each algorithm compared with a composite standard based on the agreement in matching decisions across all the algorithms and manual review. RESULTS: In simulations, BRL and fastLink maintained a high recall at nearly all data quality levels, while being comparable with deterministic algorithms in terms of precision. Deterministic algorithms typically failed to identify matches in scenarios with low data quality. All the deterministic algorithms had a shorter average computation time than the probabilistic algorithms. BRL had the slowest overall computation time (14 min when both datasets contained 2000 records). In the real-world scenario, BRL had the lowest trade-off between recall (309/309, 100.0%) and precision (309/312, 99.0%). CONCLUSIONS: Probabilistic record linkage algorithms maximize the number of true matches identified, reducing gaps in the coverage of interventions and maximizing the reach of public health action.


Assuntos
Algoritmos , COVID-19/diagnóstico , Mapeamento Cromossômico/normas , Registros Eletrônicos de Saúde/instrumentação , Saúde Pública/instrumentação , COVID-19/fisiopatologia , Mapeamento Cromossômico/métodos , Mapeamento Cromossômico/estatística & dados numéricos , Registros Eletrônicos de Saúde/normas , Registros Eletrônicos de Saúde/tendências , Humanos , Pandemias/prevenção & controle , Saúde Pública/métodos , Saúde Pública/tendências , Reprodutibilidade dos Testes , Estudos de Validação como Assunto
12.
Curr Protein Pept Sci ; 21(11): 1068-1077, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32338215

RESUMO

Many studies have shown that the spatial distribution of genes within a single chromosome exhibits distinct patterns. However, little is known about the characteristics of inter-chromosomal distribution of genes (including protein-coding genes, processed transcripts and pseudogenes) in different genomes. In this study, we explored these issues using the available genomic data of both human and model organisms. Moreover, we also analyzed the distribution pattern of protein-coding genes that have been associated with 14 common diseases and the insert/deletion mutations and single nucleotide polymorphisms detected by whole genome sequencing in an acute promyelocyte leukemia patient. We obtained the following novel findings. Firstly, inter-chromosomal distribution of genes displays a nonstochastic pattern and the gene densities in different chromosomes are heterogeneous. This kind of heterogeneity is observed in genomes of both lower and higher species. Secondly, protein-coding genes involved in certain biological processes tend to be enriched in one or a few chromosomes. Our findings have added new insights into our understanding of the spatial distribution of genome and disease- related genes across chromosomes. These results could be useful in improving the efficiency of disease-associated gene screening studies by targeting specific chromosomes.


Assuntos
Doença das Coronárias/genética , Epistasia Genética , Lúpus Eritematoso Sistêmico/genética , Neoplasias/genética , Doenças Neurodegenerativas/genética , Acidente Vascular Cerebral/genética , Animais , Composição de Bases , Caenorhabditis elegans/genética , Mapeamento Cromossômico/estatística & dados numéricos , Cromossomos Humanos/química , Doença das Coronárias/diagnóstico , Doença das Coronárias/patologia , Drosophila melanogaster/genética , Genoma Humano , Estudo de Associação Genômica Ampla , Humanos , Lúpus Eritematoso Sistêmico/diagnóstico , Lúpus Eritematoso Sistêmico/patologia , Camundongos , Neoplasias/classificação , Neoplasias/diagnóstico , Neoplasias/patologia , Doenças Neurodegenerativas/classificação , Doenças Neurodegenerativas/diagnóstico , Doenças Neurodegenerativas/patologia , Fases de Leitura Aberta , Acidente Vascular Cerebral/diagnóstico , Acidente Vascular Cerebral/patologia , Peixe-Zebra/genética
13.
PLoS One ; 15(2): e0228951, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32074141

RESUMO

Segregation distortion is the phenomenon in which genotypes deviate from expected Mendelian ratios in the progeny of a cross between two varieties or species. There is not currently a widely used consensus for the appropriate statistical test, or more specifically the multiple testing correction procedure, used to detect segregation distortion for high-density single-nucleotide polymorphism (SNP) data. Here we examine the efficacy of various multiple testing procedures, including chi-square test with no correction for multiple testing, false-discovery rate correction and Bonferroni correction using an in-silico simulation of a biparental mapping population. We find that the false discovery rate correction best approximates the traditional p-value threshold of 0.05 for high-density marker data. We also utilize this simulation to test the effect of segregation distortion on the genetic mapping process, specifically on the formation of linkage groups during marker clustering. Only extreme segregation distortion was found to effect genetic mapping. In addition, we utilize replicate empirical mapping populations of wheat varieties Avalon and Cadenza to assess how often segregation distortion conforms to the same pattern between closely related wheat varieties.


Assuntos
Mapeamento Cromossômico/métodos , Mapeamento Cromossômico/estatística & dados numéricos , Segregação de Cromossomos/fisiologia , Cromossomos de Plantas/genética , Simulação por Computador , Interpretação Estatística de Dados , Ligação Genética/genética , Genótipo , Polimorfismo de Nucleotídeo Único/genética , Locos de Características Quantitativas/genética , Triticum/genética
14.
Genes (Basel) ; 12(1)2020 12 31.
Artigo em Inglês | MEDLINE | ID: mdl-33396302

RESUMO

The study of fish cytogenetics has been impeded by the inability to produce G-bands that could assign chromosomes to their homologous pairs. Thus, the majority of karyotypes published have been estimated based on morphological similarities of chromosomes. The reason why chromosome G-banding does not work in fish remains elusive. However, the recent increase in the number of fish genomes assembled to the chromosome level provides a way to analyse this issue. We have developed a Python tool to visualize and quantify GC percentage (GC%) of both repeats and unique DNA along chromosomes using a non-overlapping sliding window approach. Our tool profiles GC% and simultaneously plots the proportion of repeats (rep%) in a color scale (or vice versa). Hence, it is possible to assess the contribution of repeats to the total GC%. The main differences are the GC% of repeats homogenizing the overall GC% along fish chromosomes and a greater range of GC% scattered along fish chromosomes. This may explain the inability to produce G-banding in fish. We also show an occasional banding pattern along the chromosomes in some fish that probably cannot be detected with traditional qualitative cytogenetic methods.


Assuntos
Composição de Bases , Mapeamento Cromossômico/métodos , Peixes/genética , Genoma , Cariotipagem/métodos , Software , Animais , Gatos , Bandeamento Cromossômico , Mapeamento Cromossômico/estatística & dados numéricos , Peixes/classificação , Gorilla gorilla/classificação , Gorilla gorilla/genética , Sequências de Repetição em Tandem
15.
Ann Neurol ; 87(2): 184-193, 2020 02.
Artigo em Inglês | MEDLINE | ID: mdl-31788832

RESUMO

OBJECTIVE: Restless legs syndrome is a frequent neurological disorder with substantial burden on individual well-being and public health. Genetic risk loci have been identified, but the causatives genes at these loci are largely unknown, so that functional investigation and clinical translation of molecular research data are still inhibited. To identify putatively causative genes, we searched for highly significant mutational burden in candidate genes. METHODS: We analyzed 84 candidate genes in 4,649 patients and 4,982 controls by next generation sequencing using molecular inversion probes that targeted mainly coding regions. The burden of low-frequency and rare variants was assessed, and in addition, an algorithm (binomial performance deviation analysis) was established to estimate independently the sequence variation in the probe binding regions from the variation in sequencing depth. RESULTS: Highly significant results (considering the number of genes in the genome) of the conventional burden test and the binomial performance deviation analysis overlapped significantly. Fourteen genes were highly significant by one method and confirmed with Bonferroni-corrected significance by the other to show a differential burden of low-frequency and rare variants in restless legs syndrome. Nine of them (AAGAB, ATP2C1, CNTN4, COL6A6, CRBN, GLO1, NTNG1, STEAP4, VAV3) resided in the vicinity of known restless legs syndrome loci, whereas 5 (BBS7, CADM1, CREB5, NRG3, SUN1) have not previously been associated with restless legs syndrome. Burden test and binomial performance deviation analysis also converged significantly in fine-mapping potentially causative domains within these genes. INTERPRETATION: Differential burden with intragenic low-frequency variants reveals putatively causative genes in restless legs syndrome. ANN NEUROL 2020;87:184-193.


Assuntos
Análise Mutacional de DNA , Predisposição Genética para Doença/genética , Síndrome das Pernas Inquietas/genética , Estudos de Casos e Controles , Mapeamento Cromossômico/estatística & dados numéricos , Feminino , Humanos , Masculino , Pessoa de Meia-Idade
16.
Sci Rep ; 9(1): 16855, 2019 11 14.
Artigo em Inglês | MEDLINE | ID: mdl-31728008

RESUMO

Ramie is an important natural fiber crop, and the fiber yield and its related traits are the most valuable traits in ramie production. However, the genetic basis for these traits is still poorly understood, which has dramatically hindered the breeding of high yield in this fiber crop. Herein, a high-density genetic map with 6,433 markers spanning 2476.5 cM was constructed using a population derived from two parents, cultivated ramie Zhongsizhu 1 (ZSZ1) and its wild progenitor B. nivea var. tenacissima (BNT). The fiber yield (FY) and its four related traits-stem diameter (SD) and length (SL), stem bark weight (BW) and thickness (BT)-were performed for quantitative trait locus (QTL) analysis, resulting in a total of 47 QTLs identified. Forty QTLs were mapped into 12 genomic regions, thus forming 12 QTL clusters. Among 47 QTLs, there were 14 QTLs whose wild allele from BNT was beneficial. Interestingly, all QTLs in Cluster 10 displayed overdominance, indicating that the region of this cluster was likely heterotic loci. In addition, four fiber yield-related genes underwent positive selection were found either to fall into the FY-related QTL regions or to be near to the identified QTLs. The dissection of FY and FY-related traits not only improved our understanding to the genetic basis of these traits, but also provided new insights into the domestication of FY in ramie. The identification of many QTLs and the discovery of beneficial alleles from wild species provided a basis for the improvement of yield traits in ramie breeding.


Assuntos
Boehmeria/genética , Mapeamento Cromossômico/estatística & dados numéricos , Produtos Agrícolas , Caules de Planta/genética , Locos de Características Quantitativas , Característica Quantitativa Herdável , Boehmeria/anatomia & histologia , Boehmeria/química , Boehmeria/crescimento & desenvolvimento , Cruzamentos Genéticos , Fibras na Dieta/análise , Ligação Genética , Genoma de Planta , Humanos , Melhoramento Vegetal/métodos , Caules de Planta/anatomia & histologia , Caules de Planta/química , Caules de Planta/crescimento & desenvolvimento
17.
Reprod Biomed Online ; 39(1): 40-48, 2019 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-31097322

RESUMO

RESEARCH QUESTION: To analyse why unbalanced viable offspring are derived mainly from the 3:1 segregation mode in t(11;22)(q23;q11.2) reciprocal translocation. DESIGN: Retrospective analysis of 24 pre-implantation genetic testing for chromosomal structural re-arrangements (PGT-SR) cycles was performed on seven male and five female carriers of t(11;22) translocation. Sperm analysis was performed on each male carrier. These patients were directed to the study centre after several years of miscarriages and/or abortions, primary infertility for male carriers or birth of an affected child. RESULTS: Twenty-four PGT-SR cycles were performed to exclude imbalances in both male and female carriers. The unbalanced embryos derived from the adjacent-1 segregation mode were the most represented in both male and female carriers (68.4% and 50%, respectively). These results were positively related with meiotic segregation analysis of reciprocal translocation in spermatozoa. A thorough analysis of the unbalanced embryo karyotypes determined that the expected viable +der22 karyotype resulting from 3:1 malsegregation was less represented at 5.3%. CONCLUSIONS: These findings highlight the divergence that may exist between meiotic segregation and post-zygotic selection. Post-zygotic selection would be responsible for the elimination of unbalanced embryos derived from the adjacent-1 segregation mode. The combined action of several factors occurs at the beginning of post-zygotic selection. Genetic counselling must consider the risk of a birth related to the adjacent-1 segregation mode, irrespective of the sex of the translocation carrier. These results will allow deeper understanding of the PGT results of t(11;22) carriers, which often include a high number of aneuploid embryos.


Assuntos
Cromossomos Humanos Par 11/genética , Cromossomos Humanos Par 22/genética , Padrões de Herança/genética , Diagnóstico Pré-Implantação/métodos , Translocação Genética , Adulto , Mapeamento Cromossômico/métodos , Mapeamento Cromossômico/estatística & dados numéricos , Feminino , Frequência do Gene , Triagem de Portadores Genéticos/métodos , Humanos , Hibridização in Situ Fluorescente/métodos , Hibridização in Situ Fluorescente/estatística & dados numéricos , Cariotipagem , Masculino , Gravidez , Diagnóstico Pré-Implantação/estatística & dados numéricos , Estudos Retrospectivos , Análise do Sêmen/métodos , Análise do Sêmen/estatística & dados numéricos , Translocação Genética/genética
18.
PLoS One ; 14(5): e0216944, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31100083

RESUMO

Most viruses are known to spontaneously generate defective viral genomes (DVG) due to errors during replication. These DVGs are subgenomic and contain deletions that render them unable to complete a full replication cycle in the absence of a co-infecting, non-defective helper virus. DVGs, especially of the copyback type, frequently observed with paramyxoviruses, have been recognized to be important triggers of the antiviral innate immune response. DVGs have therefore gained interest for their potential to alter the attenuation and immunogenicity of vaccines. To investigate this potential, accurate identification and quantification of DVGs is essential. Conventional methods, such as RT-PCR, are labor intensive and will only detect primer sequence-specific species. High throughput sequencing (HTS) is much better suited for this undertaking. Here, we present an HTS-based algorithm called DVG-profiler to identify and quantify all DVG sequences in an HTS data set generated from a virus preparation. DVG-profiler identifies DVG breakpoints relative to a reference genome and reports the directionality of each segment from within the same read. The specificity and sensitivity of the algorithm was assessed using both in silico data sets as well as HTS data obtained from parainfluenza virus 5, Sendai virus and mumps virus preparations. HTS data from the latter were also compared with conventional RT-PCR data and with data obtained using an alternative algorithm. The data presented here demonstrate the high specificity, sensitivity, and robustness of DVG-profiler. This algorithm was implemented within an open source cloud-based computing environment for analyzing HTS data. DVG-profiler might prove valuable not only in basic virus research but also in monitoring live attenuated vaccines for DVG content and to assure vaccine lot to lot consistency.


Assuntos
Algoritmos , Mapeamento Cromossômico/estatística & dados numéricos , Vírus Defeituosos/genética , Genoma Viral , Vírus da Caxumba/genética , Vírus da Parainfluenza 5/genética , Vírus Sendai/genética , Animais , Mapeamento Cromossômico/métodos , Primers do DNA/síntese química , Primers do DNA/metabolismo , Conjuntos de Dados como Assunto , Vírus Defeituosos/classificação , Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , Humanos , Tipagem Molecular , Vírus da Caxumba/classificação , Vírus da Parainfluenza 5/classificação , Reação em Cadeia da Polimerase em Tempo Real , Vírus Sendai/classificação , Sensibilidade e Especificidade
19.
Nat Commun ; 10(1): 1938, 2019 04 26.
Artigo em Inglês | MEDLINE | ID: mdl-31028255

RESUMO

Chromosome conformation capture techniques, such as Hi-C, are fundamental in characterizing genome organization. These methods have revealed several genomic features, such as chromatin loops, whose disruption can have dramatic effects in gene regulation. Unfortunately, their detection is difficult; current methods require that the users choose the resolution of interaction maps based on dataset quality and sequencing depth. Here, we introduce Binless, a resolution-agnostic method that adapts to the quality and quantity of available data, to detect both interactions and differences. Binless relies on an alternate representation of Hi-C data, which leads to a more detailed classification of paired-end reads. Using a large-scale benchmark, we demonstrate that Binless is able to call interactions with higher reproducibility than other existing methods. Binless, which is freely available, can thus reliably be used to identify chromatin loops as well as for differential analysis of chromatin interaction maps.


Assuntos
Caulobacter crescentus/genética , Cromatina/química , Mapeamento Cromossômico/métodos , Biologia Computacional/métodos , DNA/química , Genoma , Benchmarking , Mapeamento Cromossômico/estatística & dados numéricos , DNA/genética , Conjuntos de Dados como Assunto , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Conformação de Ácido Nucleico
20.
PLoS Genet ; 15(3): e1007530, 2019 03.
Artigo em Inglês | MEDLINE | ID: mdl-30875371

RESUMO

A common complementary strategy in Genome-Wide Association Studies (GWAS) is to perform Gene Set Analysis (GSA), which tests for the association between one phenotype of interest and an entire set of Single Nucleotide Polymorphisms (SNPs) residing in selected genes. While there exist many tools for performing GSA, popular methods often include a number of ad-hoc steps that are difficult to justify statistically, provide complicated interpretations based on permutation inference, and demonstrate poor operating characteristics. Additionally, the lack of gold standard gene set lists can produce misleading results and create difficulties in comparing analyses even across the same phenotype. We introduce the Generalized Berk-Jones (GBJ) statistic for GSA, a permutation-free parametric framework that offers asymptotic power guarantees in certain set-based testing settings. To adjust for confounding introduced by different gene set lists, we further develop a GBJ step-down inference technique that can discriminate between gene sets driven to significance by single genes and those demonstrating group-level effects. We compare GBJ to popular alternatives through simulation and re-analysis of summary statistics from a large breast cancer GWAS, and we show how GBJ can increase power by incorporating information from multiple signals in the same gene. In addition, we illustrate how breast cancer pathway analysis can be confounded by the frequency of FGFR2 in pathway lists. Our approach is further validated on two other datasets of summary statistics generated from GWAS of height and schizophrenia.


Assuntos
Estudo de Associação Genômica Ampla/estatística & dados numéricos , Estatura/genética , Neoplasias da Mama/genética , Mapeamento Cromossômico/estatística & dados numéricos , Biologia Computacional/métodos , Simulação por Computador , Bases de Dados Genéticas , Feminino , Redes Reguladoras de Genes , Humanos , Modelos Genéticos , Modelos Estatísticos , Polimorfismo de Nucleotídeo Único , Receptor Tipo 2 de Fator de Crescimento de Fibroblastos/genética , Esquizofrenia/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...