Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 664
Filtrar
1.
Hum Genet ; 141(2): 273-281, 2022 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-35048190

RESUMEN

Recombination is a major force that shapes genetic diversity. Determination of recombination rate is important and can theoretically be improved by increasing the sample size. However, it is nearly impossible to estimate recombination rates using traditional population genetics methods when the sample size is large because these methods are highly computationally demanding. In this study, we used a refined machine learning approach to estimate the recombination rate of the human genome using the UK10K human genomic dataset with 7,562 genomic sequences and its three subsets with 200, 400 and 2,000 genomic sequences. The estimation was performed under the human Out-of-Africa demographic model. We not only obtained an accurate human genetic map, but also found that the fluctuation of estimated recombination rate is reduced along the human genome when the sample size increases. The estimated UK10K recombination rate heterogeneity is less than that estimated from its subsets. Our results demonstrate how the sample size affects the estimated recombination rate, and analyses of a larger number of genomes result in a more precise estimation of recombination rate. The accurate genetic map based on UK10K data set is also expected to benefit other human biology researches.


Asunto(s)
Mapeo Cromosómico/métodos , Genoma Humano , Mapeo Cromosómico/estadística & datos numéricos , Bases de Datos Genéticas/estadística & datos numéricos , Genética de Población , Humanos , Aprendizaje Automático , Modelos Genéticos , Recombinación Genética , Tamaño de la Muestra , Programas Informáticos , Reino Unido
2.
Int J Obes (Lond) ; 46(2): 307-315, 2022 02.
Artículo en Inglés | MEDLINE | ID: mdl-34689180

RESUMEN

BACKGROUND: The Berlin Fat Mouse Inbred line (BFMI) is a model for obesity and the metabolic syndrome. This study aimed to identify genetic variants associated with impaired glucose metabolism using the obese lines BFMI861-S1 and BFMI861-S2, which are genetically closely related, but differ in several traits. BFMI861-S1 is insulin resistant and stores ectopic fat in the liver, whereas BFMI861-S2 is insulin sensitive. METHODS: In generation 10, 397 males of an advanced intercross line (AIL) BFMI861-S1 × BFMI861-S2 were challenged with a high-fat, high-carbohydrate diet and phenotyped over 25 weeks. QTL-analysis was performed after selective genotyping of 200 mice using the GigaMUGA Genotyping Array. Additional 197 males were genotyped for 7 top SNPs in QTL regions. For the prioritization of positional candidate genes whole genome sequencing and gene expression data of the parental lines were used. RESULTS: Overlapping QTL for gonadal adipose tissue weight and blood glucose concentration were detected on chromosome (Chr) 3 (95.8-100.1 Mb), and for gonadal adipose tissue weight, liver weight, and blood glucose concentration on Chr 17 (9.5-26.1 Mb). Causal modeling suggested for Chr 3-QTL direct effects on adipose tissue weight, but indirect effects on blood glucose concentration. Direct effects on adipose tissue weight, liver weight, and blood glucose concentration were suggested for Chr 17-QTL. Prioritized positional candidate genes for the identified QTL were Notch2 and Fmo5 (Chr 3) and Plg and Acat2 (Chr 17). Two additional QTL were detected for gonadal adipose tissue weight on Chr 15 (67.9-74.6 Mb) and for body weight on Chr 16 (3.9-21.4 Mb). CONCLUSIONS: QTL mapping together with a detailed prioritization approach allowed us to identify candidate genes associated with traits of the metabolic syndrome. In addition, we provided evidence for direct and indirect genetic effects on blood glucose concentration in the insulin-resistant mouse line BFMI861-S1.


Asunto(s)
Obesidad/dietoterapia , Sitios de Carácter Cuantitativo/genética , Animales , Carbohidratos/efectos adversos , Mapeo Cromosómico/métodos , Mapeo Cromosómico/estadística & datos numéricos , Dieta Alta en Grasa/efectos adversos , Dieta Alta en Grasa/estadística & datos numéricos , Modelos Animales de Enfermedad , Ratones , Obesidad/metabolismo , Obesidad/fisiopatología , Sitios de Carácter Cuantitativo/fisiología
3.
Pak J Biol Sci ; 24(9): 997-1014, 2021 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-34585553

RESUMEN

<b>Background and Objective:</b> Barley is considering one of the most important cereal crops at the local and global levels. It is ranked second in terms of nutritional importance after wheat and its flour contributes significantly to bridging the large nutritional gap in the production of Egyptian bread. The aim of this study concentrated on knowing and testing the genetic behaviour responsible for salinity stress tolerance in barley as trying to improve barley crop and increase its ability for abiotic stress resistance under Egyptian conditions. <b>Materials and Methods:</b> Twenty-one crosses and ten parents of barley with different responses to salinity tolerance were evaluated in this investigation under normal and salinity conditions. Yield and its components and some physiological traits related to salt stress tolerance were the most important studied attributes evaluated in this regard under both conditions. Moreover, SSR markers were used to evaluate and identified associated markers for salinity tolerance in selected hybrids and comparing among the ten barley parents. <b>Results:</b> The final results confirmed that the three testers; Giza 123, Giza 126 and Giza 2000 besides; the crosses; Line 1XTester 1 (Giza 125XGiza 123), Line 2XTester 1 (Giza 133XGiza 123), Line 1XTester 2 (Giza 125XGiza 126), Line 2XTester 2 (Giza 133XGiza 126) and Line 1XTester 3 (Giza 125XGiza 2000) exhibited highly salinity tolerance under saline stress treatment compared with the control experiment. Among 15 analyzed barley entries, the chosen set of 11 markers amplified 20 alleles with an average of 1.81, with a range from 1-4 alleles. <b>Conclusion:</b> The results of SSR analysis and the data on valued agricultural trait loci determined the genetic distance among parents and their hybrids, which is of an unlimited rate for breeders.


Asunto(s)
Hordeum/microbiología , Estrés Salino , Quimera/microbiología , Quimera/fisiología , Mapeo Cromosómico/métodos , Mapeo Cromosómico/estadística & datos numéricos , Egipto , Hordeum/fisiología
4.
PLoS Comput Biol ; 17(9): e1009373, 2021 09.
Artículo en Inglés | MEDLINE | ID: mdl-34534210

RESUMEN

Despite the growing constellation of genetic loci linked to common traits, these loci have yet to account for most heritable variation, and most act through poorly understood mechanisms. Recent machine learning (ML) systems have used hierarchical biological knowledge to associate genetic mutations with phenotypic outcomes, yielding substantial predictive power and mechanistic insight. Here, we use an ontology-guided ML system to map single nucleotide variants (SNVs) focusing on 6 classic phenotypic traits in natural yeast populations. The 29 identified loci are largely novel and account for ~17% of the phenotypic variance, versus <3% for standard genetic analysis. Representative results show that sensitivity to hydroxyurea is linked to SNVs in two alternative purine biosynthesis pathways, and that sensitivity to copper arises through failure to detoxify reactive oxygen species in fatty acid metabolism. This work demonstrates a knowledge-based approach to amplifying and interpreting signals in population genetic studies.


Asunto(s)
Aprendizaje Automático , Modelos Genéticos , Herencia Multifactorial , Benomilo/toxicidad , Mapeo Cromosómico/métodos , Mapeo Cromosómico/estadística & datos numéricos , Biología Computacional , Cobre/toxicidad , Ontología de Genes , Estudio de Asociación del Genoma Completo , Glucosa/metabolismo , Glicina/metabolismo , Hidroxiurea/farmacología , Bases del Conocimiento , Redes y Vías Metabólicas/efectos de los fármacos , Redes y Vías Metabólicas/genética , Mutación , Redes Neurales de la Computación , Nucleotidiltransferasas/metabolismo , Fenotipo , Polimorfismo de Nucleótido Simple , Saccharomyces cerevisiae/efectos de los fármacos , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Biología de Sistemas
5.
Nat Genet ; 53(8): 1260-1269, 2021 08.
Artículo en Inglés | MEDLINE | ID: mdl-34226706

RESUMEN

Exome association studies to date have generally been underpowered to systematically evaluate the phenotypic impact of very rare coding variants. We leveraged extensive haplotype sharing between 49,960 exome-sequenced UK Biobank participants and the remainder of the cohort (total n ≈ 500,000) to impute exome-wide variants with accuracy R2 > 0.5 down to minor allele frequency (MAF) ~0.00005. Association and fine-mapping analyses of 54 quantitative traits identified 1,189 significant associations (P < 5 × 10-8) involving 675 distinct rare protein-altering variants (MAF < 0.01) that passed stringent filters for likely causality. Across all traits, 49% of associations (578/1,189) occurred in genes with two or more hits; follow-up analyses of these genes identified allelic series containing up to 45 distinct 'likely-causal' variants. Our results demonstrate the utility of within-cohort imputation in population-scale genome-wide association studies, provide a catalog of likely-causal, large-effect coding variant associations and foreshadow the insights that will be revealed as genetic biobank studies continue to grow.


Asunto(s)
Bancos de Muestras Biológicas , Secuenciación del Exoma/estadística & datos numéricos , Frecuencia de los Genes , Proteínas/genética , Presión Sanguínea/genética , Mapeo Cromosómico/métodos , Mapeo Cromosómico/estadística & datos numéricos , Marcadores Genéticos , Estudio de Asociación del Genoma Completo/estadística & datos numéricos , Humanos , Desequilibrio de Ligamiento , Proteínas de la Membrana/genética , Modelos Genéticos , Fenotipo , Polimorfismo de Nucleótido Simple , Proteínas/metabolismo , Receptores del Factor Natriurético Atrial/genética , Reino Unido , Secuenciación del Exoma/métodos
6.
Genome Biol ; 22(1): 188, 2021 06 24.
Artículo en Inglés | MEDLINE | ID: mdl-34167583

RESUMEN

BACKGROUND: Single-cell RNA sequencing (scRNA-seq) has enabled the unbiased, high-throughput quantification of gene expression specific to cell types and states. With the cost of scRNA-seq decreasing and techniques for sample multiplexing improving, population-scale scRNA-seq, and thus single-cell expression quantitative trait locus (sc-eQTL) mapping, is increasingly feasible. Mapping of sc-eQTL provides additional resolution to study the regulatory role of common genetic variants on gene expression across a plethora of cell types and states and promises to improve our understanding of genetic regulation across tissues in both health and disease. RESULTS: While previously established methods for bulk eQTL mapping can, in principle, be applied to sc-eQTL mapping, there are a number of open questions about how best to process scRNA-seq data and adapt bulk methods to optimize sc-eQTL mapping. Here, we evaluate the role of different normalization and aggregation strategies, covariate adjustment techniques, and multiple testing correction methods to establish best practice guidelines. We use both real and simulated datasets across single-cell technologies to systematically assess the impact of these different statistical approaches. CONCLUSION: We provide recommendations for future single-cell eQTL studies that can yield up to twice as many eQTL discoveries as default approaches ported from bulk studies.


Asunto(s)
Mapeo Cromosómico/estadística & datos numéricos , Genoma Humano , Células Madre Pluripotentes Inducidas/metabolismo , Sitios de Carácter Cuantitativo , Análisis de la Célula Individual/métodos , Alelos , Línea Celular , Perfilación de la Expresión Génica , Regulación de la Expresión Génica , Humanos , Células Madre Pluripotentes Inducidas/citología , Análisis de Secuencia de ARN , Programas Informáticos , Secuenciación del Exoma
7.
PLoS Comput Biol ; 17(4): e1008926, 2021 04.
Artículo en Inglés | MEDLINE | ID: mdl-33872311

RESUMEN

Next-generation sequencing (NGS) has transformed molecular biology and contributed to many seminal insights into genomic regulation and function. Apart from whole-genome sequencing, an NGS workflow involves alignment of the sequencing reads to the genome of study, after which the resulting alignments can be used for downstream analyses. However, alignment is complicated by the repetitive sequences; many reads align to more than one genomic locus, with 15-30% of the genome not being uniquely mappable by short-read NGS. This problem is typically addressed by discarding reads that do not uniquely map to the genome, but this practice can lead to systematic distortion of the data. Previous studies that developed methods for handling ambiguously mapped reads were often of limited applicability or were computationally intensive, hindering their broader usage. In this work, we present SmartMap: an algorithm that augments industry-standard aligners to enable usage of ambiguously mapped reads by assigning weights to each alignment with Bayesian analysis of the read distribution and alignment quality. SmartMap is computationally efficient, utilizing far fewer weighting iterations than previously thought necessary to process alignments and, as such, analyzing more than a billion alignments of NGS reads in approximately one hour on a desktop PC. By applying SmartMap to peak-type NGS data, including MNase-seq, ChIP-seq, and ATAC-seq in three organisms, we can increase read depth by up to 53% and increase the mapped proportion of the genome by up to 18% compared to analyses utilizing only uniquely mapped reads. We further show that SmartMap enables the analysis of more than 140,000 repetitive elements that could not be analyzed by traditional ChIP-seq workflows, and we utilize this method to gain insight into the epigenetic regulation of different classes of repetitive elements. These data emphasize both the dangers of discarding ambiguously mapped reads and their power for driving biological discovery.


Asunto(s)
Teorema de Bayes , Mapeo Cromosómico/estadística & datos numéricos , Secuenciación de Nucleótidos de Alto Rendimiento/estadística & datos numéricos , Inmunoprecipitación de Cromatina , ADN/genética , Conjuntos de Datos como Asunto , Genoma Humano , Humanos , Secuencias Repetitivas de Ácidos Nucleicos , Reproducibilidad de los Resultados , Alineación de Secuencia
8.
Nucleic Acids Res ; 48(21): 12074-12084, 2020 12 02.
Artículo en Inglés | MEDLINE | ID: mdl-33219687

RESUMEN

CRISPR-Cas systems require discriminating self from non-self DNA during adaptation and interference. Yet, multiple cases have been reported of bacteria containing self-targeting spacers (STS), i.e. CRISPR spacers targeting protospacers on the same genome. STS has been suggested to reflect potential auto-immunity as an unwanted side effect of CRISPR-Cas defense, or a regulatory mechanism for gene expression. Here we investigated the incidence, distribution, and evasion of STS in over 100 000 bacterial genomes. We found STS in all CRISPR-Cas types and in one fifth of all CRISPR-carrying bacteria. Notably, up to 40% of I-B and I-F CRISPR-Cas systems contained STS. We observed that STS-containing genomes almost always carry a prophage and that STS map to prophage regions in more than half of the cases. Despite carrying STS, genetic deterioration of CRISPR-Cas systems appears to be rare, suggesting a level of escape from the potentially deleterious effects of STS by other mechanisms such as anti-CRISPR proteins and CRISPR target mutations. We propose a scenario where it is common to acquire an STS against a prophage, and this may trigger more extensive STS buildup by primed spacer acquisition in type I systems, without detrimental autoimmunity effects as mechanisms of auto-immunity evasion create tolerance to STS-targeted prophages.


Asunto(s)
Bacterias/genética , Proteínas Asociadas a CRISPR/genética , Sistemas CRISPR-Cas/inmunología , Repeticiones Palindrómicas Cortas Agrupadas y Regularmente Espaciadas/inmunología , Genoma Bacteriano , Profagos/genética , Autoinmunidad/genética , Bacterias/inmunología , Bacterias/virología , Secuencia de Bases , Proteína 9 Asociada a CRISPR/genética , Proteína 9 Asociada a CRISPR/inmunología , Proteínas Asociadas a CRISPR/inmunología , Mapeo Cromosómico/estadística & datos numéricos , Programas Informáticos
9.
Nucleic Acids Res ; 48(21): e123, 2020 12 02.
Artículo en Inglés | MEDLINE | ID: mdl-33074315

RESUMEN

The recently developed Hi-C technique has been widely applied to map genome-wide chromatin interactions. However, current methods for analyzing diploid Hi-C data cannot fully distinguish between homologous chromosomes. Consequently, the existing diploid Hi-C analyses are based on sparse and inaccurate allele-specific contact matrices, which might lead to incorrect modeling of diploid genome architecture. Here we present ASHIC, a hierarchical Bayesian framework to model allele-specific chromatin organizations in diploid genomes. We developed two models under the Bayesian framework: the Poisson-multinomial (ASHIC-PM) model and the zero-inflated Poisson-multinomial (ASHIC-ZIPM) model. The proposed ASHIC methods impute allele-specific contact maps from diploid Hi-C data and simultaneously infer allelic 3D structures. Through simulation studies, we demonstrated that ASHIC methods outperformed existing approaches, especially under low coverage and low SNP density conditions. Additionally, in the analyses of diploid Hi-C datasets in mouse and human, our ASHIC-ZIPM method produced fine-resolution diploid chromatin maps and 3D structures and provided insights into the allelic chromatin organizations and functions. To summarize, our work provides a statistically rigorous framework for investigating fine-scale allele-specific chromatin conformations. The ASHIC software is publicly available at https://github.com/wmalab/ASHIC.


Asunto(s)
Ensamble y Desensamble de Cromatina , Cromatina/ultraestructura , Mapeo Cromosómico/estadística & datos numéricos , Programas Informáticos , Alelos , Animales , Teorema de Bayes , Cromatina/metabolismo , Mapeo Cromosómico/métodos , Simulación por Computador , Diploidia , Fibroblastos/metabolismo , Fibroblastos/ultraestructura , Impresión Genómica , Histonas/genética , Histonas/metabolismo , Humanos , Factor II del Crecimiento Similar a la Insulina/genética , Factor II del Crecimiento Similar a la Insulina/metabolismo , Internet , Ratones , Polimorfismo de Nucleótido Simple
10.
Am J Hum Genet ; 107(5): 895-910, 2020 11 05.
Artículo en Inglés | MEDLINE | ID: mdl-33053335

RESUMEN

Most methods for fast detection of identity by descent (IBD) segments report identity by state segments without any quantification of the uncertainty in the endpoints and lengths of the IBD segments. We present a method for determining the posterior probability distribution of IBD segment endpoints. Our approach accounts for genotype errors, recent mutations, and gene conversions which disrupt DNA sequence identity within IBD segments, and it can be applied to large cohorts with whole-genome sequence or SNP array data. We find that our method's estimates of uncertainty are well calibrated for homogeneous samples. We quantify endpoint uncertainty for 77.7 billion IBD segments from 408,883 individuals of white British ancestry in the UK Biobank, and we use these IBD segments to find regions showing evidence of recent natural selection. We show that many spurious selection signals are eliminated by the use of unbiased estimates of IBD segment endpoints and a pedigree-based genetic map. Eleven of the twelve regions with the greatest evidence for recent selection in our scan have been identified as selected in previous analyses using different approaches. Our computationally efficient method for quantifying IBD segment endpoint uncertainty is implemented in the open source ibd-ends software package.


Asunto(s)
Identificación Biométrica/métodos , Mapeo Cromosómico/estadística & datos numéricos , Genoma Humano , Patrón de Herencia , Modelos Estadísticos , Polimorfismo de Nucleótido Simple , Bancos de Muestras Biológicas , Familia , Femenino , Estudio de Asociación del Genoma Completo , Humanos , Masculino , Linaje , Programas Informáticos , Incertidumbre , Reino Unido
11.
JMIR Public Health Surveill ; 6(2): e15917, 2020 04 30.
Artículo en Inglés | MEDLINE | ID: mdl-32352389

RESUMEN

BACKGROUND: Many public health departments use record linkage between surveillance data and external data sources to inform public health interventions. However, little guidance is available to inform these activities, and many health departments rely on deterministic algorithms that may miss many true matches. In the context of public health action, these missed matches lead to missed opportunities to deliver interventions and may exacerbate existing health inequities. OBJECTIVE: This study aimed to compare the performance of record linkage algorithms commonly used in public health practice. METHODS: We compared five deterministic (exact, Stenger, Ocampo 1, Ocampo 2, and Bosh) and two probabilistic record linkage algorithms (fastLink and beta record linkage [BRL]) using simulations and a real-world scenario. We simulated pairs of datasets with varying numbers of errors per record and the number of matching records between the two datasets (ie, overlap). We matched the datasets using each algorithm and calculated their recall (ie, sensitivity, the proportion of true matches identified by the algorithm) and precision (ie, positive predictive value, the proportion of matches identified by the algorithm that were true matches). We estimated the average computation time by performing a match with each algorithm 20 times while varying the size of the datasets being matched. In a real-world scenario, HIV and sexually transmitted disease surveillance data from King County, Washington, were matched to identify people living with HIV who had a syphilis diagnosis in 2017. We calculated the recall and precision of each algorithm compared with a composite standard based on the agreement in matching decisions across all the algorithms and manual review. RESULTS: In simulations, BRL and fastLink maintained a high recall at nearly all data quality levels, while being comparable with deterministic algorithms in terms of precision. Deterministic algorithms typically failed to identify matches in scenarios with low data quality. All the deterministic algorithms had a shorter average computation time than the probabilistic algorithms. BRL had the slowest overall computation time (14 min when both datasets contained 2000 records). In the real-world scenario, BRL had the lowest trade-off between recall (309/309, 100.0%) and precision (309/312, 99.0%). CONCLUSIONS: Probabilistic record linkage algorithms maximize the number of true matches identified, reducing gaps in the coverage of interventions and maximizing the reach of public health action.


Asunto(s)
Algoritmos , COVID-19/diagnóstico , Mapeo Cromosómico/normas , Registros Electrónicos de Salud/instrumentación , Salud Pública/instrumentación , COVID-19/fisiopatología , Mapeo Cromosómico/métodos , Mapeo Cromosómico/estadística & datos numéricos , Registros Electrónicos de Salud/normas , Registros Electrónicos de Salud/tendencias , Humanos , Pandemias/prevención & control , Salud Pública/métodos , Salud Pública/tendencias , Reproducibilidad de los Resultados , Estudios de Validación como Asunto
12.
Curr Protein Pept Sci ; 21(11): 1068-1077, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-32338215

RESUMEN

Many studies have shown that the spatial distribution of genes within a single chromosome exhibits distinct patterns. However, little is known about the characteristics of inter-chromosomal distribution of genes (including protein-coding genes, processed transcripts and pseudogenes) in different genomes. In this study, we explored these issues using the available genomic data of both human and model organisms. Moreover, we also analyzed the distribution pattern of protein-coding genes that have been associated with 14 common diseases and the insert/deletion mutations and single nucleotide polymorphisms detected by whole genome sequencing in an acute promyelocyte leukemia patient. We obtained the following novel findings. Firstly, inter-chromosomal distribution of genes displays a nonstochastic pattern and the gene densities in different chromosomes are heterogeneous. This kind of heterogeneity is observed in genomes of both lower and higher species. Secondly, protein-coding genes involved in certain biological processes tend to be enriched in one or a few chromosomes. Our findings have added new insights into our understanding of the spatial distribution of genome and disease- related genes across chromosomes. These results could be useful in improving the efficiency of disease-associated gene screening studies by targeting specific chromosomes.


Asunto(s)
Enfermedad Coronaria/genética , Epistasis Genética , Lupus Eritematoso Sistémico/genética , Neoplasias/genética , Enfermedades Neurodegenerativas/genética , Accidente Cerebrovascular/genética , Animales , Composición de Base , Caenorhabditis elegans/genética , Mapeo Cromosómico/estadística & datos numéricos , Cromosomas Humanos/química , Enfermedad Coronaria/diagnóstico , Enfermedad Coronaria/patología , Drosophila melanogaster/genética , Genoma Humano , Estudio de Asociación del Genoma Completo , Humanos , Lupus Eritematoso Sistémico/diagnóstico , Lupus Eritematoso Sistémico/patología , Ratones , Neoplasias/clasificación , Neoplasias/diagnóstico , Neoplasias/patología , Enfermedades Neurodegenerativas/clasificación , Enfermedades Neurodegenerativas/diagnóstico , Enfermedades Neurodegenerativas/patología , Sistemas de Lectura Abierta , Accidente Cerebrovascular/diagnóstico , Accidente Cerebrovascular/patología , Pez Cebra/genética
13.
PLoS One ; 15(2): e0228951, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-32074141

RESUMEN

Segregation distortion is the phenomenon in which genotypes deviate from expected Mendelian ratios in the progeny of a cross between two varieties or species. There is not currently a widely used consensus for the appropriate statistical test, or more specifically the multiple testing correction procedure, used to detect segregation distortion for high-density single-nucleotide polymorphism (SNP) data. Here we examine the efficacy of various multiple testing procedures, including chi-square test with no correction for multiple testing, false-discovery rate correction and Bonferroni correction using an in-silico simulation of a biparental mapping population. We find that the false discovery rate correction best approximates the traditional p-value threshold of 0.05 for high-density marker data. We also utilize this simulation to test the effect of segregation distortion on the genetic mapping process, specifically on the formation of linkage groups during marker clustering. Only extreme segregation distortion was found to effect genetic mapping. In addition, we utilize replicate empirical mapping populations of wheat varieties Avalon and Cadenza to assess how often segregation distortion conforms to the same pattern between closely related wheat varieties.


Asunto(s)
Mapeo Cromosómico/métodos , Mapeo Cromosómico/estadística & datos numéricos , Segregación Cromosómica/fisiología , Cromosomas de las Plantas/genética , Simulación por Computador , Interpretación Estadística de Datos , Ligamiento Genético/genética , Genotipo , Polimorfismo de Nucleótido Simple/genética , Sitios de Carácter Cuantitativo/genética , Triticum/genética
14.
Genes (Basel) ; 12(1)2020 12 31.
Artículo en Inglés | MEDLINE | ID: mdl-33396302

RESUMEN

The study of fish cytogenetics has been impeded by the inability to produce G-bands that could assign chromosomes to their homologous pairs. Thus, the majority of karyotypes published have been estimated based on morphological similarities of chromosomes. The reason why chromosome G-banding does not work in fish remains elusive. However, the recent increase in the number of fish genomes assembled to the chromosome level provides a way to analyse this issue. We have developed a Python tool to visualize and quantify GC percentage (GC%) of both repeats and unique DNA along chromosomes using a non-overlapping sliding window approach. Our tool profiles GC% and simultaneously plots the proportion of repeats (rep%) in a color scale (or vice versa). Hence, it is possible to assess the contribution of repeats to the total GC%. The main differences are the GC% of repeats homogenizing the overall GC% along fish chromosomes and a greater range of GC% scattered along fish chromosomes. This may explain the inability to produce G-banding in fish. We also show an occasional banding pattern along the chromosomes in some fish that probably cannot be detected with traditional qualitative cytogenetic methods.


Asunto(s)
Composición de Base , Mapeo Cromosómico/métodos , Peces/genética , Genoma , Cariotipificación/métodos , Programas Informáticos , Animales , Gatos , Bandeo Cromosómico , Mapeo Cromosómico/estadística & datos numéricos , Peces/clasificación , Gorilla gorilla/clasificación , Gorilla gorilla/genética , Secuencias Repetidas en Tándem
15.
Ann Neurol ; 87(2): 184-193, 2020 02.
Artículo en Inglés | MEDLINE | ID: mdl-31788832

RESUMEN

OBJECTIVE: Restless legs syndrome is a frequent neurological disorder with substantial burden on individual well-being and public health. Genetic risk loci have been identified, but the causatives genes at these loci are largely unknown, so that functional investigation and clinical translation of molecular research data are still inhibited. To identify putatively causative genes, we searched for highly significant mutational burden in candidate genes. METHODS: We analyzed 84 candidate genes in 4,649 patients and 4,982 controls by next generation sequencing using molecular inversion probes that targeted mainly coding regions. The burden of low-frequency and rare variants was assessed, and in addition, an algorithm (binomial performance deviation analysis) was established to estimate independently the sequence variation in the probe binding regions from the variation in sequencing depth. RESULTS: Highly significant results (considering the number of genes in the genome) of the conventional burden test and the binomial performance deviation analysis overlapped significantly. Fourteen genes were highly significant by one method and confirmed with Bonferroni-corrected significance by the other to show a differential burden of low-frequency and rare variants in restless legs syndrome. Nine of them (AAGAB, ATP2C1, CNTN4, COL6A6, CRBN, GLO1, NTNG1, STEAP4, VAV3) resided in the vicinity of known restless legs syndrome loci, whereas 5 (BBS7, CADM1, CREB5, NRG3, SUN1) have not previously been associated with restless legs syndrome. Burden test and binomial performance deviation analysis also converged significantly in fine-mapping potentially causative domains within these genes. INTERPRETATION: Differential burden with intragenic low-frequency variants reveals putatively causative genes in restless legs syndrome. ANN NEUROL 2020;87:184-193.


Asunto(s)
Análisis Mutacional de ADN , Predisposición Genética a la Enfermedad/genética , Síndrome de las Piernas Inquietas/genética , Estudios de Casos y Controles , Mapeo Cromosómico/estadística & datos numéricos , Femenino , Humanos , Masculino , Persona de Mediana Edad
16.
Sci Rep ; 9(1): 16855, 2019 11 14.
Artículo en Inglés | MEDLINE | ID: mdl-31728008

RESUMEN

Ramie is an important natural fiber crop, and the fiber yield and its related traits are the most valuable traits in ramie production. However, the genetic basis for these traits is still poorly understood, which has dramatically hindered the breeding of high yield in this fiber crop. Herein, a high-density genetic map with 6,433 markers spanning 2476.5 cM was constructed using a population derived from two parents, cultivated ramie Zhongsizhu 1 (ZSZ1) and its wild progenitor B. nivea var. tenacissima (BNT). The fiber yield (FY) and its four related traits-stem diameter (SD) and length (SL), stem bark weight (BW) and thickness (BT)-were performed for quantitative trait locus (QTL) analysis, resulting in a total of 47 QTLs identified. Forty QTLs were mapped into 12 genomic regions, thus forming 12 QTL clusters. Among 47 QTLs, there were 14 QTLs whose wild allele from BNT was beneficial. Interestingly, all QTLs in Cluster 10 displayed overdominance, indicating that the region of this cluster was likely heterotic loci. In addition, four fiber yield-related genes underwent positive selection were found either to fall into the FY-related QTL regions or to be near to the identified QTLs. The dissection of FY and FY-related traits not only improved our understanding to the genetic basis of these traits, but also provided new insights into the domestication of FY in ramie. The identification of many QTLs and the discovery of beneficial alleles from wild species provided a basis for the improvement of yield traits in ramie breeding.


Asunto(s)
Boehmeria/genética , Mapeo Cromosómico/estadística & datos numéricos , Productos Agrícolas , Tallos de la Planta/genética , Sitios de Carácter Cuantitativo , Carácter Cuantitativo Heredable , Boehmeria/anatomía & histología , Boehmeria/química , Boehmeria/crecimiento & desarrollo , Cruzamientos Genéticos , Fibras de la Dieta/análisis , Ligamiento Genético , Genoma de Planta , Humanos , Fitomejoramiento/métodos , Tallos de la Planta/anatomía & histología , Tallos de la Planta/química , Tallos de la Planta/crecimiento & desarrollo
17.
PLoS One ; 14(5): e0216944, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-31100083

RESUMEN

Most viruses are known to spontaneously generate defective viral genomes (DVG) due to errors during replication. These DVGs are subgenomic and contain deletions that render them unable to complete a full replication cycle in the absence of a co-infecting, non-defective helper virus. DVGs, especially of the copyback type, frequently observed with paramyxoviruses, have been recognized to be important triggers of the antiviral innate immune response. DVGs have therefore gained interest for their potential to alter the attenuation and immunogenicity of vaccines. To investigate this potential, accurate identification and quantification of DVGs is essential. Conventional methods, such as RT-PCR, are labor intensive and will only detect primer sequence-specific species. High throughput sequencing (HTS) is much better suited for this undertaking. Here, we present an HTS-based algorithm called DVG-profiler to identify and quantify all DVG sequences in an HTS data set generated from a virus preparation. DVG-profiler identifies DVG breakpoints relative to a reference genome and reports the directionality of each segment from within the same read. The specificity and sensitivity of the algorithm was assessed using both in silico data sets as well as HTS data obtained from parainfluenza virus 5, Sendai virus and mumps virus preparations. HTS data from the latter were also compared with conventional RT-PCR data and with data obtained using an alternative algorithm. The data presented here demonstrate the high specificity, sensitivity, and robustness of DVG-profiler. This algorithm was implemented within an open source cloud-based computing environment for analyzing HTS data. DVG-profiler might prove valuable not only in basic virus research but also in monitoring live attenuated vaccines for DVG content and to assure vaccine lot to lot consistency.


Asunto(s)
Algoritmos , Mapeo Cromosómico/estadística & datos numéricos , Virus Defectuosos/genética , Genoma Viral , Virus de la Parotiditis/genética , Virus de la Parainfluenza 5/genética , Virus Sendai/genética , Animales , Mapeo Cromosómico/métodos , Cartilla de ADN/síntesis química , Cartilla de ADN/metabolismo , Conjuntos de Datos como Asunto , Virus Defectuosos/clasificación , Secuenciación de Nucleótidos de Alto Rendimiento/estadística & datos numéricos , Humanos , Tipificación Molecular , Virus de la Parotiditis/clasificación , Virus de la Parainfluenza 5/clasificación , Reacción en Cadena en Tiempo Real de la Polimerasa , Virus Sendai/clasificación , Sensibilidad y Especificidad
18.
Reprod Biomed Online ; 39(1): 40-48, 2019 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-31097322

RESUMEN

RESEARCH QUESTION: To analyse why unbalanced viable offspring are derived mainly from the 3:1 segregation mode in t(11;22)(q23;q11.2) reciprocal translocation. DESIGN: Retrospective analysis of 24 pre-implantation genetic testing for chromosomal structural re-arrangements (PGT-SR) cycles was performed on seven male and five female carriers of t(11;22) translocation. Sperm analysis was performed on each male carrier. These patients were directed to the study centre after several years of miscarriages and/or abortions, primary infertility for male carriers or birth of an affected child. RESULTS: Twenty-four PGT-SR cycles were performed to exclude imbalances in both male and female carriers. The unbalanced embryos derived from the adjacent-1 segregation mode were the most represented in both male and female carriers (68.4% and 50%, respectively). These results were positively related with meiotic segregation analysis of reciprocal translocation in spermatozoa. A thorough analysis of the unbalanced embryo karyotypes determined that the expected viable +der22 karyotype resulting from 3:1 malsegregation was less represented at 5.3%. CONCLUSIONS: These findings highlight the divergence that may exist between meiotic segregation and post-zygotic selection. Post-zygotic selection would be responsible for the elimination of unbalanced embryos derived from the adjacent-1 segregation mode. The combined action of several factors occurs at the beginning of post-zygotic selection. Genetic counselling must consider the risk of a birth related to the adjacent-1 segregation mode, irrespective of the sex of the translocation carrier. These results will allow deeper understanding of the PGT results of t(11;22) carriers, which often include a high number of aneuploid embryos.


Asunto(s)
Cromosomas Humanos Par 11/genética , Cromosomas Humanos Par 22/genética , Patrón de Herencia/genética , Diagnóstico Preimplantación/métodos , Translocación Genética , Adulto , Mapeo Cromosómico/métodos , Mapeo Cromosómico/estadística & datos numéricos , Femenino , Frecuencia de los Genes , Tamización de Portadores Genéticos/métodos , Humanos , Hibridación Fluorescente in Situ/métodos , Hibridación Fluorescente in Situ/estadística & datos numéricos , Cariotipificación , Masculino , Embarazo , Diagnóstico Preimplantación/estadística & datos numéricos , Estudios Retrospectivos , Análisis de Semen/métodos , Análisis de Semen/estadística & datos numéricos , Translocación Genética/genética
19.
Nat Commun ; 10(1): 1938, 2019 04 26.
Artículo en Inglés | MEDLINE | ID: mdl-31028255

RESUMEN

Chromosome conformation capture techniques, such as Hi-C, are fundamental in characterizing genome organization. These methods have revealed several genomic features, such as chromatin loops, whose disruption can have dramatic effects in gene regulation. Unfortunately, their detection is difficult; current methods require that the users choose the resolution of interaction maps based on dataset quality and sequencing depth. Here, we introduce Binless, a resolution-agnostic method that adapts to the quality and quantity of available data, to detect both interactions and differences. Binless relies on an alternate representation of Hi-C data, which leads to a more detailed classification of paired-end reads. Using a large-scale benchmark, we demonstrate that Binless is able to call interactions with higher reproducibility than other existing methods. Binless, which is freely available, can thus reliably be used to identify chromatin loops as well as for differential analysis of chromatin interaction maps.


Asunto(s)
Caulobacter crescentus/genética , Cromatina/química , Mapeo Cromosómico/métodos , Biología Computacional/métodos , ADN/química , Genoma , Benchmarking , Mapeo Cromosómico/estadística & datos numéricos , ADN/genética , Conjuntos de Datos como Asunto , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Conformación de Ácido Nucleico
20.
PLoS Genet ; 15(3): e1007530, 2019 03.
Artículo en Inglés | MEDLINE | ID: mdl-30875371

RESUMEN

A common complementary strategy in Genome-Wide Association Studies (GWAS) is to perform Gene Set Analysis (GSA), which tests for the association between one phenotype of interest and an entire set of Single Nucleotide Polymorphisms (SNPs) residing in selected genes. While there exist many tools for performing GSA, popular methods often include a number of ad-hoc steps that are difficult to justify statistically, provide complicated interpretations based on permutation inference, and demonstrate poor operating characteristics. Additionally, the lack of gold standard gene set lists can produce misleading results and create difficulties in comparing analyses even across the same phenotype. We introduce the Generalized Berk-Jones (GBJ) statistic for GSA, a permutation-free parametric framework that offers asymptotic power guarantees in certain set-based testing settings. To adjust for confounding introduced by different gene set lists, we further develop a GBJ step-down inference technique that can discriminate between gene sets driven to significance by single genes and those demonstrating group-level effects. We compare GBJ to popular alternatives through simulation and re-analysis of summary statistics from a large breast cancer GWAS, and we show how GBJ can increase power by incorporating information from multiple signals in the same gene. In addition, we illustrate how breast cancer pathway analysis can be confounded by the frequency of FGFR2 in pathway lists. Our approach is further validated on two other datasets of summary statistics generated from GWAS of height and schizophrenia.


Asunto(s)
Estudio de Asociación del Genoma Completo/estadística & datos numéricos , Estatura/genética , Neoplasias de la Mama/genética , Mapeo Cromosómico/estadística & datos numéricos , Biología Computacional/métodos , Simulación por Computador , Bases de Datos Genéticas , Femenino , Redes Reguladoras de Genes , Humanos , Modelos Genéticos , Modelos Estadísticos , Polimorfismo de Nucleótido Simple , Receptor Tipo 2 de Factor de Crecimiento de Fibroblastos/genética , Esquizofrenia/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA