Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
1.
Am J Hum Genet ; 105(4): 822-835, 2019 10 03.
Artigo em Inglês | MEDLINE | ID: mdl-31585107

RESUMO

To analyze family-based whole-genome sequence (WGS) data for complex traits, we developed a rare variant (RV) non-parametric linkage (NPL) analysis method, which has advantages over association methods. The RV-NPL differs from the NPL in that RVs are analyzed, and allele sharing among affected relative-pairs is estimated only for minor alleles. Analyzing families can increase power because causal variants with familial aggregation usually have larger effect sizes than those underlying sporadic diseases. Differing from association analysis, for NPL only affected individuals are analyzed, which can increase power, since unaffected family members can be susceptibility variant carriers. RV-NPL is robust to population substructure and admixture, inclusion of nonpathogenic variants, as well as allelic and locus heterogeneity and can readily be applied outside of coding regions. In contrast to analyzing common variants using NPL, where loci localize to large genomic regions (e.g., >50 Mb), mapped regions are well defined for RV-NPL. Using simulation studies, we demonstrate that RV-NPL is substantially more powerful than applying traditional NPL methods to analyze RVs. The RV-NPL was applied to analyze 107 late-onset Alzheimer disease (LOAD) pedigrees of Caribbean Hispanic and European ancestry with WGS data, and statistically significant linkage (LOD ≥ 3.8) was found with RVs in PSMF1 and PTPN21 which have been shown to be involved in LOAD etiology. Additionally, nominally significant linkage was observed with RVs in ABCA7, ACE, EPHA1, and SORL1, genes that were previously reported to be associated with LOAD. RV-NPL is an ideal method to elucidate the genetic etiology of complex familial diseases.


Assuntos
Doença de Alzheimer/diagnóstico , Doença de Alzheimer/genética , Ligação Genética , Sequenciamento Completo do Genoma , Feminino , Humanos , Masculino , Linhagem
2.
Am J Hum Genet ; 101(1): 115-122, 2017 Jul 06.
Artigo em Inglês | MEDLINE | ID: mdl-28669402

RESUMO

Massively parallel sequencing technologies provide great opportunities for discovering rare susceptibility variants involved in complex disease etiology via large-scale imputation and exome and whole-genome sequence-based association studies. Due to modest effect sizes, large sample sizes of tens to hundreds of thousands of individuals are required for adequately powered studies. Current analytical tools are obsolete when it comes to handling these large datasets. To facilitate the analysis of large-scale sequence-based studies, we developed SEQSpark which implements parallel processing based on Spark to increase the speed and efficiency of performing data quality control, annotation, and association analysis. To demonstrate the versatility and speed of SEQSpark, we analyzed whole-genome sequence data from the UK10K, testing for associations with waist-to-hip ratios. The analysis, which was completed in 1.5 hr, included loading data, annotation, principal component analysis, and single variant and rare variant aggregate association analysis of >9 million variants. For rare variant aggregate analysis, an exome-wide significant association (p < 2.5 × 10-6) was observed with CCDC62 (SKAT-O [p = 6.89 × 10-7], combined multivariate collapsing [p = 1.48 × 10-6], and burden of rare variants [p = 1.48 × 10-6]). SEQSpark was also used to analyze 50,000 simulated exomes and it required 1.75 hr for the analysis of a quantitative trait using several rare variant aggregate association methods. Additionally, the performance of SEQSpark was compared to Variant Association Tools and PLINK/SEQ. SEQSpark was always faster and in some situations computation was reduced to a hundredth of the time. SEQSpark will empower large sequence-based epidemiological studies to quickly elucidate genetic variation involved in the etiology of complex traits.


Assuntos
Bases de Dados de Ácidos Nucleicos , Exoma/genética , Variação Genética , Estudo de Associação Genômica Ampla/métodos , Análise de Sequência de DNA/métodos , Software , Humanos , Análise de Componente Principal , Relação Cintura-Quadril
3.
Am J Hum Genet ; 100(2): 193-204, 2017 02 02.
Artigo em Inglês | MEDLINE | ID: mdl-28065470

RESUMO

Whole-genome and exome sequence data can be cost-effectively generated for the detection of rare-variant (RV) associations in families. Causal variants that aggregate in families usually have larger effect sizes than those found in sporadic cases, so family-based designs can be a more powerful approach than population-based designs. Moreover, some family-based designs are robust to confounding due to population admixture or substructure. We developed a RV extension of the generalized disequilibrium test (GDT) to analyze sequence data obtained from nuclear and extended families. The GDT utilizes genotype differences of all discordant relative pairs to assess associations within a family, and the RV extension combines the single-variant GDT statistic over a genomic region of interest. The RV-GDT has increased power by efficiently incorporating information beyond first-degree relatives and allows for the inclusion of covariates. Using simulated genetic data, we demonstrated that the RV-GDT method has well-controlled type I error rates, even when applied to admixed populations and populations with substructure. It is more powerful than existing family-based RV association methods, particularly for the analysis of extended pedigrees and pedigrees with missing data. We analyzed whole-genome sequence data from families affected by Alzheimer disease to illustrate the application of the RV-GDT. Given the capability of the RV-GDT to adequately control for population admixture or substructure and analyze pedigrees with missing genotype data and its superior power over other family-based methods, it is an effective tool for elucidating the involvement of RVs in the etiology of complex traits.


Assuntos
Doença de Alzheimer/genética , Variação Genética , Desequilíbrio de Ligação , Análise de Sequência de DNA/métodos , Idoso , Idoso de 80 Anos ou mais , Doença de Alzheimer/diagnóstico , Proteína Axina/genética , Proteína Axina/metabolismo , Simulação por Computador , Bases de Dados Genéticas , Feminino , Genótipo , Haplótipos , Humanos , Masculino , Modelos Genéticos , Linhagem , Fenótipo
4.
Bioinformatics ; 35(3): 529-531, 2019 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-30032240

RESUMO

Motivation: For the design of genetic studies, it is necessary to perform power calculations. Although for Mendelian traits the power of detecting linkage for pedigree(s) can be determined, it is also of great interest to determine the probability of identifying multiple pedigrees or unrelated cases with variants in the same gene. For many diseases, due to extreme locus heterogeneity this probability can be small. If only one family is observed segregating a variant classified as likely pathogenic or of unknown significance, the gene cannot be implicated in disease etiology. The probability of identifying several disease families or cases is dependent on the gene-specific disease prevalence and the sample size. The observation of multiple disease families or cases with variants in the same gene as well as evidence of pathogenicity from other sources, e.g. in silico prediction, expression and functional studies, can aid in implicating a gene in disease etiology. MendelProb can determine the probability of detecting a minimum number of families or cases with variants in the same gene. It can also calculate the probability of detecting genes with variants in different data types, e.g. identifying a variant in at least one family that can establish linkage and more the two additional families regardless of their size. Additionally, for a specified probability MendelProb can determine the number of probands which need to be screened to detect a minimum number of individuals with variants within the same gene. Results: A single Mendelian disease family is not sufficient to implicate a gene in disease etiology. It is necessary to observe multiple families or cases with potentially pathogenic variants in the same gene. MendelProb, an R library, was developed to determine the probability of observing multiple families and cases with variants within a gene and to also establish the numbers of probands to screen to detect multiple observations of variants within a gene. Availability and implementation: https://github.com/statgenetics/mendelprob.


Assuntos
Exoma , Ligação Genética , Genômica , Software , Humanos , Linhagem , Probabilidade , Tamanho da Amostra
5.
Nature ; 511(7508): 241-5, 2014 Jul 10.
Artigo em Inglês | MEDLINE | ID: mdl-24896186

RESUMO

Intracranial germ cell tumours (IGCTs) are a group of rare heterogeneous brain tumours that are clinically and histologically similar to the more common gonadal GCTs. IGCTs show great variation in their geographical and gender distribution, histological composition and treatment outcomes. The incidence of IGCTs is historically five- to eightfold greater in Japan and other East Asian countries than in Western countries, with peak incidence near the time of puberty. About half of the tumours are located in the pineal region. The male-to-female incidence ratio is approximately 3-4:1 overall, but is even higher for tumours located in the pineal region. Owing to the scarcity of tumour specimens available for research, little is currently known about this rare disease. Here we report the analysis of 62 cases by next-generation sequencing, single nucleotide polymorphism array and expression array. We find the KIT/RAS signalling pathway frequently mutated in more than 50% of IGCTs, including novel recurrent somatic mutations in KIT, its downstream mediators KRAS and NRAS, and its negative regulator CBL. Novel somatic alterations in the AKT/mTOR pathway included copy number gains of the AKT1 locus at 14q32.33 in 19% of patients, with corresponding upregulation of AKT1 expression. We identified loss-of-function mutations in BCORL1, a transcriptional co-repressor and tumour suppressor. We report significant enrichment of novel and rare germline variants in JMJD1C, which codes for a histone demethylase and is a coactivator of the androgen receptor, among Japanese IGCT patients. This study establishes a molecular foundation for understanding the biology of IGCTs and suggests potentially promising therapeutic strategies focusing on the inhibition of KIT/RAS activation and the AKT1/mTOR pathway.


Assuntos
Neoplasias Encefálicas/genética , Mutação em Linhagem Germinativa/genética , Mutação/genética , Neoplasias Embrionárias de Células Germinativas/genética , Adulto , Neoplasias Encefálicas/patologia , Criança , Feminino , Humanos , Japão , Masculino , Neoplasias Embrionárias de Células Germinativas/patologia , Proteína Oncogênica v-akt/genética , Proteínas Proto-Oncogênicas c-kit/genética , Reprodutibilidade dos Testes , Transdução de Sinais/genética , Serina-Treonina Quinases TOR/genética , Adulto Jovem , Proteínas ras/genética
6.
Am J Hum Genet ; 94(1): 33-46, 2014 Jan 02.
Artigo em Inglês | MEDLINE | ID: mdl-24360806

RESUMO

Many population-based rare-variant (RV) association tests, which aggregate variants across a region, have been developed to analyze sequence data. A drawback of analyzing population-based data is that it is difficult to adequately control for population substructure and admixture, and spurious associations can occur. For RVs, this problem can be substantial, because the spectrum of rare variation can differ greatly between populations. A solution is to analyze parent-child trio data, by using the transmission disequilibrium test (TDT), which is robust to population substructure and admixture. We extended the TDT to test for RV associations using four commonly used methods. We demonstrate that for all RV-TDT methods, using proper analysis strategies, type I error is well-controlled even when there are high levels of population substructure or admixture. For trio data, unlike for population-based data, RV allele-counting association methods will lead to inflated type I errors. However type I errors can be properly controlled by obtaining p values empirically through haplotype permutation. The power of the RV-TDT methods was evaluated and compared to the analysis of case-control data with a number of genetic and disease models. The RV-TDT was also used to analyze exome data from 199 Simons Simplex Collection autism trios and an association was observed with variants in ABCA7. Given the problem of adequately controlling for population substructure and admixture in RV association studies and the growing number of sequence-based trio studies, the RV-TDT is extremely beneficial to elucidate the involvement of RVs in the etiology of complex traits.


Assuntos
Transtorno Autístico/genética , Exoma , Estudos de Associação Genética/métodos , Variação Genética , Desequilíbrio de Ligação , Alelos , Simulação por Computador , Frequência do Gene , Predisposição Genética para Doença , Haplótipos , Humanos , Modelos Genéticos , Fenótipo , Análise de Sequência de DNA
8.
Nucleic Acids Res ; 39(Web Server issue): W139-44, 2011 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-21596785

RESUMO

MicroRNAs (miRNAs) are critical regulators in the complex cellular networks. The mirAct web server (http://sysbio.ustc.edu.cn/software/mirAct) is a tool designed to investigate miRNA activity based on gene-expression data by using the negative regulation relationship between miRNAs and their target genes. mirAct supports multiple-class data and enables clustering analysis based on computationally determined miRNA activity. Here, we describe the framework of mirAct, demonstrate its performance by comparing with other similar programs and exemplify its applications using case studies.


Assuntos
MicroRNAs/metabolismo , Software , Neoplasias da Mama/genética , Análise por Conglomerados , Feminino , Expressão Gênica , Humanos , Internet
9.
Nat Genet ; 47(6): 582-8, 2015 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-25961944

RESUMO

To assess the relative impact of inherited and de novo variants on autism risk, we generated a comprehensive set of exonic single-nucleotide variants (SNVs) and copy number variants (CNVs) from 2,377 families with autism. We find that private, inherited truncating SNVs in conserved genes are enriched in probands (odds ratio = 1.14, P = 0.0002) in comparison to unaffected siblings, an effect involving significant maternal transmission bias to sons. We also observe a bias for inherited CNVs, specifically for small (<100 kb), maternally inherited events (P = 0.01) that are enriched in CHD8 target genes (P = 7.4 × 10(-3)). Using a logistic regression model, we show that private truncating SNVs and rare, inherited CNVs are statistically independent risk factors for autism, with odds ratios of 1.11 (P = 0.0002) and 1.23 (P = 0.01), respectively. This analysis identifies a second class of candidate genes (for example, RIMS1, CUL7 and LZTR1) where transmitted mutations may create a sensitized background but are unlikely to be completely penetrant.


Assuntos
Transtorno Autístico/genética , Códon sem Sentido , Variações do Número de Cópias de DNA , Exoma , Feminino , Estudos de Associação Genética , Predisposição Genética para Doença , Humanos , Desequilíbrio de Ligação , Masculino , Polimorfismo de Nucleotídeo Único , Risco
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA