Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 38
Filtrar
Más filtros

Bases de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Genet Epidemiol ; 44(4): 339-351, 2020 06.
Artículo en Inglés | MEDLINE | ID: mdl-32100375

RESUMEN

Testing millions of single nucleotide polymorphisms (SNPs) in genetic association studies has become a standard routine for disease gene discovery. In light of recent re-evaluation of statistical practice, it has been suggested that p-values are unfit as summaries of statistical evidence. Despite this criticism, p-values contain information that can be utilized to address the concerns about their flaws. We present a new method for utilizing evidence summarized by p-values for estimating odds ratio (OR) based on its approximate posterior distribution. In our method, only p-values, sample size, and standard deviation for ln(OR) are needed as summaries of data, accompanied by a suitable prior distribution for ln(OR) that can assume any shape. The parameter of interest, ln(OR), is the only parameter with a specified prior distribution, hence our model is a mix of classical and Bayesian approaches. We show that our method retains the main advantages of the Bayesian approach: it yields direct probability statements about hypotheses for OR and is resistant to biases caused by selection of top-scoring SNPs. Our method enjoys greater flexibility than similarly inspired methods in the assumed distribution for the summary statistic and in the form of the prior for the parameter of interest. We illustrate our method by presenting interval estimates of effect size for reported genetic associations with lung cancer. Although we focus on OR, the method is not limited to this particular measure of effect size and can be used broadly for assessing reliability of findings in studies testing multiple predictors.


Asunto(s)
Susceptibilidad a Enfermedades , Modelos Genéticos , Teorema de Bayes , Sitios Genéticos , Humanos , Polimorfismo de Nucleótido Simple
2.
PLoS Comput Biol ; 16(4): e1007819, 2020 04.
Artículo en Inglés | MEDLINE | ID: mdl-32287273

RESUMEN

Historically, the majority of statistical association methods have been designed assuming availability of SNP-level information. However, modern genetic and sequencing data present new challenges to access and sharing of genotype-phenotype datasets, including cost of management, difficulties in consolidation of records across research groups, etc. These issues make methods based on SNP-level summary statistics particularly appealing. The most common form of combining statistics is a sum of SNP-level squared scores, possibly weighted, as in burden tests for rare variants. The overall significance of the resulting statistic is evaluated using its distribution under the null hypothesis. Here, we demonstrate that this basic approach can be substantially improved by decorrelating scores prior to their addition, resulting in remarkable power gains in situations that are most commonly encountered in practice; namely, under heterogeneity of effect sizes and diversity between pairwise LD. In these situations, the power of the traditional test, based on the added squared scores, quickly reaches a ceiling, as the number of variants increases. Thus, the traditional approach does not benefit from information potentially contained in any additional SNPs, while our decorrelation by orthogonal transformation (DOT) method yields steady gain in power. We present theoretical and computational analyses of both approaches, and reveal causes behind sometimes dramatic difference in their respective powers. We showcase DOT by analyzing breast cancer and cleft lip data, in which our method strengthened levels of previously reported associations and implied the possibility of multiple new alleles that jointly confer disease risk.


Asunto(s)
Biología Computacional/métodos , Estudio de Asociación del Genoma Completo/métodos , Desequilibrio de Ligamiento/genética , Polimorfismo de Nucleótido Simple/genética , Neoplasias de la Mama/genética , Labio Leporino/genética , Femenino , Marcadores Genéticos/genética , Predisposición Genética a la Enfermedad/genética , Humanos , Modelos Estadísticos
3.
Genet Epidemiol ; 41(8): 726-743, 2017 12.
Artículo en Inglés | MEDLINE | ID: mdl-28913944

RESUMEN

The increasing accessibility of data to researchers makes it possible to conduct massive amounts of statistical testing. Rather than follow specific scientific hypotheses with statistical analysis, researchers can now test many possible relationships and let statistics generate hypotheses for them. The field of genetic epidemiology is an illustrative case, where testing of candidate genetic variants for association with an outcome has been replaced by agnostic screening of the entire genome. Poor replication rates of candidate gene studies have improved dramatically with the increase in genomic coverage, due to factors such as adoption of better statistical practices and availability of larger sample sizes. Here, we suggest that another important factor behind the improved replicability of genome-wide scans is an increase in the amount of statistical testing itself. We show that an increase in the number of tested hypotheses increases the proportion of true associations among the variants with the smallest P-values. We develop statistical theory to quantify how the expected proportion of genuine signals (EPGS) among top hits depends on the number of tests. This enrichment of top hits by real findings holds regardless of whether genome-wide statistical significance has been reached in a study. Moreover, if we consider only those "failed" studies that produce no statistically significant results, the same enrichment phenomenon takes place: the proportion of true associations among top hits grows with the number of tests. The enrichment occurs even if the true signals are encountered at the logarithmically decreasing rate with the additional testing.


Asunto(s)
Modelos Genéticos , Teorema de Bayes , Estudio de Asociación del Genoma Completo , Humanos , Modelos Estadísticos
4.
Genet Epidemiol ; 40(3): 210-221, 2016 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-27027515

RESUMEN

Recent technological advances equipped researchers with capabilities that go beyond traditional genotyping of loci known to be polymorphic in a general population. Genetic sequences of study participants can now be assessed directly. This capability removed technology-driven bias toward scoring predominantly common polymorphisms and let researchers reveal a wealth of rare and sample-specific variants. Although the relative contributions of rare and common polymorphisms to trait variation are being debated, researchers are faced with the need for new statistical tools for simultaneous evaluation of all variants within a region. Several research groups demonstrated flexibility and good statistical power of the functional linear model approach. In this work we extend previous developments to allow inclusion of multiple traits and adjustment for additional covariates. Our functional approach is unique in that it provides a nuanced depiction of effects and interactions for the variables in the model by representing them as curves varying over a genetic region. We demonstrate flexibility and competitive power of our approach by contrasting its performance with commonly used statistical tools and illustrate its potential for discovery and characterization of genetic architecture of complex traits using sequencing data from the Dallas Heart Study.


Asunto(s)
Estudios de Asociación Genética , Modelos Lineales , Fenotipo , Negro o Afroamericano/genética , Proteína 4 Similar a la Angiopoyetina , Angiopoyetinas/genética , Femenino , Genotipo , Corazón , Hispánicos o Latinos/genética , Humanos , Masculino , Modelos Genéticos , Polimorfismo Genético/genética , Encuestas y Cuestionarios , Texas , Triglicéridos/sangre , Población Blanca/genética
5.
Breast Cancer Res Treat ; 161(2): 333-344, 2017 01.
Artículo en Inglés | MEDLINE | ID: mdl-27848153

RESUMEN

PURPOSE: Genome-wide association studies (GWAS) have identified dozens of single-nucleotide polymorphisms (SNPs) associated with breast cancer. Few studies focused on young-onset breast cancer, which exhibits etiologic and tumor-type differences from older-onset disease. Possible confounding by prenatal effects of the maternal genome has also not been considered. METHODS: Using a family-based design for breast cancer before age 50, we assessed the relationship between breast cancer and 77 GWAS-identified breast cancer risk SNPs. We estimated relative risks (RR) for inherited and maternally mediated genetic effects. We also used published RR estimates to calculate genetic risk scores and model joint effects. RESULTS: Seventeen of the candidate SNPs were nominally associated with young-onset breast cancer in our 1296 non-Hispanic white affected families (uncorrected p value <0.05). Top-ranked SNPs included rs3803662-A (TOX3, RR = 1.39; p = 7.0 × 10-6), rs12662670-G (ESR1, RR = 1.56; p = 5.7 × 10-4), rs2981579-A (FGFR2, RR = 1.24; p = 0.002), and rs999737-G (RAD51B, RR = 1.37; p = 0.003). No maternally mediated effects were found. A risk score based on all 77 SNPs indicated that their overall relationship to young-onset breast cancer risk was more than additive (additive-fit p = 2.2 × 10-7) and consistent with a multiplicative joint effect (multiplicative-fit p = 0.27). With the multiplicative formulation, the case sister's genetic risk score exceeded that of her unaffected sister in 59% of families. CONCLUSIONS: The results of this family-based study indicate that no effects of previously identified risk SNPs were explained by prenatal effects of maternal variants. Many of the known breast cancer risk variants were associated with young-onset breast cancer, with evidence that TOX3, ESR1, FGFR2, and RAD51B are important for young-onset disease.


Asunto(s)
Neoplasias de la Mama/epidemiología , Neoplasias de la Mama/genética , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Adulto , Edad de Inicio , Alelos , Etnicidad , Femenino , Genotipo , Humanos , Persona de Mediana Edad , Oportunidad Relativa , Polimorfismo de Nucleótido Simple , Medición de Riesgo , Factores de Riesgo , Adulto Joven
6.
Proc Natl Acad Sci U S A ; 111(16): E1581-90, 2014 Apr 22.
Artículo en Inglés | MEDLINE | ID: mdl-24711389

RESUMEN

Identification of genes associated with specific biological phenotypes is a fundamental step toward understanding the molecular basis underlying development and pathogenesis. Although RNAi-based high-throughput screens are routinely used for this task, false discovery and sensitivity remain a challenge. Here we describe a computational framework for systematic integration of published gene expression data to identify genes defining a phenotype of interest. We applied our approach to rank-order all genes based on their likelihood of determining ES cell (ESC) identity. RNAi-mediated loss-of-function experiments on top-ranked genes unearthed many novel determinants of ESC identity, thus validating the derived gene ranks to serve as a rich and valuable resource for those working to uncover novel ESC regulators. Underscoring the value of our gene ranks, functional studies of our top-hit Nucleolin (Ncl), abundant in stem and cancer cells, revealed Ncl's essential role in the maintenance of ESC homeostasis by shielding against differentiation-inducing redox imbalance-induced oxidative stress. Notably, we report a conceptually novel mechanism involving a Nucleolin-dependent Nanog-p53 bistable switch regulating the homeostatic balance between self-renewal and differentiation in ESCs. Our findings connect the dots on a previously unknown regulatory circuitry involving genes associated with traits in both ESCs and cancer and might have profound implications for understanding cell fate decisions in cancer stem cells. The proposed computational framework, by helping to prioritize and preselect candidate genes for tests using complex and expensive genetic screens, provides a powerful yet inexpensive means for identification of key cell identity genes.


Asunto(s)
Células Madre Embrionarias/citología , Células Madre Embrionarias/metabolismo , Homeostasis/genética , Animales , Diferenciación Celular/genética , Proliferación Celular , Regulación de la Expresión Génica , Proteínas de Homeodominio/metabolismo , Ratones , Proteína Homeótica Nanog , Estrés Oxidativo/genética , Fosfoproteínas/genética , Fosfoproteínas/metabolismo , Células Madre Pluripotentes/citología , Células Madre Pluripotentes/metabolismo , Interferencia de ARN , Proteínas de Unión al ARN/genética , Proteínas de Unión al ARN/metabolismo , Especies Reactivas de Oxígeno/metabolismo , Reproducibilidad de los Resultados , Transcripción Genética , Proteína p53 Supresora de Tumor/metabolismo , Nucleolina
7.
Arthritis Rheum ; 64(2): 584-93, 2012 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-21905019

RESUMEN

OBJECTIVE: Fibromyalgia (FM) represents a complex disorder that is characterized by widespread pain and tenderness and is frequently accompanied by additional somatic and cognitive/affective symptoms. Genetic risk factors are known to contribute to the etiology of the syndrome. The aim of this study was to examine >350 genes for association with FM, using a large-scale candidate gene approach. METHODS: The study group comprised 496 patients with FM (cases) and 348 individuals with no chronic pain (controls). Genotyping was performed using a dedicated gene array chip, the Pain Research Panel, which assays variants characterizing >350 genes known to be involved in the biologic pathways relevant to nociception, inflammation, and mood. Association testing was performed using logistic regression. RESULTS: Significant differences in allele frequencies between cases and controls were observed for 3 genes: GABRB3 (rs4906902; P = 3.65 × 10(-6)), TAAR1 (rs8192619; P = 1.11 × 10(-5)), and GBP1 (rs7911; P = 1.06 × 10(-4)). These 3 genes and 7 other genes with suggestive evidence for association were examined in a second, independent cohort of patients with FM and control subjects who were genotyped using the Perlegen 600K platform. Evidence of association in the replication cohort was observed for TAAR1, RGS4, CNR1, and GRIA4. CONCLUSION: Variation in these 4 replicated genes may serve as a basis for development of new diagnostic approaches, and the products of these genes may contribute to the pathophysiology of FM and represent potential targets for therapeutic action.


Asunto(s)
Fibromialgia/genética , Predisposición Genética a la Enfermedad , Polimorfismo de Nucleótido Simple , Adulto , Anciano , Alelos , Estudios de Casos y Controles , Femenino , Proteínas de Unión al GTP/genética , Frecuencia de los Genes , Estudios de Asociación Genética , Genotipo , Humanos , Persona de Mediana Edad , Receptores Acoplados a Proteínas G/genética , Receptores de GABA-B/genética
8.
Nucleic Acids Res ; 39(Database issue): D730-5, 2011 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-21113022

RESUMEN

DOMINE is a comprehensive collection of known and predicted domain-domain interactions (DDIs) compiled from 15 different sources. The updated DOMINE includes 2285 new domain-domain interactions (DDIs) inferred from experimentally characterized high-resolution three-dimensional structures, and about 3500 novel predictions by five computational approaches published over the last 3 years. These additions bring the total number of unique DDIs in the updated version to 26,219 among 5140 unique Pfam domains, a 23% increase compared to 20,513 unique DDIs among 4346 unique domains in the previous version. The updated version now contains 6634 known DDIs, and features a new classification scheme to assign confidence levels to predicted DDIs. DOMINE will serve as a valuable resource to those studying protein and domain interactions. Most importantly, DOMINE will not only serve as an excellent reference to bench scientists testing for new interactions but also to bioinformaticans seeking to predict novel protein-protein interactions based on the DDIs. The contents of the DOMINE are available at http://domine.utdallas.edu.


Asunto(s)
Bases de Datos de Proteínas , Dominios y Motivos de Interacción de Proteínas , Mapeo de Interacción de Proteínas
9.
Proc Natl Acad Sci U S A ; 107(11): 5148-53, 2010 Mar 16.
Artículo en Inglés | MEDLINE | ID: mdl-20212137

RESUMEN

The gene SCN9A is responsible for three human pain disorders. Nonsense mutations cause a complete absence of pain, whereas activating mutations cause severe episodic pain in paroxysmal extreme pain disorder and primary erythermalgia. This led us to investigate whether single nucleotide polymorphisms (SNPs) in SCN9A were associated with differing pain perception in the general population. We first genotyped 27 SCN9A SNPs in 578 individuals with a radiographic diagnosis of osteoarthritis and a pain score assessment. A significant association was found between pain score and SNP rs6746030; the rarer A allele was associated with increased pain scores compared to the commoner G allele (P = 0.016). This SNP was then further genotyped in 195 pain-assessed people with sciatica, 100 amputees with phantom pain, 179 individuals after lumbar discectomy, and 205 individuals with pancreatitis. The combined P value for increased A allele pain was 0.0001 in the five cohorts tested (1277 people in total). The two alleles of the SNP rs6746030 alter the coding sequence of the sodium channel Nav1.7. Each was separately transfected into HEK293 cells and electrophysiologically assessed by patch-clamping. The two alleles showed a difference in the voltage-dependent slow inactivation (P = 0.042) where the A allele would be predicted to increase Nav1.7 activity. Finally, we genotyped 186 healthy females characterized by their responses to a diverse set of noxious stimuli. The A allele of rs6746030 was associated with an altered pain threshold and the effect mediated through C-fiber activation. We conclude that individuals experience differing amounts of pain, per nociceptive stimulus, on the basis of their SCN9A rs6746030 genotype.


Asunto(s)
Dolor/genética , Percepción , Polimorfismo de Nucleótido Simple/genética , Canales de Sodio/genética , Adulto , Alelos , Fenómenos Biofísicos/genética , Estudios de Cohortes , Femenino , Predisposición Genética a la Enfermedad , Humanos , Proteínas Mutantes/genética , Canal de Sodio Activado por Voltaje NAV1.7 , Dolor/fisiopatología , Umbral del Dolor , Análisis de Regresión
10.
Genet Epidemiol ; 34(7): 725-38, 2010 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-20976797

RESUMEN

An appealing genome-wide association study design compares one large control group against several disease samples. A pioneering study by the Wellcome Trust Case Control Consortium that employed such a design has identified multiple susceptibility regions, many of which have been independently replicated. While reusing a control sample provides effective utilization of data, it also creates correlation between association statistics across diseases. An observation of a large association statistic for one of the diseases may greatly increase chances of observing a spuriously large association for a different disease. Accounting for the correlation is also particularly important when screening for SNPs that might be involved in a set of diseases with overlapping etiology. We describe methods that correct association statistics for dependency due to shared controls, and we describe ways to obtain a measure of overall evidence and to combine association signals across multiple diseases. The methods we describe require no access to individual subject data, instead, they efficiently utilize information contained in P-values for association reported for individual diseases. P-value based combined tests for association are flexible and essentially as powerful as the approach based on aggregating the individual subject data.


Asunto(s)
Estudio de Asociación del Genoma Completo/métodos , Estudios de Casos y Controles , Distribución de Chi-Cuadrado , Simulación por Computador , Bases de Datos Genéticas , Estudio de Asociación del Genoma Completo/estadística & datos numéricos , Humanos , Modelos Genéticos , Epidemiología Molecular , Método de Montecarlo , Polimorfismo de Nucleótido Simple
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA