Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 26
Filtrar
1.
Blood ; 128(8): 1121-8, 2016 08 25.
Artigo em Inglês | MEDLINE | ID: mdl-27365426

RESUMO

We conducted a genome-wide association study (GWAS) to identify novel predisposition alleles associated with Philadelphia chromosome-negative myeloproliferative neoplasms (MPNs) and JAK2 V617F clonal hematopoiesis in the general population. We recruited a web-based cohort of 726 individuals with polycythemia vera, essential thrombocythemia, and myelofibrosis and 252 637 population controls unselected for hematologic phenotypes. Using a single-nucleotide polymorphism (SNP) array platform with custom probes for the JAK2 V617F mutation (V617F), we identified 497 individuals (0.2%) among the population controls who were V617F carriers. We performed a combined GWAS of the MPN cases plus V617F carriers in the control population (n = 1223) vs the remaining controls who were noncarriers for V617F (n = 252 140). For these MPN cases plus V617F carriers, we replicated the germ line JAK2 46/1 haplotype (rs59384377: odds ratio [OR] = 2.4, P = 6.6 × 10(-89)), previously associated with V617F-positive MPN. We also identified genome-wide significant associations in the TERT gene (rs7705526: OR = 1.8, P = 1.1 × 10(-32)), in SH2B3 (rs7310615: OR = 1.4, P = 3.1 × 10(-14)), and upstream of TET2 (rs1548483: OR = 2.0, P = 2.0 × 10(-9)). These associations were confirmed in a separate replication cohort of 446 V617F carriers vs 169 021 noncarriers. In a joint analysis of the combined GWAS and replication results, we identified additional genome-wide significant predisposition alleles associated with CHEK2, ATM, PINT, and GFI1B All SNP ORs were similar for MPN patients and controls who were V617F carriers. These data indicate that the same germ line variants endow individuals with a predisposition not only to MPN, but also to JAK2 V617F clonal hematopoiesis, a more common phenomenon that may foreshadow the development of an overt neoplasm.


Assuntos
Predisposição Genética para Doença , Células Germinativas/metabolismo , Hematopoese/genética , Janus Quinase 2/genética , Mutação/genética , Transtornos Mieloproliferativos/genética , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Alelos , Estudos de Casos e Controles , Criança , Pré-Escolar , Estudos de Coortes , Demografia , Feminino , Estudo de Associação Genômica Ampla , Humanos , Lactente , Masculino , Pessoa de Meia-Idade , Polimorfismo de Nucleotídeo Único/genética , Reprodutibilidade dos Testes , Adulto Jovem
2.
Hum Mol Genet ; 24(9): 2700-8, 2015 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-25628336

RESUMO

Roughly one in three individuals is highly susceptible to motion sickness and yet the underlying causes of this condition are not well understood. Despite high heritability, no associated genetic factors have been discovered. Here, we conducted the first genome-wide association study on motion sickness in 80 494 individuals from the 23andMe database who were surveyed about car sickness. Thirty-five single-nucleotide polymorphisms (SNPs) were associated with motion sickness at a genome-wide-significant level (P < 5 × 10(-8)). Many of these SNPs are near genes involved in balance, and eye, ear and cranial development (e.g. PVRL3, TSHZ1, MUTED, HOXB3, HOXD3). Other SNPs may affect motion sickness through nearby genes with roles in the nervous system, glucose homeostasis or hypoxia. We show that several of these SNPs display sex-specific effects, with up to three times stronger effects in women. We searched for comorbid phenotypes with motion sickness, confirming associations with known comorbidities including migraines, postoperative nausea and vomiting (PONV), vertigo and morning sickness and observing new associations with altitude sickness and many gastrointestinal conditions. We also show that two of these related phenotypes (PONV and migraines) share underlying genetic factors with motion sickness. These results point to the importance of the nervous system in motion sickness and suggest a role for glucose levels in motion-induced nausea and vomiting, a finding that may provide insight into other nausea-related phenotypes like PONV. They also highlight personal characteristics (e.g. being a poor sleeper) that correlate with motion sickness, findings that could help identify risk factors or treatments.


Assuntos
Orelha Interna/embriologia , Orelha Interna/fisiopatologia , Variação Genética , Glucose/metabolismo , Homeostase , Enjoo devido ao Movimento/etiologia , Enjoo devido ao Movimento/fisiopatologia , Adulto , Idoso , Alelos , Feminino , Estudos de Associação Genética , Estudo de Associação Genômica Ampla , Genótipo , Humanos , Masculino , Pessoa de Meia-Idade , Fenótipo , Polimorfismo de Nucleotídeo Único , Fatores Sexuais , Adulto Jovem
3.
Nat Genet ; 46(9): 989-93, 2014 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-25064009

RESUMO

We conducted a meta-analysis of Parkinson's disease genome-wide association studies using a common set of 7,893,274 variants across 13,708 cases and 95,282 controls. Twenty-six loci were identified as having genome-wide significant association; these and 6 additional previously reported loci were then tested in an independent set of 5,353 cases and 5,551 controls. Of the 32 tested SNPs, 24 replicated, including 6 newly identified loci. Conditional analyses within loci showed that four loci, including GBA, GAK-DGKQ, SNCA and the HLA region, contain a secondary independent risk variant. In total, we identified and replicated 28 independent risk variants for Parkinson's disease across 24 loci. Although the effect of each individual locus was small, risk profile analysis showed substantial cumulative risk in a comparison of the highest and lowest quintiles of genetic risk (odds ratio (OR) = 3.31, 95% confidence interval (CI) = 2.55-4.30; P = 2 × 10(-16)). We also show six risk loci associated with proximal gene expression or DNA methylation.


Assuntos
Loci Gênicos , Doença de Parkinson/genética , Estudos de Casos e Controles , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla/métodos , Genótipo , Humanos , Polimorfismo de Nucleotídeo Único , Fatores de Risco
4.
Nat Genet ; 45(8): 907-11, 2013 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-23817569

RESUMO

Allergic disease is very common and carries substantial public-health burdens. We conducted a meta-analysis of genome-wide associations with self-reported cat, dust-mite and pollen allergies in 53,862 individuals. We used generalized estimating equations to model shared and allergy-specific genetic effects. We identified 16 shared susceptibility loci with association P<5×10(-8), including 8 loci previously associated with asthma, as well as 4p14 near TLR1, TLR6 and TLR10 (rs2101521, P=5.3×10(-21)); 6p21.33 near HLA-C and MICA (rs9266772, P=3.2×10(-12)); 5p13.1 near PTGER4 (rs7720838, P=8.2×10(-11)); 2q33.1 in PLCL1 (rs10497813, P=6.1×10(-10)), 3q28 in LPP (rs9860547, P=1.2×10(-9)); 20q13.2 in NFATC2 (rs6021270, P=6.9×10(-9)), 4q27 in ADAD1 (rs17388568, P=3.9×10(-8)); and 14q21.1 near FOXA1 and TTC6 (rs1998359, P=4.8×10(-8)). We identified one locus with substantial evidence of differences in effects across allergies at 6p21.32 in the class II human leukocyte antigen (HLA) region (rs17533090, P=1.7×10(-12)), which was strongly associated with cat allergy. Our study sheds new light on the shared etiology of immune and autoimmune disease.


Assuntos
Loci Gênicos , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Hipersensibilidade/genética , Adulto , Idoso , Alérgenos/imunologia , Animais , Gatos , Estudos de Coortes , Feminino , Humanos , Hipersensibilidade/imunologia , Masculino , Pessoa de Meia-Idade , Polimorfismo de Nucleotídeo Único , Pyroglyphidae/imunologia , Autorrelato , Adulto Jovem
5.
Bioinformatics ; 29(13): i18-26, 2013 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-23812982

RESUMO

MOTIVATION: Advances in high-resolution microscopy have recently made possible the analysis of gene expression at the level of individual cells. The fixed lineage of cells in the adult worm Caenorhabditis elegans makes this organism an ideal model for studying complex biological processes like development and aging. However, annotating individual cells in images of adult C.elegans typically requires expertise and significant manual effort. Automation of this task is therefore critical to enabling high-resolution studies of a large number of genes. RESULTS: In this article, we describe an automated method for annotating a subset of 154 cells (including various muscle, intestinal and hypodermal cells) in high-resolution images of adult C.elegans. We formulate the task of labeling cells within an image as a combinatorial optimization problem, where the goal is to minimize a scoring function that compares cells in a test input image with cells from a training atlas of manually annotated worms according to various spatial and morphological characteristics. We propose an approach for solving this problem based on reduction to minimum-cost maximum-flow and apply a cross-entropy-based learning algorithm to tune the weights of our scoring function. We achieve 84% median accuracy across a set of 154 cell labels in this highly variable system. These results demonstrate the feasibility of the automatic annotation of microscopy-based images in adult C.elegans.


Assuntos
Caenorhabditis elegans/citologia , Perfilação da Expressão Gênica , Imageamento Tridimensional/métodos , Algoritmos , Animais , Caenorhabditis elegans/genética , Caenorhabditis elegans/metabolismo , Divisão Celular , Linhagem da Célula , Microscopia Confocal
6.
PLoS Med ; 10(6): e1001462, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23750121

RESUMO

BACKGROUND: Although levels of iron are known to be increased in the brains of patients with Parkinson disease (PD), epidemiological evidence on a possible effect of iron blood levels on PD risk is inconclusive, with effects reported in opposite directions. Epidemiological studies suffer from problems of confounding and reverse causation, and mendelian randomization (MR) represents an alternative approach to provide unconfounded estimates of the effects of biomarkers on disease. We performed a MR study where genes known to modify iron levels were used as instruments to estimate the effect of iron on PD risk, based on estimates of the genetic effects on both iron and PD obtained from the largest sample meta-analyzed to date. METHODS AND FINDINGS: We used as instrumental variables three genetic variants influencing iron levels, HFE rs1800562, HFE rs1799945, and TMPRSS6 rs855791. Estimates of their effect on serum iron were based on a recent genome-wide meta-analysis of 21,567 individuals, while estimates of their effect on PD risk were obtained through meta-analysis of genome-wide and candidate gene studies with 20,809 PD cases and 88,892 controls. Separate MR estimates of the effect of iron on PD were obtained for each variant and pooled by meta-analysis. We investigated heterogeneity across the three estimates as an indication of possible pleiotropy and found no evidence of it. The combined MR estimate showed a statistically significant protective effect of iron, with a relative risk reduction for PD of 3% (95% CI 1%-6%; p = 0.001) per 10 µg/dl increase in serum iron. CONCLUSIONS: Our study suggests that increased iron levels are causally associated with a decreased risk of developing PD. Further studies are needed to understand the pathophysiological mechanism of action of serum iron on PD risk before recommendations can be made.


Assuntos
Predisposição Genética para Doença , Ferro/sangue , Análise da Randomização Mendeliana , Doença de Parkinson/sangue , Doença de Parkinson/genética , Estudos de Associação Genética , Humanos , Fatores de Risco
7.
PLoS Genet ; 9(2): e1003299, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23468642

RESUMO

Myopia, or nearsightedness, is the most common eye disorder, resulting primarily from excess elongation of the eye. The etiology of myopia, although known to be complex, is poorly understood. Here we report the largest ever genome-wide association study (45,771 participants) on myopia in Europeans. We performed a survival analysis on age of myopia onset and identified 22 significant associations ([Formula: see text]), two of which are replications of earlier associations with refractive error. Ten of the 20 novel associations identified replicate in a separate cohort of 8,323 participants who reported if they had developed myopia before age 10. These 22 associations in total explain 2.9% of the variance in myopia age of onset and point toward a number of different mechanisms behind the development of myopia. One association is in the gene PRSS56, which has previously been linked to abnormally small eyes; one is in a gene that forms part of the extracellular matrix (LAMA2); two are in or near genes involved in the regeneration of 11-cis-retinal (RGR and RDH5); two are near genes known to be involved in the growth and guidance of retinal ganglion cells (ZIC2, SFRP1); and five are in or near genes involved in neuronal signaling or development. These novel findings point toward multiple genetic factors involved in the development of myopia and suggest that complex interactions between extracellular matrix remodeling, neuronal development, and visual signals from the retina may underlie the development of myopia in humans.


Assuntos
Matriz Extracelular , Olho , Estudo de Associação Genômica Ampla , Miopia , Matriz Extracelular/genética , Matriz Extracelular/metabolismo , Matriz Extracelular/patologia , Olho/metabolismo , Olho/fisiopatologia , Predisposição Genética para Doença , Humanos , Miopia/genética , Miopia/patologia , Neurônios/metabolismo , Neurônios/patologia , Erros de Refração/genética , Erros de Refração/metabolismo , Erros de Refração/patologia , Retina/metabolismo , Retina/patologia , Células Ganglionares da Retina/metabolismo , Serina Proteases/genética
8.
PLoS Genet ; 8(10): e1002973, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-23071447

RESUMO

The clinical utility of family history and genetic tests is generally well understood for simple Mendelian disorders and rare subforms of complex diseases that are directly attributable to highly penetrant genetic variants. However, little is presently known regarding the performance of these methods in situations where disease susceptibility depends on the cumulative contribution of multiple genetic factors of moderate or low penetrance. Using quantitative genetic theory, we develop a model for studying the predictive ability of family history and single nucleotide polymorphism (SNP)-based methods for assessing risk of polygenic disorders. We show that family history is most useful for highly common, heritable conditions (e.g., coronary artery disease), where it explains roughly 20%-30% of disease heritability, on par with the most successful SNP models based on associations discovered to date. In contrast, we find that for diseases of moderate or low frequency (e.g., Crohn disease) family history accounts for less than 4% of disease heritability, substantially lagging behind SNPs in almost all cases. These results indicate that, for a broad range of diseases, already identified SNP associations may be better predictors of risk than their family history-based counterparts, despite the large fraction of missing heritability that remains to be explained. Our model illustrates the difficulty of using either family history or SNPs for standalone disease prediction. On the other hand, we show that, unlike family history, SNP-based tests can reveal extreme likelihood ratios for a relatively large percentage of individuals, thus providing potentially valuable adjunctive evidence in a differential diagnosis.


Assuntos
Predisposição Genética para Doença , Herança Multifatorial , Polimorfismo de Nucleotídeo Único , Família , Humanos , Funções Verossimilhança , Modelos Genéticos , Linhagem , Curva ROC , Risco
9.
BMC Med Genet ; 13: 53, 2012 Jun 30.
Artigo em Inglês | MEDLINE | ID: mdl-22747683

RESUMO

BACKGROUND: While some factors of breast morphology, such as density, are directly implicated in breast cancer, the relationship between breast size and cancer is less clear. Breast size is moderately heritable, yet the genetic variants leading to differences in breast size have not been identified. METHODS: To investigate the genetic factors underlying breast size, we conducted a genome-wide association study (GWAS) of self-reported bra cup size, controlling for age, genetic ancestry, breast surgeries, pregnancy history and bra band size, in a cohort of 16,175 women of European ancestry. RESULTS: We identified seven single-nucleotide polymorphisms (SNPs) significantly associated with breast size (p<5.10(-8)): rs7816345 near ZNF703, rs4849887 and (independently) rs17625845 flanking INHBB, rs12173570 near ESR1, rs7089814 in ZNF365, rs12371778 near PTHLH, and rs62314947 near AREG. Two of these seven SNPs are in linkage disequilibrium (LD) with SNPs associated with breast cancer (those near ESR1 and PTHLH), and a third (ZNF365) is near, but not in LD with, a breast cancer SNP. The other three loci (ZNF703, INHBB, and AREG) have strong links to breast cancer, estrogen regulation, and breast development. CONCLUSIONS: These results provide insight into the genetic factors underlying normal breast development and show that some of these factors are shared with breast cancer. While these results do not directly support any possible epidemiological relationships between breast size and cancer, this study may contribute to a better understanding of the subtle interactions between breast morphology and breast cancer risk.


Assuntos
Neoplasias da Mama/genética , Mama/anatomia & histologia , Mama/metabolismo , Predisposição Genética para Doença/genética , Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Mama/crescimento & desenvolvimento , Mama/patologia , Feminino , Humanos , Pessoa de Meia-Idade , Tamanho do Órgão/genética , Gravidez
10.
PLoS One ; 7(4): e34442, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22493691

RESUMO

Hypothyroidism is the most common thyroid disorder, affecting about 5% of the general population. Here we present the current largest genome-wide association study of hypothyroidism, in 3,736 cases and 35,546 controls. Hypothyroidism was assessed via web-based questionnaires. We identify five genome-wide significant associations, three of which are well known to be involved in a large spectrum of autoimmune diseases: rs6679677 near PTPN22, rs3184504 in SH2B3, and rs2517532 in the HLA class I region (p-values 2.8·10(-13), 2.6·10(-12), and 1.3·10(-8), respectively). We also report associations with rs4915077 near VAV3 (p-value 7.5·10(-10)) and rs925489 near FOXE1 (p value 2.4·10(-19)). VAV3 is involved in immune function, and FOXE1 and PTPN22 have previously been associated with hypothyroidism. Although the HLA class I region and SH2B3 have previously been linked with a number of autoimmune diseases, this is the first report of their association with thyroid disease. The VAV3 association is also novel. We also show suggestive evidence of association for hypothyroidism with a SNP in the HLA class II region (independent of the other HLA association) as well as SNPs in CAPZB, PDE8B, and CTLA4. CAPZB and PDE8B have been linked to TSH levels and CTLA4 to a variety of autoimmune diseases. These results suggest heterogeneity in the genetic etiology of hypothyroidism, implicating genes involved in both autoimmune disorders and thyroid function. Using a genetic risk profile score based on the top association from each of the five genome-wide significant regions in our study, the relative risk between the highest and lowest deciles of genetic risk is 2.0.


Assuntos
Doenças Autoimunes/genética , Fatores de Transcrição Forkhead/genética , Antígenos de Histocompatibilidade Classe I/genética , Hipotireoidismo/genética , Proteína Tirosina Fosfatase não Receptora Tipo 22/genética , Proteínas/genética , Proteínas Proto-Oncogênicas c-vav/genética , Proteínas Adaptadoras de Transdução de Sinal , Adulto , Idoso , Doenças Autoimunes/complicações , California , Estudos de Casos e Controles , Feminino , Loci Gênicos , Genoma Humano , Estudo de Associação Genômica Ampla , Humanos , Hipotireoidismo/complicações , Peptídeos e Proteínas de Sinalização Intracelular , Masculino , Pessoa de Meia-Idade , Polimorfismo de Nucleotídeo Único , Risco , Inquéritos e Questionários
11.
PLoS Genet ; 8(3): e1002548, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22438815

RESUMO

More than 800 published genetic association studies have implicated dozens of potential risk loci in Parkinson's disease (PD). To facilitate the interpretation of these findings, we have created a dedicated online resource, PDGene, that comprehensively collects and meta-analyzes all published studies in the field. A systematic literature screen of -27,000 articles yielded 828 eligible articles from which relevant data were extracted. In addition, individual-level data from three publicly available genome-wide association studies (GWAS) were obtained and subjected to genotype imputation and analysis. Overall, we performed meta-analyses on more than seven million polymorphisms originating either from GWAS datasets and/or from smaller scale PD association studies. Meta-analyses on 147 SNPs were supplemented by unpublished GWAS data from up to 16,452 PD cases and 48,810 controls. Eleven loci showed genome-wide significant (P < 5 × 10(-8)) association with disease risk: BST1, CCDC62/HIP1R, DGKQ/GAK, GBA, LRRK2, MAPT, MCCC1/LAMP3, PARK16, SNCA, STK39, and SYT11/RAB25. In addition, we identified novel evidence for genome-wide significant association with a polymorphism in ITGA8 (rs7077361, OR 0.88, P  =  1.3 × 10(-8)). All meta-analysis results are freely available on a dedicated online database (www.pdgene.org), which is cross-linked with a customized track on the UCSC Genome Browser. Our study provides an exhaustive and up-to-date summary of the status of PD genetics research that can be readily scaled to include the results of future large-scale genetics projects, including next-generation sequencing studies.


Assuntos
Bases de Dados Genéticas , Estudo de Associação Genômica Ampla , Doença de Parkinson/genética , Genoma Humano , Humanos , Internet , Polimorfismo de Nucleotídeo Único
12.
PLoS One ; 6(8): e23473, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-21858135

RESUMO

While the cost and speed of generating genomic data have come down dramatically in recent years, the slow pace of collecting medical data for large cohorts continues to hamper genetic research. Here we evaluate a novel online framework for obtaining large amounts of medical information from a recontactable cohort by assessing our ability to replicate genetic associations using these data. Using web-based questionnaires, we gathered self-reported data on 50 medical phenotypes from a generally unselected cohort of over 20,000 genotyped individuals. Of a list of genetic associations curated by NHGRI, we successfully replicated about 75% of the associations that we expected to (based on the number of cases in our cohort and reported odds ratios, and excluding a set of associations with contradictory published evidence). Altogether we replicated over 180 previously reported associations, including many for type 2 diabetes, prostate cancer, cholesterol levels, and multiple sclerosis. We found significant variation across categories of conditions in the percentage of expected associations that we were able to replicate, which may reflect systematic inflation of the effects in some initial reports, or differences across diseases in the likelihood of misdiagnosis or misreport. We also demonstrated that we could improve replication success by taking advantage of our recontactable cohort, offering more in-depth questions to refine self-reported diagnoses. Our data suggest that online collection of self-reported data from a recontactable cohort may be a viable method for both broad and deep phenotyping in large populations.


Assuntos
Estudos de Associação Genética/métodos , Genoma Humano/genética , Estudo de Associação Genômica Ampla/métodos , Inquéritos e Questionários , Adulto , Idoso , Estudos de Coortes , Feminino , Genótipo , Humanos , Modelos Logísticos , Masculino , Pessoa de Meia-Idade , Razão de Chances , Polimorfismo de Nucleotídeo Único/genética , Adulto Jovem
13.
PLoS Genet ; 7(6): e1002141, 2011 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-21738487

RESUMO

Although the causes of Parkinson's disease (PD) are thought to be primarily environmental, recent studies suggest that a number of genes influence susceptibility. Using targeted case recruitment and online survey instruments, we conducted the largest case-control genome-wide association study (GWAS) of PD based on a single collection of individuals to date (3,426 cases and 29,624 controls). We discovered two novel, genome-wide significant associations with PD-rs6812193 near SCARB2 (p = 7.6 × 10(-10), OR = 0.84) and rs11868035 near SREBF1/RAI1 (p = 5.6 × 10(-8), OR = 0.85)-both replicated in an independent cohort. We also replicated 20 previously discovered genetic associations (including LRRK2, GBA, SNCA, MAPT, GAK, and the HLA region), providing support for our novel study design. Relying on a recently proposed method based on genome-wide sharing estimates between distantly related individuals, we estimated the heritability of PD to be at least 0.27. Finally, using sparse regression techniques, we constructed predictive models that account for 6%-7% of the total variance in liability and that suggest the presence of true associations just beyond genome-wide significance, as confirmed through both internal and external cross-validation. These results indicate a substantial, but by no means total, contribution of genetics underlying susceptibility to both early-onset and late-onset PD, suggesting that, despite the novel associations discovered here and elsewhere, the majority of the genetic component for Parkinson's disease remains to be discovered.


Assuntos
Loci Gênicos/genética , Estudo de Associação Genômica Ampla , Internet , Doença de Parkinson/genética , Bases de Dados Factuais , Predisposição Genética para Doença , Hereditariedade/genética , Humanos , Polimorfismo de Nucleotídeo Único/genética , Medição de Risco
14.
J Comput Biol ; 16(8): 1001-22, 2009 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-19645599

RESUMO

We developed Graemlin 2.0, a new multiple network aligner with (1) a new multi-stage approach to local network alignment; (2) a novel scoring function that can use arbitrary features of a multiple network alignment, such as protein deletions, protein duplications, protein mutations, and interaction losses; (3) a parameter learning algorithm that uses a training set of known network alignments to learn parameters for our scoring function and thereby adapt it to any set of networks; and (4) an algorithm that uses our scoring function to find approximate multiple network alignments in linear time. We tested Graemlin 2.0's accuracy on protein interaction networks from IntAct, DIP, and the Stanford Network Database. We show that, on each of these datasets, Graemlin 2.0 has higher sensitivity and specificity than existing network aligners. Graemlin 2.0 is available under the GNU public license at http://graemlin.stanford.edu .


Assuntos
Algoritmos , Mapeamento de Interação de Proteínas/métodos , Proteômica/métodos , Inteligência Artificial , Bases de Dados de Proteínas
15.
Bioinformatics ; 25(12): i21-9, 2009 Jun 15.
Artigo em Inglês | MEDLINE | ID: mdl-19477990

RESUMO

MOTIVATION: Genome-wide association studies are commonly used to identify possible associations between genetic variations and diseases. These studies mainly focus on identifying individual single nucleotide polymorphisms (SNPs) potentially linked with one disease of interest. In this work, we introduce a novel methodology that identifies similarities between diseases using information from a large number of SNPs. We separate the diseases for which we have individual genotype data into one reference disease and several query diseases. We train a classifier that distinguishes between individuals that have the reference disease and a set of control individuals. This classifier is then used to classify the individuals that have the query diseases. We can then rank query diseases according to the average classification of the individuals in each disease set, and identify which of the query diseases are more similar to the reference disease. We repeat these classification and comparison steps so that each disease is used once as reference disease. RESULTS: We apply this approach using a decision tree classifier to the genotype data of seven common diseases and two shared control sets provided by the Wellcome Trust Case Control Consortium. We show that this approach identifies the known genetic similarity between type 1 diabetes and rheumatoid arthritis, and identifies a new putative similarity between bipolar disease and hypertension.


Assuntos
Artrite Reumatoide/genética , Biologia Computacional/métodos , Diabetes Mellitus Tipo 1/genética , Predisposição Genética para Doença/genética , Artrite Reumatoide/classificação , Diabetes Mellitus Tipo 1/classificação , Perfilação da Expressão Gênica , Genoma Humano , Estudo de Associação Genômica Ampla , Genótipo , Humanos , Polimorfismo de Nucleotídeo Único
17.
Bioinformatics ; 24(13): i68-76, 2008 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-18586747

RESUMO

MOTIVATION: The need for accurate and efficient tools for computational RNA structure analysis has become increasingly apparent over the last several years: RNA folding algorithms underlie numerous applications in bioinformatics, ranging from microarray probe selection to de novo non-coding RNA gene prediction. In this work, we present RAF (RNA Alignment and Folding), an efficient algorithm for simultaneous alignment and consensus folding of unaligned RNA sequences. Algorithmically, RAF exploits sparsity in the set of likely pairing and alignment candidates for each nucleotide (as identified by the CONTRAfold or CONTRAlign programs) to achieve an effectively quadratic running time for simultaneous pairwise alignment and folding. RAF's fast sparse dynamic programming, in turn, serves as the inference engine within a discriminative machine learning algorithm for parameter estimation. RESULTS: In cross-validated benchmark tests, RAF achieves accuracies equaling or surpassing the current best approaches for RNA multiple sequence secondary structure prediction. However, RAF requires nearly an order of magnitude less time than other simultaneous folding and alignment methods, thus making it especially appropriate for high-throughput studies. AVAILABILITY: Source code for RAF is available at:http://contra.stanford.edu/contrafold/.


Assuntos
Algoritmos , Sequência Consenso/genética , RNA/genética , RNA/ultraestrutura , Alinhamento de Sequência/métodos , Análise de Sequência de RNA/métodos , Sequência de Bases , Simulação por Computador , Modelos Químicos , Modelos Moleculares , Dados de Sequência Molecular , Conformação de Ácido Nucleico
18.
Methods Mol Biol ; 484: 379-413, 2008.
Artigo em Inglês | MEDLINE | ID: mdl-18592193

RESUMO

Protein sequence alignment is the task of identifying evolutionarily or structurally related positions in a collection of amino acid sequences. Although the protein alignment problem has been studied for several decades, many recent studies have demonstrated considerable progress in improving the accuracy or scalability of multiple and pairwise alignment tools, or in expanding the scope of tasks handled by an alignment program. In this chapter, we review state-of-the-art protein sequence alignment and provide practical advice for users of alignment tools.


Assuntos
Bases de Dados de Proteínas , Proteínas/genética , Alinhamento de Sequência/métodos , Software , Algoritmos , Sequência de Aminoácidos , Dados de Sequência Molecular
19.
Genome Res ; 18(4): 676-82, 2008 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-18353807

RESUMO

The genome of an admixed individual with ancestors from isolated populations is a mosaic of chromosomal blocks, each following the statistical properties of variation seen in those populations. By analyzing polymorphisms in the admixed individual against those seen in representatives from the populations, we can infer the ancestral source of the individual's haploblocks. In this paper we describe a novel approach for ancestry inference, HAPAA (HMM-based analysis of polymorphisms in admixed ancestries), that models the allelic and haplotypic variation in the populations and captures the signal of correlation due to linkage disequilibrium, resulting in greatly improved accuracy. We also introduce a methodology for evaluating the effect of genetic divergence between ancestral populations and time-to-admixture on inference accuracy. Using HAPAA, we explore the limits of ancestry inference in closely related populations.


Assuntos
Etnicidade/genética , Genética Populacional/métodos , Polimorfismo Genético , Genoma Humano , Humanos , Desequilíbrio de Ligação , Cadeias de Markov
20.
Genome Biol ; 8(12): R269, 2007.
Artigo em Inglês | MEDLINE | ID: mdl-18096039

RESUMO

We describe CONTRAST, a gene predictor which directly incorporates information from multiple alignments rather than employing phylogenetic models. This is accomplished through the use of discriminative machine learning techniques, including a novel training algorithm. We use a two-stage approach, in which a set of binary classifiers designed to recognize coding region boundaries is combined with a global model of gene structure. CONTRAST predicts exact coding region structures for 65% more human genes than the previous state-of-the-art method, misses 46% fewer exons and displays comparable gains in specificity.


Assuntos
Genômica , Proteínas/genética , Alinhamento de Sequência/métodos , Software , Algoritmos , Animais , Inteligência Artificial , Sequência de Bases , Éxons , Etiquetas de Sequências Expressas , Genoma Humano , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...