Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 25
Filtrar
Más filtros

Bases de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
PLoS Comput Biol ; 15(6): e1007128, 2019 06.
Artículo en Inglés | MEDLINE | ID: mdl-31233491

RESUMEN

Open, collaborative research is a powerful paradigm that can immensely strengthen the scientific process by integrating broad and diverse expertise. However, traditional research and multi-author writing processes break down at scale. We present new software named Manubot, available at https://manubot.org, to address the challenges of open scholarly writing. Manubot adopts the contribution workflow used by many large-scale open source software projects to enable collaborative authoring of scholarly manuscripts. With Manubot, manuscripts are written in Markdown and stored in a Git repository to precisely track changes over time. By hosting manuscript repositories publicly, such as on GitHub, multiple authors can simultaneously propose and review changes. A cloud service automatically evaluates proposed changes to catch errors. Publication with Manubot is continuous: When a manuscript's source changes, the rendered outputs are rebuilt and republished to a web page. Manubot automates bibliographic tasks by implementing citation by identifier, where users cite persistent identifiers (e.g. DOIs, PubMed IDs, ISBNs, URLs), whose metadata is then retrieved and converted to a user-specified style. Manubot modernizes publishing to align with the ideals of open science by making it transparent, reproducible, immediate, versioned, collaborative, and free of charge.


Asunto(s)
Edición , Programas Informáticos , Escritura , Humanos , Manuscritos Médicos como Asunto
2.
PLoS Comput Biol ; 11(7): e1004259, 2015 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-26158728

RESUMEN

The first decade of Genome Wide Association Studies (GWAS) has uncovered a wealth of disease-associated variants. Two important derivations will be the translation of this information into a multiscale understanding of pathogenic variants and leveraging existing data to increase the power of existing and future studies through prioritization. We explore edge prediction on heterogeneous networks--graphs with multiple node and edge types--for accomplishing both tasks. First we constructed a network with 18 node types--genes, diseases, tissues, pathophysiologies, and 14 MSigDB (molecular signatures database) collections--and 19 edge types from high-throughput publicly-available resources. From this network composed of 40,343 nodes and 1,608,168 edges, we extracted features that describe the topology between specific genes and diseases. Next, we trained a model from GWAS associations and predicted the probability of association between each protein-coding gene and each of 29 well-studied complex diseases. The model, which achieved 132-fold enrichment in precision at 10% recall, outperformed any individual domain, highlighting the benefit of integrative approaches. We identified pleiotropy, transcriptional signatures of perturbations, pathways, and protein interactions as influential mechanisms explaining pathogenesis. Our method successfully predicted the results (with AUROC = 0.79) from a withheld multiple sclerosis (MS) GWAS despite starting with only 13 previously associated genes. Finally, we combined our network predictions with statistical evidence of association to propose four novel MS genes, three of which (JAK2, REL, RUNX3) validated on the masked GWAS. Furthermore, our predictions provide biological support highlighting REL as the causal gene within its gene-rich locus. Users can browse all predictions online (http://het.io). Heterogeneous network edge prediction effectively prioritized genetic associations and provides a powerful new approach for data integration across multiple domains.


Asunto(s)
Mapeo Cromosómico/métodos , Minería de Datos/métodos , Bases de Datos Genéticas , Predisposición Genética a la Enfermedad/genética , Estudio de Asociación del Genoma Completo/métodos , Proteoma/genética , Algoritmos , Animales , Humanos , Mapeo de Interacción de Proteínas/métodos , Transducción de Señal/genética , Integración de Sistemas
3.
Gigascience ; 132024 Jan 02.
Artículo en Inglés | MEDLINE | ID: mdl-38323677

RESUMEN

Important tasks in biomedical discovery such as predicting gene functions, gene-disease associations, and drug repurposing opportunities are often framed as network edge prediction. The number of edges connecting to a node, termed degree, can vary greatly across nodes in real biomedical networks, and the distribution of degrees varies between networks. If degree strongly influences edge prediction, then imbalance or bias in the distribution of degrees could lead to nonspecific or misleading predictions. We introduce a network permutation framework to quantify the effects of node degree on edge prediction. Our framework decomposes performance into the proportions attributable to degree and the network's specific connections using network permutation to generate features that depend only on degree. We discover that performance attributable to factors other than degree is often only a small portion of overall performance. Researchers seeking to predict new or missing edges in biological networks should use our permutation approach to obtain a baseline for performance that may be nonspecific because of degree. We released our methods as an open-source Python package (https://github.com/hetio/xswap/).


Asunto(s)
Algoritmos , Probabilidad
4.
bioRxiv ; 2023 Jan 06.
Artículo en Inglés | MEDLINE | ID: mdl-36711569

RESUMEN

Important tasks in biomedical discovery such as predicting gene functions, gene-disease associations, and drug repurposing opportunities are often framed as network edge prediction. The number of edges connecting to a node, termed degree, can vary greatly across nodes in real biomedical networks, and the distribution of degrees varies between networks. If degree strongly influences edge prediction, then imbalance or bias in the distribution of degrees could lead to nonspecific or misleading predictions. We introduce a network permutation framework to quantify the effects of node degree on edge prediction. Our framework decomposes performance into the proportions attributable to degree and the network's specific connections. We discover that performance attributable to factors other than degree is often only a small portion of overall performance. Degree's predictive performance diminishes when the networks used for training and testing-despite measuring the same biological relationships-were generated using distinct techniques and hence have large differences in degree distribution. We introduce the permutation-derived edge prior as the probability that an edge exists based only on degree. The edge prior shows excellent discrimination and calibration for 20 biomedical networks (16 bipartite, 3 undirected, 1 directed), with AUROCs frequently exceeding 0.85. Researchers seeking to predict new or missing edges in biological networks should use the edge prior as a baseline to identify the fraction of performance that is nonspecific because of degree. We released our methods as an open-source Python package (https://github.com/hetio/xswap/).

5.
bioRxiv ; 2023 Jan 07.
Artículo en Inglés | MEDLINE | ID: mdl-36711546

RESUMEN

Hetnets, short for "heterogeneous networks", contain multiple node and relationship types and offer a way to encode biomedical knowledge. One such example, Hetionet connects 11 types of nodes - including genes, diseases, drugs, pathways, and anatomical structures - with over 2 million edges of 24 types. Previous work has demonstrated that supervised machine learning methods applied to such networks can identify drug repurposing opportunities. However, a training set of known relationships does not exist for many types of node pairs, even when it would be useful to examine how nodes of those types are meaningfully connected. For example, users may be curious not only how metformin is related to breast cancer, but also how the GJA1 gene might be involved in insomnia. We developed a new procedure, termed hetnet connectivity search, that proposes important paths between any two nodes without requiring a supervised gold standard. The algorithm behind connectivity search identifies types of paths that occur more frequently than would be expected by chance (based on node degree alone). We find that predictions are broadly similar to those from previously described supervised approaches for certain node type pairs. Scoring of individual paths is based on the most specific paths of a given type. Several optimizations were required to precompute significant instances of node connectivity at the scale of large knowledge graphs. We implemented the method on Hetionet and provide an online interface at https://het.io/search . We provide an open source implementation of these methods in our new Python package named hetmatpy .

6.
BioData Min ; 15(1): 26, 2022 Oct 18.
Artículo en Inglés | MEDLINE | ID: mdl-36258252

RESUMEN

BACKGROUND: Knowledge graphs support biomedical research efforts by providing contextual information for biomedical entities, constructing networks, and supporting the interpretation of high-throughput analyses. These databases are populated via manual curation, which is challenging to scale with an exponentially rising publication rate. Data programming is a paradigm that circumvents this arduous manual process by combining databases with simple rules and heuristics written as label functions, which are programs designed to annotate textual data automatically. Unfortunately, writing a useful label function requires substantial error analysis and is a nontrivial task that takes multiple days per function. This bottleneck makes populating a knowledge graph with multiple nodes and edge types practically infeasible. Thus, we sought to accelerate the label function creation process by evaluating how label functions can be re-used across multiple edge types. RESULTS: We obtained entity-tagged abstracts and subsetted these entities to only contain compounds, genes, and disease mentions. We extracted sentences containing co-mentions of certain biomedical entities contained in a previously described knowledge graph, Hetionet v1. We trained a baseline model that used database-only label functions and then used a sampling approach to measure how well adding edge-specific or edge-mismatch label function combinations improved over our baseline. Next, we trained a discriminator model to detect sentences that indicated a biomedical relationship and then estimated the number of edge types that could be recalled and added to Hetionet v1. We found that adding edge-mismatch label functions rarely improved relationship extraction, while control edge-specific label functions did. There were two exceptions to this trend, Compound-binds-Gene and Gene-interacts-Gene, which both indicated physical relationships and showed signs of transferability. Across the scenarios tested, discriminative model performance strongly depends on generated annotations. Using the best discriminative model for each edge type, we recalled close to 30% of established edges within Hetionet v1. CONCLUSIONS: Our results show that this framework can incorporate novel edges into our source knowledge graph. However, results with label function transfer were mixed. Only label functions describing very similar edge types supported improved performance when transferred. We expect that the continued development of this strategy may provide essential building blocks to populating biomedical knowledge graphs with discoveries, ensuring that these resources include cutting-edge results.

7.
Gigascience ; 122022 12 28.
Artículo en Inglés | MEDLINE | ID: mdl-37503959

RESUMEN

BACKGROUND: Hetnets, short for "heterogeneous networks," contain multiple node and relationship types and offer a way to encode biomedical knowledge. One such example, Hetionet, connects 11 types of nodes-including genes, diseases, drugs, pathways, and anatomical structures-with over 2 million edges of 24 types. Previous work has demonstrated that supervised machine learning methods applied to such networks can identify drug repurposing opportunities. However, a training set of known relationships does not exist for many types of node pairs, even when it would be useful to examine how nodes of those types are meaningfully connected. For example, users may be curious about not only how metformin is related to breast cancer but also how a given gene might be involved in insomnia. FINDINGS: We developed a new procedure, termed hetnet connectivity search, that proposes important paths between any 2 nodes without requiring a supervised gold standard. The algorithm behind connectivity search identifies types of paths that occur more frequently than would be expected by chance (based on node degree alone). Several optimizations were required to precompute significant instances of node connectivity at the scale of large knowledge graphs. CONCLUSION: We implemented the method on Hetionet and provide an online interface at https://het.io/search. We provide an open-source implementation of these methods in our new Python package named hetmatpy.


Asunto(s)
Algoritmos , Probabilidad
8.
Sci Data ; 9(1): 714, 2022 11 19.
Artículo en Inglés | MEDLINE | ID: mdl-36402838

RESUMEN

The standardized identification of biomedical entities is a cornerstone of interoperability, reuse, and data integration in the life sciences. Several registries have been developed to catalog resources maintaining identifiers for biomedical entities such as small molecules, proteins, cell lines, and clinical trials. However, existing registries have struggled to provide sufficient coverage and metadata standards that meet the evolving needs of modern life sciences researchers. Here, we introduce the Bioregistry, an integrative, open, community-driven metaregistry that synthesizes and substantially expands upon 23 existing registries. The Bioregistry addresses the need for a sustainable registry by leveraging public infrastructure and automation, and employing a progressive governance model centered around open code and open data to foster community contribution. The Bioregistry can be used to support the standardized annotation of data, models, ontologies, and scientific literature, thereby promoting their interoperability and reuse. The Bioregistry can be accessed through https://bioregistry.io and its source code and data are available under the MIT and CC0 Licenses at https://github.com/biopragmatics/bioregistry .

9.
Bioinformatics ; 26(5): 694-5, 2010 Mar 01.
Artículo en Inglés | MEDLINE | ID: mdl-20081222

RESUMEN

MOTIVATION: Epistasis, the presence of gene-gene interactions, has been hypothesized to be at the root of many common human diseases, but current genome-wide association studies largely ignore its role. Multifactor dimensionality reduction (MDR) is a powerful model-free method for detecting epistatic relationships between genes, but computational costs have made its application to genome-wide data difficult. Graphics processing units (GPUs), the hardware responsible for rendering computer games, are powerful parallel processors. Using GPUs to run MDR on a genome-wide dataset allows for statistically rigorous testing of epistasis. RESULTS: The implementation of MDR for GPUs (MDRGPU) includes core features of the widely used Java software package, MDR. This GPU implementation allows for large-scale analysis of epistasis at a dramatically lower cost than the standard CPU-based implementations. As a proof-of-concept, we applied this software to a genome-wide study of sporadic amyotrophic lateral sclerosis (ALS). We discovered a statistically significant two-SNP classifier and subsequently replicated the significance of these two SNPs in an independent study of ALS. MDRGPU makes the large-scale analysis of epistasis tractable and opens the door to statistically rigorous testing of interactions in genome-wide datasets. AVAILABILITY: MDRGPU is open source and available free of charge from http://www.sourceforge.net/projects/mdr.


Asunto(s)
Esclerosis Amiotrófica Lateral/genética , Epistasis Genética , Estudio de Asociación del Genoma Completo/métodos , Bases de Datos Genéticas , Genoma Humano , Genómica/métodos , Humanos , Polimorfismo de Nucleótido Simple
10.
Cell Syst ; 12(9): 900-906.e5, 2021 09 22.
Artículo en Inglés | MEDLINE | ID: mdl-34555325

RESUMEN

Delivering a keynote talk at a conference organized by a scientific society or being named as a fellow by such a society indicates that a scientist is held in high regard by their colleagues. To explore if the distribution of such indicators of esteem in the field of bioinformatics reflects the composition of this field, we compared the gender, name origin, and country of affiliation of 412 honorees from the "International Society for Computational Biology" (75 fellows and 337 keynote speakers) with over 170,000 last authorships on computational biology papers between 1993 and 2019. The proportion of honors bestowed on women was similar to that of the field's overall last authorship rate. However, names of East Asian origin have been persistently underrepresented among honorees. Moreover, there were roughly twice as many honors bestowed on scientists with an affiliation in the United States as expected based on literature authorship. A record of this paper's transparent peer review process is included in the supplemental information.


Asunto(s)
Biología Computacional , Sociedades Científicas , Femenino , Humanos , Estados Unidos
11.
CEUR Workshop Proc ; 2976: 29-38, 2021 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-35558551

RESUMEN

The COVID-19 pandemic catalyzed the rapid dissemination of papers and preprints investigating the disease and its associated virus, SARS-CoV-2. The multifaceted nature of COVID-19 demands a multidisciplinary approach, but the urgency of the crisis combined with the need for social distancing measures present unique challenges to collaborative science. We applied a massive online open publishing approach to this problem using Manubot. Through GitHub, collaborators summarized and critiqued COVID-19 literature, creating a review manuscript. Manubot automatically compiled citation information for referenced preprints, journal publications, websites, and clinical trials. Continuous integration workflows retrieved up-to-date data from online sources nightly, regenerating some of the manuscript's figures and statistics. Manubot rendered the manuscript into PDF, HTML, LaTeX, and DOCX outputs, immediately updating the version available online upon the integration of new content. Through this effort, we organized over 50 scientists from a range of backgrounds who evaluated over 1,500 sources and developed seven literature reviews. While many efforts from the computational community have focused on mining COVID-19 literature, our project illustrates the power of open publishing to organize both technical and non-technical scientists to aggregate and disseminate information in response to an evolving crisis.

12.
ArXiv ; 2021 Sep 17.
Artículo en Inglés | MEDLINE | ID: mdl-34545336

RESUMEN

The COVID-19 pandemic catalyzed the rapid dissemination of papers and preprints investigating the disease and its associated virus, SARS-CoV-2. The multifaceted nature of COVID-19 demands a multidisciplinary approach, but the urgency of the crisis combined with the need for social distancing measures present unique challenges to collaborative science. We applied a massive online open publishing approach to this problem using Manubot. Through GitHub, collaborators summarized and critiqued COVID-19 literature, creating a review manuscript. Manubot automatically compiled citation information for referenced preprints, journal publications, websites, and clinical trials. Continuous integration workflows retrieved up-to-date data from online sources nightly, regenerating some of the manuscript's figures and statistics. Manubot rendered the manuscript into PDF, HTML, LaTeX, and DOCX outputs, immediately updating the version available online upon the integration of new content. Through this effort, we organized over 50 scientists from a range of backgrounds who evaluated over 1,500 sources and developed seven literature reviews. While many efforts from the computational community have focused on mining COVID-19 literature, our project illustrates the power of open publishing to organize both technical and non-technical scientists to aggregate and disseminate information in response to an evolving crisis.

13.
Account Res ; 28(1): 23-43, 2021 01.
Artículo en Inglés | MEDLINE | ID: mdl-32602379

RESUMEN

Assigning authorship and recognizing contributions to scholarly works is challenging on many levels. Here we discuss ethical, social, and technical challenges to the concept of authorship that may impede the recognition of contributions to a scholarly work. Recent work in the field of authorship shows that shifting to a more inclusive contributorship approach may address these challenges. Recent efforts to enable better recognition of contributions to scholarship include the development of the Contributor Role Ontology (CRO), which extends the CRediT taxonomy and can be used in information systems for structuring contributions. We also introduce the Contributor Attribution Model (CAM), which provides a simple data model that relates the contributor to research objects via the role that they played, as well as the provenance of the information. Finally, requirements for the adoption of a contributorship-based approach are discussed.


Asunto(s)
Autoria , Humanos
14.
Genome Biol ; 21(1): 109, 2020 05 11.
Artículo en Inglés | MEDLINE | ID: mdl-32393369

RESUMEN

BACKGROUND: Unsupervised compression algorithms applied to gene expression data extract latent or hidden signals representing technical and biological sources of variation. However, these algorithms require a user to select a biologically appropriate latent space dimensionality. In practice, most researchers fit a single algorithm and latent dimensionality. We sought to determine the extent by which selecting only one fit limits the biological features captured in the latent representations and, consequently, limits what can be discovered with subsequent analyses. RESULTS: We compress gene expression data from three large datasets consisting of adult normal tissue, adult cancer tissue, and pediatric cancer tissue. We train many different models across a large range of latent space dimensionalities and observe various performance differences. We identify more curated pathway gene sets significantly associated with individual dimensions in denoising autoencoder and variational autoencoder models trained using an intermediate number of latent dimensionalities. Combining compressed features across algorithms and dimensionalities captures the most pathway-associated representations. When trained with different latent dimensionalities, models learn strongly associated and generalizable biological representations including sex, neuroblastoma MYCN amplification, and cell types. Stronger signals, such as tumor type, are best captured in models trained at lower dimensionalities, while more subtle signals such as pathway activity are best identified in models trained with more latent dimensionalities. CONCLUSIONS: There is no single best latent dimensionality or compression algorithm for analyzing gene expression data. Instead, using features derived from different compression models across multiple latent space dimensionalities enhances biological representations.


Asunto(s)
Compresión de Datos/métodos , Expresión Génica , Modelos Biológicos , Adulto , Niño , Humanos , Neoplasias/metabolismo , Aprendizaje Automático Supervisado
15.
Elife ; 72018 03 01.
Artículo en Inglés | MEDLINE | ID: mdl-29424689

RESUMEN

The website Sci-Hub enables users to download PDF versions of scholarly articles, including many articles that are paywalled at their journal's site. Sci-Hub has grown rapidly since its creation in 2011, but the extent of its coverage has been unclear. Here we report that, as of March 2017, Sci-Hub's database contains 68.9% of the 81.6 million scholarly articles registered with Crossref and 85.1% of articles published in toll access journals. We find that coverage varies by discipline and publisher, and that Sci-Hub preferentially covers popular, paywalled content. For toll access articles, we find that Sci-Hub provides greater coverage than the University of Pennsylvania, a major research university in the United States. Green open access to toll access articles via licit services, on the other hand, remains quite limited. Our interactive browser at https://greenelab.github.io/scihub allows users to explore these findings in more detail. For the first time, nearly all scholarly literature is available gratis to anyone with an Internet connection, suggesting the toll access business model may become unsustainable.


Asunto(s)
Acceso a la Información , Bases de Datos Bibliográficas , Comunicación Académica , Bibliometría , Internet , Pennsylvania
16.
J R Soc Interface ; 15(141)2018 04.
Artículo en Inglés | MEDLINE | ID: mdl-29618526

RESUMEN

Deep learning describes a class of machine learning algorithms that are capable of combining raw inputs into layers of intermediate features. These algorithms have recently shown impressive results across a variety of domains. Biology and medicine are data-rich disciplines, but the data are complex and often ill-understood. Hence, deep learning techniques may be particularly well suited to solve problems of these fields. We examine applications of deep learning to a variety of biomedical problems-patient classification, fundamental biological processes and treatment of patients-and discuss whether deep learning will be able to transform these tasks or if the biomedical sphere poses unique challenges. Following from an extensive literature review, we find that deep learning has yet to revolutionize biomedicine or definitively resolve any of the most pressing challenges in the field, but promising advances have been made on the prior state of the art. Even though improvements over previous baselines have been modest in general, the recent progress indicates that deep learning methods will provide valuable means for speeding up or aiding human investigation. Though progress has been made linking a specific neural network's prediction to input features, understanding how users should interpret these models to make testable hypotheses about the system under study remains an open challenge. Furthermore, the limited amount of labelled data for training presents problems in some domains, as do legal and privacy constraints on work with sensitive health records. Nonetheless, we foresee deep learning enabling changes at both bench and bedside with the potential to transform several areas of biology and medicine.


Asunto(s)
Investigación Biomédica/tendencias , Tecnología Biomédica/tendencias , Aprendizaje Profundo/tendencias , Algoritmos , Investigación Biomédica/métodos , Toma de Decisiones , Atención a la Salud/métodos , Atención a la Salud/tendencias , Enfermedad/genética , Diseño de Fármacos , Registros Electrónicos de Salud/tendencias , Humanos , Terminología como Asunto
17.
BioData Min ; 9: 9, 2016.
Artículo en Inglés | MEDLINE | ID: mdl-26848312

RESUMEN

[This corrects the article DOI: 10.1186/1756-0381-4-21.].

18.
JAMA Neurol ; 73(7): 795-802, 2016 07 01.
Artículo en Inglés | MEDLINE | ID: mdl-27244296

RESUMEN

IMPORTANCE: Although multiple HLA alleles associated with multiple sclerosis (MS) risk have been identified, genotype-phenotype studies in the HLA region remain scarce and inconclusive. OBJECTIVES: To investigate whether MS risk-associated HLA alleles also affect disease phenotypes. DESIGN, SETTING, AND PARTICIPANTS: A cross-sectional, case-control study comprising 652 patients with MS who had comprehensive phenotypic information and 455 individuals of European origin serving as controls was conducted at a single academic research site. Patients evaluated at the Multiple Sclerosis Center at University of California, San Francisco between July 2004 and September 2005 were invited to participate. Spinal cord imaging in the data set was acquired between July 2013 and March 2014; analysis was performed between December 2014 and December 2015. MAIN OUTCOMES AND MEASURES: Cumulative HLA genetic burden (HLAGB) calculated using the most updated MS-associated HLA alleles vs clinical and magnetic resonance imaging outcomes, including age at onset, disease severity, conversion time from clinically isolated syndrome to clinically definite MS, fractions of cortical and subcortical gray matter and cerebral white matter, brain lesion volume, spinal cord gray and white matter areas, upper cervical cord area, and the ratio of gray matter to the upper cervical cord area. Multivariate modeling was applied separately for each sex data set. RESULTS: Of the 652 patients with MS, 586 had no missing genetic data and were included in the HLAGB analysis. In these 586 patients (404 women [68.9%]; mean [SD] age at disease onset, 33.6 [9.4] years), HLAGB was higher than in controls (median [IQR], 0.7 [0-1.4] and 0 [-0.3 to 0.5], respectively; P = 1.8 × 10-27). A total of 619 (95.8%) had relapsing-onset MS and 27 (4.2%) had progressive-onset MS. No significant difference was observed between relapsing-onset MS and primary progressive MS. A higher HLAGB was associated with younger age at onset and the atrophy of subcortical gray matter fraction in women with relapsing-onset MS (standard ß = -1.20 × 10-1; P = 1.7 × 10-2 and standard ß = -1.67 × 10-1; P = 2.3 × 10-4, respectively), which were driven mainly by the HLA-DRB1*15:01 haplotype. In addition, we observed the distinct role of the HLA-A*24:02-B*07:02-DRB1*15:01 haplotype among the other common DRB1*15:01 haplotypes and a nominally protective effect of HLA-B*44:02 to the subcortical gray atrophy (standard ß = -1.28 × 10-1; P = 5.1 × 10-3 and standard ß = 9.52 × 10-2; P = 3.6 × 10-2, respectively). CONCLUSIONS AND RELEVANCE: We confirm and extend previous observations linking HLA MS susceptibility alleles with disease progression and specific clinical and magnetic resonance imaging phenotypic traits.


Asunto(s)
Predisposición Genética a la Enfermedad/genética , Antígenos de Histocompatibilidad Clase I/genética , Esclerosis Múltiple/genética , Polimorfismo de Nucleótido Simple/genética , Adulto , Edad de Inicio , Alelos , Encéfalo/diagnóstico por imagen , Encéfalo/patología , Estudios de Casos y Controles , Estudios Transversales , Femenino , Estudios de Asociación Genética , Humanos , Imagenología Tridimensional , Masculino , Persona de Mediana Edad , Esclerosis Múltiple/diagnóstico por imagen , Esclerosis Múltiple/fisiopatología , Estudios Retrospectivos , Médula Espinal/diagnóstico por imagen , Médula Espinal/patología , Población Blanca , Adulto Joven
19.
Int J Epidemiol ; 45(3): 728-40, 2016 06.
Artículo en Inglés | MEDLINE | ID: mdl-26971321

RESUMEN

BACKGROUND: Based on epidemiological commonalities, multiple sclerosis (MS) and Hodgkin lymphoma (HL), two clinically distinct conditions, have long been suspected to be aetiologically related. MS and HL occur in roughly the same age groups, both are associated with Epstein-Barr virus infection and ultraviolet (UV) light exposure, and they cluster mutually in families (though not in individuals). We speculated if in addition to sharing environmental risk factors, MS and HL were also genetically related. Using data from genome-wide association studies (GWAS) of 1816 HL patients, 9772 MS patients and 25 255 controls, we therefore investigated the genetic overlap between the two diseases. METHODS: From among a common denominator of 404 K single nucleotide polymorphisms (SNPs) studied, we identified SNPs and human leukocyte antigen (HLA) alleles independently associated with both diseases. Next, we assessed the cumulative genome-wide effect of MS-associated SNPs on HL and of HL-associated SNPs on MS. To provide an interpretational frame of reference, we used data from published GWAS to create a genetic network of diseases within which we analysed proximity of HL and MS to autoimmune diseases and haematological and non-haematological malignancies. RESULTS: SNP analyses revealed genome-wide overlap between HL and MS, most prominently in the HLA region. Polygenic HL risk scores explained 4.44% of HL risk (Nagelkerke R(2)), but also 2.36% of MS risk. Conversely, polygenic MS risk scores explained 8.08% of MS risk and 1.94% of HL risk. In the genetic disease network, HL was closer to autoimmune diseases than to solid cancers. CONCLUSIONS: HL displays considerable genetic overlap with MS and other autoimmune diseases.


Asunto(s)
Estudio de Asociación del Genoma Completo , Enfermedad de Hodgkin/genética , Esclerosis Múltiple/genética , Polimorfismo de Nucleótido Simple , Femenino , Redes Reguladoras de Genes , Predisposición Genética a la Enfermedad , Humanos , Modelos Lineales , Masculino
20.
PeerJ ; 3: e705, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-25648772

RESUMEN

The level of atmospheric oxygen, a driver of free radical damage and tumorigenesis, decreases sharply with rising elevation. To understand whether ambient oxygen plays a role in human carcinogenesis, we characterized age-adjusted cancer incidence (compiled by the National Cancer Institute from 2005 to 2009) across counties of the elevation-varying Western United States and compared trends displayed by respiratory cancer (lung) and non-respiratory cancers (breast, colorectal, and prostate). To adjust for important demographic and cancer-risk factors, 8-12 covariates were considered for each cancer. We produced regression models that captured known risks. Models demonstrated that elevation is strongly, negatively associated with lung cancer incidence (p < 10(-16)), but not with the incidence of non-respiratory cancers. For every 1,000 m rise in elevation, lung cancer incidence decreased by 7.23 99% CI [5.18-9.29] cases per 100,000 individuals, equivalent to 12.7% of the mean incidence, 56.8. As a predictor of lung cancer incidence, elevation was second only to smoking prevalence in terms of significance and effect size. Furthermore, no evidence of ecological fallacy or of confounding arising from evaluated factors was detected: the lung cancer association was robust to varying regression models, county stratification, and population subgrouping; additionally seven environmental correlates of elevation, such as exposure to sunlight and fine particulate matter, could not capture the association. Overall, our findings suggest the presence of an inhaled carcinogen inherently and inversely tied to elevation, offering epidemiological support for oxygen-driven tumorigenesis. Finally, highlighting the need to consider elevation in studies of lung cancer, we demonstrated that previously reported inverse lung cancer associations with radon and UVB became insignificant after accounting for elevation.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA