Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 41
Filtrar
Más filtros

Bases de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Genet Med ; 25(4): 100012, 2023 04.
Artículo en Inglés | MEDLINE | ID: mdl-36637017

RESUMEN

PURPOSE: TTN truncating variants (TTNtvs) represent the largest known genetic cause of dilated cardiomyopathies (DCMs), however their penetrance for DCM in general populations is low. More broadly, patients with cardiomyopathies (CMs) often exhibit other cardiac conditions, such as atrial fibrillation (Afib), which has also been linked to TTNtvs. This retrospective analysis aims to characterize the relationship between different cardiac conditions in those with TTNtvs and identify individuals with the highest risk of DCM. METHODS: In this work we leverage longitudinal electronic health record and exome sequencing data from approximately 450,000 individuals in 2 health systems to statistically confirm and pinpoint the genetic footprint of TTNtv-related diagnoses aside from CM, such as Afib, and determine whether vetting additional significantly associated phenotypes better stratifies CM risk across those with TTNtvs. We focused on TTNtvs in exons with a percentage spliced in >90% (hiPSI TTNtvs), a representation of constitutive cardiac expression. RESULTS: When controlling for CM and Afib, other cardiac conditions retained only nominal association with TTNtvs. A sliding window analysis of TTNtvs across the locus confirms that the association is specific to hiPSI exons for both CM and Afib, with no meaningful associations in percent spliced in ≤90% exons (loPSI TTNtvs). The combination of hiPSI TTNtv status and early Afib diagnosis (before age 60) found a subset of TTNtv individuals at high risk for CM. The prevalence of CM in this subset was 33%, a rate that was 3.5 fold higher than that in individuals with hiPSI TTNtvs (9% prevalence), 5-fold higher than that in individuals without TTNtvs with early Afib (6% prevalence), and 80-fold higher than that in the general population. CONCLUSION: Our retrospective analyses revealed that those with hiPSI TTNtvs and early Afib (∼1/2900) have a high prevalence of CM (33%), far exceeding that in other individuals with TTNtvs and in those without TTNtvs with an early Afib diagnosis. These results show that combining phenotypic information along with genomic population screening can identify patients at higher risk for progressing to symptomatic heart failure.


Asunto(s)
Fibrilación Atrial , Cardiomiopatías , Cardiomiopatía Dilatada , Cardiopatías , Humanos , Fibrilación Atrial/epidemiología , Fibrilación Atrial/genética , Estudios Retrospectivos , Prevalencia , Cardiomiopatías/epidemiología , Cardiomiopatías/genética , Conectina/genética , Conectina/metabolismo , Cardiomiopatía Dilatada/epidemiología , Cardiomiopatía Dilatada/genética
2.
BMC Med Inform Decis Mak ; 23(Suppl 1): 40, 2023 02 24.
Artículo en Inglés | MEDLINE | ID: mdl-36829139

RESUMEN

BACKGROUND: Two years into the COVID-19 pandemic and with more than five million deaths worldwide, the healthcare establishment continues to struggle with every new wave of the pandemic resulting from a new coronavirus variant. Research has demonstrated that there are variations in the symptoms, and even in the order of symptom presentations, in COVID-19 patients infected by different SARS-CoV-2 variants (e.g., Alpha and Omicron). Textual data in the form of admission notes and physician notes in the Electronic Health Records (EHRs) is rich in information regarding the symptoms and their orders of presentation. Unstructured EHR data is often underutilized in research due to the lack of annotations that enable automatic extraction of useful information from the available extensive volumes of textual data. METHODS: We present the design of a COVID Interface Terminology (CIT), not just a generic COVID-19 terminology, but one serving a specific purpose of enabling automatic annotation of EHRs of COVID-19 patients. CIT was constructed by integrating existing COVID-related ontologies and mining additional fine granularity concepts from clinical notes. The iterative mining approach utilized the techniques of 'anchoring' and 'concatenation' to identify potential fine granularity concepts to be added to the CIT. We also tested the generalizability of our approach on a hold-out dataset and compared the annotation coverage to the coverage obtained for the dataset used to build the CIT. RESULTS: Our experiments demonstrate that this approach results in higher annotation coverage compared to existing ontologies such as SNOMED CT and Coronavirus Infectious Disease Ontology (CIDO). The final version of CIT achieved about 20% more coverage than SNOMED CT and 50% more coverage than CIDO. In the future, the concepts mined and added into CIT could be used as training data for machine learning models for mining even more concepts into CIT and further increasing the annotation coverage. CONCLUSION: In this paper, we demonstrated the construction of a COVID interface terminology that can be utilized for automatically annotating EHRs of COVID-19 patients. The techniques presented can identify frequently documented fine granularity concepts that are missing in other ontologies thereby increasing the annotation coverage.


Asunto(s)
COVID-19 , Registros Electrónicos de Salud , Humanos , Pandemias , SARS-CoV-2
3.
Environ Health ; 19(1): 92, 2020 08 27.
Artículo en Inglés | MEDLINE | ID: mdl-32854703

RESUMEN

BACKGROUND: Health risks due to particulate matter (PM) from wildfires may differ from risk due to PM from other sources. In places frequently subjected to wildfire smoke, such as Reno, Nevada, it is critical to determine whether wildfire PM poses unique risks. Our goal was to quantify the difference in the association of adverse asthma events with PM on days when wildfire smoke was present versus days when wildfire smoke was not present. METHODS: We obtained counts of visits for asthma at emergency departments and urgent care centers from a large regional healthcare system in Reno for the years 2013-2018. We also obtained dates when wildfire smoke was present from the Washoe County Health District Air Quality Management Division. We then examined whether the presence of wildfire smoke modified the association of PM2.5, PM10-2.5, and PM10 with asthma visits using generalized additive models. We improved on previous studies by excluding wildfire-smoke days where the PM concentration exceeded the maximum PM concentration on other days, thus accounting for possible nonlinearity in the association between PM concentration and asthma visits. RESULTS: Air quality was affected by wildfire smoke on 188 days between 2013 and 2018. We found that the presence of wildfire smoke increased the association of a 5 µg/m3 increase in daily and three-day averages of PM2.5 with asthma visits by 6.1% (95% confidence interval (CI): 2.1-10.3%) and 6.8% (CI: 1.2-12.7%), respectively. Similarly, the presence of wildfire smoke increased the association of a 5 µg/m3 increase in daily and three-day averages of PM10 with asthma visits by 5.5% (CI: 2.5-8.6%) and 7.2% (CI: 2.6-12.0%), respectively. We did not observe any significant increases in association for PM10-2.5 or for seven-day averages of PM2.5 and PM10. CONCLUSIONS: Since we found significantly stronger associations of PM2.5 and PM10 with asthma visits when wildfire smoke was present, our results suggest that wildfire PM is more hazardous than non-wildfire PM for patients with asthma.


Asunto(s)
Asma/epidemiología , Servicio de Urgencia en Hospital/estadística & datos numéricos , Exposición a Riesgos Ambientales/efectos adversos , Hospitalización/estadística & datos numéricos , Material Particulado/efectos adversos , Humo/efectos adversos , Incendios Forestales , Asma/inducido químicamente , Ciudades , Nevada/epidemiología , Material Particulado/análisis
4.
J Biomed Inform ; 94: 103193, 2019 06.
Artículo en Inglés | MEDLINE | ID: mdl-31048072

RESUMEN

In previous research, we have studied concepts that occur in pairs of medical terminologies and are known to be identical, because they have the same ID number in the Unified Medical Language System (UMLS). We observed that such concepts rarely have exactly the same sets of children (=subconcepts) in the two terminologies. The number of common children was found to vary widely. A special situation was identified where the children in one terminology relate to the common parent in a very different way than the children in the other terminology. For example, children in one terminology might subdivide a parent concept by anatomical location in one terminology and by disease kind in the other terminology. We coined the term "alternative classification" (of the same parent concept) for such situations. In previous work, only human experts could recognize alternative classifications. In this paper, we present a mathematically expressed criterion for likely cases of alternative classifications. We compare the recommendations of this criterion, expressed by a mathematical quantity called "EFI" becoming zero, with the decisions of a human expert. It is found that the human expert agreed with the criterion in 72% of all cases, which is a big improvement over having no computable criterion at all. Besides alternative classifications, common parent concepts in a pair of terminologies might also indicate a possible import of a child concept missing in one terminology, different granularities, or errors in either one of the two terminologies. In this paper, we further investigate different kinds of alternative classifications.


Asunto(s)
Relaciones Padres-Hijo , Terminología como Asunto , Adulto , Niño , Humanos , Semántica , Unified Medical Language System
5.
J Biomed Inform ; 83: 135-149, 2018 07.
Artículo en Inglés | MEDLINE | ID: mdl-29852316

RESUMEN

In previous research, we have demonstrated for a number of ontologies that structurally complex concepts (for different definitions of "complex") in an ontology are more likely to exhibit errors than other concepts. Thus, such complex concepts often become fertile ground for quality assurance (QA) in ontologies. They should be audited first. One example of complex concepts is given by "overlapping concepts" (to be defined below.) Historically, a different auditing methodology had to be developed for every single ontology. For better scalability and efficiency, it is desirable to identify family-wide QA methodologies. Each such methodology would be applicable to a whole family of similar ontologies. In past research, we had divided the 685 ontologies of BioPortal into families of structurally similar ontologies. We showed for four ontologies of the same large family in BioPortal that "overlapping concepts" are indeed statistically significantly more likely to exhibit errors. In order to make an authoritative statement concerning the success of "overlapping concepts" as a methodology for a whole family of similar ontologies (or of large subhierarchies of ontologies), it is necessary to show that "overlapping concepts" have a higher likelihood of errors for six out of six ontologies of the family. In this paper, we are demonstrating for two more ontologies that "overlapping concepts" can successfully predict groups of concepts with a higher error rate than concepts from a control group. The fifth ontology is the Neoplasm subhierarchy of the National Cancer Institute thesaurus (NCIt). The sixth ontology is the Infectious Disease subhierarchy of SNOMED CT. We demonstrate quality assurance results for both of them. Furthermore, in this paper we observe two novel, important, and useful phenomena during quality assurance of "overlapping concepts." First, an erroneous "overlapping concept" can help with discovering other erroneous "non-overlapping concepts" in its vicinity. Secondly, correcting erroneous "overlapping concepts" may turn them into "non-overlapping concepts." We demonstrate that this may reduce the complexity of parts of the ontology, which in turn makes the ontology more comprehensible, simplifying maintenance and use of the ontology.


Asunto(s)
Ontologías Biológicas , Procesamiento Automatizado de Datos/métodos , National Cancer Institute (U.S.) , Systematized Nomenclature of Medicine , Estados Unidos , Vocabulario Controlado
6.
J Biomed Inform ; 57: 278-87, 2015 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-26260003

RESUMEN

The Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT) is an extensive reference terminology with an attendant amount of complexity. It has been updated continuously and revisions have been released semi-annually to meet users' needs and to reflect the results of quality assurance (QA) activities. Two measures based on structural features are proposed to track the effects of both natural terminology growth and QA activities based on aspects of the complexity of SNOMED CT. These two measures, called the structural density measure and accumulated structural measure, are derived based on two abstraction networks, the area taxonomy and the partial-area taxonomy. The measures derive from attribute relationship distributions and various concept groupings that are associated with the abstraction networks. They are used to track the trends in the complexity of structures as SNOMED CT changes over time. The measures were calculated for consecutive releases of five SNOMED CT hierarchies, including the Specimen hierarchy. The structural density measure shows that natural growth tends to move a hierarchy's structure toward a more complex state, whereas the accumulated structural measure shows that QA processes tend to move a hierarchy's structure toward a less complex state. It is also observed that both the structural density and accumulated structural measures are useful tools to track the evolution of an entire SNOMED CT hierarchy and reveal internal concept migration within it.


Asunto(s)
Exactitud de los Datos , Systematized Nomenclature of Medicine
7.
J Biomed Inform ; 47: 192-8, 2014 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-24239752

RESUMEN

OBJECTIVE: To quantify the presence of and evaluate an approach for detection of inconsistencies in the formal definitions of SNOMED CT (SCT) concepts utilizing a lexical method. MATERIAL AND METHOD: Utilizing SCT's Procedure hierarchy, we algorithmically formulated similarity sets: groups of concepts with similar lexical structure of their fully specified name. We formulated five random samples, each with 50 similarity sets, based on the same parameter: number of parents, attributes, groups, all the former as well as a randomly selected control sample. All samples' sets were reviewed for types of formal definition inconsistencies: hierarchical, attribute assignment, attribute target values, groups, and definitional. RESULTS: For the Procedure hierarchy, 2111 similarity sets were formulated, covering 18.1% of eligible concepts. The evaluation revealed that 38 (Control) to 70% (Different relationships) of similarity sets within the samples exhibited significant inconsistencies. The rate of inconsistencies for the sample with different relationships was highly significant compared to Control, as well as the number of attribute assignment and hierarchical inconsistencies within their respective samples. DISCUSSION AND CONCLUSION: While, at this time of the HITECH initiative, the formal definitions of SCT are only a minor consideration, in the grand scheme of sophisticated, meaningful use of captured clinical data, they are essential. However, significant portion of the concepts in the most semantically complex hierarchy of SCT, the Procedure hierarchy, are modeled inconsistently in a manner that affects their computability. Lexical methods can efficiently identify such inconsistencies and possibly allow for their algorithmic resolution.


Asunto(s)
Algoritmos , Semántica , Systematized Nomenclature of Medicine , Humanos , Uso Significativo , Infarto del Miocardio/terapia , Isquemia Miocárdica/terapia , Garantía de la Calidad de Atención de Salud , Estados Unidos
8.
J Biomed Inform ; 45(6): 1042-8, 2012 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-22687822

RESUMEN

Auditing healthcare terminologies for errors requires human experts. In this paper, we present a study of the performance of auditors looking for errors in the semantic type assignments of complex UMLS concepts. In this study, concepts are considered complex whenever they are assigned combinations of semantic types. Past research has shown that complex concepts have a higher likelihood of errors. The results of this study indicate that individual auditors are not reliable when auditing such concepts and their performance is low, according to various metrics. These results confirm the outcomes of an earlier pilot study. They imply that to achieve an acceptable level of reliability and performance, when auditing such concepts of the UMLS, several auditors need to be assigned the same task. A mechanism is then needed to combine the possibly differing opinions of the different auditors into a final determination. In the current study, in contrast to our previous work, we used a majority mechanism for this purpose. For a sample of 232 complex UMLS concepts, the majority opinion was found reliable and its performance for accuracy, recall, precision and the F-measure was found statistically significantly higher than the average performance of individual auditors.


Asunto(s)
Semántica , Unified Medical Language System/normas , Humanos , Reproducibilidad de los Resultados , Terminología como Asunto
9.
J Biomed Inform ; 45(1): 1-14, 2012 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-21907827

RESUMEN

Auditors of a large terminology, such as SNOMED CT, face a daunting challenge. To aid them in their efforts, it is essential to devise techniques that can automatically identify concepts warranting special attention. "Complex" concepts, which by their very nature are more difficult to model, fall neatly into this category. A special kind of grouping, called a partial-area, is utilized in the characterization of complex concepts. In particular, the complex concepts that are the focus of this work are those appearing in intersections of multiple partial-areas and are thus referred to as overlapping concepts. In a companion paper, an automatic methodology for identifying and partitioning the entire collection of overlapping concepts into disjoint, singly-rooted groups, that are more manageable to work with and comprehend, has been presented. The partitioning methodology formed the foundation for the development of an abstraction network for the overlapping concepts called a disjoint partial-area taxonomy. This new disjoint partial-area taxonomy offers a collection of semantically uniform partial-areas and is exploited herein as the basis for a novel auditing methodology. The review of the overlapping concepts is done in a top-down order within semantically uniform groups. These groups are themselves reviewed in a top-down order, which proceeds from the less complex to the more complex overlapping concepts. The results of applying the methodology to SNOMED's Specimen hierarchy are presented. Hypotheses regarding error ratios for overlapping concepts and between different kinds of overlapping concepts are formulated. Two phases of auditing the Specimen hierarchy for two releases of SNOMED are reported on. With the use of the double bootstrap and Fisher's exact test (two-tailed), the auditing of concepts and especially roots of overlapping partial-areas is shown to yield a statistically significant higher proportion of errors.


Asunto(s)
Systematized Nomenclature of Medicine , Modelos Teóricos , Terminología como Asunto
10.
Front Genet ; 13: 866169, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35571025

RESUMEN

The clinical value of population-based genetic screening projects depends on the actions taken on the findings. The Healthy Nevada Project (HNP) is an all-comer genetic screening and research project based in northern Nevada. HNP participants with CDC Tier 1 findings of hereditary breast and ovarian cancer syndrome (HBOC), Lynch syndrome (LS), or familial hypercholesterolemia (FH) are notified and provided with genetic counseling. However, the HNP subsequently takes a "hands-off" approach: it is the responsibility of notified participants to share their findings with their healthcare providers, and providers are expected to implement the recommended action plans. Thus, the HNP presents an opportunity to evaluate the efficiency of participant and provider responses to notification of important genetic findings, using electronic health records (EHRs) at Renown Health (a large regional hospital in northern Nevada). Out of 520 HNP participants with findings, we identified 250 participants who were notified of their findings and who had an EHR. 107 of these participants responded to a survey, with 76 (71%) indicating that they had shared their findings with their healthcare providers. However, a sufficiently specific genetic diagnosis appeared in the EHRs and problem lists of only 22 and 10%, respectively, of participants without prior knowledge. Furthermore, review of participant EHRs provided evidence of possible relevant changes in clinical care for only a handful of participants. Up to 19% of participants would have benefited from earlier screening due to prior presentation of their condition. These results suggest that continuous support for both participants and their providers is necessary to maximize the benefit of population-based genetic screening. We recommend that genetic screening projects require participants' consent to directly document their genetic findings in their EHRs. Additionally, we recommend that they provide healthcare providers with ongoing training regarding documentation of findings and with clinical decision support regarding subsequent care.

11.
J Expo Sci Environ Epidemiol ; 31(5): 797-803, 2021 09.
Artículo en Inglés | MEDLINE | ID: mdl-34257389

RESUMEN

BACKGROUND: Air pollution has been linked to increased susceptibility to SARS-CoV-2. Thus, it has been suggested that wildfire smoke events may exacerbate the COVID-19 pandemic. OBJECTIVES: Our goal was to examine whether wildfire smoke from the 2020 wildfires in the western United States was associated with an increased rate of SARS-CoV-2 infections in Reno, Nevada. METHODS: We conducted a time-series analysis using generalized additive models to examine the relationship between the SARS-CoV-2 test positivity rate at a large regional hospital in Reno and ambient PM2.5 from 15 May to 20 Oct 2020. RESULTS: We found that a 10 µg/m3 increase in the 7-day average PM2.5 concentration was associated with a 6.3% relative increase in the SARS-CoV-2 test positivity rate, with a 95% confidence interval (CI) of 2.5 to 10.3%. This corresponded to an estimated 17.7% (CI: 14.4-20.1%) increase in the number of cases during the time period most affected by wildfire smoke, from 16 Aug to 10 Oct. SIGNIFICANCE: Wildfire smoke may have greatly increased the number of COVID-19 cases in Reno. Thus, our results substantiate the role of air pollution in exacerbating the pandemic and can help guide the development of public preparedness policies in areas affected by wildfire smoke, as wildfires are likely to coincide with the COVID-19 pandemic in 2021.


Asunto(s)
Contaminantes Atmosféricos , COVID-19 , Incendios Forestales , Contaminantes Atmosféricos/efectos adversos , Contaminantes Atmosféricos/análisis , Humanos , Nevada , Pandemias , Material Particulado/efectos adversos , Material Particulado/análisis , SARS-CoV-2 , Humo/efectos adversos , Estados Unidos/epidemiología
12.
Cell Death Dis ; 12(4): 310, 2021 03 24.
Artículo en Inglés | MEDLINE | ID: mdl-33762578

RESUMEN

SARS-CoV-2 is responsible for the ongoing world-wide pandemic which has already taken more than two million lives. Effective treatments are urgently needed. The enzymatic activity of the HECT-E3 ligase family members has been implicated in the cell egression phase of deadly RNA viruses such as Ebola through direct interaction of its VP40 Protein. Here we report that HECT-E3 ligase family members such as NEDD4 and WWP1 interact with and ubiquitylate the SARS-CoV-2 Spike protein. Furthermore, we find that HECT family members are overexpressed in primary samples derived from COVID-19 infected patients and COVID-19 mouse models. Importantly, rare germline activating variants in the NEDD4 and WWP1 genes are associated with severe COVID-19 cases. Critically, I3C, a natural NEDD4 and WWP1 inhibitor from Brassicaceae, displays potent antiviral effects and inhibits viral egression. In conclusion, we identify the HECT family members of E3 ligases as likely novel biomarkers for COVID-19, as well as new potential targets of therapeutic strategy easily testable in clinical trials in view of the established well-tolerated nature of the Brassicaceae natural compounds.


Asunto(s)
Tratamiento Farmacológico de COVID-19 , COVID-19/enzimología , Ubiquitina-Proteína Ligasas/antagonistas & inhibidores , Ubiquitina-Proteína Ligasas/metabolismo , Adulto , Anciano , Animales , Antivirales/farmacología , COVID-19/genética , COVID-19/metabolismo , Chlorocebus aethiops , Complejos de Clasificación Endosomal Requeridos para el Transporte/metabolismo , Femenino , Humanos , Indoles/farmacología , Masculino , Ratones , Ratones Endogámicos BALB C , Persona de Mediana Edad , Ubiquitina-Proteína Ligasas Nedd4/genética , Ubiquitina-Proteína Ligasas Nedd4/metabolismo , SARS-CoV-2/aislamiento & purificación , SARS-CoV-2/metabolismo , Glicoproteína de la Espiga del Coronavirus/metabolismo , Ubiquitina-Proteína Ligasas/genética , Ubiquitinación , Células Vero
13.
J Biomed Inform ; 43(6): 988-97, 2010 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-20692366

RESUMEN

As the UMLS integrates multiple source vocabularies, the integration process requires that certain adaptation be applied to the source. Our interest is in examining the relationship between the UMLS representation of a source vocabulary and the source vocabulary itself. We investigated the integration of the Minimal Standard Terminology (MST) into the UMLS in order to examine how close its UMLS representation is to the source MST. The MST was conceived as a "minimal" list of terms and structure intended for use within computer systems to facilitate standardized reporting of gastrointestinal endoscopic examinations. Although the MST has an overall schema and implied relationship structure, many of the UMLS integrated MST terms were found to be hierarchically orphaned, and with lateral relationships that do not closely adhere to the source MST. Thus, the MST representation within the UMLS significantly differs from that of the source MST. These representation discrepancies may affect the usability of the MST representation in the UMLS for knowledge acquisition. Furthermore, they pose a problem from the perspective of application developers. While these findings may not necessarily apply to other source terminologies, they highlight the conflict between preservation of authentic concept orientation and the UMLS overall desire to provide fully specified names for all source terms.


Asunto(s)
Biología Computacional/métodos , Terminología como Asunto , Unified Medical Language System , Bases de Datos Factuales , Estándares de Referencia , Vocabulario Controlado
14.
G3 (Bethesda) ; 10(2): 645-664, 2020 02 06.
Artículo en Inglés | MEDLINE | ID: mdl-31888951

RESUMEN

The aggregation of Electronic Health Records (EHR) and personalized genetics leads to powerful discoveries relevant to population health. Here we perform genome-wide association studies (GWAS) and accompanying phenome-wide association studies (PheWAS) to validate phenotype-genotype associations of BMI, and to a greater extent, severe Class 2 obesity, using comprehensive diagnostic and clinical data from the EHR database of our cohort. Three GWASs of 500,000 variants on the Illumina platform of 6,645 Healthy Nevada participants identified several published and novel variants that affect BMI and obesity. Each GWAS was followed with two independent PheWASs to examine associations between extensive phenotypes (incidence of diagnoses, condition, or disease), significant SNPs, BMI, and incidence of extreme obesity. The first GWAS examines associations with BMI in a cohort with no type 2 diabetics, focusing exclusively on BMI. The second GWAS examines associations with BMI in a cohort that includes type 2 diabetics. In the second GWAS, type 2 diabetes is a comorbidity, and thus becomes a covariate in the statistical model. The intersection of significant variants of these two studies is surprising. The third GWAS is a case vs. control study, with cases defined as extremely obese (Class 2 or 3 obesity), and controls defined as participants with BMI between 18.5 and 25. This last GWAS identifies strong associations with extreme obesity, including established variants in the FTO and NEGR1 genes, as well as loci not yet linked to obesity. The PheWASs validate published associations between BMI and extreme obesity and incidence of specific diagnoses and conditions, yet also highlight novel links. This study emphasizes the importance of our extensive longitudinal EHR database to validate known associations and identify putative novel links with BMI and obesity.


Asunto(s)
Índice de Masa Corporal , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Obesidad/etiología , Adulto , Anciano , Comorbilidad , Bases de Datos Genéticas , Registros Electrónicos de Salud , Femenino , Estudios de Asociación Genética/métodos , Genotipo , Humanos , Masculino , Persona de Mediana Edad , Nevada/epidemiología , Obesidad/diagnóstico , Obesidad/epidemiología , Fenotipo , Polimorfismo de Nucleótido Simple
15.
Nat Commun ; 11(1): 542, 2020 Jan 28.
Artículo en Inglés | MEDLINE | ID: mdl-31992710

RESUMEN

Understanding the impact of rare variants is essential to understanding human health. We analyze rare (MAF < 0.1%) variants against 4264 phenotypes in 49,960 exome-sequenced individuals from the UK Biobank and 1934 phenotypes (1821 overlapping with UK Biobank) in 21,866 members of the Healthy Nevada Project (HNP) cohort who underwent Exome + sequencing at Helix. After using our rare-variant-tailored methodology to reduce test statistic inflation, we identify 64 statistically significant gene-based associations in our meta-analysis of the two cohorts and 37 for phenotypes available in only one cohort. Singletons make significant contributions to our results, and the vast majority of the associations could not have been identified with a genotyping chip. Our results are available for interactive browsing in a webapp (https://ukb.research.helix.com). This comprehensive analysis illustrates the biological value of large, deeply phenotyped cohorts of unselected populations coupled with NGS data.


Asunto(s)
Exoma/genética , Variación Genética , Genoma Humano , Estudio de Asociación del Genoma Completo , Fenotipo , Adolescente , Adulto , Anciano , Anciano de 80 o más Años , Estudios de Cohortes , Bases de Datos Genéticas , Europa (Continente) , Femenino , Genética de Población/estadística & datos numéricos , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Masculino , Metaanálisis como Asunto , Persona de Mediana Edad , Programas Informáticos , Secuenciación del Exoma , Adulto Joven
16.
Nat Metab ; 2(10): 1126-1134, 2020 10.
Artículo en Inglés | MEDLINE | ID: mdl-33046911

RESUMEN

Genome-wide association studies have identified 240 independent loci associated with type 2 diabetes (T2D) risk, but this knowledge has not advanced precision medicine. In contrast, the genetic diagnosis of monogenic forms of diabetes (including maturity-onset diabetes of the young (MODY)) are textbook cases of genomic medicine. Recent studies trying to bridge the gap between monogenic diabetes and T2D have been inconclusive. Here, we show a significant burden of pathogenic variants in genes linked with monogenic diabetes among people with common T2D, particularly in actionable MODY genes, thus implying that there should be a substantial change in care for carriers with T2D. We show that, among 74,629 individuals, this burden is probably driven by the pathogenic variants found in GCK, and to a lesser extent in HNF4A, KCNJ11, HNF1B and ABCC8. The carriers with T2D are leaner, which evidences a functional metabolic effect of these mutations. Pathogenic variants in actionable MODY genes are more frequent than was previously expected in common T2D. These results open avenues for future interventions assessing the clinical interest of these pathogenic mutations in precision medicine.


Asunto(s)
Diabetes Mellitus Tipo 2/genética , Biología Computacional , Femenino , Variación Genética , Estudio de Asociación del Genoma Completo , Quinasas del Centro Germinal/genética , Heterocigoto , Humanos , Masculino , Persona de Mediana Edad , Mutación
17.
J Biomed Inform ; 42(3): 550-7, 2009 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-19475727

RESUMEN

The Foundational Model of Anatomy (FMA) ontology is a domain reference ontology based on a disciplined modeling approach. Due to its large size, semantic complexity and manual data entry process, errors and inconsistencies are unavoidable and might remain within the FMA structure without detection. In this paper, we present computable methods to highlight candidate concepts for various relationship assignment errors. The process starts with locating structures formed by transitive structural relationships (part_of, tributary_of, branch_of) and examine their assignments in the context of the IS-A hierarchy. The algorithms were designed to detect five major categories of possible incorrect relationship assignments: circular, mutually exclusive, redundant, inconsistent, and missed entries. A domain expert reviewed samples of these presumptive errors to confirm the findings. Seven thousand and fifty-two presumptive errors were detected, the largest proportion related to part_of relationship assignments. The results highlight the fact that errors are unavoidable in complex ontologies and that well designed algorithms can help domain experts to focus on concepts with high likelihood of errors and maximize their effort to ensure consistency and reliability. In the future similar methods might be integrated with data entry processes to offer real-time error detection.


Asunto(s)
Auditoría Administrativa , Terminología como Asunto , Algoritmos
18.
PLoS One ; 14(6): e0218078, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-31194788

RESUMEN

In this study, we perform a full genome-wide association study (GWAS) to identify statistically significantly associated single nucleotide polymorphisms (SNPs) with three red blood cell (RBC) components and follow it with two independent PheWASs to examine associations between phenotypic data (case-control status of diagnoses or disease), significant SNPs, and RBC component levels. We first identified associations between the three RBC components: mean platelet volume (MPV), mean corpuscular volume (MCV), and platelet counts (PC), and the genotypes of approximately 500,000 SNPs on the Illumina Infimum DNA Human OmniExpress-24 BeadChip using a single cohort of 4,673 Northern Nevadans. Twenty-one SNPs in five major genomic regions were found to be statistically significantly associated with MPV, two regions with MCV, and one region with PC, with p<5x10-8. Twenty-nine SNPs and nine chromosomal regions were identified in 30 previous GWASs, with effect sizes of similar magnitude and direction as found in our cohort. The two strongest associations were SNP rs1354034 with MPV (p = 2.4x10-13) and rs855791 with MCV (p = 5.2x10-12). We then examined possible associations between these significant SNPs and incidence of 1,488 phenotype groups mapped from International Classification of Disease version 9 and 10 (ICD9 and ICD10) codes collected in the extensive electronic health record (EHR) database associated with Healthy Nevada Project consented participants. Further leveraging data collected in the EHR, we performed an additional PheWAS to identify associations between continuous red blood cell (RBC) component measures and incidence of specific diagnoses. The first PheWAS illuminated whether SNPs associated with RBC components in our cohort were linked with other hematologic phenotypic diagnoses or diagnoses of other nature. Although no SNPs from our GWAS were identified as strongly associated to other phenotypic components, a number of associations were identified with p-values ranging between 1x10-3 and 1x10-4 with traits such as respiratory failure, sleep disorders, hypoglycemia, hyperglyceridemia, GERD and IBS. The second PheWAS examined possible phenotypic predictors of abnormal RBC component measures: a number of hematologic phenotypes such as thrombocytopenia, anemias, hemoglobinopathies and pancytopenia were found to be strongly associated to RBC component measures; additional phenotypes such as (morbid) obesity, malaise and fatigue, alcoholism, and cirrhosis were also identified to be possible predictors of RBC component measures.


Asunto(s)
Eritrocitos/citología , Estudio de Asociación del Genoma Completo , Fenotipo , Adulto , Mapeo Cromosómico , Estudios de Cohortes , Femenino , Genotipo , Humanos , Masculino , Persona de Mediana Edad , Nevada , Polimorfismo de Nucleótido Simple
19.
Methods Inf Med ; 57(1): 43-53, 2018 02.
Artículo en Inglés | MEDLINE | ID: mdl-29621830

RESUMEN

BACKGROUND: The UMLS assigns semantic types to all its integrated concepts. The semantic types are widely used in various natural language processing tasks in the biomedical domain, such as named entity recognition, semantic disambiguation, and semantic annotation. Due to the size of the UMLS, erroneous semantic type assignments are hard to detect. It is imperative to devise automated techniques to identify errors and inconsistencies in semantic type assignments. OBJECTIVES: Designing a methodology to perform programmatic checks to detect semantic type assignment errors for UMLS concepts with one or more SNOMED CT terms and evaluating concepts in a selected set of SNOMED CT hierarchies to verify our hypothesis that UMLS semantic type assignment errors may exist in concepts residing in semantically inconsistent groups. METHODS: Our methodology is a four-stage process. 1) partitioning concepts in a SNOMED CT hierarchy into semantically uniform groups based on their assigned semantic tags; 2) partitioning concepts in each group from 1) into the disjoint sub-groups based on their semantic type assignments; 3) mapping all SNOMED CT semantic tags into one or more semantic types in the UMLS; 4) identifying semantically inconsistent groups that have inconsistent assignments between semantic tags and semantic types according to the mapping from 3) and providing concepts in such groups to the domain experts for reviewing. RESULTS: We applied our method on the UMLS 2013AA release. Concepts of the semantically inconsistent groups in the PHYSICAL FORCE and RECORD ARTIFACT hierarchies have error rates 33% and 62.5% respectively, which are greatly larger than error rates 0.6% and 1% in semantically consistent groups of the two hierarchies. CONCLUSION: Concepts in semantically in - consistent groups are more likely to contain semantic type assignment errors. Our methodology can make auditing more efficient by limiting auditing resources on concepts of semantically inconsistent groups.


Asunto(s)
Semántica , Systematized Nomenclature of Medicine , Unified Medical Language System , Artefactos , Reproducibilidad de los Resultados
20.
Artículo en Inglés | MEDLINE | ID: mdl-29375930

RESUMEN

The Unified Medical Language System (UMLS) is an important terminological system. By the policy of its curators, each concept of the UMLS should be assigned the most specific Semantic Types (STs) in the UMLS Semantic Network (SN). Hence, the Semantic Types of most UMLS concepts are assigned at or near the bottom (leaves) of the UMLS Semantic Network. While most ST assignments are correct, some errors do occur. Therefore, Quality Assurance efforts of UMLS curators for ST assignments should concentrate on automatically detected sets of UMLS concepts with higher error rates than random sets. In this paper, we investigate the assignments of top-level semantic types in the UMLS semantic network to concepts, identify potential erroneous assignments, define four categories of errors, and thus provide assistance to curators of the UMLS to avoid these assignments errors. Human experts analyzed samples of concepts assigned 10 of the top-level semantic types and categorized the erroneous ST assignments into these four logical categories. Two thirds of the concepts assigned these 10 top-level semantic types are erroneous. Our results demonstrate that reviewing top-level semantic type assignments to concepts provides an effective way for UMLS quality assurance, comparing to reviewing a random selection of semantic type assignments.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA