Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 42
Filter
Add more filters

Country/Region as subject
Publication year range
1.
Genet Med ; 25(4): 100012, 2023 04.
Article in English | MEDLINE | ID: mdl-36637017

ABSTRACT

PURPOSE: TTN truncating variants (TTNtvs) represent the largest known genetic cause of dilated cardiomyopathies (DCMs), however their penetrance for DCM in general populations is low. More broadly, patients with cardiomyopathies (CMs) often exhibit other cardiac conditions, such as atrial fibrillation (Afib), which has also been linked to TTNtvs. This retrospective analysis aims to characterize the relationship between different cardiac conditions in those with TTNtvs and identify individuals with the highest risk of DCM. METHODS: In this work we leverage longitudinal electronic health record and exome sequencing data from approximately 450,000 individuals in 2 health systems to statistically confirm and pinpoint the genetic footprint of TTNtv-related diagnoses aside from CM, such as Afib, and determine whether vetting additional significantly associated phenotypes better stratifies CM risk across those with TTNtvs. We focused on TTNtvs in exons with a percentage spliced in >90% (hiPSI TTNtvs), a representation of constitutive cardiac expression. RESULTS: When controlling for CM and Afib, other cardiac conditions retained only nominal association with TTNtvs. A sliding window analysis of TTNtvs across the locus confirms that the association is specific to hiPSI exons for both CM and Afib, with no meaningful associations in percent spliced in ≤90% exons (loPSI TTNtvs). The combination of hiPSI TTNtv status and early Afib diagnosis (before age 60) found a subset of TTNtv individuals at high risk for CM. The prevalence of CM in this subset was 33%, a rate that was 3.5 fold higher than that in individuals with hiPSI TTNtvs (9% prevalence), 5-fold higher than that in individuals without TTNtvs with early Afib (6% prevalence), and 80-fold higher than that in the general population. CONCLUSION: Our retrospective analyses revealed that those with hiPSI TTNtvs and early Afib (∼1/2900) have a high prevalence of CM (33%), far exceeding that in other individuals with TTNtvs and in those without TTNtvs with an early Afib diagnosis. These results show that combining phenotypic information along with genomic population screening can identify patients at higher risk for progressing to symptomatic heart failure.


Subject(s)
Atrial Fibrillation , Cardiomyopathies , Cardiomyopathy, Dilated , Heart Diseases , Humans , Atrial Fibrillation/epidemiology , Atrial Fibrillation/genetics , Retrospective Studies , Prevalence , Cardiomyopathies/epidemiology , Cardiomyopathies/genetics , Connectin/genetics , Connectin/metabolism , Cardiomyopathy, Dilated/epidemiology , Cardiomyopathy, Dilated/genetics
2.
BMC Med Inform Decis Mak ; 23(Suppl 1): 40, 2023 02 24.
Article in English | MEDLINE | ID: mdl-36829139

ABSTRACT

BACKGROUND: Two years into the COVID-19 pandemic and with more than five million deaths worldwide, the healthcare establishment continues to struggle with every new wave of the pandemic resulting from a new coronavirus variant. Research has demonstrated that there are variations in the symptoms, and even in the order of symptom presentations, in COVID-19 patients infected by different SARS-CoV-2 variants (e.g., Alpha and Omicron). Textual data in the form of admission notes and physician notes in the Electronic Health Records (EHRs) is rich in information regarding the symptoms and their orders of presentation. Unstructured EHR data is often underutilized in research due to the lack of annotations that enable automatic extraction of useful information from the available extensive volumes of textual data. METHODS: We present the design of a COVID Interface Terminology (CIT), not just a generic COVID-19 terminology, but one serving a specific purpose of enabling automatic annotation of EHRs of COVID-19 patients. CIT was constructed by integrating existing COVID-related ontologies and mining additional fine granularity concepts from clinical notes. The iterative mining approach utilized the techniques of 'anchoring' and 'concatenation' to identify potential fine granularity concepts to be added to the CIT. We also tested the generalizability of our approach on a hold-out dataset and compared the annotation coverage to the coverage obtained for the dataset used to build the CIT. RESULTS: Our experiments demonstrate that this approach results in higher annotation coverage compared to existing ontologies such as SNOMED CT and Coronavirus Infectious Disease Ontology (CIDO). The final version of CIT achieved about 20% more coverage than SNOMED CT and 50% more coverage than CIDO. In the future, the concepts mined and added into CIT could be used as training data for machine learning models for mining even more concepts into CIT and further increasing the annotation coverage. CONCLUSION: In this paper, we demonstrated the construction of a COVID interface terminology that can be utilized for automatically annotating EHRs of COVID-19 patients. The techniques presented can identify frequently documented fine granularity concepts that are missing in other ontologies thereby increasing the annotation coverage.


Subject(s)
COVID-19 , Electronic Health Records , Humans , Pandemics , SARS-CoV-2
3.
Environ Health ; 19(1): 92, 2020 08 27.
Article in English | MEDLINE | ID: mdl-32854703

ABSTRACT

BACKGROUND: Health risks due to particulate matter (PM) from wildfires may differ from risk due to PM from other sources. In places frequently subjected to wildfire smoke, such as Reno, Nevada, it is critical to determine whether wildfire PM poses unique risks. Our goal was to quantify the difference in the association of adverse asthma events with PM on days when wildfire smoke was present versus days when wildfire smoke was not present. METHODS: We obtained counts of visits for asthma at emergency departments and urgent care centers from a large regional healthcare system in Reno for the years 2013-2018. We also obtained dates when wildfire smoke was present from the Washoe County Health District Air Quality Management Division. We then examined whether the presence of wildfire smoke modified the association of PM2.5, PM10-2.5, and PM10 with asthma visits using generalized additive models. We improved on previous studies by excluding wildfire-smoke days where the PM concentration exceeded the maximum PM concentration on other days, thus accounting for possible nonlinearity in the association between PM concentration and asthma visits. RESULTS: Air quality was affected by wildfire smoke on 188 days between 2013 and 2018. We found that the presence of wildfire smoke increased the association of a 5 µg/m3 increase in daily and three-day averages of PM2.5 with asthma visits by 6.1% (95% confidence interval (CI): 2.1-10.3%) and 6.8% (CI: 1.2-12.7%), respectively. Similarly, the presence of wildfire smoke increased the association of a 5 µg/m3 increase in daily and three-day averages of PM10 with asthma visits by 5.5% (CI: 2.5-8.6%) and 7.2% (CI: 2.6-12.0%), respectively. We did not observe any significant increases in association for PM10-2.5 or for seven-day averages of PM2.5 and PM10. CONCLUSIONS: Since we found significantly stronger associations of PM2.5 and PM10 with asthma visits when wildfire smoke was present, our results suggest that wildfire PM is more hazardous than non-wildfire PM for patients with asthma.


Subject(s)
Asthma/epidemiology , Emergency Service, Hospital/statistics & numerical data , Environmental Exposure/adverse effects , Hospitalization/statistics & numerical data , Particulate Matter/adverse effects , Smoke/adverse effects , Wildfires , Asthma/chemically induced , Cities , Nevada/epidemiology , Particulate Matter/analysis
4.
J Biomed Inform ; 94: 103193, 2019 06.
Article in English | MEDLINE | ID: mdl-31048072

ABSTRACT

In previous research, we have studied concepts that occur in pairs of medical terminologies and are known to be identical, because they have the same ID number in the Unified Medical Language System (UMLS). We observed that such concepts rarely have exactly the same sets of children (=subconcepts) in the two terminologies. The number of common children was found to vary widely. A special situation was identified where the children in one terminology relate to the common parent in a very different way than the children in the other terminology. For example, children in one terminology might subdivide a parent concept by anatomical location in one terminology and by disease kind in the other terminology. We coined the term "alternative classification" (of the same parent concept) for such situations. In previous work, only human experts could recognize alternative classifications. In this paper, we present a mathematically expressed criterion for likely cases of alternative classifications. We compare the recommendations of this criterion, expressed by a mathematical quantity called "EFI" becoming zero, with the decisions of a human expert. It is found that the human expert agreed with the criterion in 72% of all cases, which is a big improvement over having no computable criterion at all. Besides alternative classifications, common parent concepts in a pair of terminologies might also indicate a possible import of a child concept missing in one terminology, different granularities, or errors in either one of the two terminologies. In this paper, we further investigate different kinds of alternative classifications.


Subject(s)
Parent-Child Relations , Terminology as Topic , Adult , Child , Humans , Semantics , Unified Medical Language System
5.
J Biomed Inform ; 83: 135-149, 2018 07.
Article in English | MEDLINE | ID: mdl-29852316

ABSTRACT

In previous research, we have demonstrated for a number of ontologies that structurally complex concepts (for different definitions of "complex") in an ontology are more likely to exhibit errors than other concepts. Thus, such complex concepts often become fertile ground for quality assurance (QA) in ontologies. They should be audited first. One example of complex concepts is given by "overlapping concepts" (to be defined below.) Historically, a different auditing methodology had to be developed for every single ontology. For better scalability and efficiency, it is desirable to identify family-wide QA methodologies. Each such methodology would be applicable to a whole family of similar ontologies. In past research, we had divided the 685 ontologies of BioPortal into families of structurally similar ontologies. We showed for four ontologies of the same large family in BioPortal that "overlapping concepts" are indeed statistically significantly more likely to exhibit errors. In order to make an authoritative statement concerning the success of "overlapping concepts" as a methodology for a whole family of similar ontologies (or of large subhierarchies of ontologies), it is necessary to show that "overlapping concepts" have a higher likelihood of errors for six out of six ontologies of the family. In this paper, we are demonstrating for two more ontologies that "overlapping concepts" can successfully predict groups of concepts with a higher error rate than concepts from a control group. The fifth ontology is the Neoplasm subhierarchy of the National Cancer Institute thesaurus (NCIt). The sixth ontology is the Infectious Disease subhierarchy of SNOMED CT. We demonstrate quality assurance results for both of them. Furthermore, in this paper we observe two novel, important, and useful phenomena during quality assurance of "overlapping concepts." First, an erroneous "overlapping concept" can help with discovering other erroneous "non-overlapping concepts" in its vicinity. Secondly, correcting erroneous "overlapping concepts" may turn them into "non-overlapping concepts." We demonstrate that this may reduce the complexity of parts of the ontology, which in turn makes the ontology more comprehensible, simplifying maintenance and use of the ontology.


Subject(s)
Biological Ontologies , Electronic Data Processing/methods , National Cancer Institute (U.S.) , Systematized Nomenclature of Medicine , United States , Vocabulary, Controlled
6.
J Biomed Inform ; 57: 278-87, 2015 Oct.
Article in English | MEDLINE | ID: mdl-26260003

ABSTRACT

The Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT) is an extensive reference terminology with an attendant amount of complexity. It has been updated continuously and revisions have been released semi-annually to meet users' needs and to reflect the results of quality assurance (QA) activities. Two measures based on structural features are proposed to track the effects of both natural terminology growth and QA activities based on aspects of the complexity of SNOMED CT. These two measures, called the structural density measure and accumulated structural measure, are derived based on two abstraction networks, the area taxonomy and the partial-area taxonomy. The measures derive from attribute relationship distributions and various concept groupings that are associated with the abstraction networks. They are used to track the trends in the complexity of structures as SNOMED CT changes over time. The measures were calculated for consecutive releases of five SNOMED CT hierarchies, including the Specimen hierarchy. The structural density measure shows that natural growth tends to move a hierarchy's structure toward a more complex state, whereas the accumulated structural measure shows that QA processes tend to move a hierarchy's structure toward a less complex state. It is also observed that both the structural density and accumulated structural measures are useful tools to track the evolution of an entire SNOMED CT hierarchy and reveal internal concept migration within it.


Subject(s)
Data Accuracy , Systematized Nomenclature of Medicine
7.
J Biomed Inform ; 47: 192-8, 2014 Feb.
Article in English | MEDLINE | ID: mdl-24239752

ABSTRACT

OBJECTIVE: To quantify the presence of and evaluate an approach for detection of inconsistencies in the formal definitions of SNOMED CT (SCT) concepts utilizing a lexical method. MATERIAL AND METHOD: Utilizing SCT's Procedure hierarchy, we algorithmically formulated similarity sets: groups of concepts with similar lexical structure of their fully specified name. We formulated five random samples, each with 50 similarity sets, based on the same parameter: number of parents, attributes, groups, all the former as well as a randomly selected control sample. All samples' sets were reviewed for types of formal definition inconsistencies: hierarchical, attribute assignment, attribute target values, groups, and definitional. RESULTS: For the Procedure hierarchy, 2111 similarity sets were formulated, covering 18.1% of eligible concepts. The evaluation revealed that 38 (Control) to 70% (Different relationships) of similarity sets within the samples exhibited significant inconsistencies. The rate of inconsistencies for the sample with different relationships was highly significant compared to Control, as well as the number of attribute assignment and hierarchical inconsistencies within their respective samples. DISCUSSION AND CONCLUSION: While, at this time of the HITECH initiative, the formal definitions of SCT are only a minor consideration, in the grand scheme of sophisticated, meaningful use of captured clinical data, they are essential. However, significant portion of the concepts in the most semantically complex hierarchy of SCT, the Procedure hierarchy, are modeled inconsistently in a manner that affects their computability. Lexical methods can efficiently identify such inconsistencies and possibly allow for their algorithmic resolution.


Subject(s)
Algorithms , Semantics , Systematized Nomenclature of Medicine , Humans , Meaningful Use , Myocardial Infarction/therapy , Myocardial Ischemia/therapy , Quality Assurance, Health Care , United States
8.
JAMA Netw Open ; 7(9): e2435901, 2024 Sep 03.
Article in English | MEDLINE | ID: mdl-39320887

ABSTRACT

Importance: Most patients with pathogenic or likely pathogenic (P/LP) variants for breast cancer have not undergone genetic testing. Objective: To identify patients meeting family history criteria for genetic testing in the electronic health record (EHR). Design, Setting, and Participants: This study included both cross-sectional (observation date, February 1, 2024) and retrospective cohort (observation period, January 1, 2018, to February 1, 2024) analyses. Participants included patients aged 18 to 79 years enrolled in Renown Health, a large health system in Northern Nevada. Genotype was known for 38 003 patients enrolled in Healthy Nevada Project (HNP), a population genomics study. Exposure: An EHR indicating that a patient is positive for criteria according to the Seven-Question Family History Questionnaire (hereafter, FHS7 positive) assessing familial risk for hereditary breast and ovarian cancer (HBOC). Main Outcomes and Measures: The primary outcomes were the presence of P/LP variants in the ATM, BRCA1, BRCA2, CHEK2, or PALB2 genes (cross-sectional analysis) or a diagnosis of cancer (cohort analysis). Age-adjusted cancer incidence rates per 100 000 patients per year were calculated using the 2020 US population as the standard. Hazard ratios (HRs) for cancer attributable to FHS7-positive status were estimated using cause-specific hazard models. Results: Among 835 727 patients, 423 393 (50.7%) were female and 29 913 (3.6%) were FHS7 positive. Among those who were FHS7 positive, 24 535 (82.0%) had no evidence of prior genetic testing for HBOC in their EHR. Being FHS7 positive was associated with increased prevalence of P/LP variants in BRCA1/BRCA2 (odds ratio [OR], 3.34; 95% CI, 2.48-4.47), CHEK2 (OR, 1.62; 95% CI, 1.05-2.43), and PALB2 (OR, 2.84; 95% CI, 1.23-6.16) among HNP female individuals, and in BRCA1/BRCA2 (OR, 3.35; 95% CI, 1.93-5.56) among HNP male individuals. Being FHS7 positive was also associated with significantly increased risk of cancer among 131 622 non-HNP female individuals (HR, 1.44; 95% CI, 1.22-1.70) but not among 114 982 non-HNP male individuals (HR, 1.11; 95% CI, 0.87-1.42). Among 1527 HNP survey respondents, 352 of 383 EHR-FHS7 positive patients (91.9%) were survey-FHS7 positive, but only 352 of 883 survey-FHS7 positive patients (39.9%) were EHR-FHS7 positive. Of the 29 913 FHS7-positive patients, 19 764 (66.1%) were identified only after parsing free-text family history comments. Socioeconomic differences were also observed between EHR-FHS7-negative and EHR-FHS7-positive patients, suggesting disparities in recording family history. Conclusions and Relevance: In this cross-sectional study, EHR-derived FHS7 identified thousands of patients with familial risk for breast cancer, indicating a substantial gap in genetic testing. However, limitations in EHR family history data suggested that other identification methods, such as direct-to-patient questionnaires, are required to fully address this gap.


Subject(s)
Genetic Predisposition to Disease , Genetic Testing , Ovarian Neoplasms , Humans , Female , Middle Aged , Adult , Cross-Sectional Studies , Retrospective Studies , Aged , Genetic Testing/methods , Genetic Testing/statistics & numerical data , Ovarian Neoplasms/genetics , Ovarian Neoplasms/epidemiology , Breast Neoplasms/genetics , Breast Neoplasms/epidemiology , Breast Neoplasms/diagnosis , Nevada/epidemiology , Young Adult , Hereditary Breast and Ovarian Cancer Syndrome/genetics , Hereditary Breast and Ovarian Cancer Syndrome/epidemiology , Hereditary Breast and Ovarian Cancer Syndrome/diagnosis , Adolescent , Male , Fanconi Anemia Complementation Group N Protein
9.
J Biomed Inform ; 45(6): 1042-8, 2012 Dec.
Article in English | MEDLINE | ID: mdl-22687822

ABSTRACT

Auditing healthcare terminologies for errors requires human experts. In this paper, we present a study of the performance of auditors looking for errors in the semantic type assignments of complex UMLS concepts. In this study, concepts are considered complex whenever they are assigned combinations of semantic types. Past research has shown that complex concepts have a higher likelihood of errors. The results of this study indicate that individual auditors are not reliable when auditing such concepts and their performance is low, according to various metrics. These results confirm the outcomes of an earlier pilot study. They imply that to achieve an acceptable level of reliability and performance, when auditing such concepts of the UMLS, several auditors need to be assigned the same task. A mechanism is then needed to combine the possibly differing opinions of the different auditors into a final determination. In the current study, in contrast to our previous work, we used a majority mechanism for this purpose. For a sample of 232 complex UMLS concepts, the majority opinion was found reliable and its performance for accuracy, recall, precision and the F-measure was found statistically significantly higher than the average performance of individual auditors.


Subject(s)
Semantics , Unified Medical Language System/standards , Humans , Reproducibility of Results , Terminology as Topic
10.
J Biomed Inform ; 45(1): 1-14, 2012 Feb.
Article in English | MEDLINE | ID: mdl-21907827

ABSTRACT

Auditors of a large terminology, such as SNOMED CT, face a daunting challenge. To aid them in their efforts, it is essential to devise techniques that can automatically identify concepts warranting special attention. "Complex" concepts, which by their very nature are more difficult to model, fall neatly into this category. A special kind of grouping, called a partial-area, is utilized in the characterization of complex concepts. In particular, the complex concepts that are the focus of this work are those appearing in intersections of multiple partial-areas and are thus referred to as overlapping concepts. In a companion paper, an automatic methodology for identifying and partitioning the entire collection of overlapping concepts into disjoint, singly-rooted groups, that are more manageable to work with and comprehend, has been presented. The partitioning methodology formed the foundation for the development of an abstraction network for the overlapping concepts called a disjoint partial-area taxonomy. This new disjoint partial-area taxonomy offers a collection of semantically uniform partial-areas and is exploited herein as the basis for a novel auditing methodology. The review of the overlapping concepts is done in a top-down order within semantically uniform groups. These groups are themselves reviewed in a top-down order, which proceeds from the less complex to the more complex overlapping concepts. The results of applying the methodology to SNOMED's Specimen hierarchy are presented. Hypotheses regarding error ratios for overlapping concepts and between different kinds of overlapping concepts are formulated. Two phases of auditing the Specimen hierarchy for two releases of SNOMED are reported on. With the use of the double bootstrap and Fisher's exact test (two-tailed), the auditing of concepts and especially roots of overlapping partial-areas is shown to yield a statistically significant higher proportion of errors.


Subject(s)
Systematized Nomenclature of Medicine , Models, Theoretical , Terminology as Topic
11.
Front Genet ; 13: 866169, 2022.
Article in English | MEDLINE | ID: mdl-35571025

ABSTRACT

The clinical value of population-based genetic screening projects depends on the actions taken on the findings. The Healthy Nevada Project (HNP) is an all-comer genetic screening and research project based in northern Nevada. HNP participants with CDC Tier 1 findings of hereditary breast and ovarian cancer syndrome (HBOC), Lynch syndrome (LS), or familial hypercholesterolemia (FH) are notified and provided with genetic counseling. However, the HNP subsequently takes a "hands-off" approach: it is the responsibility of notified participants to share their findings with their healthcare providers, and providers are expected to implement the recommended action plans. Thus, the HNP presents an opportunity to evaluate the efficiency of participant and provider responses to notification of important genetic findings, using electronic health records (EHRs) at Renown Health (a large regional hospital in northern Nevada). Out of 520 HNP participants with findings, we identified 250 participants who were notified of their findings and who had an EHR. 107 of these participants responded to a survey, with 76 (71%) indicating that they had shared their findings with their healthcare providers. However, a sufficiently specific genetic diagnosis appeared in the EHRs and problem lists of only 22 and 10%, respectively, of participants without prior knowledge. Furthermore, review of participant EHRs provided evidence of possible relevant changes in clinical care for only a handful of participants. Up to 19% of participants would have benefited from earlier screening due to prior presentation of their condition. These results suggest that continuous support for both participants and their providers is necessary to maximize the benefit of population-based genetic screening. We recommend that genetic screening projects require participants' consent to directly document their genetic findings in their EHRs. Additionally, we recommend that they provide healthcare providers with ongoing training regarding documentation of findings and with clinical decision support regarding subsequent care.

12.
J Expo Sci Environ Epidemiol ; 31(5): 797-803, 2021 09.
Article in English | MEDLINE | ID: mdl-34257389

ABSTRACT

BACKGROUND: Air pollution has been linked to increased susceptibility to SARS-CoV-2. Thus, it has been suggested that wildfire smoke events may exacerbate the COVID-19 pandemic. OBJECTIVES: Our goal was to examine whether wildfire smoke from the 2020 wildfires in the western United States was associated with an increased rate of SARS-CoV-2 infections in Reno, Nevada. METHODS: We conducted a time-series analysis using generalized additive models to examine the relationship between the SARS-CoV-2 test positivity rate at a large regional hospital in Reno and ambient PM2.5 from 15 May to 20 Oct 2020. RESULTS: We found that a 10 µg/m3 increase in the 7-day average PM2.5 concentration was associated with a 6.3% relative increase in the SARS-CoV-2 test positivity rate, with a 95% confidence interval (CI) of 2.5 to 10.3%. This corresponded to an estimated 17.7% (CI: 14.4-20.1%) increase in the number of cases during the time period most affected by wildfire smoke, from 16 Aug to 10 Oct. SIGNIFICANCE: Wildfire smoke may have greatly increased the number of COVID-19 cases in Reno. Thus, our results substantiate the role of air pollution in exacerbating the pandemic and can help guide the development of public preparedness policies in areas affected by wildfire smoke, as wildfires are likely to coincide with the COVID-19 pandemic in 2021.


Subject(s)
Air Pollutants , COVID-19 , Wildfires , Air Pollutants/adverse effects , Air Pollutants/analysis , Humans , Nevada , Pandemics , Particulate Matter/adverse effects , Particulate Matter/analysis , SARS-CoV-2 , Smoke/adverse effects , United States/epidemiology
13.
Cell Death Dis ; 12(4): 310, 2021 03 24.
Article in English | MEDLINE | ID: mdl-33762578

ABSTRACT

SARS-CoV-2 is responsible for the ongoing world-wide pandemic which has already taken more than two million lives. Effective treatments are urgently needed. The enzymatic activity of the HECT-E3 ligase family members has been implicated in the cell egression phase of deadly RNA viruses such as Ebola through direct interaction of its VP40 Protein. Here we report that HECT-E3 ligase family members such as NEDD4 and WWP1 interact with and ubiquitylate the SARS-CoV-2 Spike protein. Furthermore, we find that HECT family members are overexpressed in primary samples derived from COVID-19 infected patients and COVID-19 mouse models. Importantly, rare germline activating variants in the NEDD4 and WWP1 genes are associated with severe COVID-19 cases. Critically, I3C, a natural NEDD4 and WWP1 inhibitor from Brassicaceae, displays potent antiviral effects and inhibits viral egression. In conclusion, we identify the HECT family members of E3 ligases as likely novel biomarkers for COVID-19, as well as new potential targets of therapeutic strategy easily testable in clinical trials in view of the established well-tolerated nature of the Brassicaceae natural compounds.


Subject(s)
COVID-19 Drug Treatment , COVID-19/enzymology , Ubiquitin-Protein Ligases/antagonists & inhibitors , Ubiquitin-Protein Ligases/metabolism , Adult , Aged , Animals , Antiviral Agents/pharmacology , COVID-19/genetics , COVID-19/metabolism , Chlorocebus aethiops , Endosomal Sorting Complexes Required for Transport/metabolism , Female , Humans , Indoles/pharmacology , Male , Mice , Mice, Inbred BALB C , Middle Aged , Nedd4 Ubiquitin Protein Ligases/genetics , Nedd4 Ubiquitin Protein Ligases/metabolism , SARS-CoV-2/isolation & purification , SARS-CoV-2/metabolism , Spike Glycoprotein, Coronavirus/metabolism , Ubiquitin-Protein Ligases/genetics , Ubiquitination , Vero Cells
14.
J Biomed Inform ; 43(6): 988-97, 2010 Dec.
Article in English | MEDLINE | ID: mdl-20692366

ABSTRACT

As the UMLS integrates multiple source vocabularies, the integration process requires that certain adaptation be applied to the source. Our interest is in examining the relationship between the UMLS representation of a source vocabulary and the source vocabulary itself. We investigated the integration of the Minimal Standard Terminology (MST) into the UMLS in order to examine how close its UMLS representation is to the source MST. The MST was conceived as a "minimal" list of terms and structure intended for use within computer systems to facilitate standardized reporting of gastrointestinal endoscopic examinations. Although the MST has an overall schema and implied relationship structure, many of the UMLS integrated MST terms were found to be hierarchically orphaned, and with lateral relationships that do not closely adhere to the source MST. Thus, the MST representation within the UMLS significantly differs from that of the source MST. These representation discrepancies may affect the usability of the MST representation in the UMLS for knowledge acquisition. Furthermore, they pose a problem from the perspective of application developers. While these findings may not necessarily apply to other source terminologies, they highlight the conflict between preservation of authentic concept orientation and the UMLS overall desire to provide fully specified names for all source terms.


Subject(s)
Computational Biology/methods , Terminology as Topic , Unified Medical Language System , Databases, Factual , Reference Standards , Vocabulary, Controlled
15.
G3 (Bethesda) ; 10(2): 645-664, 2020 02 06.
Article in English | MEDLINE | ID: mdl-31888951

ABSTRACT

The aggregation of Electronic Health Records (EHR) and personalized genetics leads to powerful discoveries relevant to population health. Here we perform genome-wide association studies (GWAS) and accompanying phenome-wide association studies (PheWAS) to validate phenotype-genotype associations of BMI, and to a greater extent, severe Class 2 obesity, using comprehensive diagnostic and clinical data from the EHR database of our cohort. Three GWASs of 500,000 variants on the Illumina platform of 6,645 Healthy Nevada participants identified several published and novel variants that affect BMI and obesity. Each GWAS was followed with two independent PheWASs to examine associations between extensive phenotypes (incidence of diagnoses, condition, or disease), significant SNPs, BMI, and incidence of extreme obesity. The first GWAS examines associations with BMI in a cohort with no type 2 diabetics, focusing exclusively on BMI. The second GWAS examines associations with BMI in a cohort that includes type 2 diabetics. In the second GWAS, type 2 diabetes is a comorbidity, and thus becomes a covariate in the statistical model. The intersection of significant variants of these two studies is surprising. The third GWAS is a case vs. control study, with cases defined as extremely obese (Class 2 or 3 obesity), and controls defined as participants with BMI between 18.5 and 25. This last GWAS identifies strong associations with extreme obesity, including established variants in the FTO and NEGR1 genes, as well as loci not yet linked to obesity. The PheWASs validate published associations between BMI and extreme obesity and incidence of specific diagnoses and conditions, yet also highlight novel links. This study emphasizes the importance of our extensive longitudinal EHR database to validate known associations and identify putative novel links with BMI and obesity.


Subject(s)
Body Mass Index , Genetic Predisposition to Disease , Genome-Wide Association Study , Obesity/etiology , Adult , Aged , Comorbidity , Databases, Genetic , Electronic Health Records , Female , Genetic Association Studies/methods , Genotype , Humans , Male , Middle Aged , Nevada/epidemiology , Obesity/diagnosis , Obesity/epidemiology , Phenotype , Polymorphism, Single Nucleotide
16.
Nat Commun ; 11(1): 542, 2020 Jan 28.
Article in English | MEDLINE | ID: mdl-31992710

ABSTRACT

Understanding the impact of rare variants is essential to understanding human health. We analyze rare (MAF < 0.1%) variants against 4264 phenotypes in 49,960 exome-sequenced individuals from the UK Biobank and 1934 phenotypes (1821 overlapping with UK Biobank) in 21,866 members of the Healthy Nevada Project (HNP) cohort who underwent Exome + sequencing at Helix. After using our rare-variant-tailored methodology to reduce test statistic inflation, we identify 64 statistically significant gene-based associations in our meta-analysis of the two cohorts and 37 for phenotypes available in only one cohort. Singletons make significant contributions to our results, and the vast majority of the associations could not have been identified with a genotyping chip. Our results are available for interactive browsing in a webapp (https://ukb.research.helix.com). This comprehensive analysis illustrates the biological value of large, deeply phenotyped cohorts of unselected populations coupled with NGS data.


Subject(s)
Exome/genetics , Genetic Variation , Genome, Human , Genome-Wide Association Study , Phenotype , Adolescent , Adult , Aged , Aged, 80 and over , Cohort Studies , Databases, Genetic , Europe , Female , Genetics, Population/statistics & numerical data , High-Throughput Nucleotide Sequencing , Humans , Male , Meta-Analysis as Topic , Middle Aged , Software , Exome Sequencing , Young Adult
17.
Nat Metab ; 2(10): 1126-1134, 2020 10.
Article in English | MEDLINE | ID: mdl-33046911

ABSTRACT

Genome-wide association studies have identified 240 independent loci associated with type 2 diabetes (T2D) risk, but this knowledge has not advanced precision medicine. In contrast, the genetic diagnosis of monogenic forms of diabetes (including maturity-onset diabetes of the young (MODY)) are textbook cases of genomic medicine. Recent studies trying to bridge the gap between monogenic diabetes and T2D have been inconclusive. Here, we show a significant burden of pathogenic variants in genes linked with monogenic diabetes among people with common T2D, particularly in actionable MODY genes, thus implying that there should be a substantial change in care for carriers with T2D. We show that, among 74,629 individuals, this burden is probably driven by the pathogenic variants found in GCK, and to a lesser extent in HNF4A, KCNJ11, HNF1B and ABCC8. The carriers with T2D are leaner, which evidences a functional metabolic effect of these mutations. Pathogenic variants in actionable MODY genes are more frequent than was previously expected in common T2D. These results open avenues for future interventions assessing the clinical interest of these pathogenic mutations in precision medicine.


Subject(s)
Diabetes Mellitus, Type 2/genetics , Computational Biology , Female , Genetic Variation , Genome-Wide Association Study , Germinal Center Kinases/genetics , Heterozygote , Humans , Male , Middle Aged , Mutation
18.
J Biomed Inform ; 42(3): 550-7, 2009 Jun.
Article in English | MEDLINE | ID: mdl-19475727

ABSTRACT

The Foundational Model of Anatomy (FMA) ontology is a domain reference ontology based on a disciplined modeling approach. Due to its large size, semantic complexity and manual data entry process, errors and inconsistencies are unavoidable and might remain within the FMA structure without detection. In this paper, we present computable methods to highlight candidate concepts for various relationship assignment errors. The process starts with locating structures formed by transitive structural relationships (part_of, tributary_of, branch_of) and examine their assignments in the context of the IS-A hierarchy. The algorithms were designed to detect five major categories of possible incorrect relationship assignments: circular, mutually exclusive, redundant, inconsistent, and missed entries. A domain expert reviewed samples of these presumptive errors to confirm the findings. Seven thousand and fifty-two presumptive errors were detected, the largest proportion related to part_of relationship assignments. The results highlight the fact that errors are unavoidable in complex ontologies and that well designed algorithms can help domain experts to focus on concepts with high likelihood of errors and maximize their effort to ensure consistency and reliability. In the future similar methods might be integrated with data entry processes to offer real-time error detection.


Subject(s)
Management Audit , Terminology as Topic , Algorithms
19.
PLoS One ; 14(6): e0218078, 2019.
Article in English | MEDLINE | ID: mdl-31194788

ABSTRACT

In this study, we perform a full genome-wide association study (GWAS) to identify statistically significantly associated single nucleotide polymorphisms (SNPs) with three red blood cell (RBC) components and follow it with two independent PheWASs to examine associations between phenotypic data (case-control status of diagnoses or disease), significant SNPs, and RBC component levels. We first identified associations between the three RBC components: mean platelet volume (MPV), mean corpuscular volume (MCV), and platelet counts (PC), and the genotypes of approximately 500,000 SNPs on the Illumina Infimum DNA Human OmniExpress-24 BeadChip using a single cohort of 4,673 Northern Nevadans. Twenty-one SNPs in five major genomic regions were found to be statistically significantly associated with MPV, two regions with MCV, and one region with PC, with p<5x10-8. Twenty-nine SNPs and nine chromosomal regions were identified in 30 previous GWASs, with effect sizes of similar magnitude and direction as found in our cohort. The two strongest associations were SNP rs1354034 with MPV (p = 2.4x10-13) and rs855791 with MCV (p = 5.2x10-12). We then examined possible associations between these significant SNPs and incidence of 1,488 phenotype groups mapped from International Classification of Disease version 9 and 10 (ICD9 and ICD10) codes collected in the extensive electronic health record (EHR) database associated with Healthy Nevada Project consented participants. Further leveraging data collected in the EHR, we performed an additional PheWAS to identify associations between continuous red blood cell (RBC) component measures and incidence of specific diagnoses. The first PheWAS illuminated whether SNPs associated with RBC components in our cohort were linked with other hematologic phenotypic diagnoses or diagnoses of other nature. Although no SNPs from our GWAS were identified as strongly associated to other phenotypic components, a number of associations were identified with p-values ranging between 1x10-3 and 1x10-4 with traits such as respiratory failure, sleep disorders, hypoglycemia, hyperglyceridemia, GERD and IBS. The second PheWAS examined possible phenotypic predictors of abnormal RBC component measures: a number of hematologic phenotypes such as thrombocytopenia, anemias, hemoglobinopathies and pancytopenia were found to be strongly associated to RBC component measures; additional phenotypes such as (morbid) obesity, malaise and fatigue, alcoholism, and cirrhosis were also identified to be possible predictors of RBC component measures.


Subject(s)
Erythrocytes/cytology , Genome-Wide Association Study , Phenotype , Adult , Chromosome Mapping , Cohort Studies , Female , Genotype , Humans , Male , Middle Aged , Nevada , Polymorphism, Single Nucleotide
20.
Methods Inf Med ; 57(1): 43-53, 2018 02.
Article in English | MEDLINE | ID: mdl-29621830

ABSTRACT

BACKGROUND: The UMLS assigns semantic types to all its integrated concepts. The semantic types are widely used in various natural language processing tasks in the biomedical domain, such as named entity recognition, semantic disambiguation, and semantic annotation. Due to the size of the UMLS, erroneous semantic type assignments are hard to detect. It is imperative to devise automated techniques to identify errors and inconsistencies in semantic type assignments. OBJECTIVES: Designing a methodology to perform programmatic checks to detect semantic type assignment errors for UMLS concepts with one or more SNOMED CT terms and evaluating concepts in a selected set of SNOMED CT hierarchies to verify our hypothesis that UMLS semantic type assignment errors may exist in concepts residing in semantically inconsistent groups. METHODS: Our methodology is a four-stage process. 1) partitioning concepts in a SNOMED CT hierarchy into semantically uniform groups based on their assigned semantic tags; 2) partitioning concepts in each group from 1) into the disjoint sub-groups based on their semantic type assignments; 3) mapping all SNOMED CT semantic tags into one or more semantic types in the UMLS; 4) identifying semantically inconsistent groups that have inconsistent assignments between semantic tags and semantic types according to the mapping from 3) and providing concepts in such groups to the domain experts for reviewing. RESULTS: We applied our method on the UMLS 2013AA release. Concepts of the semantically inconsistent groups in the PHYSICAL FORCE and RECORD ARTIFACT hierarchies have error rates 33% and 62.5% respectively, which are greatly larger than error rates 0.6% and 1% in semantically consistent groups of the two hierarchies. CONCLUSION: Concepts in semantically in - consistent groups are more likely to contain semantic type assignment errors. Our methodology can make auditing more efficient by limiting auditing resources on concepts of semantically inconsistent groups.


Subject(s)
Semantics , Systematized Nomenclature of Medicine , Unified Medical Language System , Artifacts , Reproducibility of Results
SELECTION OF CITATIONS
SEARCH DETAIL