Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 8 de 8
Filter
1.
Proc Natl Acad Sci U S A ; 114(32): E6652-E6659, 2017 08 08.
Article in English | MEDLINE | ID: mdl-28739897

ABSTRACT

Gram-positive bacteria cause the majority of skin and soft tissue infections (SSTIs), resulting in the most common reason for clinic visits in the United States. Recently, it was discovered that Gram-positive pathogens use a unique heme biosynthesis pathway, which implicates this pathway as a target for development of antibacterial therapies. We report here the identification of a small-molecule activator of coproporphyrinogen oxidase (CgoX) from Gram-positive bacteria, an enzyme essential for heme biosynthesis. Activation of CgoX induces accumulation of coproporphyrin III and leads to photosensitization of Gram-positive pathogens. In combination with light, CgoX activation reduces bacterial burden in murine models of SSTI. Thus, small-molecule activation of CgoX represents an effective strategy for the development of light-based antimicrobial therapies.


Subject(s)
Bacterial Proteins/metabolism , Coproporphyrinogen Oxidase/metabolism , Coproporphyrins/biosynthesis , Photosensitizing Agents/metabolism , Phototherapy , Staphylococcal Skin Infections/enzymology , Staphylococcal Skin Infections/therapy , Staphylococcus aureus/metabolism , Animals , Bacterial Proteins/genetics , Coproporphyrinogen Oxidase/genetics , Coproporphyrins/genetics , Disease Models, Animal , Mice , Staphylococcus aureus/genetics
2.
Circulation ; 135(14): 1311-1320, 2017 Apr 04.
Article in English | MEDLINE | ID: mdl-27793994

ABSTRACT

BACKGROUND: Atrial fibrillation (AF) has a substantial genetic basis. Identification of individuals at greatest AF risk could minimize the incidence of cardioembolic stroke. METHODS: To determine whether genetic data can stratify risk for development of AF, we examined associations between AF genetic risk scores and incident AF in 5 prospective studies comprising 18 919 individuals of European ancestry. We examined associations between AF genetic risk scores and ischemic stroke in a separate study of 509 ischemic stroke cases (202 cardioembolic [40%]) and 3028 referents. Scores were based on 11 to 719 common variants (≥5%) associated with AF at P values ranging from <1×10-3 to <1×10-8 in a prior independent genetic association study. RESULTS: Incident AF occurred in 1032 individuals (5.5%). AF genetic risk scores were associated with new-onset AF after adjustment for clinical risk factors. The pooled hazard ratio for incident AF for the highest versus lowest quartile of genetic risk scores ranged from 1.28 (719 variants; 95% confidence interval, 1.13-1.46; P=1.5×10-4) to 1.67 (25 variants; 95% confidence interval, 1.47-1.90; P=9.3×10-15). Discrimination of combined clinical and genetic risk scores varied across studies and scores (maximum C statistic, 0.629-0.811; maximum ΔC statistic from clinical score alone, 0.009-0.017). AF genetic risk was associated with stroke in age- and sex-adjusted models. For example, individuals in the highest versus lowest quartile of a 127-variant score had a 2.49-fold increased odds of cardioembolic stroke (95% confidence interval, 1.39-4.58; P=2.7×10-3). The effect persisted after the exclusion of individuals (n=70) with known AF (odds ratio, 2.25; 95% confidence interval, 1.20-4.40; P=0.01). CONCLUSIONS: Comprehensive AF genetic risk scores were associated with incident AF beyond associations for clinical AF risk factors but offered small improvements in discrimination. AF genetic risk was also associated with cardioembolic stroke in age- and sex-adjusted analyses. Efforts are warranted to determine whether AF genetic risk may improve identification of subclinical AF or help distinguish between stroke mechanisms.


Subject(s)
Atrial Fibrillation/genetics , Aged , Female , Humans , Incidence , Male , Middle Aged , Risk Factors
3.
Molecules ; 18(1): 735-56, 2013 Jan 08.
Article in English | MEDLINE | ID: mdl-23299552

ABSTRACT

With the rapidly increasing availability of High-Throughput Screening (HTS) data in the public domain, such as the PubChem database, methods for ligand-based computer-aided drug discovery (LB-CADD) have the potential to accelerate and reduce the cost of probe development and drug discovery efforts in academia. We assemble nine data sets from realistic HTS campaigns representing major families of drug target proteins for benchmarking LB-CADD methods. Each data set is public domain through PubChem and carefully collated through confirmation screens validating active compounds. These data sets provide the foundation for benchmarking a new cheminformatics framework BCL::ChemInfo, which is freely available for non-commercial use. Quantitative structure activity relationship (QSAR) models are built using Artificial Neural Networks (ANNs), Support Vector Machines (SVMs), Decision Trees (DTs), and Kohonen networks (KNs). Problem-specific descriptor optimization protocols are assessed including Sequential Feature Forward Selection (SFFS) and various information content measures. Measures of predictive power and confidence are evaluated through cross-validation, and a consensus prediction scheme is tested that combines orthogonal machine learning algorithms into a single predictor. Enrichments ranging from 15 to 101 for a TPR cutoff of 25% are observed.


Subject(s)
Databases, Chemical/standards , High-Throughput Screening Assays/standards , Quantitative Structure-Activity Relationship , Algorithms , Animals , Area Under Curve , Computer Simulation , Decision Trees , Drug Discovery/standards , Humans , Inhibitory Concentration 50 , Ligands , Models, Chemical , Neural Networks, Computer , Quality Improvement , ROC Curve , Support Vector Machine
4.
PM R ; 12(11): 1099-1105, 2020 11.
Article in English | MEDLINE | ID: mdl-32198840

ABSTRACT

BACKGROUND: A lack of studies with large sample sizes of patients with rotator cuff tears is a barrier to performing clinical and genomic research. OBJECTIVE: To develop and validate an electronic medical record (EMR)-based algorithm to identify individuals with and without rotator cuff tear. DESIGN: We used a deidentified version of the EMR of more than 2 million subjects. A screening algorithm was applied to classify subjects into likely rotator cuff tear and likely normal rotator cuff groups. From these subjects, 500 likely rotator cuff tear and 500 likely normal rotator cuff were randomly chosen for algorithm development. Chart review of all 1000 subjects confirmed the true phenotype of rotator cuff tear or normal rotator cuff based on magnetic resonance imaging and operative report. An algorithm was then developed based on logistic regression and validation of the algorithm was performed. RESULTS: The variables significantly predicting rotator cuff tear included the number of times a Current Procedural Terminology code related to rotator cuff procedures was used (odds ratio [OR] = 3.3; 95% confidence interval [CI]: 1.6-6.8 for ≥3 vs 0), the number of times a term related to rotator cuff lesions occurred in radiology reports (OR = 2.2; 95% CI: 1.2-4.1 for ≥1 vs 0), and the number of times a term related to rotator cuff lesions occurred in physician notes (OR = 4.5; 95% CI: 2.2-9.1 for 1 or 2 times vs 0). This phenotyping algorithm had a specificity of 0.89 (95% CI: 0.79-0.95) for rotator cuff tear, area under the curve (AUC) of 0.842, and diagnostic likelihood ratios (DLRs), DLR+ and DLR- of 5.94 (95% CI: 3.07-11.48) and 0.363 (95% CI: 0.291-0.453). CONCLUSION: Our informatics algorithm enables identification of cohorts of individuals with and without rotator cuff tear from an EMR-based data set with moderate accuracy.


Subject(s)
Rotator Cuff Injuries , Rotator Cuff , Algorithms , Electronic Health Records , Female , Humans , Magnetic Resonance Imaging , Male , Phenotype , Rotator Cuff Injuries/diagnosis
5.
PLoS One ; 12(5): e0177866, 2017.
Article in English | MEDLINE | ID: mdl-28542325

ABSTRACT

De novo membrane protein structure prediction is limited to small proteins due to the conformational search space quickly expanding with length. Long-range contacts (24+ amino acid separation)-residue positions distant in sequence, but in close proximity in the structure, are arguably the most effective way to restrict this conformational space. Inverse methods for co-evolutionary analysis predict a global set of position-pair couplings that best explain the observed amino acid co-occurrences, thus distinguishing between evolutionarily explained co-variances and these arising from spurious transitive effects. Here, we show that applying machine learning approaches and custom descriptors improves evolutionary contact prediction accuracy, resulting in improvement of average precision by 6 percentage points for the top 1L non-local contacts. Further, we demonstrate that predicted contacts improve protein folding with BCL::Fold. The mean RMSD100 metric for the top 10 models folded was reduced by an average of 2 Å for a benchmark of 25 membrane proteins.


Subject(s)
Machine Learning , Membrane Proteins/metabolism , Models, Molecular , Protein Folding , Protein Structure, Secondary/physiology , Algorithms , Amino Acid Sequence , Humans
6.
J Am Med Inform Assoc ; 24(1): 162-171, 2017 01.
Article in English | MEDLINE | ID: mdl-27497800

ABSTRACT

OBJECTIVE: Phenotyping algorithms applied to electronic health record (EHR) data enable investigators to identify large cohorts for clinical and genomic research. Algorithm development is often iterative, depends on fallible investigator intuition, and is time- and labor-intensive. We developed and evaluated 4 types of phenotyping algorithms and categories of EHR information to identify hypertensive individuals and controls and provide a portable module for implementation at other sites. MATERIALS AND METHODS: We reviewed the EHRs of 631 individuals followed at Vanderbilt for hypertension status. We developed features and phenotyping algorithms of increasing complexity. Input categories included International Classification of Diseases, Ninth Revision (ICD9) codes, medications, vital signs, narrative-text search results, and Unified Medical Language System (UMLS) concepts extracted using natural language processing (NLP). We developed a module and tested portability by replicating 10 of the best-performing algorithms at the Marshfield Clinic. RESULTS: Random forests using billing codes, medications, vitals, and concepts had the best performance with a median area under the receiver operator characteristic curve (AUC) of 0.976. Normalized sums of all 4 categories also performed well (0.959 AUC). The best non-NLP algorithm combined normalized ICD9 codes, medications, and blood pressure readings with a median AUC of 0.948. Blood pressure cutoffs or ICD9 code counts alone had AUCs of 0.854 and 0.908, respectively. Marshfield Clinic results were similar. CONCLUSION: This work shows that billing codes or blood pressure readings alone yield good hypertension classification performance. However, even simple combinations of input categories improve performance. The most complex algorithms classified hypertension with excellent recall and precision.


Subject(s)
Algorithms , Electronic Health Records , Hypertension/diagnosis , Machine Learning , Aged , Blood Pressure Determination , Clinical Coding , Female , Humans , Information Storage and Retrieval/methods , Male , Middle Aged , Natural Language Processing , Phenotype , ROC Curve
7.
Sci Rep ; 7(1): 11303, 2017 09 12.
Article in English | MEDLINE | ID: mdl-28900195

ABSTRACT

It is unclear whether genetic markers interact with risk factors to influence atrial fibrillation (AF) risk. We performed genome-wide interaction analyses between genetic variants and age, sex, hypertension, and body mass index in the AFGen Consortium. Study-specific results were combined using meta-analysis (88,383 individuals of European descent, including 7,292 with AF). Variants with nominal interaction associations in the discovery analysis were tested for association in four independent studies (131,441 individuals, including 5,722 with AF). In the discovery analysis, the AF risk associated with the minor rs6817105 allele (at the PITX2 locus) was greater among subjects ≤ 65 years of age than among those > 65 years (interaction p-value = 4.0 × 10-5). The interaction p-value exceeded genome-wide significance in combined discovery and replication analyses (interaction p-value = 1.7 × 10-8). We observed one genome-wide significant interaction with body mass index and several suggestive interactions with age, sex, and body mass index in the discovery analysis. However, none was replicated in the independent sample. Our findings suggest that the pathogenesis of AF may differ according to age in individuals of European descent, but we did not observe evidence of statistically significant genetic interactions with sex, body mass index, or hypertension on AF risk.


Subject(s)
Atrial Fibrillation/genetics , Body Mass Index , Epistasis, Genetic , Genetic Predisposition to Disease , Hypertension/genetics , Sex Characteristics , Age Factors , Aged , Chromosomes, Human, Pair 4/genetics , Female , Genetic Loci , Genome-Wide Association Study , Humans , Male , Middle Aged , Odds Ratio , Polymorphism, Single Nucleotide/genetics , Reproducibility of Results , Risk Factors
8.
J Am Med Inform Assoc ; 23(e1): e20-7, 2016 Apr.
Article in English | MEDLINE | ID: mdl-26338219

ABSTRACT

OBJECTIVE: To evaluate the phenotyping performance of three major electronic health record (EHR) components: International Classification of Disease (ICD) diagnosis codes, primary notes, and specific medications. MATERIALS AND METHODS: We conducted the evaluation using de-identified Vanderbilt EHR data. We preselected ten diseases: atrial fibrillation, Alzheimer's disease, breast cancer, gout, human immunodeficiency virus infection, multiple sclerosis, Parkinson's disease, rheumatoid arthritis, and types 1 and 2 diabetes mellitus. For each disease, patients were classified into seven categories based on the presence of evidence in diagnosis codes, primary notes, and specific medications. Twenty-five patients per disease category (a total number of 175 patients for each disease, 1750 patients for all ten diseases) were randomly selected for manual chart review. Review results were used to estimate the positive predictive value (PPV), sensitivity, andF-score for each EHR component alone and in combination. RESULTS: The PPVs of single components were inconsistent and inadequate for accurately phenotyping (0.06-0.71). Using two or more ICD codes improved the average PPV to 0.84. We observed a more stable and higher accuracy when using at least two components (mean ± standard deviation: 0.91 ± 0.08). Primary notes offered the best sensitivity (0.77). The sensitivity of ICD codes was 0.67. Again, two or more components provided a reasonably high and stable sensitivity (0.59 ± 0.16). Overall, the best performance (Fscore: 0.70 ± 0.12) was achieved by using two or more components. Although the overall performance of using ICD codes (0.67 ± 0.14) was only slightly lower than using two or more components, its PPV (0.71 ± 0.13) is substantially worse (0.91 ± 0.08). CONCLUSION: Multiple EHR components provide a more consistent and higher performance than a single one for the selected phenotypes. We suggest considering multiple EHR components for future phenotyping design in order to obtain an ideal result.


Subject(s)
Algorithms , Electronic Health Records , International Classification of Diseases , Phenotype , Diagnosis , Humans , Medical Records, Problem-Oriented , Predictive Value of Tests
SELECTION OF CITATIONS
SEARCH DETAIL