Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 71
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Am J Hum Genet ; 108(12): 2301-2318, 2021 12 02.
Artículo en Inglés | MEDLINE | ID: mdl-34762822

RESUMEN

Identifying whether a given genetic mutation results in a gene product with increased (gain-of-function; GOF) or diminished (loss-of-function; LOF) activity is an important step toward understanding disease mechanisms because they may result in markedly different clinical phenotypes. Here, we generated an extensive database of documented germline GOF and LOF pathogenic variants by employing natural language processing (NLP) on the available abstracts in the Human Gene Mutation Database. We then investigated various gene- and protein-level features of GOF and LOF variants and applied machine learning and statistical analyses to identify discriminative features. We found that GOF variants were enriched in essential genes, for autosomal-dominant inheritance, and in protein binding and interaction domains, whereas LOF variants were enriched in singleton genes, for protein-truncating variants, and in protein core regions. We developed a user-friendly web-based interface that enables the extraction of selected subsets from the GOF/LOF database by a broad set of annotated features and downloading of up-to-date versions. These results improve our understanding of how variants affect gene/protein function and may ultimately guide future treatment options.


Asunto(s)
Bases de Datos Genéticas , Mutación con Ganancia de Función , Mutación con Pérdida de Función , Proteínas/genética , Nube Computacional , Predisposición Genética a la Enfermedad , Genoma Humano , Mutación de Línea Germinal , Humanos , Intervención basada en la Internet , Aprendizaje Automático
2.
Indian J Med Res ; 159(2): 223-231, 2024 Feb 01.
Artículo en Inglés | MEDLINE | ID: mdl-38517215

RESUMEN

BACKGROUND OBJECTIVES: The Omicron sub-lineages are known to have higher infectivity, immune escape and lower virulence. During December 2022 - January 2023 and March - April 2023, India witnessed increased SARS-CoV-2 infections, mostly due to newer Omicron sub-lineages. With this unprecedented rise in cases, we assessed the neutralization potential of individuals vaccinated with ChAdOx1 nCoV (Covishield) and BBV152 (Covaxin) against emerging Omicron sub-lineages. METHODS: Neutralizing antibody responses were measured in the sera collected from individuals six months post-two doses (n=88) of Covishield (n=44) or Covaxin (n=44) and post-three doses (n=102) of Covishield (n=46) or Covaxin (n=56) booster dose against prototype B.1 strain, lineages of Omicron; XBB.1, BQ.1, BA.5.2 and BF.7. RESULTS: The sera of individuals collected six months after the two-dose and the three-dose demonstrated neutralizing activity against all variants. The neutralizing antibody (NAbs) level was highest against the prototype B.1 strain, followed by BA5.2 (5-6 fold lower), BF.7 (11-12 fold lower), BQ.1 (12 fold lower) and XBB.1 (18-22 fold lower). INTERPRETATION CONCLUSIONS: Persistence of NAb responses was comparable in individuals with two- and three-dose groups post six months of vaccination. Among the Omicron sub-variants, XBB.1 showed marked neutralization escape, thus pointing towards an eventual immune escape, which may cause more infections. Further, the correlation of study data with complete clinical profile of the participants along with observations for cell-mediated immunity may provide a clear picture for the sustained protection due to three-dose vaccination as well as hybrid immunity against the newer variants.


Asunto(s)
Vacunas contra la COVID-19 , COVID-19 , ChAdOx1 nCoV-19 , Vacunas de Productos Inactivados , Humanos , COVID-19/prevención & control , SARS-CoV-2 , Anticuerpos Neutralizantes , Vacunación , Anticuerpos Antivirales
3.
Hum Mol Genet ; 30(10): 952-960, 2021 05 29.
Artículo en Inglés | MEDLINE | ID: mdl-33704450

RESUMEN

Diabetic retinopathy (DR) is a common consequence in type 2 diabetes (T2D) and a leading cause of blindness in working-age adults. Yet, its genetic predisposition is largely unknown. Here, we examined the polygenic architecture underlying DR by deriving and assessing a genome-wide polygenic risk score (PRS) for DR. We evaluated the PRS in 6079 individuals with T2D of European, Hispanic, African and other ancestries from a large-scale multi-ethnic biobank. Main outcomes were PRS association with DR diagnosis, symptoms and complications, and time to diagnosis, and transferability to non-European ancestries. We observed that PRS was significantly associated with DR. A standard deviation increase in PRS was accompanied by an adjusted odds ratio (OR) of 1.12 [95% confidence interval (CI) 1.04-1.20; P = 0.001] for DR diagnosis. When stratified by ancestry, PRS was associated with the highest OR in European ancestry (OR = 1.22, 95% CI 1.02-1.41; P = 0.049), followed by African (OR = 1.15, 95% CI 1.03-1.28; P = 0.028) and Hispanic ancestries (OR = 1.10, 95% CI 1.00-1.10; P = 0.050). Individuals in the top PRS decile had a 1.8-fold elevated risk for DR versus the bottom decile (P = 0.002). Among individuals without DR diagnosis, the top PRS decile had more DR symptoms than the bottom decile (P = 0.008). The PRS was associated with retinal hemorrhage (OR = 1.44, 95% CI 1.03-2.02; P = 0.03) and earlier DR presentation (10% probability of DR by 4 years in the top PRS decile versus 8 years in the bottom decile). These results establish the significant polygenic underpinnings of DR and indicate the need for more diverse ancestries in biobanks to develop multi-ancestral PRS.


Asunto(s)
Diabetes Mellitus Tipo 2/epidemiología , Retinopatía Diabética/epidemiología , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Adulto , Anciano , Población Negra/genética , Diabetes Mellitus Tipo 2/complicaciones , Diabetes Mellitus Tipo 2/genética , Diabetes Mellitus Tipo 2/patología , Retinopatía Diabética/complicaciones , Retinopatía Diabética/genética , Retinopatía Diabética/patología , Hispánicos o Latinos/genética , Humanos , Persona de Mediana Edad , Herencia Multifactorial/genética , Medición de Riesgo , Factores de Riesgo , Población Blanca/genética
4.
Am Heart J ; 250: 29-33, 2022 08.
Artículo en Inglés | MEDLINE | ID: mdl-35526571

RESUMEN

Genetic risk for coronary artery disease (CAD) is commonly measured with polygenic risk scores (PRS); yet, the relationship of atherosclerotic burden with PRS in healthy individuals not at high clinical risk for CAD (ie, without a high pooled cohort equations [PCE] score) is unknown. Here, we implemented a novel recall-by-PRS strategy to measure coronary artery calcium (CAC) scores prospectively in 53 healthy individuals with extreme high PRS (median [IQR] PRS = 94% [83-98]) and low PRS (median [IQR] PRS = 3.6% [1.2-10]). The high PRS group was associated with a 2.8-fold greater CAC than the low PRS group, adjusted for age, sex, BMI, smoking, and statin use, and had a 6.7-fold greater proportion of individuals with CAC exceeding 300 HU. These findings reveal that extreme PRS tracks with CAD risk even in those without high clinical risk and demonstrate proof of principle for recall-by-PRS approaches that should be assessed prospectively in larger trials.


Asunto(s)
Calcio , Enfermedad de la Arteria Coronaria , Calcio de la Dieta , Estudios de Cohortes , Enfermedad de la Arteria Coronaria/genética , Humanos , Medición de Riesgo , Factores de Riesgo
5.
J Am Soc Nephrol ; 32(1): 151-160, 2021 01.
Artículo en Inglés | MEDLINE | ID: mdl-32883700

RESUMEN

BACKGROUND: Early reports indicate that AKI is common among patients with coronavirus disease 2019 (COVID-19) and associated with worse outcomes. However, AKI among hospitalized patients with COVID-19 in the United States is not well described. METHODS: This retrospective, observational study involved a review of data from electronic health records of patients aged ≥18 years with laboratory-confirmed COVID-19 admitted to the Mount Sinai Health System from February 27 to May 30, 2020. We describe the frequency of AKI and dialysis requirement, AKI recovery, and adjusted odds ratios (aORs) with mortality. RESULTS: Of 3993 hospitalized patients with COVID-19, AKI occurred in 1835 (46%) patients; 347 (19%) of the patients with AKI required dialysis. The proportions with stages 1, 2, or 3 AKI were 39%, 19%, and 42%, respectively. A total of 976 (24%) patients were admitted to intensive care, and 745 (76%) experienced AKI. Of the 435 patients with AKI and urine studies, 84% had proteinuria, 81% had hematuria, and 60% had leukocyturia. Independent predictors of severe AKI were CKD, men, and higher serum potassium at admission. In-hospital mortality was 50% among patients with AKI versus 8% among those without AKI (aOR, 9.2; 95% confidence interval, 7.5 to 11.3). Of survivors with AKI who were discharged, 35% had not recovered to baseline kidney function by the time of discharge. An additional 28 of 77 (36%) patients who had not recovered kidney function at discharge did so on posthospital follow-up. CONCLUSIONS: AKI is common among patients hospitalized with COVID-19 and is associated with high mortality. Of all patients with AKI, only 30% survived with recovery of kidney function by the time of discharge.


Asunto(s)
Lesión Renal Aguda/etiología , COVID-19/complicaciones , SARS-CoV-2 , Lesión Renal Aguda/epidemiología , Lesión Renal Aguda/terapia , Lesión Renal Aguda/orina , Anciano , Anciano de 80 o más Años , COVID-19/mortalidad , Femenino , Hematuria/etiología , Mortalidad Hospitalaria , Hospitales Privados/estadística & datos numéricos , Hospitales Urbanos/estadística & datos numéricos , Humanos , Incidencia , Pacientes Internos , Leucocitos , Masculino , Persona de Mediana Edad , Ciudad de Nueva York/epidemiología , Proteinuria/etiología , Diálisis Renal , Estudios Retrospectivos , Resultado del Tratamiento , Orina/citología
6.
JAMA ; 327(4): 350-359, 2022 01 25.
Artículo en Inglés | MEDLINE | ID: mdl-35076666

RESUMEN

Importance: Population-based assessment of disease risk associated with gene variants informs clinical decisions and risk stratification approaches. Objective: To evaluate the population-based disease risk of clinical variants in known disease predisposition genes. Design, Setting, and Participants: This cohort study included 72 434 individuals with 37 780 clinical variants who were enrolled in the BioMe Biobank from 2007 onwards with follow-up until December 2020 and the UK Biobank from 2006 to 2010 with follow-up until June 2020. Participants had linked exome and electronic health record data, were older than 20 years, and were of diverse ancestral backgrounds. Exposures: Variants previously reported as pathogenic or predicted to cause a loss of protein function by bioinformatic algorithms (pathogenic/loss-of-function variants). Main Outcomes and Measures: The primary outcome was the disease risk associated with clinical variants. The risk difference (RD) between the prevalence of disease in individuals with a variant allele (penetrance) vs in individuals with a normal allele was measured. Results: Among 72 434 study participants, 43 395 were from the UK Biobank (mean [SD] age, 57 [8.0] years; 24 065 [55%] women; 2948 [7%] non-European) and 29 039 were from the BioMe Biobank (mean [SD] age, 56 [16] years; 17 355 [60%] women; 19 663 [68%] non-European). Of 5360 pathogenic/loss-of-function variants, 4795 (89%) were associated with an RD less than or equal to 0.05. Mean penetrance was 6.9% (95% CI, 6.0%-7.8%) for pathogenic variants and 0.85% (95% CI, 0.76%-0.95%) for benign variants reported in ClinVar (difference, 6.0 [95% CI, 5.6-6.4] percentage points), with a median of 0% for both groups due to large numbers of nonpenetrant variants. Penetrance of pathogenic/loss-of-function variants for late-onset diseases was modified by age: mean penetrance was 10.3% (95% CI, 9.0%-11.6%) in individuals 70 years or older and 8.5% (95% CI, 7.9%-9.1%) in individuals 20 years or older (difference, 1.8 [95% CI, 0.40-3.3] percentage points). Penetrance of pathogenic/loss-of-function variants was heterogeneous even in known disease predisposition genes, including BRCA1 (mean [range], 38% [0%-100%]), BRCA2 (mean [range], 38% [0%-100%]), and PALB2 (mean [range], 26% [0%-100%]). Conclusions and Relevance: In 2 large biobank cohorts, the estimated penetrance of pathogenic/loss-of-function variants was variable but generally low. Further research of population-based penetrance is needed to refine variant interpretation and clinical evaluation of individuals with these variant alleles.


Asunto(s)
Predisposición Genética a la Enfermedad , Variación Genética , Mutación con Pérdida de Función , Penetrancia , Anciano , Bancos de Muestras Biológicas , Estudios de Cohortes , Femenino , Humanos , Masculino , Mutación , Reino Unido
7.
Hum Mutat ; 42(8): 969-977, 2021 08.
Artículo en Inglés | MEDLINE | ID: mdl-34005834

RESUMEN

Biobanks with exomes linked to electronic health records (EHRs) enable the study of genetic pleiotropy between rare variants and seemingly disparate diseases. We performed robust clinical phenotyping of rare, putatively deleterious variants (loss-of-function [LoF] and deleterious missense variants) in ERCC6, a gene implicated in inherited retinal disease. We analyzed 213,084 exomes, along with a targeted set of retinal, cardiac, and immune phenotypes from two large-scale EHR-linked biobanks. In the primary analysis, a burden of deleterious variants in ERCC6 was strongly associated with (1) retinal disorders; (2) cardiac and electrocardiogram perturbations; and (3) immunodeficiency and decreased immunoglobulin levels. Meta-analysis of results from the BioMe Biobank and UK Biobank showed a significant association of deleterious ERCC6 burden with retinal dystrophy (odds ratio [OR] = 2.6, 95% confidence interval [CI]: 1.5-4.6; p = 8.7 × 10-4 ), atypical atrial flutter (OR = 3.5, 95% CI: 1.9-6.5; p = 6.2 × 10-5 ), arrhythmia (OR = 1.5, 95% CI: 1.2-2.0; p = 2.7 × 10-3 ), and lymphocyte immunodeficiency (OR = 3.8, 95% CI: 2.1-6.8; p = 5.0 × 10-6 ). Carriers of ERCC6 LoF variants who lacked a diagnosis of these conditions exhibited increased symptoms, indicating underdiagnosis. These results reveal a unique genetic link among retinal, cardiac, and immune disorders and underscore the value of EHR-linked biobanks in assessing the full clinical profile of carriers of rare variants.


Asunto(s)
Pleiotropía Genética , Distrofias Retinianas , Arritmias Cardíacas , ADN Helicasas , Enzimas Reparadoras del ADN , Exoma , Humanos , Proteínas de Unión a Poli-ADP-Ribosa , Distrofias Retinianas/genética , Secuenciación del Exoma/métodos
8.
Blood Purif ; 50(4-5): 621-627, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-33631752

RESUMEN

BACKGROUND/AIMS: Acute kidney injury (AKI) in critically ill patients is common, and continuous renal replacement therapy (CRRT) is a preferred mode of renal replacement therapy (RRT) in hemodynamically unstable patients. Prediction of clinical outcomes in patients on CRRT is challenging. We utilized several approaches to predict RRT-free survival (RRTFS) in critically ill patients with AKI requiring CRRT. METHODS: We used the Medical Information Mart for Intensive Care (MIMIC-III) database to identify patients ≥18 years old with AKI on CRRT, after excluding patients who had ESRD on chronic dialysis, and kidney transplantation. We defined RRTFS as patients who were discharged alive and did not require RRT ≥7 days prior to hospital discharge. We utilized all available biomedical data up to CRRT initiation. We evaluated 7 approaches, including logistic regression (LR), random forest (RF), support vector machine (SVM), adaptive boosting (AdaBoost), extreme gradient boosting (XGBoost), multilayer perceptron (MLP), and MLP with long short-term memory (MLP + LSTM). We evaluated model performance by using area under the receiver operating characteristic (AUROC) curves. RESULTS: Out of 684 patients with AKI on CRRT, 205 (30%) patients had RRTFS. The median age of patients was 63 years and their median Simplified Acute Physiology Score (SAPS) II was 67 (interquartile range 52-84). The MLP + LSTM showed the highest AUROC (95% CI) of 0.70 (0.67-0.73), followed by MLP 0.59 (0.54-0.64), LR 0.57 (0.52-0.62), SVM 0.51 (0.46-0.56), AdaBoost 0.51 (0.46-0.55), RF 0.44 (0.39-0.48), and XGBoost 0.43 (CI 0.38-0.47). CONCLUSIONS: A MLP + LSTM model outperformed other approaches for predicting RRTFS. Performance could be further improved by incorporating other data types.


Asunto(s)
Lesión Renal Aguda/terapia , Terapia de Reemplazo Renal , Lesión Renal Aguda/diagnóstico , Factores de Edad , Anciano , Cuidados Críticos , Femenino , Humanos , Modelos Logísticos , Aprendizaje Automático , Masculino , Persona de Mediana Edad , Pronóstico
9.
Kidney Int ; 98(5): 1323-1330, 2020 11.
Artículo en Inglés | MEDLINE | ID: mdl-32540406

RESUMEN

Urinary tract stones have high heritability indicating a strong genetic component. However, genome-wide association studies (GWAS) have uncovered only a few genome wide significant single nucleotide polymorphisms (SNPs). Polygenic risk scores (PRS) sum cumulative effect of many SNPs and shed light on underlying genetic architecture. Using GWAS summary statistics from 361,141 participants in the United Kingdom Biobank, we generated a PRS and determined association with stone diagnosis in 28,877 participants in the Mount Sinai BioMe Biobank. In BioMe (1,071 cases and 27,806 controls), for every standard deviation increase, we observed a significant increment in adjusted odds ratio of a factor of 1.2 (95% confidence interval 1.13-1.26). In comparison, a risk score comprised of GWAS significant SNPs was not significantly associated with diagnosis. After stratifying individuals into low and high-risk categories on clinical risk factors, there was a significant increment in adjusted odds ratio of 1.3 (1.12-1.6) in the low- and 1.2 (1.1-1.2) in the high-risk group for every standard deviation increment in PRS. In a 14,348-participant validation cohort (Penn Medicine Biobank), every standard deviation increment was associated with a significant adjusted odds ratio of 1.1 (1.03 - 1.2). Thus, a genome-wide PRS is associated with urinary tract stones overall and in the absence of known clinical risk factors and illustrates their complex polygenic architecture.


Asunto(s)
Estudio de Asociación del Genoma Completo , Cálculos Urinarios , Predisposición Genética a la Enfermedad , Humanos , Herencia Multifactorial , Polimorfismo de Nucleótido Simple , Reino Unido/epidemiología
10.
Kidney Int ; 97(2): 383-392, 2020 02.
Artículo en Inglés | MEDLINE | ID: mdl-31883805

RESUMEN

Symptoms are common in patients on maintenance hemodialysis but identification is challenging. New informatics approaches including natural language processing (NLP) can be utilized to identify symptoms from narrative clinical documentation. Here we utilized NLP to identify seven patient symptoms from notes of maintenance hemodialysis patients of the BioMe Biobank and validated our findings using a separate cohort and the MIMIC-III database. NLP performance was compared for symptom detection with International Classification of Diseases (ICD)-9/10 codes and the performance of both methods were validated against manual chart review. From 1034 and 519 hemodialysis patients within BioMe and MIMIC-III databases, respectively, the most frequently identified symptoms by NLP were fatigue, pain, and nausea/vomiting. In BioMe, sensitivity for NLP (0.85 - 0.99) was higher than for ICD codes (0.09 - 0.59) for all symptoms with similar results in the BioMe validation cohort and MIMIC-III. ICD codes were significantly more specific for nausea/vomiting in BioMe and more specific for fatigue, depression, and pain in the MIMIC-III database. A majority of patients in both cohorts had four or more symptoms. Patients with more symptoms identified by NLP, ICD, and chart review had more clinical encounters. NLP had higher specificity in inpatient notes but higher sensitivity in outpatient notes and performed similarly across pain severity subgroups. Thus, NLP had higher sensitivity compared to ICD codes for identification of seven common hemodialysis-related symptoms, with comparable specificity between the two methods. Hence, NLP may be useful for the high-throughput identification of patient-centered outcomes when using electronic health records.


Asunto(s)
Registros Electrónicos de Salud , Procesamiento de Lenguaje Natural , Algoritmos , Bases de Datos Factuales , Humanos , Diálisis Renal/efectos adversos
11.
JAMA ; 322(22): 2191-2202, 2019 12 10.
Artículo en Inglés | MEDLINE | ID: mdl-31821430

RESUMEN

Importance: Hereditary transthyretin (TTR) amyloid cardiomyopathy (hATTR-CM) due to the TTR V122I variant is an autosomal-dominant disorder that causes heart failure in elderly individuals of African ancestry. The clinical associations of carrying the variant, its effect in other African ancestry populations including Hispanic/Latino individuals, and the rates of achieving a clinical diagnosis in carriers are unknown. Objective: To assess the association between the TTR V122I variant and heart failure and identify rates of hATTR-CM diagnosis among carriers with heart failure. Design, Setting, and Participants: Cross-sectional analysis of carriers and noncarriers of TTR V122I of African ancestry aged 50 years or older enrolled in the Penn Medicine Biobank between 2008 and 2017 using electronic health record data from 1996 to 2017. Case-control study in participants of African and Hispanic/Latino ancestry with and without heart failure in the Mount Sinai BioMe Biobank enrolled between 2007 and 2015 using electronic health record data from 2007 to 2018. Exposures: TTR V122I carrier status. Main Outcomes and Measures: The primary outcome was prevalent heart failure. The rate of diagnosis with hATTR-CM among TTR V122I carriers with heart failure was measured. Results: The cross-sectional cohort included 3724 individuals of African ancestry with a median age of 64 years (interquartile range, 57-71); 1755 (47%) were male, 2896 (78%) had a diagnosis of hypertension, and 753 (20%) had a history of myocardial infarction or coronary revascularization. There were 116 TTR V122I carriers (3.1%); 1121 participants (30%) had heart failure. The case-control study consisted of 2307 individuals of African ancestry and 3663 Hispanic/Latino individuals; the median age was 73 years (interquartile range, 68-80), 2271 (38%) were male, 4709 (79%) had a diagnosis of hypertension, and 1008 (17%) had a history of myocardial infarction or coronary revascularization. There were 1376 cases of heart failure. TTR V122I was associated with higher rates of heart failure (cross-sectional cohort: n = 51/116 TTR V122I carriers [44%], n = 1070/3608 noncarriers [30%], adjusted odds ratio, 1.7 [95% CI, 1.2-2.4], P = .006; case-control study: n = 36/1376 heart failure cases [2.6%], n = 82/4594 controls [1.8%], adjusted odds ratio, 1.8 [95% CI, 1.2-2.7], P = .008). Ten of 92 TTR V122I carriers with heart failure (11%) were diagnosed as having hATTR-CM; the median time from onset of symptoms to clinical diagnosis was 3 years. Conclusions and Relevance: Among individuals of African or Hispanic/Latino ancestry enrolled in 2 academic medical center-based biobanks, the TTR V122I genetic variant was significantly associated with heart failure.


Asunto(s)
Neuropatías Amiloides Familiares/genética , Negro o Afroamericano/genética , Insuficiencia Cardíaca/genética , Hispánicos o Latinos/genética , Prealbúmina/genética , Centros Médicos Académicos , Anciano , Neuropatías Amiloides Familiares/complicaciones , Neuropatías Amiloides Familiares/etnología , Bancos de Muestras Biológicas , Estudios de Casos y Controles , Estudios Transversales , Femenino , Variación Genética , Insuficiencia Cardíaca/etnología , Humanos , Masculino , Persona de Mediana Edad
12.
J Proteome Res ; 17(1): 337-347, 2018 01 05.
Artículo en Inglés | MEDLINE | ID: mdl-29110491

RESUMEN

Metabolomics holds the promise as a new technology to diagnose highly heterogeneous diseases. Conventionally, metabolomics data analysis for diagnosis is done using various statistical and machine learning based classification methods. However, it remains unknown if deep neural network, a class of increasingly popular machine learning methods, is suitable to classify metabolomics data. Here we use a cohort of 271 breast cancer tissues, 204 positive estrogen receptor (ER+), and 67 negative estrogen receptor (ER-) to test the accuracies of feed-forward networks, a deep learning (DL) framework, as well as six widely used machine learning models, namely random forest (RF), support vector machines (SVM), recursive partitioning and regression trees (RPART), linear discriminant analysis (LDA), prediction analysis for microarrays (PAM), and generalized boosted models (GBM). DL framework has the highest area under the curve (AUC) of 0.93 in classifying ER+/ER- patients, compared to the other six machine learning algorithms. Furthermore, the biological interpretation of the first hidden layer reveals eight commonly enriched significant metabolomics pathways (adjusted P-value <0.05) that cannot be discovered by other machine learning methods. Among them, protein digestion and absorption and ATP-binding cassette (ABC) transporters pathways are also confirmed in integrated analysis between metabolomics and gene expression data in these samples. In summary, deep learning method shows advantages for metabolomics based breast cancer ER status classification, with both the highest prediction accuracy (AUC = 0.93) and better revelation of disease biology. We encourage the adoption of feed-forward networks based deep learning method in the metabolomics research community for classification.


Asunto(s)
Neoplasias de la Mama/clasificación , Aprendizaje Automático/normas , Metabolómica/métodos , Receptores de Estrógenos/análisis , Área Bajo la Curva , Femenino , Humanos
13.
J Transl Med ; 16(1): 181, 2018 07 03.
Artículo en Inglés | MEDLINE | ID: mdl-29970096

RESUMEN

BACKGROUND: Evidences in literature strongly advocate the potential of immunomodulatory peptides for use as vaccine adjuvants. All the mechanisms of vaccine adjuvants ensuing immunostimulatory effects directly or indirectly stimulate antigen presenting cells (APCs). While numerous methods have been developed in the past for predicting B cell and T-cell epitopes; no method is available for predicting the peptides that can modulate the APCs. METHODS: We named the peptides that can activate APCs as A-cell epitopes and developed methods for their prediction in this study. A dataset of experimentally validated A-cell epitopes was collected and compiled from various resources. To predict A-cell epitopes, we developed support vector machine-based machine learning models using different sequence-based features. RESULTS: A hybrid model developed on a combination of sequence-based features (dipeptide composition and motif occurrence), achieved the highest accuracy of 95.71% with Matthews correlation coefficient (MCC) value of 0.91 on the training dataset. We also evaluated the hybrid models on an independent dataset and achieved a comparable accuracy of 95.00% with MCC 0.90. CONCLUSION: The models developed in this study were implemented in a web-based platform VaxinPAD to predict and design immunomodulatory peptides or A-cell epitopes. This web server available at http://webs.iiitd.edu.in/raghava/vaxinpad/ will facilitate researchers in designing peptide-based vaccine adjuvants.


Asunto(s)
Adyuvantes Inmunológicos/farmacología , Células Presentadoras de Antígenos/efectos de los fármacos , Simulación por Computador , Diseño de Fármacos , Vacunas de Subunidad/farmacología , Secuencias de Aminoácidos , Secuencia de Aminoácidos , Bases de Datos de Proteínas , Epítopos/metabolismo , Humanos , Factores Inmunológicos/farmacología , Internet , Modelos Teóricos , Máquina de Vectores de Soporte , Interfaz Usuario-Computador , Vacunas de Subunidad/química
14.
Nucleic Acids Res ; 44(D1): D1098-103, 2016 Jan 04.
Artículo en Inglés | MEDLINE | ID: mdl-26586798

RESUMEN

CPPsite 2.0 (http://crdd.osdd.net/raghava/cppsite/) is an updated version of manually curated database (CPPsite) of cell-penetrating peptides (CPPs). The current version holds around 1850 peptide entries, which is nearly two times than the entries in the previous version. The updated data were curated from research papers and patents published in last three years. It was observed that most of the CPPs discovered/ tested, in last three years, have diverse chemical modifications (e.g. non-natural residues, linkers, lipid moieties, etc.). We have compiled this information on chemical modifications systematically in the updated version of the database. In order to understand the structure-function relationship of these peptides, we predicted tertiary structure of CPPs, possessing both modified and natural residues, using state-of-the-art techniques. CPPsite 2.0 also maintains information about model systems (in vitro/in vivo) used for CPP evaluation and different type of cargoes (e.g. nucleic acid, protein, nanoparticles, etc.) delivered by these peptides. In order to assist a wide range of users, we developed a user-friendly responsive website, with various tools, suitable for smartphone, tablet and desktop users. In conclusion, CPPsite 2.0 provides significant improvements over the previous version in terms of data content.


Asunto(s)
Péptidos de Penetración Celular/química , Bases de Datos de Proteínas , Portadores de Fármacos/química , Conformación Proteica , Relación Estructura-Actividad
15.
Nucleic Acids Res ; 44(D1): D1119-26, 2016 Jan 04.
Artículo en Inglés | MEDLINE | ID: mdl-26527728

RESUMEN

SATPdb (http://crdd.osdd.net/raghava/satpdb/) is a database of structurally annotated therapeutic peptides, curated from 22 public domain peptide databases/datasets including 9 of our own. The current version holds 19192 unique experimentally validated therapeutic peptide sequences having length between 2 and 50 amino acids. It covers peptides having natural, non-natural and modified residues. These peptides were systematically grouped into 10 categories based on their major function or therapeutic property like 1099 anticancer, 10585 antimicrobial, 1642 drug delivery and 1698 antihypertensive peptides. We assigned or annotated structure of these therapeutic peptides using structural databases (Protein Data Bank) and state-of-the-art structure prediction methods like I-TASSER, HHsearch and PEPstrMOD. In addition, SATPdb facilitates users in performing various tasks that include: (i) structure and sequence similarity search, (ii) peptide browsing based on their function and properties, (iii) identification of moonlighting peptides and (iv) searching of peptides having desired structure and therapeutic activities. We hope this database will be useful for researchers working in the field of peptide-based therapeutics.


Asunto(s)
Bases de Datos Farmacéuticas , Péptidos/química , Péptidos/uso terapéutico , Antihipertensivos/farmacología , Antineoplásicos/farmacología , Anotación de Secuencia Molecular , Péptidos/farmacología
16.
Nucleic Acids Res ; 43(Database issue): D956-62, 2015 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-25392419

RESUMEN

AHTPDB (http://crdd.osdd.net/raghava/ahtpdb/) is a manually curated database of experimentally validated antihypertensive peptides. Information pertaining to peptides with antihypertensive activity was collected from research articles and from various peptide repositories. These peptides were derived from 35 major sources that include milk, egg, fish, pork, chicken, soybean, etc. In AHTPDB, most of the peptides belong to a family of angiotensin-I converting enzyme inhibiting peptides. The current release of AHTPDB contains 5978 peptide entries among which 1694 are unique peptides. Each entry provides detailed information about a peptide like sequence, inhibitory concentration (IC50), toxicity/bitterness value, source, length, molecular mass and information related to purification of peptides. In addition, the database provides structural information of these peptides that includes predicted tertiary and secondary structures. A user-friendly web interface with various tools has been developed to retrieve and analyse the data. It is anticipated that AHTPDB will be a useful and unique resource for the researchers working in the field of antihypertensive peptides.


Asunto(s)
Antihipertensivos/química , Bases de Datos de Compuestos Químicos , Péptidos/química , Péptidos/farmacología , Antihipertensivos/farmacología , Antihipertensivos/toxicidad , Internet , Péptidos/toxicidad , Programas Informáticos
17.
BMC Cancer ; 16: 77, 2016 Feb 09.
Artículo en Inglés | MEDLINE | ID: mdl-26860193

RESUMEN

BACKGROUND: In past, numerous quantitative structure-activity relationship (QSAR) based models have been developed for predicting anticancer activity for a specific class of molecules against different cancer drug targets. In contrast, limited attempt have been made to predict the anticancer activity of a diverse class of chemicals against a wide variety of cancer cell lines. In this study, we described a hybrid method developed on thousands of anticancer and non-anticancer molecules tested against National Cancer Institute (NCI) 60 cancer cell lines. RESULTS: Our analysis of anticancer molecules revealed that majority of anticancer molecules contains 18-24 carbon atoms and are dominated by functional groups like R2NH, R3N, ROH, RCOR, and ROR. It was also observed that certain substructures (e.g., 1-methoxy-4-methylbenzene, 1-methoxy benzene, Nitrobenzene, Indole, Propenyl benzene) are more abundant in anticancer molecules. Next, we developed anticancer molecule prediction models using various machine-learning techniques and achieved maximum matthews correlation coefficient (MCC) of 0.81 with 90.40% accuracy using support vector machine (SVM) based models. In another approach, a novel similarity or potency score based method has been developed using selected fragments/fingerprints and achieved maximum MCC of 0.82 with 90.65% accuracy. Finally, we combined the strength of above methods and developed a hybrid method with maximum MCC of 0.85 with 92.47% accuracy. CONCLUSIONS: We developed a hybrid method utilizing the best of machine learning and potency score based method. The highly accurate hybrid method can be used for classification of anticancer and non-anticancer molecules. In order to facilitate scientific community working in the field of anticancer drug discovery, we integrate hybrid and potency method in a web server CancerIN. This server provides various facilities that includes; virtual screening of anticancer molecules, analog based drug design, and similarity with known anticancer molecules ( http://crdd.osdd.net/oscadd/cancerin).


Asunto(s)
Anticarcinógenos/química , Línea Celular Tumoral/efectos de los fármacos , Evaluación Preclínica de Medicamentos , Neoplasias/tratamiento farmacológico , Anticarcinógenos/farmacología , Carbono/química , Biología Computacional , Humanos , Modelos Moleculares , Neoplasias/patología , Programas Informáticos
18.
Nucleic Acids Res ; 42(Database issue): D444-9, 2014 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-24174543

RESUMEN

Hemolytik (http://crdd.osdd.net/raghava/hemolytik/) is a manually curated database of experimentally determined hemolytic and non-hemolytic peptides. Data were compiled from a large number of published research articles and various databases like Antimicrobial Peptide Database, Collection of Anti-microbial Peptides, Dragon Antimicrobial Peptide Database and Swiss-Prot. The current release of Hemolytik database contains ∼3000 entries that include ∼2000 unique peptides whose hemolytic activities were evaluated on erythrocytes isolated from as many as 17 different sources. Each entry in Hemolytik provides comprehensive information about a peptide, like its name, sequence, origin, reported function, property such as chirality, types (linear and cyclic), end modifications as well as details pertaining to its hemolytic activity. In addition, tertiary structure of each peptide has been predicted, and secondary structure states have been assigned. To facilitate the scientific community, a user-friendly interface has been developed with various tools for data searching and analysis. We hope, Hemolytik will be useful for researchers working in the field of designing therapeutic peptides.


Asunto(s)
Bases de Datos de Proteínas , Hemolíticos/toxicidad , Péptidos/toxicidad , Hemólisis , Hemolíticos/química , Internet , Péptidos/química , Programas Informáticos
20.
J Transl Med ; 11: 74, 2013 Mar 22.
Artículo en Inglés | MEDLINE | ID: mdl-23517638

RESUMEN

BACKGROUND: Cell penetrating peptides have gained much recognition as a versatile transport vehicle for the intracellular delivery of wide range of cargoes (i.e. oligonucelotides, small molecules, proteins, etc.), that otherwise lack bioavailability, thus offering great potential as future therapeutics. Keeping in mind the therapeutic importance of these peptides, we have developed in silico methods for the prediction of cell penetrating peptides, which can be used for rapid screening of such peptides prior to their synthesis. METHODS: In the present study, support vector machine (SVM)-based models have been developed for predicting and designing highly effective cell penetrating peptides. Various features like amino acid composition, dipeptide composition, binary profile of patterns, and physicochemical properties have been used as input features. The main dataset used in this study consists of 708 peptides. In addition, we have identified various motifs in cell penetrating peptides, and used these motifs for developing a hybrid prediction model. Performance of our method was evaluated on an independent dataset and also compared with that of the existing methods. RESULTS: In cell penetrating peptides, certain residues (e.g. Arg, Lys, Pro, Trp, Leu, and Ala) are preferred at specific locations. Thus, it was possible to discriminate cell-penetrating peptides from non-cell penetrating peptides based on amino acid composition. All models were evaluated using five-fold cross-validation technique. We have achieved a maximum accuracy of 97.40% using the hybrid model that combines motif information and binary profile of the peptides. On independent dataset, we achieved maximum accuracy of 81.31% with MCC of 0.63. CONCLUSION: The present study demonstrates that features like amino acid composition, binary profile of patterns and motifs, can be used to train an SVM classifier that can predict cell penetrating peptides with higher accuracy. The hybrid model described in this study achieved more accuracy than the previous methods and thus may complement the existing methods. Based on the above study, a user-friendly web server CellPPD has been developed to help the biologists, where a user can predict and design CPPs with much ease. CellPPD web server is freely accessible at http://crdd.osdd.net/raghava/cellppd/.


Asunto(s)
Péptidos de Penetración Celular/farmacología , Ingeniería de Proteínas/métodos , Secuencias de Aminoácidos , Péptidos de Penetración Celular/síntesis química , Péptidos de Penetración Celular/química , Simulación por Computador , Bases de Datos de Proteínas , Sistemas de Liberación de Medicamentos , Oligonucleótidos/genética , Estructura Terciaria de Proteína , Curva ROC , Reproducibilidad de los Resultados , Análisis de Secuencia de Proteína , Máquina de Vectores de Soporte
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA