Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 289
Filtrar
Mais filtros

Bases de dados
Tipo de documento
Intervalo de ano de publicação
1.
J Am Soc Nephrol ; 34(2): 309-321, 2023 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-36368777

RESUMO

BACKGROUND: The National Kidney Foundation and American Society of Nephrology Task Force on Reassessing the Inclusion of Race in Diagnosing Kidney Disease recently recommended a new race-free creatinine-based equation for eGFR. The effect on recommended clinical care across race and ethnicity groups is unknown. METHODS: We analyzed nationally representative cross-sectional questionnaires and medical examinations from 44,360 participants collected between 2001 and 2018 by the National Health and Nutrition Examination Survey. We quantified the number and proportion of Black, White, Hispanic, and Asian/Other adults with guideline-recommended changes in care. RESULTS: The new equation, if applied nationally, could assign new CKD diagnoses to 434,000 (95% confidence interval [CI], 350,000 to 517,000) Black adults, reclassify 584,000 (95% CI, 508,000 to 667,000) to more advanced stages of CKD, restrict kidney donation eligibility for 246,000 (95% CI, 189,000 to 303,000), expand nephrologist referrals for 41,800 (95% CI, 19,800 to 63,800), and reduce medication dosing for 222,000 (95% CI, 169,000 to 275,000). Among non-Black adults, these changes may undo CKD diagnoses for 5.51 million (95% CI, 4.86 million to 6.16 million), reclassify 4.59 million (95% CI, 4.28 million to 4.92 million) to less advanced stages of CKD, expand kidney donation eligibility for 3.96 million (95% CI, 3.46 million to 4.46 million), reverse nephrologist referral for 75,800 (95% CI, 35,400 to 116,000), and reverse medication dose reductions for 1.47 million (95% CI, 1.22 million to 1.73 million). The racial and ethnic mix of the populations used to develop eGFR equations has a substantial effect on potential care changes. CONCLUSION: The newly recommended 2021 CKD-EPI creatinine-based eGFR equation may result in substantial changes to recommended care for US patients of all racial and ethnic groups.


Assuntos
Insuficiência Renal Crônica , Adulto , Humanos , Creatinina , Taxa de Filtração Glomerular , Inquéritos Nutricionais , Estudos Transversais , Insuficiência Renal Crônica/diagnóstico
2.
Lancet ; 398(10316): 2093-2100, 2021 12 04.
Artigo em Inglês | MEDLINE | ID: mdl-34756184

RESUMO

BACKGROUND: Many countries are experiencing a resurgence of COVID-19, driven predominantly by the delta (B.1.617.2) variant of SARS-CoV-2. In response, these countries are considering the administration of a third dose of mRNA COVID-19 vaccine as a booster dose to address potential waning immunity over time and reduced effectiveness against the delta variant. We aimed to use the data repositories of Israel's largest health-care organisation to evaluate the effectiveness of a third dose of the BNT162b2 mRNA vaccine for preventing severe COVID-19 outcomes. METHODS: Using data from Clalit Health Services, which provides mandatory health-care coverage for over half of the Israeli population, individuals receiving a third vaccine dose between July 30, 2020, and Sept 23, 2021, were matched (1:1) to demographically and clinically similar controls who did not receive a third dose. Eligible participants had received the second vaccine dose at least 5 months before the recruitment date, had no previous documented SARS-CoV-2 infection, and had no contact with the health-care system in the 3 days before recruitment. Individuals who are health-care workers, live in long-term care facilities, or are medically confined to their homes were excluded. Primary outcomes were COVID-19-related admission to hospital, severe disease, and COVID-19-related death. The third dose effectiveness for each outcome was estimated as 1 - risk ratio using the Kaplan-Meier estimator. FINDINGS: 1 158 269 individuals were eligible to be included in the third dose group. Following matching, the third dose and control groups each included 728 321 individuals. Participants had a median age of 52 years (IQR 37-68) and 51% were female. The median follow-up time was 13 days (IQR 6-21) in both groups. Vaccine effectiveness evaluated at least 7 days after receipt of the third dose, compared with receiving only two doses at least 5 months ago, was estimated to be 93% (231 events for two doses vs 29 events for three doses; 95% CI 88-97) for admission to hospital, 92% (157 vs 17 events; 82-97) for severe disease, and 81% (44 vs seven events; 59-97) for COVID-19-related death. INTERPRETATION: Our findings suggest that a third dose of the BNT162b2 mRNA vaccine is effective in protecting individuals against severe COVID-19-related outcomes, compared with receiving only two doses at least 5 months ago. FUNDING: The Ivan and Francesca Berkowitz Family Living Laboratory Collaboration at Harvard Medical School and Clalit Research Institute.


Assuntos
Vacina BNT162 , COVID-19/prevenção & controle , Imunização Secundária , Eficácia de Vacinas , Adulto , Idoso , COVID-19/epidemiologia , COVID-19/virologia , Feminino , Humanos , Israel/epidemiologia , Masculino , Vacinação em Massa , Pessoa de Meia-Idade , Pandemias/prevenção & controle , Prognóstico , SARS-CoV-2
5.
N Engl J Med ; 379(22): 2131-2139, 2018 11 29.
Artigo em Inglês | MEDLINE | ID: mdl-30304647

RESUMO

BACKGROUND: Many patients remain without a diagnosis despite extensive medical evaluation. The Undiagnosed Diseases Network (UDN) was established to apply a multidisciplinary model in the evaluation of the most challenging cases and to identify the biologic characteristics of newly discovered diseases. The UDN, which is funded by the National Institutes of Health, was formed in 2014 as a network of seven clinical sites, two sequencing cores, and a coordinating center. Later, a central biorepository, a metabolomics core, and a model organisms screening center were added. METHODS: We evaluated patients who were referred to the UDN over a period of 20 months. The patients were required to have an undiagnosed condition despite thorough evaluation by a health care provider. We determined the rate of diagnosis among patients who subsequently had a complete evaluation, and we observed the effect of diagnosis on medical care. RESULTS: A total of 1519 patients (53% female) were referred to the UDN, of whom 601 (40%) were accepted for evaluation. Of the accepted patients, 192 (32%) had previously undergone exome sequencing. Symptoms were neurologic in 40% of the applicants, musculoskeletal in 10%, immunologic in 7%, gastrointestinal in 7%, and rheumatologic in 6%. Of the 382 patients who had a complete evaluation, 132 received a diagnosis, yielding a rate of diagnosis of 35%. A total of 15 diagnoses (11%) were made by clinical review alone, and 98 (74%) were made by exome or genome sequencing. Of the diagnoses, 21% led to recommendations regarding changes in therapy, 37% led to changes in diagnostic testing, and 36% led to variant-specific genetic counseling. We defined 31 new syndromes. CONCLUSIONS: The UDN established a diagnosis in 132 of the 382 patients who had a complete evaluation, yielding a rate of diagnosis of 35%. (Funded by the National Institutes of Health Common Fund.).


Assuntos
Testes Genéticos , Doenças Raras/genética , Análise de Sequência de DNA , Adulto , Animais , Criança , Diagnóstico Diferencial , Drosophila , Exoma , Feminino , Testes Genéticos/economia , Custos de Cuidados de Saúde/estatística & dados numéricos , Humanos , Masculino , Modelos Animais , National Institutes of Health (U.S.) , Doenças Raras/diagnóstico , Síndrome , Estados Unidos
6.
Bioinformatics ; 36(13): 4047-4057, 2020 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-31860066

RESUMO

MOTIVATION: The advent of in vivo automated techniques for single-cell lineaging, sequencing and analysis of gene expression has begun to dramatically increase our understanding of organismal development. We applied novel meta-analysis and visualization techniques to the EPIC single-cell-resolution developmental gene expression dataset for Caenorhabditis elegans from Bao, Murray, Waterston et al. to gain insights into regulatory mechanisms governing the timing of development. RESULTS: Our meta-analysis of the EPIC dataset revealed that a simple linear combination of the expression levels of the developmental genes is strongly correlated with the developmental age of the organism, irrespective of the cell division rate of different cell lineages. We uncovered a pattern of collective sinusoidal oscillation in gene activation, in multiple dominant frequencies and in multiple orthogonal axes of gene expression, pointing to the existence of a coordinated, multi-frequency global timing mechanism. We developed a novel method based on Fisher's Discriminant Analysis to identify gene expression weightings that maximally separate traits of interest, and found that remarkably, simple linear gene expression weightings are capable of producing sinusoidal oscillations of any frequency and phase, adding to the growing body of evidence that oscillatory mechanisms likely play an important role in the timing of development. We cross-linked EPIC with gene ontology and anatomy ontology terms, employing Fisher's Discriminant Analysis methods to identify previously unknown positive and negative genetic contributions to developmental processes and cell phenotypes. This meta-analysis demonstrates new evidence for direct linear and/or sinusoidal mechanisms regulating the timing of development. We uncovered a number of previously unknown positive and negative correlations between developmental genes and developmental processes or cell phenotypes. Our results highlight both the continued relevance of the EPIC technique, and the value of meta-analysis of previously published results. The presented analysis and visualization techniques are broadly applicable across developmental and systems biology. AVAILABILITY AND IMPLEMENTATION: Analysis software available upon request. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Proteínas de Caenorhabditis elegans , Caenorhabditis elegans , Animais , Caenorhabditis elegans/genética , Caenorhabditis elegans/metabolismo , Proteínas de Caenorhabditis elegans/genética , Proteínas de Caenorhabditis elegans/metabolismo , Linhagem da Célula , Regulação da Expressão Gênica no Desenvolvimento , Ativação Transcricional
7.
Genet Med ; 23(6): 1075-1085, 2021 06.
Artigo em Inglês | MEDLINE | ID: mdl-33580225

RESUMO

PURPOSE: Genomic sequencing has become an increasingly powerful and relevant tool to be leveraged for the discovery of genetic aberrations underlying rare, Mendelian conditions. Although the computational tools incorporated into diagnostic workflows for this task are continually evolving and improving, we nevertheless sought to investigate commonalities across sequencing processing workflows to reveal consensus and standard practice tools and highlight exploratory analyses where technical and theoretical method improvements would be most impactful. METHODS: We collected details regarding the computational approaches used by a genetic testing laboratory and 11 clinical research sites in the United States participating in the Undiagnosed Diseases Network via meetings with bioinformaticians, online survey forms, and analyses of internal protocols. RESULTS: We found that tools for processing genomic sequencing data can be grouped into four distinct categories. Whereas well-established practices exist for initial variant calling and quality control steps, there is substantial divergence across sites in later stages for variant prioritization and multimodal data integration, demonstrating a diversity of approaches for solving the most mysterious undiagnosed cases. CONCLUSION: The largest differences across diagnostic workflows suggest that advances in structural variant detection, noncoding variant interpretation, and integration of additional biomedical data may be especially promising for solving chronically undiagnosed cases.


Assuntos
Genômica , Doenças não Diagnosticadas , Biologia Computacional , Testes Genéticos , Genoma , Humanos , Software , Fluxo de Trabalho
8.
J Med Internet Res ; 23(3): e22219, 2021 03 02.
Artigo em Inglês | MEDLINE | ID: mdl-33600347

RESUMO

Coincident with the tsunami of COVID-19-related publications, there has been a surge of studies using real-world data, including those obtained from the electronic health record (EHR). Unfortunately, several of these high-profile publications were retracted because of concerns regarding the soundness and quality of the studies and the EHR data they purported to analyze. These retractions highlight that although a small community of EHR informatics experts can readily identify strengths and flaws in EHR-derived studies, many medical editorial teams and otherwise sophisticated medical readers lack the framework to fully critically appraise these studies. In addition, conventional statistical analyses cannot overcome the need for an understanding of the opportunities and limitations of EHR-derived studies. We distill here from the broader informatics literature six key considerations that are crucial for appraising studies utilizing EHR data: data completeness, data collection and handling (eg, transformation), data type (ie, codified, textual), robustness of methods against EHR variability (within and across institutions, countries, and time), transparency of data and analytic code, and the multidisciplinary approach. These considerations will inform researchers, clinicians, and other stakeholders as to the recommended best practices in reviewing manuscripts, grants, and other outputs from EHR-data derived studies, and thereby promote and foster rigor, quality, and reliability of this rapidly growing field.


Assuntos
COVID-19/epidemiologia , Coleta de Dados/métodos , Registros Eletrônicos de Saúde , Coleta de Dados/normas , Humanos , Revisão da Pesquisa por Pares/normas , Editoração/normas , Reprodutibilidade dos Testes , SARS-CoV-2/isolamento & purificação
9.
Hum Mol Genet ; 27(R1): R29-R34, 2018 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-29566172

RESUMO

While tens of thousands of pathogenic variants are used to inform the many clinical applications of genomics, there remains limited information on quantitative disease risk for the majority of variants used in clinical practice. At the same time, rising demand for genetic counselling has prompted a growing need for computational approaches that can help interpret genetic variation. Such tasks include predicting variant pathogenicity and identifying variants that are too common to be penetrant. To address these challenges, researchers are increasingly turning to integrative informatics approaches. These approaches often leverage vast sources of data, including electronic health records and population-level allele frequency databases (e.g. gnomAD), as well as machine learning techniques such as support vector machines and deep learning. In this review, we highlight recent informatics and machine learning approaches that are improving our understanding of pathogenic variation and discuss obstacles that may limit their emerging role in clinical genomics.


Assuntos
Biologia Computacional/tendências , Genoma Humano/genética , Genômica/tendências , Aprendizado de Máquina/tendências , Bases de Dados Genéticas , Humanos
10.
BMC Med ; 18(1): 236, 2020 08 18.
Artigo em Inglês | MEDLINE | ID: mdl-32807164

RESUMO

BACKGROUND: Ovarian cancer causes 151,900 deaths per year worldwide. Treatment and prognosis are primarily determined by the histopathologic interpretation in combination with molecular diagnosis. However, the relationship between histopathology patterns and molecular alterations is not fully understood, and it is difficult to predict patients' chemotherapy response using the known clinical and histological variables. METHODS: We analyzed the whole-slide histopathology images, RNA-Seq, and proteomics data from 587 primary serous ovarian adenocarcinoma patients and developed a systematic algorithm to integrate histopathology and functional omics findings and to predict patients' response to platinum-based chemotherapy. RESULTS: Our convolutional neural networks identified the cancerous regions with areas under the receiver operating characteristic curve (AUCs) > 0.95 and classified tumor grade with AUCs > 0.80. Functional omics analysis revealed that expression levels of proteins participated in innate immune responses and catabolic pathways are associated with tumor grade. Quantitative histopathology analysis successfully stratified patients with different response to platinum-based chemotherapy (P = 0.003). CONCLUSIONS: These results indicated the potential clinical utility of quantitative histopathology evaluation in tumor cell detection and chemotherapy response prediction. The developed algorithm is easily extensible to other tumor types and treatment modalities.


Assuntos
Neoplasias Ovarianas/tratamento farmacológico , Neoplasias Ovarianas/patologia , Platina/uso terapêutico , Feminino , Humanos , Pessoa de Meia-Idade , Prognóstico
11.
J Med Internet Res ; 22(8): e16709, 2020 08 05.
Artigo em Inglês | MEDLINE | ID: mdl-32755895

RESUMO

BACKGROUND: Chest computed tomography (CT) is crucial for the detection of lung cancer, and many automated CT evaluation methods have been proposed. Due to the divergent software dependencies of the reported approaches, the developed methods are rarely compared or reproduced. OBJECTIVE: The goal of the research was to generate reproducible machine learning modules for lung cancer detection and compare the approaches and performances of the award-winning algorithms developed in the Kaggle Data Science Bowl. METHODS: We obtained the source codes of all award-winning solutions of the Kaggle Data Science Bowl Challenge, where participants developed automated CT evaluation methods to detect lung cancer (training set n=1397, public test set n=198, final test set n=506). The performance of the algorithms was evaluated by the log-loss function, and the Spearman correlation coefficient of the performance in the public and final test sets was computed. RESULTS: Most solutions implemented distinct image preprocessing, segmentation, and classification modules. Variants of U-Net, VGGNet, and residual net were commonly used in nodule segmentation, and transfer learning was used in most of the classification algorithms. Substantial performance variations in the public and final test sets were observed (Spearman correlation coefficient = .39 among the top 10 teams). To ensure the reproducibility of results, we generated a Docker container for each of the top solutions. CONCLUSIONS: We compared the award-winning algorithms for lung cancer detection and generated reproducible Docker images for the top solutions. Although convolutional neural networks achieved decent accuracy, there is plenty of room for improvement regarding model generalizability.


Assuntos
Neoplasias Pulmonares/diagnóstico por imagem , Neoplasias Pulmonares/diagnóstico , Aprendizado de Máquina/normas , Tomografia Computadorizada por Raios X/métodos , Algoritmos , Humanos , Reprodutibilidade dos Testes
12.
BMC Bioinformatics ; 20(1): 268, 2019 May 28.
Artigo em Inglês | MEDLINE | ID: mdl-31138121

RESUMO

BACKGROUND: Correcting a heterogeneous dataset that presents artefacts from several confounders is often an essential bioinformatics task. Attempting to remove these batch effects will result in some biologically meaningful signals being lost. Thus, a central challenge is assessing if the removal of unwanted technical variation harms the biological signal that is of interest to the researcher. RESULTS: We describe a novel framework, B-CeF, to evaluate the effectiveness of batch correction methods and their tendency toward over or under correction. The approach is based on comparing co-expression of adjusted gene-gene pairs to a-priori knowledge of highly confident gene-gene associations based on thousands of unrelated experiments derived from an external reference. Our framework includes three steps: (1) data adjustment with the desired methods (2) calculating gene-gene co-expression measurements for adjusted datasets (3) evaluating the performance of the co-expression measurements against a gold standard. Using the framework, we evaluated five batch correction methods applied to RNA-seq data of six representative tissue datasets derived from the GTEx project. CONCLUSIONS: Our framework enables the evaluation of batch correction methods to better preserve the original biological signal. We show that using a multiple linear regression model to correct for known confounders outperforms factor analysis-based methods that estimate hidden confounders. The code is publicly available as an R package.


Assuntos
Algoritmos , Biologia Computacional/métodos , Bases de Dados Genéticas , Epistasia Genética , Genes , Área Sob a Curva , Regulação da Expressão Gênica , Humanos , Curva ROC , Gordura Subcutânea/metabolismo
14.
N Engl J Med ; 375(7): 655-65, 2016 Aug 18.
Artigo em Inglês | MEDLINE | ID: mdl-27532831

RESUMO

BACKGROUND: For more than a decade, risk stratification for hypertrophic cardiomyopathy has been enhanced by targeted genetic testing. Using sequencing results, clinicians routinely assess the risk of hypertrophic cardiomyopathy in a patient's relatives and diagnose the condition in patients who have ambiguous clinical presentations. However, the benefits of genetic testing come with the risk that variants may be misclassified. METHODS: Using publicly accessible exome data, we identified variants that have previously been considered causal in hypertrophic cardiomyopathy and that are overrepresented in the general population. We studied these variants in diverse populations and reevaluated their initial ascertainments in the medical literature. We reviewed patient records at a leading genetic-testing laboratory for occurrences of these variants during the near-decade-long history of the laboratory. RESULTS: Multiple patients, all of whom were of African or unspecified ancestry, received positive reports, with variants misclassified as pathogenic on the basis of the understanding at the time of testing. Subsequently, all reported variants were recategorized as benign. The mutations that were most common in the general population were significantly more common among black Americans than among white Americans (P<0.001). Simulations showed that the inclusion of even small numbers of black Americans in control cohorts probably would have prevented these misclassifications. We identified methodologic shortcomings that contributed to these errors in the medical literature. CONCLUSIONS: The misclassification of benign variants as pathogenic that we found in our study shows the need for sequencing the genomes of diverse populations, both in asymptomatic controls and the tested patient population. These results expand on current guidelines, which recommend the use of ancestry-matched controls to interpret variants. As additional populations of different ancestry backgrounds are sequenced, we expect variant reclassifications to increase, particularly for ancestry groups that have historically been less well studied. (Funded by the National Institutes of Health.).


Assuntos
Negro ou Afro-Americano/genética , Cardiomiopatia Hipertrófica/genética , Reações Falso-Positivas , Predisposição Genética para Doença , Variação Genética , Adolescente , Adulto , Idoso , Asiático/genética , Criança , Exoma , Testes Genéticos , Genótipo , Disparidades nos Níveis de Saúde , Hispânico ou Latino/genética , Humanos , Pessoa de Meia-Idade , Mutação , Análise de Sequência de DNA , Estados Unidos , População Branca/genética , Adulto Jovem
15.
Cancer Immunol Immunother ; 68(6): 917-926, 2019 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-30877325

RESUMO

INTRODUCTION: Patients with pre-existing autoimmune diseases have been excluded from clinical trials of immune checkpoint inhibitors (ICIs) for cancer. Real-world evidence is necessary to understand ICI safety in this population. METHODS: Patients treated with ICIs from 2011 to 2017 were identified using data from a large health insurer. Outcomes included time to (1) any hospitalization; (2) any hospitalization with an irAE diagnosis; and (3) outpatient corticosteroid treatment. The key exposure was pre-existing autoimmune disease, ascertained within 12 months before starting ICI treatment, and defined either by strict criteria (one inpatient or two outpatient claims at least 30 days apart) or relaxed criteria only (any claim, without meeting strict criteria). RESULTS: Of 4438 ICI-treated patients, pre-existing autoimmune disease was present among 179 (4%) by strict criteria, and another 283 (6%) by relaxed criteria only. In multivariable models, pre-existing autoimmune disease by strict criteria was not associated with all-cause hospitalization (HR 1.27, 95% CI 0.998-1.62), but it was associated with hospitalization with an irAE diagnosis (HR 1.81, 95% CI 1.21-2.71) and with corticosteroid treatment (HR 1.93, 95% CI 1.35-2.76). Similarly, pre-existing autoimmune disease by relaxed criteria only was not associated with all-cause hospitalization (HR 1.11, 95% CI 0.91-1.34), but was associated with hospitalization with an irAE diagnosis (HR 1.46, 95% CI 1.06-2.01) and corticosteroid treatment (HR 1.46, 95% CI 1.13-1.88). CONCLUSION: Pre-existing autoimmune disease was not associated with time to any hospitalization after initiating ICI therapy, but it was associated with a modest increase in hospitalizations with irAE diagnoses and with corticosteroid treatment.


Assuntos
Anticorpos Monoclonais/imunologia , Doenças Autoimunes/imunologia , Antígeno B7-H1/imunologia , Antígeno CTLA-4/imunologia , Neoplasias/imunologia , Receptor de Morte Celular Programada 1/imunologia , Corticosteroides/uso terapêutico , Adulto , Idoso , Idoso de 80 Anos ou mais , Anticorpos Monoclonais/efeitos adversos , Anticorpos Monoclonais/uso terapêutico , Doenças Autoimunes/complicações , Doenças Autoimunes/tratamento farmacológico , Antígeno B7-H1/antagonistas & inibidores , Antígeno CTLA-4/antagonistas & inibidores , Feminino , Hospitalização/estatística & dados numéricos , Humanos , Imunoterapia/efeitos adversos , Imunoterapia/métodos , Seguro Saúde/estatística & dados numéricos , Masculino , Pessoa de Meia-Idade , Análise Multivariada , Neoplasias/complicações , Neoplasias/terapia , Receptor de Morte Celular Programada 1/antagonistas & inibidores
16.
Bioinformatics ; 34(8): 1431-1432, 2018 04 15.
Artigo em Inglês | MEDLINE | ID: mdl-29267850

RESUMO

Motivation: In the era of big data and precision medicine, the number of databases containing clinical, environmental, self-reported and biochemical variables is increasing exponentially. Enabling the experts to focus on their research questions rather than on computational data management, access and analysis is one of the most significant challenges nowadays. Results: We present Rcupcake, an R package that contains a variety of functions for leveraging different databases through the BD2K PIC-SURE RESTful API and facilitating its query, analysis and interpretation. The package offers a variety of analysis and visualization tools, including the study of the phenotype co-occurrence and prevalence, according to multiple layers of data, such as phenome, exposome or genome. Availability and implementation: The package is implemented in R and is available under Mozilla v2 license from GitHub (https://github.com/hms-dbmi/Rcupcake). Two reproducible case studies are also available (https://github.com/hms-dbmi/Rcupcake-case-studies/blob/master/SSCcaseStudy_v01.ipynb, https://github.com/hms-dbmi/Rcupcake-case-studies/blob/master/NHANEScaseStudy_v01.ipynb). Contact: paul_avillach@hms.harvard.edu. Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Biologia Computacional/métodos , Genoma Humano , Fenótipo , Medicina de Precisão , Software , Bases de Dados Factuais , Humanos
17.
J Proteome Res ; 17(4): 1383-1396, 2018 04 06.
Artigo em Inglês | MEDLINE | ID: mdl-29505266

RESUMO

There are more than 3.7 million published articles on the biological functions or disease implications of proteins, constituting an important resource of proteomics knowledge. However, it is difficult to summarize the millions of proteomics findings in the literature manually and quantify their relevance to the biology and diseases of interest. We developed a fully automated bioinformatics framework to identify and prioritize proteins associated with any biological entity. We used the 22 targeted areas of the Biology/Disease-driven (B/D)-Human Proteome Project (HPP) as examples, prioritized the relevant proteins through their Protein Universal Reference Publication-Originated Search Engine (PURPOSE) scores, validated the relevance of the score by comparing the protein prioritization results with a curated database, computed the scores of proteins across the topics of B/D-HPP, and characterized the top proteins in the common model organisms. We further extended the bioinformatics workflow to identify the relevant proteins in all organ systems and human diseases and deployed a cloud-based tool to prioritize proteins related to any custom search terms in real time. Our tool can facilitate the prioritization of proteins for any organ system or disease of interest and can contribute to the development of targeted proteomic studies for precision medicine.


Assuntos
Biologia Computacional/métodos , Proteômica/métodos , Animais , Projeto Genoma Humano , Humanos , Medicina de Precisão/métodos , Pesquisa , Ferramenta de Busca
18.
J Proteome Res ; 17(12): 4345-4357, 2018 12 07.
Artigo em Inglês | MEDLINE | ID: mdl-30094994

RESUMO

Targeted metabolomics and biochemical studies complement the ongoing investigations led by the Human Proteome Organization (HUPO) Biology/Disease-Driven Human Proteome Project (B/D-HPP). However, it is challenging to identify and prioritize metabolite and chemical targets. Literature-mining-based approaches have been proposed for target proteomics studies, but text mining methods for metabolite and chemical prioritization are hindered by a large number of synonyms and nonstandardized names of each entity. In this study, we developed a cloud-based literature mining and summarization platform that maps metabolites and chemicals in the literature to unique identifiers and summarizes the copublication trends of metabolites/chemicals and B/D-HPP topics using Protein Universal Reference Publication-Originated Search Engine (PURPOSE) scores. We successfully prioritized metabolites and chemicals associated with the B/D-HPP targeted fields and validated the results by checking against expert-curated associations and enrichment analyses. Compared with existing algorithms, our system achieved better precision and recall in retrieving chemicals related to B/D-HPP focused areas. Our cloud-based platform enables queries on all biological terms in multiple species, which will contribute to B/D-HPP and targeted metabolomics/chemical studies.


Assuntos
Computação em Nuvem , Metabolômica , Proteoma , Algoritmos , Mineração de Dados/métodos , Humanos , Ferramenta de Busca
20.
Nat Rev Genet ; 12(6): 417-28, 2011 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-21587298

RESUMO

If genomic studies are to be a clinically relevant and timely reflection of the relationship between genetics and health status--whether for common or rare variants--cost-effective ways must be found to measure both the genetic variation and the phenotypic characteristics of large populations, including the comprehensive and up-to-date record of their medical treatment. The adoption of electronic health records, used by clinicians to document clinical care, is becoming widespread and recent studies demonstrate that they can be effectively employed for genetic studies using the informational and biological 'by-products' of health-care delivery while maintaining patient privacy.


Assuntos
Registros Eletrônicos de Saúde , Genética Médica/métodos , Genômica , Alelos , Coleta de Dados , Etnicidade , Doenças Genéticas Inatas/diagnóstico , Doenças Genéticas Inatas/genética , Variação Genética , Genoma , Humanos , Consentimento Livre e Esclarecido , Sistemas Computadorizados de Registros Médicos , Modelos Genéticos , Fenótipo , Projetos de Pesquisa
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA