Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 40
Filtrar
1.
Oncologist ; 23(2): 179-185, 2018 02.
Artigo em Inglês | MEDLINE | ID: mdl-29158372

RESUMO

BACKGROUND: Using next-generation sequencing (NGS) to guide cancer therapy has created challenges in analyzing and reporting large volumes of genomic data to patients and caregivers. Specifically, providing current, accurate information on newly approved therapies and open clinical trials requires considerable manual curation performed mainly by human "molecular tumor boards" (MTBs). The purpose of this study was to determine the utility of cognitive computing as performed by Watson for Genomics (WfG) compared with a human MTB. MATERIALS AND METHODS: One thousand eighteen patient cases that previously underwent targeted exon sequencing at the University of North Carolina (UNC) and subsequent analysis by the UNCseq informatics pipeline and the UNC MTB between November 7, 2011, and May 12, 2015, were analyzed with WfG, a cognitive computing technology for genomic analysis. RESULTS: Using a WfG-curated actionable gene list, we identified additional genomic events of potential significance (not discovered by traditional MTB curation) in 323 (32%) patients. The majority of these additional genomic events were considered actionable based upon their ability to qualify patients for biomarker-selected clinical trials. Indeed, the opening of a relevant clinical trial within 1 month prior to WfG analysis provided the rationale for identification of a new actionable event in nearly a quarter of the 323 patients. This automated analysis took <3 minutes per case. CONCLUSION: These results demonstrate that the interpretation and actionability of somatic NGS results are evolving too rapidly to rely solely on human curation. Molecular tumor boards empowered by cognitive computing could potentially improve patient care by providing a rapid, comprehensive approach for data analysis and consideration of up-to-date availability of clinical trials. IMPLICATIONS FOR PRACTICE: The results of this study demonstrate that the interpretation and actionability of somatic next-generation sequencing results are evolving too rapidly to rely solely on human curation. Molecular tumor boards empowered by cognitive computing can significantly improve patient care by providing a fast, cost-effective, and comprehensive approach for data analysis in the delivery of precision medicine. Patients and physicians who are considering enrollment in clinical trials may benefit from the support of such tools applied to genomic data.


Assuntos
Protocolos de Quimioterapia Combinada Antineoplásica/uso terapêutico , Neoplasias/tratamento farmacológico , Biomarcadores Tumorais , Estudos de Casos e Controles , Terapia Combinada , Seguimentos , Regulação Neoplásica da Expressão Gênica , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Metástase Linfática , Invasividade Neoplásica , Recidiva Local de Neoplasia/tratamento farmacológico , Recidiva Local de Neoplasia/patologia , Neoplasias/patologia , Prognóstico , Estudos Retrospectivos , Taxa de Sobrevida
2.
PLoS Comput Biol ; 12(6): e1004890, 2016 06.
Artigo em Inglês | MEDLINE | ID: mdl-27351836

RESUMO

Acute Myeloid Leukemia (AML) is a fatal hematological cancer. The genetic abnormalities underlying AML are extremely heterogeneous among patients, making prognosis and treatment selection very difficult. While clinical proteomics data has the potential to improve prognosis accuracy, thus far, the quantitative means to do so have yet to be developed. Here we report the results and insights gained from the DREAM 9 Acute Myeloid Prediction Outcome Prediction Challenge (AML-OPC), a crowdsourcing effort designed to promote the development of quantitative methods for AML prognosis prediction. We identify the most accurate and robust models in predicting patient response to therapy, remission duration, and overall survival. We further investigate patient response to therapy, a clinically actionable prediction, and find that patients that are classified as resistant to therapy are harder to predict than responsive patients across the 31 models submitted to the challenge. The top two performing models, which held a high sensitivity to these patients, substantially utilized the proteomics data to make predictions. Using these models, we also identify which signaling proteins were useful in predicting patient therapeutic response.


Assuntos
Algoritmos , Esclerose Lateral Amiotrófica/diagnóstico , Esclerose Lateral Amiotrófica/terapia , Crowdsourcing/métodos , Avaliação de Processos e Resultados em Cuidados de Saúde/métodos , Proteoma/metabolismo , Esclerose Lateral Amiotrófica/metabolismo , Biomarcadores/metabolismo , Humanos , Reprodutibilidade dos Testes , Medição de Risco , Sensibilidade e Especificidade , Resultado do Tratamento
3.
BMC Bioinformatics ; 17: 155, 2016 Apr 08.
Artigo em Inglês | MEDLINE | ID: mdl-27059896

RESUMO

BACKGROUND: Understanding the interactions between antibodies and the linear epitopes that they recognize is an important task in the study of immunological diseases. We present a novel computational method for the design of linear epitopes of specified binding affinity to Intravenous Immunoglobulin (IVIg). RESULTS: We show that the method, called Pythia-design can accurately design peptides with both high-binding affinity and low binding affinity to IVIg. To show this, we experimentally constructed and tested the computationally constructed designs. We further show experimentally that these designed peptides are more accurate that those produced by a recent method for the same task. Pythia-design is based on combining random walks with an ensemble of probabilistic support vector machines (SVM) classifiers, and we show that it produces a diverse set of designed peptides, an important property to develop robust sets of candidates for construction. We show that by combining Pythia-design and the method of (PloS ONE 6(8):23616, 2011), we are able to produce an even more accurate collection of designed peptides. Analysis of the experimental validation of Pythia-design peptides indicates that binding of IVIg is favored by epitopes that contain trypthophan and cysteine. CONCLUSIONS: Our method, Pythia-design, is able to generate a diverse set of binding and non-binding peptides, and its designs have been experimentally shown to be accurate.


Assuntos
Biologia Computacional/métodos , Epitopos/química , Imunoglobulinas Intravenosas/química , Peptídeos Cíclicos/química , Citrulina/química , Cisteína/química , Humanos , Modelos Moleculares , Reprodutibilidade dos Testes , Máquina de Vetores de Suporte , Triptofano/química
4.
Genome Res ; 23(11): 1928-37, 2013 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-23950146

RESUMO

The Gene Promoter Expression Prediction challenge consisted of predicting gene expression from promoter sequences in a previously unknown experimentally generated data set. The challenge was presented to the community in the framework of the sixth Dialogue for Reverse Engineering Assessments and Methods (DREAM6), a community effort to evaluate the status of systems biology modeling methodologies. Nucleotide-specific promoter activity was obtained by measuring fluorescence from promoter sequences fused upstream of a gene for yellow fluorescence protein and inserted in the same genomic site of yeast Saccharomyces cerevisiae. Twenty-one teams submitted results predicting the expression levels of 53 different promoters from yeast ribosomal protein genes. Analysis of participant predictions shows that accurate values for low-expressed and mutated promoters were difficult to obtain, although in the latter case, only when the mutation induced a large change in promoter activity compared to the wild-type sequence. As in previous DREAM challenges, we found that aggregation of participant predictions provided robust results, but did not fare better than the three best algorithms. Finally, this study not only provides a benchmark for the assessment of methods predicting activity of a specific set of promoters from their sequence, but it also shows that the top performing algorithm, which used machine-learning approaches, can be improved by the addition of biological features such as transcription factor binding sites.


Assuntos
Crowdsourcing , Expressão Gênica , Regiões Promotoras Genéticas , Proteínas Ribossômicas/genética , Ribossomos/genética , Saccharomyces cerevisiae/genética , Algoritmos , Sítios de Ligação/genética , Perfilação da Expressão Gênica , Regulação Fúngica da Expressão Gênica , Redes Reguladoras de Genes , Genes Fúngicos , Modelos Genéticos , Mutação , Elementos Reguladores de Transcrição , Ribossomos/metabolismo , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/metabolismo , Biologia de Sistemas
5.
Bioinformatics ; 31(4): 501-8, 2015 Feb 15.
Artigo em Inglês | MEDLINE | ID: mdl-25150249

RESUMO

MOTIVATION: Experiments in animal models are often conducted to infer how humans will respond to stimuli by assuming that the same biological pathways will be affected in both organisms. The limitations of this assumption were tested in the IMPROVER Species Translation Challenge, where 52 stimuli were applied to both human and rat cells and perturbed pathways were identified. In the Inter-species Pathway Perturbation Prediction sub-challenge, multiple teams proposed methods to use rat transcription data from 26 stimuli to predict human gene set and pathway activity under the same perturbations. Submissions were evaluated using three performance metrics on data from the remaining 26 stimuli. RESULTS: We present two approaches, ranked second in this challenge, that do not rely on sequence-based orthology between rat and human genes to translate pathway perturbation state but instead identify transcriptional response orthologs across a set of training conditions. The translation from rat to human accomplished by these so-called direct methods is not dependent on the particular analysis method used to identify perturbed gene sets. In contrast, machine learning-based methods require performing a pathway analysis initially and then mapping the pathway activity between organisms. Unlike most machine learning approaches, direct methods can be used to predict the activation of a human pathway for a new (test) stimuli, even when that pathway was never activated by a training stimuli. AVAILABILITY: Gene expression data are available from ArrayExpress (accession E-MTAB-2091), while software implementations are available from http://bioinformaticsprb.med.wayne.edu?p=50 and http://goo.gl/hJny3h. CONTACT: christoph.hafemeister@nyu.edu or atarca@med.wayne.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Inteligência Artificial , Citocinas/metabolismo , Perfilação da Expressão Gênica/métodos , Fosfoproteínas/metabolismo , Software , Biologia de Sistemas/métodos , Animais , Brônquios/citologia , Brônquios/metabolismo , Células Cultivadas , Bases de Dados Factuais , Células Epiteliais/citologia , Células Epiteliais/metabolismo , Regulação da Expressão Gênica , Humanos , Modelos Animais , Análise de Sequência com Séries de Oligonucleotídeos , Fosforilação , Ratos , Transdução de Sinais , Especificidade da Espécie , Pesquisa Translacional Biomédica
6.
Bioinformatics ; 31(4): 492-500, 2015 Feb 15.
Artigo em Inglês | MEDLINE | ID: mdl-25152231

RESUMO

MOTIVATION: Translating findings in rodent models to human models has been a cornerstone of modern biology and drug development. However, in many cases, a naive 'extrapolation' between the two species has not succeeded. As a result, clinical trials of new drugs sometimes fail even after considerable success in the mouse or rat stage of development. In addition to in vitro studies, inter-species translation requires analytical tools that can predict the enriched gene sets in human cells under various stimuli from corresponding measurements in animals. Such tools can improve our understanding of the underlying biology and optimize the allocation of resources for drug development. RESULTS: We developed an algorithm to predict differential gene set enrichment as part of the sbv IMPROVER (systems biology verification in Industrial Methodology for Process Verification in Research) Species Translation Challenge, which focused on phosphoproteomic and transcriptomic measurements of normal human bronchial epithelial (NHBE) primary cells under various stimuli and corresponding measurements in rat (NRBE) primary cells. We find that gene sets exhibit a higher inter-species correlation compared with individual genes, and are potentially more suited for direct prediction. Furthermore, in contrast to a similar cross-species response in protein phosphorylation states 5 and 25 min after exposure to stimuli, gene set enrichment 6 h after exposure is significantly different in NHBE cells compared with NRBE cells. In spite of this difference, we were able to develop a robust algorithm to predict gene set activation in NHBE with high accuracy using simple analytical methods. AVAILABILITY AND IMPLEMENTATION: Implementation of all algorithms is available as source code (in Matlab) at http://bhanot.biomaps.rutgers.edu/wiki/codes_SC3_Predicting_GeneSets.zip, along with the relevant data used in the analysis. Gene sets, gene expression and protein phosphorylation data are available on request. CONTACT: hormoz@kitp.ucsb.edu.


Assuntos
Algoritmos , Perfilação da Expressão Gênica/métodos , Redes Reguladoras de Genes , Proteômica/métodos , Biologia de Sistemas/métodos , Animais , Brônquios/citologia , Brônquios/metabolismo , Células Cultivadas , Citocinas/metabolismo , Interpretação Estatística de Dados , Bases de Dados Factuais , Células Epiteliais/citologia , Células Epiteliais/metabolismo , Regulação da Expressão Gênica , Humanos , Camundongos , Fosfoproteínas/metabolismo , Fosforilação , Ratos , Transdução de Sinais , Especificidade da Espécie
7.
Bioinformatics ; 31(4): 462-70, 2015 Feb 15.
Artigo em Inglês | MEDLINE | ID: mdl-25061067

RESUMO

MOTIVATION: Using gene expression to infer changes in protein phosphorylation levels induced in cells by various stimuli is an outstanding problem. The intra-species protein phosphorylation challenge organized by the IMPROVER consortium provided the framework to identify the best approaches to address this issue. RESULTS: Rat lung epithelial cells were treated with 52 stimuli, and gene expression and phosphorylation levels were measured. Competing teams used gene expression data from 26 stimuli to develop protein phosphorylation prediction models and were ranked based on prediction performance for the remaining 26 stimuli. Three teams were tied in first place in this challenge achieving a balanced accuracy of about 70%, indicating that gene expression is only moderately predictive of protein phosphorylation. In spite of the similar performance, the approaches used by these three teams, described in detail in this article, were different, with the average number of predictor genes per phosphoprotein used by the teams ranging from 3 to 124. However, a significant overlap of gene signatures between teams was observed for the majority of the proteins considered, while Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways were enriched in the union of the predictor genes of the three teams for multiple proteins. AVAILABILITY AND IMPLEMENTATION: Gene expression and protein phosphorylation data are available from ArrayExpress (E-MTAB-2091). Software implementation of the approach of Teams 49 and 75 are available at http://bioinformaticsprb.med.wayne.edu and http://people.cs.clemson.edu/∼luofeng/sbv.rar, respectively. CONTACT: gyanbhanot@gmail.com or luofeng@clemson.edu or atarca@med.wayne.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Células Epiteliais/metabolismo , Perfilação da Expressão Gênica , Pulmão/metabolismo , Fosfoproteínas/metabolismo , Software , Biologia de Sistemas/métodos , Algoritmos , Animais , Células Cultivadas , Bases de Dados Factuais , Células Epiteliais/citologia , Regulação da Expressão Gênica , Redes Reguladoras de Genes , Humanos , Pulmão/citologia , Análise de Sequência com Séries de Oligonucleotídeos , Fosforilação , Ratos , Especificidade da Espécie , Pesquisa Translacional Biomédica
8.
Bioinformatics ; 31(4): 453-61, 2015 Feb 15.
Artigo em Inglês | MEDLINE | ID: mdl-24994890

RESUMO

MOTIVATION: Animal models are widely used in biomedical research for reasons ranging from practical to ethical. An important issue is whether rodent models are predictive of human biology. This has been addressed recently in the framework of a series of challenges designed by the systems biology verification for Industrial Methodology for Process Verification in Research (sbv IMPROVER) initiative. In particular, one of the sub-challenges was devoted to the prediction of protein phosphorylation responses in human bronchial epithelial cells, exposed to a number of different chemical stimuli, given the responses in rat bronchial epithelial cells. Participating teams were asked to make inter-species predictions on the basis of available training examples, comprising transcriptomics and phosphoproteomics data. RESULTS: Here, the two best performing teams present their data-driven approaches and computational methods. In addition, post hoc analyses of the datasets and challenge results were performed by the participants and challenge organizers. The challenge outcome indicates that successful prediction of protein phosphorylation status in human based on rat phosphorylation levels is feasible. However, within the limitations of the computational tools used, the inclusion of gene expression data does not improve the prediction quality. The post hoc analysis of time-specific measurements sheds light on the signaling pathways in both species. AVAILABILITY AND IMPLEMENTATION: A detailed description of the dataset, challenge design and outcome is available at www.sbvimprover.com. The code used by team IGB is provided under http://github.com/uci-igb/improver2013. Implementations of the algorithms applied by team AMG are available at http://bhanot.biomaps.rutgers.edu/wiki/AMG-sc2-code.zip. CONTACT: meikelbiehl@gmail.com.


Assuntos
Brônquios/metabolismo , Células Epiteliais/metabolismo , Perfilação da Expressão Gênica , Fosfoproteínas/metabolismo , Software , Biologia de Sistemas/métodos , Algoritmos , Animais , Brônquios/citologia , Células Cultivadas , Bases de Dados Factuais , Células Epiteliais/citologia , Regulação da Expressão Gênica , Humanos , Análise de Sequência com Séries de Oligonucleotídeos , Fosforilação , Ratos , Especificidade da Espécie , Pesquisa Translacional Biomédica
9.
Bioinformatics ; 31(4): 484-91, 2015 Feb 15.
Artigo em Inglês | MEDLINE | ID: mdl-25294919

RESUMO

MOTIVATION: Animal models are important tools in drug discovery and for understanding human biology in general. However, many drugs that initially show promising results in rodents fail in later stages of clinical trials. Understanding the commonalities and differences between human and rat cell signaling networks can lead to better experimental designs, improved allocation of resources and ultimately better drugs. RESULTS: The sbv IMPROVER Species-Specific Network Inference challenge was designed to use the power of the crowds to build two species-specific cell signaling networks given phosphoproteomics, transcriptomics and cytokine data generated from NHBE and NRBE cells exposed to various stimuli. A common literature-inspired reference network with 220 nodes and 501 edges was also provided as prior knowledge from which challenge participants could add or remove edges but not nodes. Such a large network inference challenge not based on synthetic simulations but on real data presented unique difficulties in scoring and interpreting the results. Because any prior knowledge about the networks was already provided to the participants for reference, novel ways for scoring and aggregating the results were developed. Two human and rat consensus networks were obtained by combining all the inferred networks. Further analysis showed that major signaling pathways were conserved between the two species with only isolated components diverging, as in the case of ribosomal S6 kinase RPS6KA1. Overall, the consensus between inferred edges was relatively high with the exception of the downstream targets of transcription factors, which seemed more difficult to predict. CONTACT: ebilal@us.ibm.com or gustavo@us.ibm.com. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Crowdsourcing , Citocinas/metabolismo , Perfilação da Expressão Gênica , Redes Reguladoras de Genes , Fosfoproteínas/metabolismo , Software , Biologia de Sistemas/métodos , Animais , Brônquios/citologia , Brônquios/metabolismo , Comunicação Celular , Células Cultivadas , Bases de Dados Factuais , Células Epiteliais/citologia , Células Epiteliais/metabolismo , Regulação da Expressão Gênica , Humanos , Modelos Animais , Análise de Sequência com Séries de Oligonucleotídeos , Fosforilação , Ratos , Transdução de Sinais , Especificidade da Espécie
10.
Bioinformatics ; 31(4): 471-83, 2015 Feb 15.
Artigo em Inglês | MEDLINE | ID: mdl-25236459

RESUMO

MOTIVATION: Inferring how humans respond to external cues such as drugs, chemicals, viruses or hormones is an essential question in biomedicine. Very often, however, this question cannot be addressed because it is not possible to perform experiments in humans. A reasonable alternative consists of generating responses in animal models and 'translating' those results to humans. The limitations of such translation, however, are far from clear, and systematic assessments of its actual potential are urgently needed. sbv IMPROVER (systems biology verification for Industrial Methodology for PROcess VErification in Research) was designed as a series of challenges to address translatability between humans and rodents. This collaborative crowd-sourcing initiative invited scientists from around the world to apply their own computational methodologies on a multilayer systems biology dataset composed of phosphoproteomics, transcriptomics and cytokine data derived from normal human and rat bronchial epithelial cells exposed in parallel to 52 different stimuli under identical conditions. Our aim was to understand the limits of species-to-species translatability at different levels of biological organization: signaling, transcriptional and release of secreted factors (such as cytokines). Participating teams submitted 49 different solutions across the sub-challenges, two-thirds of which were statistically significantly better than random. Additionally, similar computational methods were found to range widely in their performance within the same challenge, and no single method emerged as a clear winner across all sub-challenges. Finally, computational methods were able to effectively translate some specific stimuli and biological processes in the lung epithelial system, such as DNA synthesis, cytoskeleton and extracellular matrix, translation, immune/inflammation and growth factor/proliferation pathways, better than the expected response similarity between species. CONTACT: pmeyerr@us.ibm.com or Julia.Hoeng@pmi.com SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Citocinas/metabolismo , Perfilação da Expressão Gênica , Modelos Animais , Fosfoproteínas/metabolismo , Software , Biologia de Sistemas/métodos , Animais , Brônquios/citologia , Brônquios/metabolismo , Células Cultivadas , Bases de Dados Factuais , Células Epiteliais/citologia , Células Epiteliais/metabolismo , Regulação da Expressão Gênica , Redes Reguladoras de Genes , Humanos , Análise de Sequência com Séries de Oligonucleotídeos , Fosforilação , Ratos , Especificidade da Espécie , Pesquisa Translacional Biomédica
11.
Bioinformatics ; 29(22): 2892-9, 2013 Nov 15.
Artigo em Inglês | MEDLINE | ID: mdl-23966112

RESUMO

MOTIVATION: After more than a decade since microarrays were used to predict phenotype of biological samples, real-life applications for disease screening and identification of patients who would best benefit from treatment are still emerging. The interest of the scientific community in identifying best approaches to develop such prediction models was reaffirmed in a competition style international collaboration called IMPROVER Diagnostic Signature Challenge whose results we describe herein. RESULTS: Fifty-four teams used public data to develop prediction models in four disease areas including multiple sclerosis, lung cancer, psoriasis and chronic obstructive pulmonary disease, and made predictions on blinded new data that we generated. Teams were scored using three metrics that captured various aspects of the quality of predictions, and best performers were awarded. This article presents the challenge results and introduces to the community the approaches of the best overall three performers, as well as an R package that implements the approach of the best overall team. The analyses of model performance data submitted in the challenge as well as additional simulations that we have performed revealed that (i) the quality of predictions depends more on the disease endpoint than on the particular approaches used in the challenge; (ii) the most important modeling factor (e.g. data preprocessing, feature selection and classifier type) is problem dependent; and (iii) for optimal results datasets and methods have to be carefully matched. Biomedical factors such as the disease severity and confidence in diagnostic were found to be associated with the misclassification rates across the different teams. AVAILABILITY: The lung cancer dataset is available from Gene Expression Omnibus (accession, GSE43580). The maPredictDSC R package implementing the approach of the best overall team is available at www.bioconductor.org or http://bioinformaticsprb.med.wayne.edu/.


Assuntos
Perfilação da Expressão Gênica/métodos , Técnicas de Diagnóstico Molecular , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Fenótipo , Doença/genética , Humanos , Neoplasias Pulmonares/diagnóstico , Neoplasias Pulmonares/genética , Esclerose Múltipla/diagnóstico , Esclerose Múltipla/genética , Psoríase/diagnóstico , Psoríase/genética , Doença Pulmonar Obstrutiva Crônica/diagnóstico , Doença Pulmonar Obstrutiva Crônica/genética
12.
JMIR Ment Health ; 11: e57234, 2024 May 16.
Artigo em Inglês | MEDLINE | ID: mdl-38771256

RESUMO

Background: Rates of suicide have increased by over 35% since 1999. Despite concerted efforts, our ability to predict, explain, or treat suicide risk has not significantly improved over the past 50 years. Objective: The aim of this study was to use large language models to understand natural language use during public web-based discussions (on Reddit) around topics related to suicidality. Methods: We used large language model-based sentence embedding to extract the latent linguistic dimensions of user postings derived from several mental health-related subreddits, with a focus on suicidality. We then applied dimensionality reduction to these sentence embeddings, allowing them to be summarized and visualized in a lower-dimensional Euclidean space for further downstream analyses. We analyzed 2.9 million posts extracted from 30 subreddits, including r/SuicideWatch, between October 1 and December 31, 2022, and the same period in 2010. Results: Our results showed that, in line with existing theories of suicide, posters in the suicidality community (r/SuicideWatch) predominantly wrote about feelings of disconnection, burdensomeness, hopeless, desperation, resignation, and trauma. Further, we identified distinct latent linguistic dimensions (well-being, seeking support, and severity of distress) among all mental health subreddits, and many of the resulting subreddit clusters were in line with a statistically driven diagnostic classification system-namely, the Hierarchical Taxonomy of Psychopathology (HiTOP)-by mapping onto the proposed superspectra. Conclusions: Overall, our findings provide data-driven support for several language-based theories of suicide, as well as dimensional classification systems for mental health disorders. Ultimately, this novel combination of natural language processing techniques can assist researchers in gaining deeper insights about emotions and experiences shared on the web and may aid in the validation and refutation of different mental health theories.


Assuntos
Linguística , Transtornos Mentais , Mídias Sociais , Suicídio , Humanos , Mídias Sociais/estatística & dados numéricos , Suicídio/psicologia , Transtornos Mentais/psicologia , Transtornos Mentais/epidemiologia , Transtornos Mentais/classificação , Processamento de Linguagem Natural
13.
medRxiv ; 2024 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-38853969

RESUMO

Amyotrophic lateral sclerosis (ALS) is a neurodegenerative motor neuron disease that causes progressive muscle weakness. Progressive bulbar dysfunction causes dysarthria and thus social isolation, reducing quality of life. The Everything ALS Speech Study obtained longitudinal clinical information and speech recordings from 292 participants. In a subset of 120 participants, we measured speaking rate (SR) and listener effort (LE), a measure of dysarthria severity rated by speech pathologists from recordings. LE intra- and inter-rater reliability was very high (ICC 0.88 to 0.92). LE correlated with other measures of dysarthria at baseline. LE changed over time in participants with ALS (slope 0.77 pts/month; p<0.001) but not controls (slope 0.005 pts/month; p=0.807). The slope of LE progression was similar in all participants with ALS who had bulbar dysfunction at baseline, regardless of ALS site of onset. LE could be a remotely collected clinically meaningful clinical outcome assessment for ALS clinical trials.

14.
Bioinformatics ; 28(9): 1193-201, 2012 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-22423044

RESUMO

MOTIVATION: Analyses and algorithmic predictions based on high-throughput data are essential for the success of systems biology in academic and industrial settings. Organizations, such as companies and academic consortia, conduct large multi-year scientific studies that entail the collection and analysis of thousands of individual experiments, often over many physical sites and with internal and outsourced components. To extract maximum value, the interested parties need to verify the accuracy and reproducibility of data and methods before the initiation of such large multi-year studies. However, systematic and well-established verification procedures do not exist for automated collection and analysis workflows in systems biology which could lead to inaccurate conclusions. RESULTS: We present here, a review of the current state of systems biology verification and a detailed methodology to address its shortcomings. This methodology named 'Industrial Methodology for Process Verification in Research' or IMPROVER, consists on evaluating a research program by dividing a workflow into smaller building blocks that are individually verified. The verification of each building block can be done internally by members of the research program or externally by 'crowd-sourcing' to an interested community. www.sbvimprover.com IMPLEMENTATION: This methodology could become the preferred choice to verify systems biology research workflows that are becoming increasingly complex and sophisticated in industrial and academic settings.


Assuntos
Biologia de Sistemas/métodos , Fluxo de Trabalho , Revisão por Pares , Publicações Periódicas como Assunto , Reprodutibilidade dos Testes
15.
Proc Natl Acad Sci U S A ; 107(24): 10896-901, 2010 Jun 15.
Artigo em Inglês | MEDLINE | ID: mdl-20534496

RESUMO

With the advent of Systems Biology, the prediction of whether two proteins form a complex has become a problem of increased importance. A variety of experimental techniques have been applied to the problem, but three-dimensional structural information has not been widely exploited. Here we explore the range of applicability of such information by analyzing the extent to which the location of binding sites on protein surfaces is conserved among structural neighbors. We find, as expected, that interface conservation is most significant among proteins that have a clear evolutionary relationship, but that there is a significant level of conservation even among remote structural neighbors. This finding is consistent with recent evidence that information available from structural neighbors, independent of classification, should be exploited in the search for functional insights. The value of such structural information is highlighted through the development of a new protein interface prediction method, PredUs, that identifies what residues on protein surfaces are likely to participate in complexes with other proteins. The performance of PredUs, as measured through comparisons with other methods, suggests that relationships across protein structure space can be successfully exploited in the prediction of protein-protein interactions.


Assuntos
Domínios e Motivos de Interação entre Proteínas , Mapeamento de Interação de Proteínas , Proteínas/química , Sítios de Ligação , Sequência Conservada , Bases de Dados de Proteínas , Modelos Moleculares , Complexos Multiproteicos/química , Mapeamento de Interação de Proteínas/estatística & dados numéricos , Proteínas/genética , Alinhamento de Sequência , Homologia Estrutural de Proteína , Biologia de Sistemas
16.
Schizophr Bull ; 49(2): 444-453, 2023 03 15.
Artigo em Inglês | MEDLINE | ID: mdl-36184074

RESUMO

BACKGROUND AND HYPOTHESIS: Disturbances in self-experience are a central feature of schizophrenia and its study can enhance phenomenological understanding and inform mechanisms underlying clinical symptoms. Self-experience involves the sense of self-presence, of being the subject of one's own experiences and agent of one's own actions, and of being distinct from others. Self-experience is traditionally assessed by manual rating of interviews; however, natural language processing (NLP) offers automated approach that can augment manual ratings by rapid and reliable analysis of text. STUDY DESIGN: We elicited autobiographical narratives from 167 patients with schizophrenia or schizoaffective disorder (SZ) and 90 healthy controls (HC), amounting to 490 000 words and 26 000 sentences. We used NLP techniques to examine transcripts for language related to self-experience, machine learning to validate group differences in language, and canonical correlation analysis to examine the relationship between language and symptoms. STUDY RESULTS: Topics related to self-experience and agency emerged as significantly more expressed in SZ than HC (P < 10-13) and were decoupled from similarly emerging features such as emotional tone, semantic coherence, and concepts related to burden. Further validation on hold-out data showed that a classifier trained on these features achieved patient-control discrimination with AUC = 0.80 (P < 10-5). Canonical correlation analysis revealed significant relationships between self-experience and agency language features and clinical symptoms. CONCLUSIONS: Notably, the self-experience and agency topics emerged without any explicit probing by the interviewer and can be algorithmically detected even though they involve higher-order metacognitive processes. These findings illustrate the utility of NLP methods to examine phenomenological aspects of schizophrenia.


Assuntos
Metacognição , Transtornos Psicóticos , Esquizofrenia , Humanos , Semântica , Processamento de Linguagem Natural
17.
Commun Med (Lond) ; 3(1): 104, 2023 Jul 27.
Artigo em Inglês | MEDLINE | ID: mdl-37500763

RESUMO

BACKGROUND: There is a prevailing view that humans' capacity to use language to characterize sensations like odors or tastes is poor, providing an unreliable source of information. METHODS: Here, we developed a machine learning method based on Natural Language Processing (NLP) using Large Language Models (LLM) to predict COVID-19 diagnosis solely based on text descriptions of acute changes in chemosensation, i.e., smell, taste and chemesthesis, caused by the disease. The dataset of more than 1500 subjects was obtained from survey responses early in the COVID-19 pandemic, in Spring 2020. RESULTS: When predicting COVID-19 diagnosis, our NLP model performs comparably (AUC ROC ~ 0.65) to models based on self-reported changes in function collected via quantitative rating scales. Further, our NLP model could attribute importance of words when performing the prediction; sentiment and descriptive words such as "smell", "taste", "sense", had strong contributions to the predictions. In addition, adjectives describing specific tastes or smells such as "salty", "sweet", "spicy", and "sour" also contributed considerably to predictions. CONCLUSIONS: Our results show that the description of perceptual symptoms caused by a viral infection can be used to fine-tune an LLM model to correctly predict and interpret the diagnostic status of a subject. In the future, similar models may have utility for patient verbatims from online health portals or electronic health records.


Early in the COVID-19 pandemic, people who were infected with SARS-CoV-2 reported changes in smell and taste. To better study these symptoms of SARS-CoV-2 infections and potentially use them to identify infected patients, a survey was undertaken in various countries asking people about their COVID-19 symptoms. One part of the questionnaire asked people to describe the changes in smell and taste they were experiencing. We developed a computational program that could use these responses to correctly distinguish people that had tested positive for SARS-CoV-2 infection from people without SARS-CoV-2 infection. This approach could allow rapid identification of people infected with SARS-CoV-2 from descriptions of their sensory symptoms and be adapted to identify people infected with other viruses in the future.

18.
Nucleic Acids Res ; 38(Web Server issue): W550-4, 2010 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-20525783

RESUMO

The construction of a homology model for a protein can involve a number of decisions requiring the integration of different sources of information and the application of different modeling tools depending on the particular problem. Functional information can be especially important in guiding the modeling process, but such information is not generally integrated into modeling pipelines. Pudge is a flexible, interactive protein structure prediction server, which is designed with these issues in mind. By dividing the modeling into five stages (template selection, alignment, model building, model refinement and model evaluation) and providing various tools to visualize, analyze and compare the results at each stage, we enable a flexible modeling strategy that can be tailored to the needs of a given problem. Pudge is freely available at http://wiki.c2b2.columbia.edu/honiglab_public/index.php/Software:PUDGE.


Assuntos
Software , Homologia Estrutural de Proteína , Proteínas de Bactérias/química , Internet , Modelos Moleculares , Interface Usuário-Computador
19.
Comput Psychiatr ; 6(1): 1-7, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-38774775

RESUMO

We conducted a feasibility analysis to determine the quality of data that could be collected ambiently during routine clinical conversations. We used inexpensive, consumer-grade hardware to record unstructured dialogue and open-source software tools to quantify and model face, voice (acoustic and language) and movement features. We used an external validation set to perform proof-of-concept predictive analyses and show that clinically relevant measures can be produced without a restrictive protocol.

20.
Hepatology ; 50(2): 575-84, 2009 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-19582816

RESUMO

UNLABELLED: Transforming growth factor-beta / bone morphogenetic protein (TGFbeta/BMP) signaling has a gradient of effects on cell fate choice in the fetal mouse liver. The molecular mechanism to understand why adjacent cells develop into bile ducts or grow actively as hepatocytes in the ubiquitous presence of both TGFbeta ligands and receptors has been unknown. We hypothesized that microRNAs (miRNAs) might play a role in cell fate decisions in the liver. miRNA profiling during late fetal development in the mouse identified miR-23b cluster miRNAs comprising miR-23b, miR-27b, and miR-24-1 and miR-10a, miR-26a, and miR-30a as up-regulated. In situ hybridization of fetal liver at embryonic day 17.5 of gestation revealed miR-23b cluster expression only in fetal hepatocytes. A complementary (c)DNA microarray approach was used to identify genes with a reciprocal expression pattern to that of miR-23b cluster miRNAs. This approach identified Smads (mothers against decapentaplegic homolog), the key TGFbeta signaling molecules, as putative miR-23b cluster targets. Bioinformatic analysis identified multiple candidate target sites in the 3' UTRs (untranslated regions) of Smads 3, 4, and 5. Dual luciferase reporter assays confirmed down-regulation of constructs containing Smad 3, 4, or 5, 3' UTRs by a mixture of miR-23b cluster mimics. Knockdown of miR-23b miRNAs during hepatocytic differentiation of a fetal liver stem cell line, HBC-3, promoted expression of bile duct genes, in addition to Smads, in these cells. In contrast, ectopic expression of miR-23b mimics during bile duct differentiation of HBC-3 cells blocked the process. CONCLUSION: Our data provide a model in which miR-23b miRNAs repress bile duct gene expression in fetal hepatocytes while promoting their growth by down-regulating Smads and consequently TGFbeta signaling. Concomitantly, low levels of the miR-23b miRNAs are needed in cholangiocytes to allow TGFbeta signaling and bile duct formation.


Assuntos
Diferenciação Celular , Hepatócitos/metabolismo , Fígado/metabolismo , MicroRNAs/metabolismo , Fator de Crescimento Transformador beta/metabolismo , Animais , Ductos Biliares/citologia , Proteínas Morfogenéticas Ósseas/metabolismo , Linhagem Celular , Perfilação da Expressão Gênica , Hepatócitos/citologia , Fígado/citologia , Fígado/embriologia , Camundongos , Análise de Sequência com Séries de Oligonucleotídeos , RNA Interferente Pequeno/metabolismo , Proteínas Smad/metabolismo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA