Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 55
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
J Biomed Inform ; 148: 104554, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-38000767

RESUMO

OBJECTIVE: Treatment pathways are step-by-step plans outlining the recommended medical care for specific diseases; they get revised when different treatments are found to improve patient outcomes. Examining health records is an important part of this revision process, but inferring patients' actual treatments from health data is challenging due to complex event-coding schemes and the absence of pathway-related annotations. The objective of this study is to develop a method for inferring actual treatment steps for a particular patient group from administrative health records - a common form of tabular healthcare data - and address several technique- and methodology-based gaps in treatment pathway-inference research. METHODS: We introduce Defrag, a method for examining health records to infer the real-world treatment steps for a particular patient group. Defrag learns the semantic and temporal meaning of healthcare event sequences, allowing it to reliably infer treatment steps from complex healthcare data. To our knowledge, Defrag is the first pathway-inference method to utilise a neural network (NN), an approach made possible by a novel, self-supervised learning objective. We also developed a testing and validation framework for pathway inference, which we use to characterise and evaluate Defrag's pathway inference ability, establish benchmarks, and compare against baselines. RESULTS: We demonstrate Defrag's effectiveness by identifying best-practice pathway fragments for breast cancer, lung cancer, and melanoma in public healthcare records. Additionally, we use synthetic data experiments to demonstrate the characteristics of the Defrag inference method, and to compare Defrag to several baselines, where it significantly outperforms non-NN-based methods. CONCLUSIONS: Defrag offers an innovative and effective approach for inferring treatment pathways from complex health data. Defrag significantly outperforms several existing pathway-inference methods, but computationally-derived treatment pathways are still difficult to compare against clinical guidelines. Furthermore, the open-source code for Defrag and the testing framework are provided to encourage further research in this area.


Assuntos
Neoplasias da Mama , Registros Eletrônicos de Saúde , Humanos , Feminino
2.
BMC Med Inform Decis Mak ; 23(1): 295, 2023 12 20.
Artigo em Inglês | MEDLINE | ID: mdl-38124044

RESUMO

BACKGROUND: Visualising patient genomic data in a cohort with embedding data analytics models can provide relevant and sensible patient comparisons to assist a clinician with treatment decisions. As immersive technology is actively used around the medical world, there is a rising demand for an efficient environment that can effectively display genomic data visualisations on immersive devices such as a Virtual Reality (VR) environment. The VR technology will allow clinicians, biologists, and computer scientists to explore a cohort of individual patients within the 3D environment. However, demonstrating the feasibility of the VR prototype needs domain users' feedback for future user-centred design and a better cognitive model of human-computer interactions. There is limited research work for collecting and integrating domain knowledge into the prototype design. OBJECTIVE: A usability study for the VR prototype--Virtual Reality to Observe Oncology data Models (VROOM) was implemented. VROOM was designed based on a preliminary study among medical users. The goals of this usability study included establishing a baseline of user experience, validating user performance measures, and identifying potential design improvements that are to be addressed to improve efficiency, functionality, and end-user satisfaction. METHODS: The study was conducted with a group of domain users (10 males, 10 females) with portable VR devices and camera equipment. These domain users included medical users such as clinicians and genetic scientists and computing domain users such as bioinformatics and data analysts. Users were asked to complete routine tasks based on a clinical scenario. Sessions were recorded and analysed to identify potential areas for improvement to the data visual analytics projects in the VR environment. The one-hour usability study included learning VR interaction gestures, running visual analytics tool, and collecting before and after feedback. The feedback was analysed with different methods to measure effectiveness. The statistical method Mann-Whitney U test was used to analyse various task performances among the different participant groups, and multiple data visualisations were created to find insights from questionnaire answers. RESULTS: The usability study investigated the feasibility of using VR for genomic data analysis in domain users' daily work. From the feedback, 65% of the participants, especially clinicians (75% of them), indicated that the VR prototype is potentially helpful for domain users' daily work but needed more flexibility, such as allowing them to define their features for machine learning part, adding new patient data, and importing their datasets in a better way. We calculated the engaged time for each task and compared them among different user groups. Computing domain users spent 50% more time exploring the algorithms and datasets than medical domain users. Additionally, the medical domain users engaged in the data visual analytics parts (approximately 20%) longer than the computing domain users.


Assuntos
Neoplasias , Médicos , Realidade Virtual , Masculino , Feminino , Humanos , Computadores , Pessoal de Saúde , Neoplasias/genética , Neoplasias/terapia
3.
BMC Bioinformatics ; 22(1): 588, 2021 Dec 11.
Artigo em Inglês | MEDLINE | ID: mdl-34895138

RESUMO

BACKGROUND: Copy number variants (CNVs) are the gain or loss of DNA segments in the genome. Studies have shown that CNVs are linked to various disorders, including autism, intellectual disability, and schizophrenia. Consequently, the interest in studying a possible association of CNVs to specific disease traits is growing. However, due to the specific multi-dimensional characteristics of the CNVs, methods for testing the association between CNVs and the disease-related traits are still underdeveloped. We propose a novel multi-dimensional CNV kernel association test (MCKAT) in this paper. We aim to find significant associations between CNVs and disease-related traits using kernel-based methods. RESULTS: We address the multi-dimensionality in CNV characteristics. We first design a single pair CNV kernel, which contains three sub-kernels to summarize the similarity between two CNVs considering all CNV characteristics. Then, aggregate single pair CNV kernel to the whole chromosome CNV kernel, which summarizes the similarity between CNVs in two or more chromosomes. Finally, the association between the CNVs and disease-related traits is evaluated by comparing the similarity in the trait with kernel-based similarity using a score test in a random effect model. We apply MCKAT on genome-wide CNV datasets to examine the association between CNVs and disease-related traits, which demonstrates the potential usefulness the proposed method has for the CNV association tests. We compare the performance of MCKAT with CKAT, a uni-dimensional kernel method. Based on the results, MCKAT indicates stronger evidence, smaller p-value, in detecting significant associations between CNVs and disease-related traits in both rare and common CNV datasets. CONCLUSION: A multi-dimensional copy number variant kernel association test can detect statistically significant associated CNV regions with any disease-related trait. MCKAT can provide biologists with CNV hot spots at the cytogenetic band level that CNVs on them may have a significant association with disease-related traits. Using MCKAT, biologists can narrow their investigation from the whole genome, including many genes and CNVs, to more specific cytogenetic bands that MCKAT identifies. Furthermore, MCKAT can help biologists detect significantly associated CNVs with disease-related traits across a patient group instead of examining each subject's CNVs case by case.


Assuntos
Variações do Número de Cópias de DNA , Genoma , Estudo de Associação Genômica Ampla , Humanos , Fenótipo
4.
J Cell Mol Med ; 25(16): 8047-8061, 2021 08.
Artigo em Inglês | MEDLINE | ID: mdl-34165249

RESUMO

Irritable bowel syndrome (IBS) is a gut-brain disorder in which symptoms are shaped by serotonin acting centrally and peripherally. The serotonin transporter gene SLC6A4 has been implicated in IBS pathophysiology, but the underlying genetic mechanisms remain unclear. We sequenced the alternative P2 promoter driving intestinal SLC6A4 expression and identified single nucleotide polymorphisms (SNPs) that were associated with IBS in a discovery sample. Identified SNPs built different haplotypes, and the tagging SNP rs2020938 seems to associate with constipation-predominant IBS (IBS-C) in females. rs2020938 validation was performed in 1978 additional IBS patients and 6,038 controls from eight countries. Meta-analysis on data from 2,175 IBS patients and 6,128 controls confirmed the association with female IBS-C. Expression analyses revealed that the P2 promoter drives SLC6A4 expression primarily in the small intestine. Gene reporter assays showed a functional impact of SNPs in the P2 region. In silico analysis of the polymorphic promoter indicated differential expression regulation. Further follow-up revealed that the major allele of the tagging SNP rs2020938 correlates with differential SLC6A4 expression in the jejunum and with stool consistency, indicating functional relevance. Our data consolidate rs2020938 as a functional SNP associated with IBS-C risk in females, underlining the relevance of SLC6A4 in IBS pathogenesis.


Assuntos
Biomarcadores/metabolismo , Síndrome do Intestino Irritável/patologia , Fenótipo , Polimorfismo de Nucleotídeo Único , Regiões Promotoras Genéticas , Proteínas da Membrana Plasmática de Transporte de Serotonina/genética , Serotonina/metabolismo , Feminino , Haplótipos , Humanos , Mucosa Intestinal/metabolismo , Mucosa Intestinal/patologia , Síndrome do Intestino Irritável/etiologia , Síndrome do Intestino Irritável/metabolismo
5.
Behav Pharmacol ; 31(1): 73-80, 2020 02.
Artigo em Inglês | MEDLINE | ID: mdl-31625973

RESUMO

Linalool is an enanitomer monoterpene compound identified as the pharmacologically active constituent in a number of essential oils and has been reported to display anxiolytic properties in humans and in animal models and to exert both GABAergic and glutamatergic effects. In Experiment 1 linalool (100, 200, and 300, i.p.) had no significant effects compared with saline in an activity tracker with C57BL/6j mice. Experiment 2 assessed the effects on operant extinction with mice of chlordiazepoxide at a dose (15 mg/kg, i.p.) previously shown to facilitate extinction, and the same doses of linalool, compared with saline. Linalool had a dose-related facilitatory effect on extinction. While the effects of the highest dose of linalool most closely resembled the effects of chlordiazepoxide, the pattern of results suggested that linalool may affect both the acquisition of extinction learning, which is influenced by glutamatergic processes, and the expression of extinction, known to be affected by GABAergic agents such as chlordiazepoxide.


Assuntos
Monoterpenos Acíclicos/farmacologia , Condicionamento Operante/efeitos dos fármacos , Extinção Psicológica/efeitos dos fármacos , Monoterpenos Acíclicos/metabolismo , Animais , Ansiolíticos/farmacologia , Comportamento Animal/efeitos dos fármacos , Clordiazepóxido/farmacologia , Masculino , Camundongos , Camundongos Endogâmicos C57BL
6.
BMC Bioinformatics ; 18(1): 527, 2017 Nov 29.
Artigo em Inglês | MEDLINE | ID: mdl-29187149

RESUMO

BACKGROUND: Data from patients with rare diseases is often produced using different platforms and probe sets because patients are widely distributed in space and time. Aggregating such data requires a method of normalization that makes patient records comparable. RESULTS: This paper proposed DBNorm, implemented as an R package, is an algorithm that normalizes arbitrarily distributed data to a common, comparable form. Specifically, DBNorm merges data distributions by fitting functions to each of them, and using the probability of each element drawn from the fitted distribution to merge it into a global distribution. DBNorm contains state-of-the-art fitting functions including Polynomial, Fourier and Gaussian distributions, and also allows users to define their own fitting functions if required. CONCLUSIONS: The performance of DBNorm is compared with z-score, average difference, quantile normalization and ComBat on a set of datasets, including several that are publically available. The performance of these normalization methods are compared using statistics, visualization, and classification when class labels are known based on a number of self-generated and public microarray datasets. The experimental results show that DBNorm achieves better normalization results than conventional methods. Finally, the approach has the potential to be applicable outside bioinformatics analysis.


Assuntos
Análise de Sequência com Séries de Oligonucleotídeos/métodos , Software , Área Sob a Curva , Regulação Neoplásica da Expressão Gênica , Humanos , Distribuição Normal , Leucemia-Linfoma Linfoblástico de Células Precursoras/genética , Leucemia-Linfoma Linfoblástico de Células Precursoras/patologia , Análise de Componente Principal , Curva ROC , Interface Usuário-Computador
7.
Brain Behav Immun ; 61: 50-59, 2017 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-27865949

RESUMO

BACKGROUND: Preclinical studies have identified certain probiotics as psychobiotics - live microorganisms with a potential mental health benefit. Lactobacillus rhamnosus (JB-1) has been shown to reduce stress-related behaviour, corticosterone release and alter central expression of GABA receptors in an anxious mouse strain. However, it is unclear if this single putative psychobiotic strain has psychotropic activity in humans. Consequently, we aimed to examine if these promising preclinical findings could be translated to healthy human volunteers. OBJECTIVES: To determine the impact of L. rhamnosus on stress-related behaviours, physiology, inflammatory response, cognitive performance and brain activity patterns in healthy male participants. METHODS: An 8week, randomized, placebo-controlled, cross-over design was employed. Twenty-nine healthy male volunteers participated. Participants completed self-report stress measures, cognitive assessments and resting electroencephalography (EEG). Plasma IL10, IL1ß, IL6, IL8 and TNFα levels and whole blood Toll-like 4 (TLR-4) agonist-induced cytokine release were determined by multiplex ELISA. Salivary cortisol was determined by ELISA and subjective stress measures were assessed before, during and after a socially evaluated cold pressor test (SECPT). RESULTS: There was no overall effect of probiotic treatment on measures of mood, anxiety, stress or sleep quality and no significant effect of probiotic over placebo on subjective stress measures, or the HPA response to the SECPT. Visuospatial memory performance, attention switching, rapid visual information processing, emotion recognition and associated EEG measures did not show improvement over placebo. No significant anti-inflammatory effects were seen as assessed by basal and stimulated cytokine levels. CONCLUSIONS: L. rhamnosus was not superior to placebo in modifying stress-related measures, HPA response, inflammation or cognitive performance in healthy male participants. These findings highlight the challenges associated with moving promising preclinical studies, conducted in an anxious mouse strain, to healthy human participants. Future interventional studies investigating the effect of this psychobiotic in populations with stress-related disorders are required.


Assuntos
Atenção/efeitos dos fármacos , Cognição/efeitos dos fármacos , Lacticaseibacillus rhamnosus , Probióticos/administração & dosagem , Estresse Psicológico/tratamento farmacológico , Adulto , Encéfalo/efeitos dos fármacos , Cognição/fisiologia , Estudos Cross-Over , Citocinas/sangue , Método Duplo-Cego , Eletroencefalografia , Voluntários Saudáveis , Humanos , Hidrocortisona/análise , Masculino , Testes Neuropsicológicos , Probióticos/uso terapêutico , Saliva/química , Estresse Psicológico/psicologia , Adulto Jovem
8.
Brief Bioinform ; 14(6): 753-74, 2013 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-23097412

RESUMO

In this article, a framework for an in silico pipeline is presented as a guide to high-throughput vaccine candidate discovery for eukaryotic pathogens, such as helminths and protozoa. Eukaryotic pathogens are mostly parasitic and cause some of the most damaging and difficult to treat diseases in humans and livestock. Consequently, these parasitic pathogens have a significant impact on economy and human health. The pipeline is based on the principle of reverse vaccinology and is constructed from freely available bioinformatics programs. There are several successful applications of reverse vaccinology to the discovery of subunit vaccines against prokaryotic pathogens but not yet against eukaryotic pathogens. The overriding aim of the pipeline, which focuses on eukaryotic pathogens, is to generate through computational processes of elimination and evidence gathering a ranked list of proteins based on a scoring system. These proteins are either surface components of the target pathogen or are secreted by the pathogen and are of a type known to be antigenic. No perfect predictive method is yet available; therefore, the highest-scoring proteins from the list require laboratory validation.


Assuntos
Células Eucarióticas/imunologia , Vacinas , Simulação por Computador
9.
Bioinformatics ; 30(16): 2381-3, 2014 Aug 15.
Artigo em Inglês | MEDLINE | ID: mdl-24790156

RESUMO

UNLABELLED: We present Vacceed, a highly configurable and scalable framework designed to automate the process of high-throughput in silico vaccine candidate discovery for eukaryotic pathogens. Given thousands of protein sequences from the target pathogen as input, the main output is a ranked list of protein candidates determined by a set of machine learning algorithms. Vacceed has the potential to save time and money by reducing the number of false candidates allocated for laboratory validation. Vacceed, if required, can also predict protein sequences from the pathogen's genome. AVAILABILITY AND IMPLEMENTATION: Vacceed is tested on Linux and can be freely downloaded from https://github.com/sgoodswe/vacceed/releases (includes a worked example with sample data). Vacceed User Guide can be obtained from https://github.com/sgoodswe/vacceed.


Assuntos
Software , Vacinas/química , Algoritmos , Inteligência Artificial , Simulação por Computador , Análise de Sequência de Proteína , Vacinas/genética
10.
J Theor Biol ; 380: 271-9, 2015 Sep 07.
Artigo em Inglês | MEDLINE | ID: mdl-26026830

RESUMO

Co-regulations of miRNAs have been much less studied than the research on regulations between miRNAs and their target genes, although these two problems are equally important for understanding the entire mechanisms of complex post-transcriptional regulations. The difficulty to construct a miRNA-miRNA co-regulation network lies in how to determine reliable miRNA pairs from various resources of data related to the same disease such as expression levels, gene ontology (GO) databases, and protein-protein interactions. Here we take a novel integrative approach to the discovery of miRNA-miRNA co-regulation networks. This approach can progressively refine the various types of data and the computational analysis results. Applied to three lung cancer miRNA expression data sets of different subtypes, our method has identified a miRNA-miRNA co-regulation network and co-regulating functional modules common to lung cancer. An example of these functional modules consists of genes SMAD2, ACVR1B, ACVR2A and ACVR2B. This module is synergistically regulated by let-7a/b/c/f, is enriched in the same GO category, and has a close proximity in the protein interaction network. We also find that the co-regulation network is scale free and that lung cancer related miRNAs have more synergism in the network. According to our literature survey and database validation, many of these results are biologically meaningful for understanding the mechanism of the complex post-transcriptional regulations in lung cancer.


Assuntos
Redes Reguladoras de Genes , Neoplasias Pulmonares/genética , MicroRNAs/genética , Biologia Computacional , Conjuntos de Dados como Assunto , Humanos
11.
BMC Bioinformatics ; 15: 272, 2014 Aug 11.
Artigo em Inglês | MEDLINE | ID: mdl-25109603

RESUMO

BACKGROUND: Neuroblastoma Tumor (NT) is one of the most aggressive types of infant cancer. Essential to accurate diagnosis and prognosis is cellular quantitative analysis of the tumor. Counting enormous numbers of cells under an optical microscope is error-prone. There is therefore an urgent demand from pathologists for robust and automated cell counting systems. However, the main challenge in developing these systems is the inability of them to distinguish between overlapping cells and single cells, and to split the overlapping cells. We address this challenge in two stages by: 1) distinguishing overlapping cells from single cells using the morphological differences between them such as area, uniformity of diameters and cell concavity; and 2) splitting overlapping cells into single cells. We propose a novel approach by using the dominant concave regions of cells as markers to identify the overlap region. We then find the initial splitting points at the critical points of the concave regions by decomposing the concave regions into their components such as arcs, chords and edges, and the distance between the components is analyzed using the developed seed growing technique. Lastly, a shortest path determination approach is developed to determine the optimum splitting route between two candidate initial splitting points. RESULTS: We compare the cell counting results of our system with those of a pathologist as the ground-truth. We also compare the system with three state-of-the-art methods, and the results of statistical tests show a significant improvement in the performance of our system compared to state-of-the-art methods. The F-measure obtained by our system is 88.70%. To evaluate the generalizability of our algorithm, we apply it to images of follicular lymphoma, which has similar histological regions to NT. Of the algorithms tested, our algorithm obtains the highest F-measure of 92.79%. CONCLUSION: We develop a novel overlapping cell splitting algorithm to enhance the cellular quantitative analysis of infant neuroblastoma. The performance of the proposed algorithm promises a reliable automated cell counting system for pathology laboratories. Moreover, the high performance obtained by our algorithm for images of follicular lymphoma demonstrates the generalization of the proposed algorithm for cancers with similar histological regions and histological structures.


Assuntos
Contagem de Células/métodos , Neuroblastoma/patologia , Algoritmos , Humanos , Linfoma Folicular/patologia , Análise de Célula Única
12.
BMC Bioinformatics ; 14: 315, 2013 Nov 02.
Artigo em Inglês | MEDLINE | ID: mdl-24180526

RESUMO

BACKGROUND: An in silico vaccine discovery pipeline for eukaryotic pathogens typically consists of several computational tools to predict protein characteristics. The aim of the in silico approach to discovering subunit vaccines is to use predicted characteristics to identify proteins which are worthy of laboratory investigation. A major challenge is that these predictions are inherent with hidden inaccuracies and contradictions. This study focuses on how to reduce the number of false candidates using machine learning algorithms rather than relying on expensive laboratory validation. Proteins from Toxoplasma gondii, Plasmodium sp., and Caenorhabditis elegans were used as training and test datasets. RESULTS: The results show that machine learning algorithms can effectively distinguish expected true from expected false vaccine candidates (with an average sensitivity and specificity of 0.97 and 0.98 respectively), for proteins observed to induce immune responses experimentally. CONCLUSIONS: Vaccine candidates from an in silico approach can only be truly validated in a laboratory. Given any in silico output and appropriate training data, the number of false candidates allocated for validation can be dramatically reduced using a pool of machine learning algorithms. This will ultimately save time and money in the laboratory.


Assuntos
Antígenos/imunologia , Proteínas de Caenorhabditis elegans/imunologia , Biologia Computacional/métodos , Simulação por Computador , Proteínas de Protozoários/imunologia , Vacinas/imunologia , Algoritmos , Animais , Antígenos/química , Inteligência Artificial , Proteínas de Caenorhabditis elegans/química , Descoberta de Drogas , Proteínas de Protozoários/química , Sensibilidade e Especificidade , Vacinas/química
13.
BMC Bioinformatics ; 14: 261, 2013 Aug 27.
Artigo em Inglês | MEDLINE | ID: mdl-23981907

RESUMO

BACKGROUND: The wealth of gene expression values being generated by high throughput microarray technologies leads to complex high dimensional datasets. Moreover, many cohorts have the problem of imbalanced classes where the number of patients belonging to each class is not the same. With this kind of dataset, biologists need to identify a small number of informative genes that can be used as biomarkers for a disease. RESULTS: This paper introduces a Balanced Iterative Random Forest (BIRF) algorithm to select the most relevant genes for a disease from imbalanced high-throughput gene expression microarray data. Balanced iterative random forest is applied on four cancer microarray datasets: a childhood leukaemia dataset, which represents the main target of this paper, collected from The Children's Hospital at Westmead, NCI 60, a Colon dataset and a Lung cancer dataset. The results obtained by BIRF are compared to those of Support Vector Machine-Recursive Feature Elimination (SVM-RFE), Multi-class SVM-RFE (MSVM-RFE), Random Forest (RF) and Naive Bayes (NB) classifiers. The results of the BIRF approach outperform these state-of-the-art methods, especially in the case of imbalanced datasets. Experiments on the childhood leukaemia dataset show that a 7% ∼ 12% better accuracy is achieved by BIRF over MSVM-RFE with the ability to predict patients in the minor class. The informative biomarkers selected by the BIRF algorithm were validated by repeating training experiments three times to see whether they are globally informative, or just selected by chance. The results show that 64% of the top genes consistently appear in the three lists, and the top 20 genes remain near the top in the other three lists. CONCLUSION: The designed BIRF algorithm is an appropriate choice to select genes from imbalanced high-throughput gene expression microarray data. BIRF outperforms the state-of-the-art methods, especially the ability to handle the class-imbalanced data. Moreover, the analysis of the selected genes also provides a way to distinguish between the predictive genes and those that only appear to be predictive.


Assuntos
Biologia Computacional/métodos , Perfilação da Expressão Gênica/métodos , Marcadores Genéticos/genética , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Teorema de Bayes , Criança , Feminino , Humanos , Modelos Genéticos , Neoplasias/genética , Neoplasias/metabolismo , Reprodutibilidade dos Testes , Máquina de Vetores de Suporte
14.
FEMS Microbiol Rev ; 47(2)2023 03 10.
Artigo em Inglês | MEDLINE | ID: mdl-36806618

RESUMO

Reverse vaccinology (RV) was described at its inception in 2000 as an in silico process that starts from the genomic sequence of the pathogen and ends with a list of potential protein and/or peptide candidates to be experimentally validated for vaccine development. Twenty-two years later, this process has evolved from a few steps entailing a handful of bioinformatics tools to a multitude of steps with a plethora of tools. Other in silico related processes with overlapping workflow steps have also emerged with terms such as subtractive proteomics, computational vaccinology, and immunoinformatics. From the perspective of a new RV practitioner, determining the appropriate workflow steps and bioinformatics tools can be a time consuming and overwhelming task, given the number of choices. This review presents the current understanding of RV and its usage in the research community as determined by a comprehensive survey of scientific papers published in the last seven years. We believe the current mainstream workflow steps and tools presented here will be a valuable guideline for all researchers wanting to apply an up-to-date in silico vaccine discovery process.


Assuntos
Vacinas , Vacinologia , Vacinologia/métodos , Genômica/métodos , Biologia Computacional/métodos , Proteômica/métodos
15.
Sci Rep ; 13(1): 8243, 2023 05 22.
Artigo em Inglês | MEDLINE | ID: mdl-37217589

RESUMO

Vaccine discovery against eukaryotic parasites is not trivial as highlighted by the limited number of known vaccines compared to the number of protozoal diseases that need one. Only three of 17 priority diseases have commercial vaccines. Live and attenuated vaccines have proved to be more effective than subunit vaccines but adversely pose more unacceptable risks. One promising approach for subunit vaccines is in silico vaccine discovery, which predicts protein vaccine candidates given thousands of target organism protein sequences. This approach, nonetheless, is an overarching concept with no standardised guidebook on implementation. No known subunit vaccines against protozoan parasites exist as a result of this approach, and consequently none to emulate. The study goal was to combine current in silico discovery knowledge specific to protozoan parasites and develop a workflow representing a state-of-the-art approach. This approach reflectively integrates a parasite's biology, a host's immune system defences, and importantly, bioinformatics programs needed to predict vaccine candidates. To demonstrate the workflow effectiveness, every Toxoplasma gondii protein was ranked in its capacity to provide long-term protective immunity. Although testing in animal models is required to validate these predictions, most of the top ranked candidates are supported by publications reinforcing our confidence in the approach.


Assuntos
Parasitos , Vacinas Protozoárias , Toxoplasma , Vacinas de DNA , Animais , Camundongos , Proteínas , Vacinas de Subunidades Antigênicas , Proteínas de Protozoários/genética , Anticorpos Antiprotozoários , Antígenos de Protozoários/genética , Camundongos Endogâmicos BALB C
16.
Sci Data ; 10(1): 595, 2023 09 08.
Artigo em Inglês | MEDLINE | ID: mdl-37684306

RESUMO

The increasing rates of breast cancer, particularly in emerging economies, have led to interest in scalable deep learning-based solutions that improve the accuracy and cost-effectiveness of mammographic screening. However, such tools require large volumes of high-quality training data, which can be challenging to obtain. This paper combines the experience of an AI startup with an analysis of the FAIR principles of the eight available datasets. It demonstrates that the datasets vary considerably, particularly in their interoperability, as each dataset is skewed towards a particular clinical use-case. Additionally, the mix of digital captures and scanned film compounds the problem of variability, along with differences in licensing terms, ease of access, labelling reliability, and file formats. Improving interoperability through adherence to standards such as the BIRADS criteria for labelling and annotation, and a consistent file format, could markedly improve access and use of larger amounts of standardized data. This, in turn, could be increased further by GAN-based synthetic data generation, paving the way towards better health outcomes for breast cancer.


Assuntos
Confiabilidade dos Dados , Mamografia , Aprendizado de Máquina , Filmes Cinematográficos , Reprodutibilidade dos Testes , Conjuntos de Dados como Assunto
17.
Artif Intell Med ; 144: 102642, 2023 10.
Artigo em Inglês | MEDLINE | ID: mdl-37783537

RESUMO

Machine learning provides many powerful and effective techniques for analysing heterogeneous electronic health records (EHR). Administrative Health Records (AHR) are a subset of EHR collected for administrative purposes, and the use of machine learning on AHRs is a growing subfield of EHR analytics. Existing reviews of EHR analytics emphasise that the data-modality of the EHR limits the breadth of suitable machine learning techniques, and pursuable healthcare applications. Despite emphasising the importance of data modality, the literature fails to analyse which techniques and applications are relevant to AHRs. AHRs contain uniquely well-structured, categorically encoded records which are distinct from other data-modalities captured by EHRs, and they can provide valuable information pertaining to how patients interact with the healthcare system. This paper systematically reviews AHR-based research, analysing 70 relevant studies and spanning multiple databases. We identify and analyse which machine learning techniques are applied to AHRs and which health informatics applications are pursued in AHR-based research. We also analyse how these techniques are applied in pursuit of each application, and identify the limitations of these approaches. We find that while AHR-based studies are disconnected from each other, the use of AHRs in health informatics research is substantial and accelerating. Our synthesis of these studies highlights the utility of AHRs for pursuing increasingly complex and diverse research objectives despite a number of pervading data- and technique-based limitations. Finally, through our findings, we propose a set of future research directions that can enhance the utility of AHR data and machine learning techniques for health informatics research.


Assuntos
Aprendizado de Máquina , Informática Médica , Humanos , Registros Eletrônicos de Saúde , Bases de Dados Factuais , Atenção à Saúde
18.
Sci Rep ; 12(1): 10349, 2022 06 20.
Artigo em Inglês | MEDLINE | ID: mdl-35725870

RESUMO

The World Health Organisation reported in 2020 that six of the top 10 sources of death in low-income countries are parasites. Parasites are microorganisms in a relationship with a larger organism, the host. They acquire all benefits at the host's expense. A disease develops if the parasitic infection disrupts normal functioning of the host. This disruption can range from mild to severe, including death. Humans and livestock continue to be challenged by established and emerging infectious disease threats. Vaccination is the most efficient tool for preventing current and future threats. Immunogenic proteins sourced from the disease-causing parasite are worthwhile vaccine components (subunits) due to reliable safety and manufacturing capacity. Publications with 'subunit vaccine' in their title have accumulated to thousands over the last three decades. However, there are possibly thousands more reporting immunogenicity results without mentioning 'subunit' and/or 'vaccine'. The exact number is unclear given the non-standardised keywords in publications. The study aim is to identify parasite proteins that induce a protective response in an animal model as reported in the scientific literature within the last 30 years using machine learning and natural language processing. Source code to fulfil this aim and the vaccine candidate list obtained is made available.


Assuntos
Parasitos , Doenças Parasitárias , Vacinas , Animais , Aprendizado de Máquina , Processamento de Linguagem Natural
19.
NAR Genom Bioinform ; 4(1): lqab124, 2022 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-35047816

RESUMO

There is increasing evidence that changes in the variability or overall distribution of gene expression are important both in normal biology and in diseases, particularly cancer. Genes whose expression differs in variability or distribution without a difference in mean are ignored by traditional differential expression-based analyses. Using a Bayesian hierarchical model that provides tests for both differential variability and differential distribution for bulk RNA-seq data, we report here an investigation into differential variability and distribution in cancer. Analysis of eight paired tumour-normal datasets from The Cancer Genome Atlas confirms that differential variability and distribution analyses are able to identify cancer-related genes. We further demonstrate that differential variability identifies cancer-related genes that are missed by differential expression analysis, and that differential expression and differential variability identify functionally distinct sets of potentially cancer-related genes. These results suggest that differential variability analysis may provide insights into genetic aspects of cancer that would not be revealed by differential expression, and that differential distribution analysis may allow for more comprehensive identification of cancer-related genes than analyses based on changes in mean or variability alone.

20.
Sci Rep ; 12(1): 11337, 2022 07 05.
Artigo em Inglês | MEDLINE | ID: mdl-35790803

RESUMO

The significant advancement of inexpensive and portable virtual reality (VR) and augmented reality devices has re-energised the research in the immersive analytics field. The immersive environment is different from a traditional 2D display used to analyse 3D data as it provides a unified environment that supports immersion in a 3D scene, gestural interaction, haptic feedback and spatial audio. Genomic data analysis has been used in oncology to understand better the relationship between genetic profile, cancer type, and treatment option. This paper proposes a novel immersive analytics tool for cancer patient cohorts in a virtual reality environment, virtual reality to observe oncology data models. We utilise immersive technologies to analyse the gene expression and clinical data of a cohort of cancer patients. Various machine learning algorithms and visualisation methods have also been deployed in VR to enhance the data interrogation process. This is supported with established 2D visual analytics and graphical methods in bioinformatics, such as scatter plots, descriptive statistical information, linear regression, box plot and heatmap into our visualisation. Our approach allows the clinician to interrogate the information that is familiar and meaningful to them while providing them immersive analytics capabilities to make new discoveries toward personalised medicine.


Assuntos
Realidade Aumentada , Neoplasias , Realidade Virtual , Retroalimentação , Humanos , Neoplasias/genética , Projetos de Pesquisa
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA