Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 119
Filtrar
Mais filtros

Tipo de documento
Intervalo de ano de publicação
1.
Proc Natl Acad Sci U S A ; 121(28): e2320870121, 2024 Jul 09.
Artigo em Inglês | MEDLINE | ID: mdl-38959033

RESUMO

Efficient storage and sharing of massive biomedical data would open up their wide accessibility to different institutions and disciplines. However, compressors tailored for natural photos/videos are rapidly limited for biomedical data, while emerging deep learning-based methods demand huge training data and are difficult to generalize. Here, we propose to conduct Biomedical data compRession with Implicit nEural Function (BRIEF) by representing the target data with compact neural networks, which are data specific and thus have no generalization issues. Benefiting from the strong representation capability of implicit neural function, BRIEF achieves 2[Formula: see text]3 orders of magnitude compression on diverse biomedical data at significantly higher fidelity than existing techniques. Besides, BRIEF is of consistent performance across the whole data volume, and supports customized spatially varying fidelity. BRIEF's multifold advantageous features also serve reliable downstream tasks at low bandwidth. Our approach will facilitate low-bandwidth data sharing and promote collaboration and progress in the biomedical field.


Assuntos
Disseminação de Informação , Redes Neurais de Computação , Humanos , Disseminação de Informação/métodos , Compressão de Dados/métodos , Aprendizado Profundo , Pesquisa Biomédica/métodos
2.
Brief Bioinform ; 25(4)2024 May 23.
Artigo em Inglês | MEDLINE | ID: mdl-38836701

RESUMO

Biomedical data are generated and collected from various sources, including medical imaging, laboratory tests and genome sequencing. Sharing these data for research can help address unmet health needs, contribute to scientific breakthroughs, accelerate the development of more effective treatments and inform public health policy. Due to the potential sensitivity of such data, however, privacy concerns have led to policies that restrict data sharing. In addition, sharing sensitive data requires a secure and robust infrastructure with appropriate storage solutions. Here, we examine and compare the centralized and federated data sharing models through the prism of five large-scale and real-world use cases of strategic significance within the European data sharing landscape: the French Health Data Hub, the BBMRI-ERIC Colorectal Cancer Cohort, the federated European Genome-phenome Archive, the Observational Medical Outcomes Partnership/OHDSI network and the EBRAINS Medical Informatics Platform. Our analysis indicates that centralized models facilitate data linkage, harmonization and interoperability, while federated models facilitate scaling up and legal compliance, as the data typically reside on the data generator's premises, allowing for better control of how data are shared. This comparative study thus offers guidance on the selection of the most appropriate sharing strategy for sensitive datasets and provides key insights for informed decision-making in data sharing efforts.


Assuntos
Disciplinas das Ciências Biológicas , Disseminação de Informação , Humanos , Informática Médica/métodos
3.
Brief Bioinform ; 25(1)2023 11 22.
Artigo em Inglês | MEDLINE | ID: mdl-38113073

RESUMO

Researchers increasingly turn to explainable artificial intelligence (XAI) to analyze omics data and gain insights into the underlying biological processes. Yet, given the interdisciplinary nature of the field, many findings have only been shared in their respective research community. An overview of XAI for omics data is needed to highlight promising approaches and help detect common issues. Toward this end, we conducted a systematic mapping study. To identify relevant literature, we queried Scopus, PubMed, Web of Science, BioRxiv, MedRxiv and arXiv. Based on keywording, we developed a coding scheme with 10 facets regarding the studies' AI methods, explainability methods and omics data. Our mapping study resulted in 405 included papers published between 2010 and 2023. The inspected papers analyze DNA-based (mostly genomic), transcriptomic, proteomic or metabolomic data by means of neural networks, tree-based methods, statistical methods and further AI methods. The preferred post-hoc explainability methods are feature relevance (n = 166) and visual explanation (n = 52), while papers using interpretable approaches often resort to the use of transparent models (n = 83) or architecture modifications (n = 72). With many research gaps still apparent for XAI for omics data, we deduced eight research directions and discuss their potential for the field. We also provide exemplary research questions for each direction. Many problems with the adoption of XAI for omics data in clinical practice are yet to be resolved. This systematic mapping study outlines extant research on the topic and provides research directions for researchers and practitioners.


Assuntos
Inteligência Artificial , Proteômica , Perfilação da Expressão Gênica , Genômica , Redes Neurais de Computação
4.
BMC Bioinformatics ; 25(1): 1, 2024 Jan 02.
Artigo em Inglês | MEDLINE | ID: mdl-38166530

RESUMO

Graph embedding techniques are using deep learning algorithms in data analysis to solve problems of such as node classification, link prediction, community detection, and visualization. Although typically used in the context of guessing friendships in social media, several applications for graph embedding techniques in biomedical data analysis have emerged. While these approaches remain computationally demanding, several developments over the last years facilitate their application to study biomedical data and thus may help advance biological discoveries. Therefore, in this review, we discuss the principles of graph embedding techniques and explore the usefulness for understanding biological network data derived from mass spectrometry and sequencing experiments, the current workhorses of systems biology studies. In particular, we focus on recent examples for characterizing protein-protein interaction networks and predicting novel drug functions.


Assuntos
Algoritmos , Mídias Sociais , Humanos , Espectrometria de Massas , Análise de Dados , Mapas de Interação de Proteínas
5.
Brief Bioinform ; 23(2)2022 03 10.
Artigo em Inglês | MEDLINE | ID: mdl-35136949

RESUMO

In recent decades, exploring potential relationships between diseases has been an active research field. With the rapid accumulation of disease-related biomedical data, a lot of computational methods and tools/platforms have been developed to reveal intrinsic relationship between diseases, which can provide useful insights to the study of complex diseases, e.g. understanding molecular mechanisms of diseases and discovering new treatment of diseases. Human complex diseases involve both external phenotypic abnormalities and complex internal molecular mechanisms in organisms. Computational methods with different types of biomedical data from phenotype to genotype can evaluate disease-disease associations at different levels, providing a comprehensive perspective for understanding diseases. In this review, available biomedical data and databases for evaluating disease-disease associations are first summarized. Then, existing computational methods for disease-disease associations are reviewed and classified into five groups in terms of the usages of biomedical data, including disease semantic-based, phenotype-based, function-based, representation learning-based and text mining-based methods. Further, we summarize software tools/platforms for computation and analysis of disease-disease associations. Finally, we give a discussion and summary on the research of disease-disease associations. This review provides a systematic overview for current disease association research, which could promote the development and applications of computational methods and tools/platforms for disease-disease associations.


Assuntos
Biologia Computacional , Mineração de Dados , Biologia Computacional/métodos , Mineração de Dados/métodos , Bases de Dados Factuais , Fenótipo , Software
6.
BMC Med Res Methodol ; 24(1): 150, 2024 Jul 16.
Artigo em Inglês | MEDLINE | ID: mdl-39014322

RESUMO

Effectiveness in health care is a specific characteristic of each intervention and outcome evaluated. Especially with regard to surgical interventions, organization, structure and processes play a key role in determining this parameter. In addition, health care services by definition operate in a context of limited resources, so rationalization of service organization becomes the primary goal for health care management. This aspect becomes even more relevant for those surgical services for which there are high volumes. Therefore, in order to support and optimize the management of patients undergoing surgical procedures, the data analysis could play a significant role. To this end, in this study used different classification algorithms for characterizing the process of patients undergoing surgery for a femoral neck fracture. The models showed significant accuracy with values of 81%, and parameters such as Anaemia and Gender proved to be determined risk factors for the patient's length of stay. The predictive power of the implemented model is assessed and discussed in view of its capability to support the management and optimisation of the hospitalisation process for femoral neck fracture, and is compared with different model in order to identify the most promising algorithms. In the end, the support of artificial intelligence algorithms laying the basis for building more accurate decision-support tools for healthcare practitioners.


Assuntos
Algoritmos , Fraturas do Colo Femoral , Humanos , Feminino , Masculino , Fraturas do Colo Femoral/cirurgia , Fraturas do Colo Femoral/terapia , Fraturas do Colo Femoral/classificação , Idoso , Fraturas do Fêmur/cirurgia , Fraturas do Fêmur/classificação , Fraturas do Fêmur/terapia , Tempo de Internação/estatística & dados numéricos , Inteligência Artificial , Pessoa de Meia-Idade , Idoso de 80 Anos ou mais , Fatores de Risco
7.
J Med Internet Res ; 26: e46160, 2024 May 28.
Artigo em Inglês | MEDLINE | ID: mdl-38805706

RESUMO

CryptoKitties, a trendy game on Ethereum that is an open-source public blockchain platform with a smart contract function, brought nonfungible tokens (NFTs) into the public eye in 2017. NFTs are popular because of their nonfungible properties and their unique and irreplaceable nature in the real world. The embryonic form of NFTs can be traced back to a P2P network protocol improved based on Bitcoin in 2012 that can realize decentralized digital asset transactions. NFTs have recently gained much attention and have shown an unprecedented explosive growth trend. Herein, the concept of digital asset NFTs is introduced into the medical and health field to conduct a subversive discussion on biobank operations. By converting biomedical data into NFTs, the collection and circulation of samples can be accelerated, and the transformation of resources can be promoted. In conclusion, the biobank can achieve sustainable development through "decentralization."


Assuntos
Internet , Humanos , Blockchain , Bancos de Espécimes Biológicos
8.
BMC Bioinformatics ; 24(1): 490, 2023 Dec 21.
Artigo em Inglês | MEDLINE | ID: mdl-38129803

RESUMO

BACKGROUND: Clustering analysis is widely used to interpret biomedical data and uncover new knowledge and patterns. However, conventional clustering methods are not effective when dealing with sparse biomedical data. To overcome this limitation, we propose a hierarchical clustering method called polynomial weight-adjusted sparse clustering (PWSC). RESULTS: The PWSC algorithm adjusts feature weights using a polynomial function, redefines the distances between samples, and performs hierarchical clustering analysis based on these adjusted distances. Additionally, we incorporate a consensus clustering approach to determine the optimal number of classifications. This consensus approach utilizes relative change in the cumulative distribution function to identify the best number of clusters, resulting in more stable clustering results. Leveraging the PWSC algorithm, we successfully classified a cohort of gastric cancer patients, enabling categorization of patients carrying different types of altered genes. Further evaluation using Entropy showed a significant improvement (p = 2.905e-05), while using the Calinski-Harabasz index demonstrates a remarkable 100% improvement in the quality of the best classification compared to conventional algorithms. Similarly, significantly increased entropy (p = 0.0336) and comparable CHI, were observed when classifying another colorectal cancer cohort with microbial abundance. The above attempts in cancer subtyping demonstrate that PWSC is highly applicable to different types of biomedical data. To facilitate its application, we have developed a user-friendly tool that implements the PWSC algorithm, which canbe accessed at http://pwsc.aiyimed.com/ . CONCLUSIONS: PWSC addresses the limitations of conventional approaches when clustering sparse biomedical data. By adjusting feature weights and employing consensus clustering, we achieve improved clustering results compared to conventional methods. The PWSC algorithm provides a valuable tool for researchers in the field, enabling more accurate and stable clustering analysis. Its application can enhance our understanding of complex biological systems and contribute to advancements in various biomedical disciplines.


Assuntos
Algoritmos , Neoplasias Gástricas , Humanos , Análise por Conglomerados
9.
J Biomed Inform ; 137: 104272, 2023 01.
Artigo em Inglês | MEDLINE | ID: mdl-36563828

RESUMO

BACKGROUND: Secondary use of health data is a valuable source of knowledge that boosts observational studies, leading to important discoveries in the medical and biomedical sciences. The fundamental guiding principle for performing a successful observational study is the research question and the approach in advance of executing a study. However, in multi-centre studies, finding suitable datasets to support the study is challenging, time-consuming, and sometimes impossible without a deep understanding of each dataset. METHODS: We propose a strategy for retrieving biomedical datasets of interest that were semantically annotated, using an interface built by applying a methodology for transforming natural language questions into formal language queries. The advantages of creating biomedical semantic data are enhanced by using natural language interfaces to issue complex queries without manipulating a logical query language. RESULTS: Our methodology was validated using Alzheimer's disease datasets published in a European platform for sharing and reusing biomedical data. We converted data to semantic information format using biomedical ontologies in everyday use in the biomedical community and published it as a FAIR endpoint. We have considered natural language questions of three types: single-concept questions, questions with exclusion criteria, and multi-concept questions. Finally, we analysed the performance of the question-answering module we used and its limitations. The source code is publicly available at https://bioinformatics-ua.github.io/BioKBQA/. CONCLUSION: We propose a strategy for using information extracted from biomedical data and transformed into a semantic format using open biomedical ontologies. Our method uses natural language to formulate questions to be answered by this semantic data without the direct use of formal query languages.


Assuntos
Processamento de Linguagem Natural , Semântica , Software , Idioma , Bases de Dados Factuais
10.
BMC Bioinformatics ; 23(1): 245, 2022 Jun 21.
Artigo em Inglês | MEDLINE | ID: mdl-35729494

RESUMO

BACKGROUND: Drug-target interactions (DTIs) are critical for drug repurposing and elucidation of drug mechanisms, and are manually curated by large databases, such as ChEMBL, BindingDB, DrugBank and DrugTargetCommons. However, the number of curated articles likely constitutes only a fraction of all the articles that contain experimentally determined DTIs. Finding such articles and extracting the experimental information is a challenging task, and there is a pressing need for systematic approaches to assist the curation of DTIs. To this end, we applied Bidirectional Encoder Representations from Transformers (BERT) to identify such articles. Because DTI data intimately depends on the type of assays used to generate it, we also aimed to incorporate functions to predict the assay format. RESULTS: Our novel method identified 0.6 million articles (along with drug and protein information) which are not previously included in public DTI databases. Using 10-fold cross-validation, we obtained ~ 99% accuracy for identifying articles containing quantitative drug-target profiles. The F1 micro for the prediction of assay format is 88%, which leaves room for improvement in future studies. CONCLUSION: The BERT model in this study is robust and the proposed pipeline can be used to identify previously overlooked articles containing quantitative DTIs. Overall, our method provides a significant advancement in machine-assisted DTI extraction and curation. We expect it to be a useful addition to drug mechanism discovery and repurposing.


Assuntos
Reposicionamento de Medicamentos , Proteínas , Bases de Dados Factuais , Interações Medicamentosas , Proteínas/metabolismo , PubMed
11.
Brief Bioinform ; 21(3): 1047-1057, 2020 05 21.
Artigo em Inglês | MEDLINE | ID: mdl-31067315

RESUMO

With the explosive growth of biological sequences generated in the post-genomic era, one of the most challenging problems in bioinformatics and computational biology is to computationally characterize sequences, structures and functions in an efficient, accurate and high-throughput manner. A number of online web servers and stand-alone tools have been developed to address this to date; however, all these tools have their limitations and drawbacks in terms of their effectiveness, user-friendliness and capacity. Here, we present iLearn, a comprehensive and versatile Python-based toolkit, integrating the functionality of feature extraction, clustering, normalization, selection, dimensionality reduction, predictor construction, best descriptor/model selection, ensemble learning and results visualization for DNA, RNA and protein sequences. iLearn was designed for users that only want to upload their data set and select the functions they need calculated from it, while all necessary procedures and optimal settings are completed automatically by the software. iLearn includes a variety of descriptors for DNA, RNA and proteins, and four feature output formats are supported so as to facilitate direct output usage or communication with other computational tools. In total, iLearn encompasses 16 different types of feature clustering, selection, normalization and dimensionality reduction algorithms, and five commonly used machine-learning algorithms, thereby greatly facilitating feature analysis and predictor construction. iLearn is made freely available via an online web server and a stand-alone toolkit.


Assuntos
DNA/química , Aprendizado de Máquina , Proteínas/química , RNA/química , Análise de Sequência/métodos , Algoritmos , Internet
12.
Stat Med ; 41(21): 4266-4283, 2022 09 20.
Artigo em Inglês | MEDLINE | ID: mdl-35796389

RESUMO

In biomedical research, the outcome of longitudinal studies has been traditionally analyzed using the repeated measures analysis of variance (rm-ANOVA) or more recently, linear mixed models (LMEMs). Although LMEMs are less restrictive than rm-ANOVA as they can work with unbalanced data and non-constant correlation between observations, both methodologies assume a linear trend in the measured response. It is common in biomedical research that the true trend response is nonlinear and in these cases the linearity assumption of rm-ANOVA and LMEMs can lead to biased estimates and unreliable inference. In contrast, GAMs relax the linearity assumption of rm-ANOVA and LMEMs and allow the data to determine the fit of the model while also permitting incomplete observations and different correlation structures. Therefore, GAMs present an excellent choice to analyze longitudinal data with non-linear trends in the context of biomedical research. This paper summarizes the limitations of rm-ANOVA and LMEMs and uses simulated data to visually show how both methods produce biased estimates when used on data with non-linear trends. We present the basic theory of GAMs and using reported trends of oxygen saturation in tumors, we simulate example longitudinal data (2 treatment groups, 10 subjects per group, 5 repeated measures for each group) to demonstrate their implementation in R. We also show that GAMs are able to produce estimates with non-linear trends even when incomplete observations exist (with 40% of the simulated observations missing). To make this work reproducible, the code and data used in this paper are available at: https://github.com/aimundo/GAMs-biomedical-research.


Assuntos
Pesquisa Biomédica , Projetos de Pesquisa , Análise de Variância , Humanos , Modelos Lineares , Estudos Longitudinais
13.
Entropy (Basel) ; 24(11)2022 Oct 31.
Artigo em Inglês | MEDLINE | ID: mdl-36359667

RESUMO

In the domain of computer vision, entropy-defined as a measure of irregularity-has been proposed as an effective method for analyzing the texture of images. Several studies have shown that, with specific parameter tuning, entropy-based approaches achieve high accuracy in terms of classification results for texture images, when associated with machine learning classifiers. However, few entropy measures have been extended to studying color images. Moreover, the literature is missing comparative analyses of entropy-based and modern deep learning-based classification methods for RGB color images. In order to address this matter, we first propose a new entropy-based measure for RGB images based on a multivariate approach. This multivariate approach is a bi-dimensional extension of the methods that have been successfully applied to multivariate signals (unidimensional data). Then, we compare the classification results of this new approach with those obtained from several deep learning methods. The entropy-based method for RGB image classification that we propose leads to promising results. In future studies, the measure could be extended to study other color spaces as well.

14.
BMC Med Inform Decis Mak ; 21(1): 242, 2021 08 12.
Artigo em Inglês | MEDLINE | ID: mdl-34384406

RESUMO

BACKGROUND: Data sharing is considered a crucial part of modern medical research. Unfortunately, despite its advantages, it often faces obstacles, especially data privacy challenges. As a result, various approaches and infrastructures have been developed that aim to ensure that patients and research participants remain anonymous when data is shared. However, privacy protection typically comes at a cost, e.g. restrictions regarding the types of analyses that can be performed on shared data. What is lacking is a systematization making the trade-offs taken by different approaches transparent. The aim of the work described in this paper was to develop a systematization for the degree of privacy protection provided and the trade-offs taken by different data sharing methods. Based on this contribution, we categorized popular data sharing approaches and identified research gaps by analyzing combinations of promising properties and features that are not yet supported by existing approaches. METHODS: The systematization consists of different axes. Three axes relate to privacy protection aspects and were adopted from the popular Five Safes Framework: (1) safe data, addressing privacy at the input level, (2) safe settings, addressing privacy during shared processing, and (3) safe outputs, addressing privacy protection of analysis results. Three additional axes address the usefulness of approaches: (4) support for de-duplication, to enable the reconciliation of data belonging to the same individuals, (5) flexibility, to be able to adapt to different data analysis requirements, and (6) scalability, to maintain performance with increasing complexity of shared data or common analysis processes. RESULTS: Using the systematization, we identified three different categories of approaches: distributed data analyses, which exchange anonymous aggregated data, secure multi-party computation protocols, which exchange encrypted data, and data enclaves, which store pooled individual-level data in secure environments for access for analysis purposes. We identified important research gaps, including a lack of approaches enabling the de-duplication of horizontally distributed data or providing a high degree of flexibility. CONCLUSIONS: There are fundamental differences between different data sharing approaches and several gaps in their functionality that may be interesting to investigate in future work. Our systematization can make the properties of privacy-preserving data sharing infrastructures more transparent and support decision makers and regulatory authorities with a better understanding of the trade-offs taken.


Assuntos
Pesquisa Biomédica , Privacidade , Segurança Computacional , Humanos , Disseminação de Informação
15.
J Med Syst ; 45(4): 45, 2021 Feb 23.
Artigo em Inglês | MEDLINE | ID: mdl-33624190

RESUMO

We present a protocol for integrating two types of biological data - clinical and molecular - for more effective classification of patients with cancer. The proposed approach is a hybrid between early and late data integration strategy. In this hybrid protocol, the set of informative clinical features is extended by the classification results based on molecular data sets. The results are then treated as new synthetic variables. The hybrid protocol was applied to METABRIC breast cancer samples and TCGA urothelial bladder carcinoma samples. Various data types were used for clinical endpoint prediction: clinical data, gene expression, somatic copy number aberrations, RNA-Seq, methylation, and reverse phase protein array. The performance of the hybrid data integration was evaluated with a repeated cross validation procedure and compared with other methods of data integration: early integration and late integration via super learning. The hybrid method gave similar results to those obtained by the best of the tested variants of super learning. What is more, the hybrid method allowed for further sensitivity analysis and recursive feature elimination, which led to compact predictive models for cancer clinical endpoints. For breast cancer, the final model consists of eight clinical variables and two synthetic features obtained from molecular data. For urothelial bladder carcinoma, only two clinical features and one synthetic variable were necessary to build the best predictive model. We have shown that the inclusion of the synthetic variables based on the RNA expression levels and copy number alterations can lead to improved quality of prognostic tests. Thus, it should be considered for inclusion in wider medical practice.


Assuntos
Algoritmos , Gerenciamento de Dados/métodos , Conjuntos de Dados como Assunto/classificação , Bases de Dados de Compostos Químicos
16.
Entropy (Basel) ; 23(10)2021 Oct 03.
Artigo em Inglês | MEDLINE | ID: mdl-34682027

RESUMO

Two-dimensional fuzzy entropy, dispersion entropy, and their multiscale extensions (MFuzzyEn2D and MDispEn2D, respectively) have shown promising results for image classifications. However, these results rely on the selection of key parameters that may largely influence the entropy values obtained. Yet, the optimal choice for these parameters has not been studied thoroughly. We propose a study on the impact of these parameters in image classification. For this purpose, the entropy-based algorithms are applied to a variety of images from different datasets, each containing multiple image classes. Several parameter combinations are used to obtain the entropy values. These entropy values are then applied to a range of machine learning classifiers and the algorithm parameters are analyzed based on the classification results. By using specific parameters, we show that both MFuzzyEn2D and MDispEn2D approach state-of-the-art in terms of image classification for multiple image types. They lead to an average maximum accuracy of more than 95% for all the datasets tested. Moreover, MFuzzyEn2D results in a better classification performance than that extracted by MDispEn2D as a majority. Furthermore, the choice of classifier does not have a significant impact on the classification of the extracted features by both entropy algorithms. The results open new perspectives for these entropy-based measures in textural analysis.

17.
BMC Med Inform Decis Mak ; 20(1): 29, 2020 02 11.
Artigo em Inglês | MEDLINE | ID: mdl-32046701

RESUMO

BACKGROUND: Modern data driven medical research promises to provide new insights into the development and course of disease and to enable novel methods of clinical decision support. To realize this, machine learning models can be trained to make predictions from clinical, paraclinical and biomolecular data. In this process, privacy protection and regulatory requirements need careful consideration, as the resulting models may leak sensitive personal information. To counter this threat, a wide range of methods for integrating machine learning with formal methods of privacy protection have been proposed. However, there is a significant lack of practical tools to create and evaluate such privacy-preserving models. In this software article, we report on our ongoing efforts to bridge this gap. RESULTS: We have extended the well-known ARX anonymization tool for biomedical data with machine learning techniques to support the creation of privacy-preserving prediction models. Our methods are particularly well suited for applications in biomedicine, as they preserve the truthfulness of data (e.g. no noise is added) and they are intuitive and relatively easy to explain to non-experts. Moreover, our implementation is highly versatile, as it supports binomial and multinomial target variables, different types of prediction models and a wide range of privacy protection techniques. All methods have been integrated into a sound framework that supports the creation, evaluation and refinement of models through intuitive graphical user interfaces. To demonstrate the broad applicability of our solution, we present three case studies in which we created and evaluated different types of privacy-preserving prediction models for breast cancer diagnosis, diagnosis of acute inflammation of the urinary system and prediction of the contraceptive method used by women. In this process, we also used a wide range of different privacy models (k-anonymity, differential privacy and a game-theoretic approach) as well as different data transformation techniques. CONCLUSIONS: With the tool presented in this article, accurate prediction models can be created that preserve the privacy of individuals represented in the training set in a variety of threat scenarios. Our implementation is available as open source software.


Assuntos
Confidencialidade , Anonimização de Dados , Sistemas de Apoio a Decisões Clínicas , Modelos Estatísticos , Software , Pesquisa Biomédica , Humanos , Aprendizado de Máquina , Curva ROC , Reprodutibilidade dos Testes
18.
Future Gener Comput Syst ; 107: 215-228, 2020 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-32494091

RESUMO

Three-dimensional late gadolinium enhanced (LGE) cardiac MR (CMR) of left atrial scar in patients with atrial fibrillation (AF) has recently emerged as a promising technique to stratify patients, to guide ablation therapy and to predict treatment success. This requires a segmentation of the high intensity scar tissue and also a segmentation of the left atrium (LA) anatomy, the latter usually being derived from a separate bright-blood acquisition. Performing both segmentations automatically from a single 3D LGE CMR acquisition would eliminate the need for an additional acquisition and avoid subsequent registration issues. In this paper, we propose a joint segmentation method based on multiview two-task (MVTT) recursive attention model working directly on 3D LGE CMR images to segment the LA (and proximal pulmonary veins) and to delineate the scar on the same dataset. Using our MVTT recursive attention model, both the LA anatomy and scar can be segmented accurately (mean Dice score of 93% for the LA anatomy and 87% for the scar segmentations) and efficiently ( ∼ 0.27 s to simultaneously segment the LA anatomy and scars directly from the 3D LGE CMR dataset with 60-68 2D slices). Compared to conventional unsupervised learning and other state-of-the-art deep learning based methods, the proposed MVTT model achieved excellent results, leading to an automatic generation of a patient-specific anatomical model combined with scar segmentation for patients in AF.

19.
Med Law Rev ; 28(1): 155-182, 2020 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-31377815

RESUMO

Harms arising from digital data use in the big data context are often systemic and cannot always be captured by linear cause and effect. Individual data subjects and third parties can bear the main downstream costs arising from increasingly complex forms of data uses-without being able to trace the exact data flows. Because current regulatory frameworks do not adequately address this situation, we propose a move towards harm mitigation tools to complement existing legal remedies. In this article, we make a normative and practical case for why individuals should be offered support in such contexts and how harm mitigation tools can achieve this. We put forward the idea of 'Harm Mitigation Bodies' (HMBs), which people could turn to when they feel they were harmed by data use but do not qualify for legal remedies, or where existing legal remedies do not address their specific circumstances. HMBs would help to obtain a better understanding of the nature, severity, and frequency of harms occurring from both lawful and unlawful data use, and they could also provide financial support in some cases. We set out the role and form of these HMBs for the first time in this article.


Assuntos
Big Data/economia , Confidencialidade/legislação & jurisprudência , Regulamentação Governamental , Redução do Dano , Disseminação de Informação/legislação & jurisprudência , Responsabilidade Legal/economia , Causalidade , Humanos
20.
BMC Bioinformatics ; 20(Suppl 10): 244, 2019 May 29.
Artigo em Inglês | MEDLINE | ID: mdl-31138159

RESUMO

BACKGROUND: Due to the advent of deep learning, the increasing number of studies in the biomedical domain has attracted much interest in feature extraction and classification tasks. In this research, we seek the best combination of feature set and hyperparameter setting of deep learning algorithms for relation classification. To this end, we incorporate an entity and relation extraction tool, PKDE4J to extract biomedical features (i.e., biomedical entities, relations) for the relation classification. We compared the chosen Convolutional Neural Networks (CNN) based classification model with the most widely used learning algorithms. RESULTS: Our CNN based classification model outperforms the most widely used supervised algorithms. We achieved a significant performance on binary classification with a weighted macro-average F1-score: 94.79% using pre-extracted relevant feature combinations. For multi-class classification, the weighted macro-average F1-score is estimated around 86.95%. CONCLUSIONS: Our results suggest that our proposed CNN based model using the not only single feature as the raw text of the sentences of biomedical literature, but also coupling with multiple and highlighted features extracted from the biomedical sentences could improve the classification performance significantly. We offer hyperparameter tuning and optimization approaches for our proposed model to obtain optimal hyperparameters of the models with the best performance.


Assuntos
Algoritmos , Redes Neurais de Computação , Bases de Dados como Assunto , Modelos Teóricos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA