Pesquisa | BVS - MINISTÉRIO DA SAÚDE

A unified graph model based on molecular data binning for disease subtyping.

Hassan Zada, Muhammad Sadiq; Yuan, Bo; Khan, Wajahat Ali; Anjum, Ashiq; Reiff-Marganiec, Stephan; Saleem, Rabia.

J Biomed Inform ; 134: 104187, 2022 10.

Artigo em Inglês | MEDLINE | ID: mdl-36055637

RESUMO

Molecular disease subtype discovery from omics data is an important research problem in precision medicine. The biggest challenges are the skewed distribution and data variability in the measurements of omics data. These challenges complicate the efficient identification of molecular disease subtypes defined by clinical differences, such as survival. Existing approaches adopt kernels to construct patient similarity graphs from each view through pairwise matching. However, the distance functions used in kernels are unable to utilize the potentially critical information of extreme values and data variability which leads to the lack of robustness. In this paper, a novel robust distance metric (ROMDEX) is proposed to construct similarity graphs for molecular disease subtypes from omics data, which is able to address the data variability and extreme values challenges. The proposed approach is validated on multiple TCGA cancer datasets, and the results are compared with multiple baseline disease subtyping methods. The evaluation of results is based on Kaplan-Meier survival time analysis, which is validated using statistical tests e.g, Cox-proportional hazard (Cox p-value). We reject the null hypothesis that the cohorts have the same hazard, for the P-values less than 0.05. The proposed approach achieved best P-values of 0.00181, 0.00171, and 0.00758 for Gene Expression, DNA Methylation, and MicroRNA data respectively, which shows significant difference in survival between the cohorts. In the results, the proposed approach outperformed the existing state-of-the-art (MRGC, PINS, SNF, Consensus Clustering and Icluster+) disease subtyping approaches on various individual disease views of multiple TCGA datasets.

Assuntos

MicroRNAs , Neoplasias , Análise por Conglomerados , Humanos , Estimativa de Kaplan-Meier , MicroRNAs/genética , Neoplasias/diagnóstico , Neoplasias/genética , Medicina de Precisão

Embedded Data Imputation for Environmental Intelligent Sensing: A Case Study.

Erhan, Laura; Di Mauro, Mario; Anjum, Ashiq; Bagdasar, Ovidiu; Song, Wei; Liotta, Antonio.

Sensors (Basel) ; 21(23)2021 Nov 23.

Artigo em Inglês | MEDLINE | ID: mdl-34883778

RESUMO

Recent developments in cloud computing and the Internet of Things have enabled smart environments, in terms of both monitoring and actuation. Unfortunately, this often results in unsustainable cloud-based solutions, whereby, in the interest of simplicity, a wealth of raw (unprocessed) data are pushed from sensor nodes to the cloud. Herein, we advocate the use of machine learning at sensor nodes to perform essential data-cleaning operations, to avoid the transmission of corrupted (often unusable) data to the cloud. Starting from a public pollution dataset, we investigate how two machine learning techniques (kNN and missForest) may be embedded on Raspberry Pi to perform data imputation, without impacting the data collection process. Our experimental results demonstrate the accuracy and computational efficiency of edge-learning methods for filling in missing data values in corrupted data series. We find that kNN and missForest correctly impute up to 40% of randomly distributed missing values, with a density distribution of values that is indistinguishable from the benchmark. We also show a trade-off analysis for the case of bursty missing values, with recoverable blocks of up to 100 samples. Computation times are shorter than sampling periods, allowing for data imputation at the edge in a timely manner.

Assuntos

Computação em Nuvem , Aprendizado de Máquina , Benchmarking

Language model-based automatic prefix abbreviation expansion method for biomedical big data analysis.

Du, Xiaokun; Zhu, Rongbo; Li, Yanhong; Anjum, Ashiq.

Future Gener Comput Syst ; 98: 238-251, 2019 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-32287562

RESUMO

In biomedical domain, abbreviations are appearing more and more frequently in various data sets, which has caused significant obstacles to biomedical big data analysis. The dictionary-based approach has been adopted to process abbreviations, but it cannot handle ad hoc abbreviations, and it is impossible to cover all abbreviations. To overcome these drawbacks, this paper proposes an automatic abbreviation expansion method called LMAAE (Language Model-based Automatic Abbreviation Expansion). In this method, the abbreviation is firstly divided into blocks; then, expansion candidates are generated by restoring each block; and finally, the expansion candidates are filtered and clustered to acquire the final expansion result according to the language model and clustering method. Through restrict the abbreviation to prefix abbreviation, the search space of expansion is reduced sharply. And then, the search space is continuous reduced by restrained the effective and the length of the partition. In order to validate the effective of the method, two types of experiments are designed. For standard abbreviations, the expansion results include most of the expansion in dictionary. Therefore, it has a high precision. For ad hoc abbreviations, the precisions of schema matching, knowledge fusion are increased by using this method to handle the abbreviations. Although the recall for standard abbreviation needs to be improved, but this does not affect the good complement effect for the dictionary method.

Providing traceability for neuroimaging analyses.

McClatchey, Richard; Branson, Andrew; Anjum, Ashiq; Bloodsworth, Peter; Habib, Irfan; Munir, Kamran; Shamdasani, Jetendr; Soomro, Kamran.

Int J Med Inform ; 82(9): 882-94, 2013 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-23763909

RESUMO

INTRODUCTION: With the increasingly digital nature of biomedical data and as the complexity of analyses in medical research increases, the need for accurate information capture, traceability and accessibility has become crucial to medical researchers in the pursuance of their research goals. Grid- or Cloud-based technologies, often based on so-called Service Oriented Architectures (SOA), are increasingly being seen as viable solutions for managing distributed data and algorithms in the bio-medical domain. For neuroscientific analyses, especially those centred on complex image analysis, traceability of processes and datasets is essential but up to now this has not been captured in a manner that facilitates collaborative study. PURPOSE AND METHOD: Few examples exist, of deployed medical systems based on Grids that provide the traceability of research data needed to facilitate complex analyses and none have been evaluated in practice. Over the past decade, we have been working with mammographers, paediatricians and neuroscientists in three generations of projects to provide the data management and provenance services now required for 21st century medical research. This paper outlines the finding of a requirements study and a resulting system architecture for the production of services to support neuroscientific studies of biomarkers for Alzheimer's disease. RESULTS: The paper proposes a software infrastructure and services that provide the foundation for such support. It introduces the use of the CRISTAL software to provide provenance management as one of a number of services delivered on a SOA, deployed to manage neuroimaging projects that have been studying biomarkers for Alzheimer's disease. CONCLUSIONS: In the neuGRID and N4U projects a Provenance Service has been delivered that captures and reconstructs the workflow information needed to facilitate researchers in conducting neuroimaging analyses. The software enables neuroscientists to track the evolution of workflows and datasets. It also tracks the outcomes of various analyses and provides provenance traceability throughout the lifecycle of their studies. As the Provenance Service has been designed to be generic it can be applied across the medical domain as a reusable tool for supporting medical researchers thus providing communities of researchers for the first time with the necessary tools to conduct widely distributed collaborative programmes of medical analysis.

Assuntos

Mapeamento Encefálico/métodos , Sistemas Computacionais/estatística & dados numéricos , Computação em Informática Médica , Neuroimagem , Software , Algoritmos , Humanos , Fluxo de Trabalho

Research traceability using provenance services for biomedical analysis.

Anjum, Ashiq; Bloodsworth, Peter; Branson, Andrew; Habib, Irfan; McClatchey, Richard; Solomonides, Tony; Soomro, Kamran.

Stud Health Technol Inform ; 159: 88-99, 2010.

Artigo em Inglês | MEDLINE | ID: mdl-20543429

RESUMO

We outline the approach being developed in the neuGRID project to use provenance management techniques for the purposes of capturing and preserving the provenance data that emerges in the specification and execution of workflows in biomedical analyses. In the neuGRID project a provenance service has been designed and implemented that is intended to capture, store, retrieve and reconstruct the workflow information needed to facilitate users in conducting user analyses. We describe the architecture of the neuGRID provenance service and discuss how the CRISTAL system from CERN is being adapted to address the requirements of the project and then consider how a generalised approach for provenance management could emerge for more generic application to the (Health)Grid community.

Assuntos

Pesquisa Biomédica , Redes de Comunicação de Computadores/organização & administração

Reusable services from the neuGRID project for grid-based health applications.

Anjum, Ashiq; Bloodsworth, Peter; Habib, Irfan; Lansdale, Tom; McClatchey, Richard; Mehmood, Yasir.

Stud Health Technol Inform ; 147: 283-8, 2009.

Artigo em Inglês | MEDLINE | ID: mdl-19593068

RESUMO

By abstracting Grid middleware specific considerations from clinical research applications, re-usable services should be developed that will provide generic functionality aimed specifically at medical applications. In the scope of the neuGRID project, generic services are being designed and developed which will be applied to satisfy the requirements of neuroscientists. These services will bring together sources of data and computing elements into a single view as far as applications are concerned, making it possible to cope with centralised, distributed or hybrid data and provide native support for common medical file formats. Services will include querying, provenance, portal, anonymization and pipeline services together with a 'glueing' service for connection to Grid services. Thus lower-level services will hide the peculiarities of any specific Grid technology from upper layers, provide application independence and will enable the selection of 'fit-for-purpose' infrastructures. This paper outlines the design strategy being followed in neuGRID using the glueing and pipeline services as examples.

Assuntos

Sistemas Computacionais , Computação em Informática Médica , Software

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA