Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
iScience ; 27(5): 109736, 2024 May 17.
Artigo em Inglês | MEDLINE | ID: mdl-38711452

RESUMO

Discovering causal effects is at the core of scientific investigation but remains challenging when only observational data are available. In practice, causal networks are difficult to learn and interpret, and limited to relatively small datasets. We report a more reliable and scalable causal discovery method (iMIIC), based on a general mutual information supremum principle, which greatly improves the precision of inferred causal relations while distinguishing genuine causes from putative and latent causal effects. We showcase iMIIC on synthetic and real-world healthcare data from 396,179 breast cancer patients from the US Surveillance, Epidemiology, and End Results program. More than 90% of predicted causal effects appear correct, while the remaining unexpected direct and indirect causal effects can be interpreted in terms of diagnostic procedures, therapeutic timing, patient preference or socio-economic disparity. iMIIC's unique capabilities open up new avenues to discover reliable and interpretable causal networks across a range of research fields.

2.
Scientometrics ; 127(3): 1609-1642, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35068619

RESUMO

The mapping and analysis of scientific knowledge makes it possible to identify the dynamics and/or growth of a particular field of research or to support strategic decisions related to different research entities, based on bibliometric and/or scientometric indicators. However, with the exponential growth of scientific production, a systematic and data-oriented approach to the analysis of this large set of productions becomes increasingly essential. Thus, in this work, a data-oriented methodology was proposed, combining Data Analysis, Machine Learning and Complex Network Analysis techniques, and Data Version Control (DVC) tool, for the extraction of implicit knowledge in scientific production bases. In addition, the approach was validated through a case study in a COVID-19 manuscripts dataset, which had 199,895 articles published on arXiv, bioRxiv, medRxiv, PubMed and Scopus databases. The results suggest the feasibility of the proposed methodology, indicating the most active countries and the most explored themes in each period of the pandemic. Therefore, this study has the potential to instrument and expand strategic decisions by the scientific community, aiming at extracting knowledge that supports the fight against the COVID-19 pandemic.

3.
Cancers (Basel) ; 13(8)2021 Apr 13.
Artigo em Inglês | MEDLINE | ID: mdl-33924679

RESUMO

Ewing Sarcoma (ES) is a rare malignant tumor occurring most frequently in adolescents and young adults. The ES hallmark is a chromosomal translocation between the chromosomes 11 and 22 that results in an aberrant transcription factor (TF) through the fusion of genes from the FET and ETS families, commonly EWSR1 and FLI1. The regulatory mechanisms behind the ES transcriptional alterations remain poorly understood. Here, we reconstruct the ES regulatory network using public available transcriptional data. Seven TFs were identified as potential MRs and clustered into two groups: one composed by PAX7 and RUNX3, and another composed by ARNT2, CREB3L1, GLI3, MEF2C, and PBX3. The MRs within each cluster act as reciprocal agonists regarding the regulation of shared genes, regulon activity, and implications in clinical outcome, while the clusters counteract each other. The regulons of all the seven MRs were differentially methylated. PAX7 and RUNX3 regulon activity were associated with good prognosis while ARNT2, CREB3L1, GLI3, and PBX3 were associated with bad prognosis. PAX7 and RUNX3 appear as highly expressed in ES biopsies and ES cell lines. This work contributes to the understanding of the ES regulome, identifying candidate MRs, analyzing their methilome and pointing to potential prognostic factors.

4.
Data Brief ; 32: 106178, 2020 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-32837978

RESUMO

COVID-2019 has been recognized as a global threat, and several studies are being conducted in order to contribute to the fight and prevention of this pandemic. This work presents a scholarly production dataset focused on COVID-19, providing an overview of scientific research activities, making it possible to identify countries, scientists and research groups most active in this task force to combat the coronavirus disease. The dataset is composed of 40,212 records of articles' metadata collected from Scopus, PubMed, arXiv and bioRxiv databases from January 2019 to July 2020. Those data were extracted by using the techniques of Python Web Scraping and preprocessed with Pandas Data Wrangling. In addition, the pipeline to preprocess and generate the dataset are versioned with the Data Version Control tool (DVC) and are thus easily reproducible and auditable.

5.
Data Brief ; 31: 105698, 2020 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-32405515

RESUMO

Understanding the COVID-19 pandemic is a multidisciplinary effort that requires a significant number of variables. This dataset comprises (i) sociodemographic characteristics, compiled from 35 datasets obtained at UN Data; (ii) mobility metrics that can assist the analysis of social distancing, from Google Community Mobility Reports and; (iii) daily counts of cases and deaths by COVID-19, from the European Centre for Disease Prevention and Control and the Johns Hopkins University Center for Systems Science and Engineering. This unified dataset ranges from February 15, 2020 to May 7, 2020, a total of 83 days, and is provided as a collection of time series for 131 countries with 192 variables. The pipeline to preprocess and generate the dataset, along with the dataset itself, are versioned with the Data Version Control tool (DVC) and are thus easily reproducible.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA