RESUMEN
Ewing Sarcoma (ES) is a rare malignant tumor occurring most frequently in adolescents and young adults. The ES hallmark is a chromosomal translocation between the chromosomes 11 and 22 that results in an aberrant transcription factor (TF) through the fusion of genes from the FET and ETS families, commonly EWSR1 and FLI1. The regulatory mechanisms behind the ES transcriptional alterations remain poorly understood. Here, we reconstruct the ES regulatory network using public available transcriptional data. Seven TFs were identified as potential MRs and clustered into two groups: one composed by PAX7 and RUNX3, and another composed by ARNT2, CREB3L1, GLI3, MEF2C, and PBX3. The MRs within each cluster act as reciprocal agonists regarding the regulation of shared genes, regulon activity, and implications in clinical outcome, while the clusters counteract each other. The regulons of all the seven MRs were differentially methylated. PAX7 and RUNX3 regulon activity were associated with good prognosis while ARNT2, CREB3L1, GLI3, and PBX3 were associated with bad prognosis. PAX7 and RUNX3 appear as highly expressed in ES biopsies and ES cell lines. This work contributes to the understanding of the ES regulome, identifying candidate MRs, analyzing their methilome and pointing to potential prognostic factors.
RESUMEN
COVID-2019 has been recognized as a global threat, and several studies are being conducted in order to contribute to the fight and prevention of this pandemic. This work presents a scholarly production dataset focused on COVID-19, providing an overview of scientific research activities, making it possible to identify countries, scientists and research groups most active in this task force to combat the coronavirus disease. The dataset is composed of 40,212 records of articles' metadata collected from Scopus, PubMed, arXiv and bioRxiv databases from January 2019 to July 2020. Those data were extracted by using the techniques of Python Web Scraping and preprocessed with Pandas Data Wrangling. In addition, the pipeline to preprocess and generate the dataset are versioned with the Data Version Control tool (DVC) and are thus easily reproducible and auditable.