RESUMO
A heterogeneous disease such as cancer is activated through multiple pathways and different perturbations. Depending upon the activated pathway(s), the survival of the patients varies significantly and shows different efficacy to various drugs. Therefore, cancer subtype detection using genomics level data is a significant research problem. Subtype detection is often a complex problem, and in most cases, needs multi-omics data fusion to achieve accurate subtyping. Different data fusion and subtyping approaches have been proposed over the years, such as kernel-based fusion, matrix factorization, and deep learning autoencoders. In this paper, we compared the performance of different deep learning autoencoders for cancer subtype detection. We performed cancer subtype detection on four different cancer types from The Cancer Genome Atlas (TCGA) datasets using four autoencoder implementations. We also predicted the optimal number of subtypes in a cancer type using the silhouette score and found that the detected subtypes exhibit significant differences in survival profiles. Furthermore, we compared the effect of feature selection and similarity measures for subtype detection. For further evaluation, we used the Glioblastoma multiforme (GBM) dataset and identified the differentially expressed genes in each of the subtypes. The results obtained are consistent with other genomic studies and can be corroborated with the involved pathways and biological functions. Thus, it shows that the results from the autoencoders, obtained through the interaction of different datatypes of cancer, can be used for the prediction and characterization of patient subgroups and survival profiles.
RESUMO
Corynebacterium pseudotuberculosis is a Gram-positive bacterium that causes caseous lymphadenitis, a disease that predominantly affects sheep, goat, cattle, buffalo, and horses, but has also been recognized in other animals. This bacterium generates a severe economic impact on countries producing meat. Gene expression studies using RNA-Seq are one of the most commonly used techniques to perform transcriptional experiments. Computational analysis of such data through reverse-engineering algorithms leads to a better understanding of the genome-wide complexity of gene interactomes, enabling the identification of genes having the most significant functions inferred by the activated stress response pathways. In this study, we identified the influential or causal genes from four RNA-Seq datasets from different stress conditions (high iron, low iron, acid, osmosis, and PH) in C. pseudotuberculosis, using a consensus-based network inference algorithm called miRsigand next identified the causal genes in the network using the miRinfluence tool, which is based on the influence diffusion model. We found that over 50% of the genes identified as influential had some essential cellular functions in the genomes. In the strains analyzed, most of the causal genes had crucial roles or participated in processes associated with the response to extracellular stresses, pathogenicity, membrane components, and essential genes. This research brings new insight into the understanding of virulence and infection by C. pseudotuberculosis.
Assuntos
Infecções por Corynebacterium/genética , Corynebacterium pseudotuberculosis/genética , Linfadenite/genética , RNA-Seq , Animais , Búfalos/microbiologia , Bovinos , Infecções por Corynebacterium/microbiologia , Regulação Bacteriana da Expressão Gênica/genética , Redes Reguladoras de Genes/genética , Cabras/microbiologia , Cavalos/microbiologia , Linfadenite/microbiologia , Linfadenite/veterinária , Ovinos/microbiologiaRESUMO
Bacteria carrying antibiotic resistance genes (ARGs) are naturally prevalent in lotic ecosystems such as rivers. Their ability to spread in anthropogenic waters could lead to the emergence of multidrug-resistant bacteria of clinical importance. For this study, three regions of the Isabela river, an important urban river in the city of Santo Domingo, were evaluated for the presence of ARGs. The Isabela river is surrounded by communities that do not have access to proper sewage systems; furthermore, water from this river is consumed daily for many activities, including recreation and sanitation. To assess the state of antibiotic resistance dissemination in the Isabela river, nine samples were collected from these three bluedistinct sites in June 2019 and isolates obtained from these sites were selected based on resistance to beta-lactams. Physico-chemical and microbiological parameters were in accordance with the Dominican legislation. Matrix-assisted laser desorption ionization-time of flight mass spectrometry analyses of ribosomal protein composition revealed a total of 8 different genera. Most common genera were as follows: Acinetobacter (44.6%) and Escherichia (18%). Twenty clinically important bacterial isolates were identified from urban regions of the river; these belonged to genera Escherichia (n = 9), Acinetobacter (n = 8), Enterobacter (n = 2), and Klebsiella (n = 1). Clinically important multi-resistant isolates were not obtained from rural areas. Fifteen isolates were selected for genome sequencing and analysis. Most isolates were resistant to at least three different families of antibiotics. Among beta-lactamase genes encountered, we found the presence of blaTEM, blaOXA, blaSHV, and blaKPC through both deep sequencing and PCR amplification. Bacteria found from genus Klebsiella and Enterobacter demonstrated ample repertoire of antibiotic resistance genes, including resistance from a family of last resort antibiotics reserved for dire infections: carbapenems. Some of the alleles found were KPC-3, OXA-1, OXA-72, OXA-132, CTX-M-55, CTX-M-15, and TEM-1.
RESUMO
This study developed a computational tool with a graphical interface and a web-service that allows the identification of phage regions through homology search and gene clustering. It uses G+C content variation evaluation and tRNA prediction sites as evidence to reinforce the presence of prophages in indeterminate regions. Also, it performs the functional characterization of the prophages regions through data integration of biological databases. The performance of PhageWeb was compared to other available tools (PHASTER, Prophinder, and PhiSpy) using Sensitivity (Sn) and Positive Predictive Value (PPV) tests. As a reference for the tests, more than 80 manually annotated genomes were used. In the PhageWeb analysis, the Sn index was 86.1% and the PPV was approximately 87%, while the second best tool presented Sn and PPV values of 83.3 and 86.5%, respectively. These numbers allowed us to observe a greater precision in the regions identified by PhageWeb while compared to other prediction tools submitted to the same tests. Additionally, PhageWeb was much faster than the other computational alternatives, decreasing the processing time to approximately one-ninth of the time required by the second best software. PhageWeb is freely available at http://computationalbiology.ufpa.br/phageweb.