RESUMEN
Multiple test corrections are a fundamental step in the analysis of differentially expressed genes, as the number of tests performed would otherwise inflate the false discovery rate (FDR). Recent methods for P-value correction involve a regression model in order to include covariates that are informative of the power of the test. Here, we present Progressive proportions plot (Prog-Plot), a visual tool to identify the functional relationship between the covariate and the proportion of P-values consistent with the null hypothesis. The relationship between the proportion of P-values and the covariate to be included is needed, but there are no available tools to verify it. The approach presented here aims at having an objective way to specify regression models instead of relying on prior knowledge.
RESUMEN
The transcriptomic analysis of microarray and RNA-Seq datasets followed our own bioinformatic pipeline to identify a transcriptional regulatory network of lung cancer. Twenty-six transcription factors are dysregulated and co-expressed in most of the lung cancer and pulmonary arterial hypertension datasets, which makes them the most frequently dysregulated transcription factors. Co-expression, gene regulatory, coregulatory, and transcriptional regulatory networks, along with fibration symmetries, were constructed to identify common connection patterns, alignments, main regulators, and target genes in order to analyze transcription factor complex formation, as well as its synchronized co-expression patterns in every type of lung cancer. The regulatory function of the most frequently dysregulated transcription factors over lung cancer deregulated genes was validated with ChEA3 enrichment analysis. A Kaplan-Meier plotter analysis linked the dysregulation of the top transcription factors with lung cancer patients' survival. Our results indicate that lung cancer has unique and common deregulated genes and transcription factors with pulmonary arterial hypertension, co-expressed and regulated in a coordinated and cooperative manner by the transcriptional regulatory network that might be associated with critical biological processes and signaling pathways related to the acquisition of the hallmarks of cancer, making them potentially relevant tumor biomarkers for lung cancer early diagnosis and targets for the development of personalized therapies against lung cancer.
RESUMEN
BACKGROUND: Lung cancer is the leading cause of cancer death worldwide. It has been reported that genetic and epigenetic factors play a crucial role in the onset and evolution of lung cancer. Previous reports have shown that essential transcription factors in embryonic development contribute to this pathology. Runt-related transcription factor (RUNX) proteins belong to a family of master regulators of embryonic developmental programs. Specifically, RUNX2 is the master transcription factor (TF) of osteoblastic differentiation, and it can be involved in pathological conditions such as prostate, thyroid, and lung cancer by regulating apoptosis and mesenchymal-epithelial transition processes. In this paper, we identified TALAM1 (Metastasis Associated Lung Adenocarcinoma Transcript 1) as a genetic target of the RUNX2 TF in lung cancer and then performed functional validation of the main findings. METHODS: We performed ChIP-seq analysis of tumor samples from a patient diagnosed with lung adenocarcinoma to evaluate the target genes of the RUNX2 TF. In addition, we performed shRNA-mediated knockdown of RUNX2 in this lung adenocarcinoma cell line to confirm the regulatory role of RUNX2 in TALAM1 expression. RESULTS: We observed RUNX2 overexpression in cell lines and primary cultured lung cancer cells. Interestingly, we found that lncRNA TALAM1 was a target of RUNX2 and that RUNX2 exerted a negative regulatory effect on TALAM1 transcription.
RESUMEN
Microbial communities live on macroalgal surfaces. The identity and abundance of the bacteria making these epiphytic communities depend on the macroalgal host and the environmental conditions. Macroalgae rely on epiphytic bacteria for basic functions (spore settlement, morphogenesis, growth, and protection against pathogens). However, these marine bacterial-macroalgal associations are still poorly understood for macroalgae inhabiting the Colombian Caribbean. This study aimed at characterizing the epiphytic bacterial community from macroalgae of the species Ulva lactuca growing in La Punta de la Loma (Santa Marta, Colombia). We conducted a 16S rRNA gene sequencing-based study of these microbial communities sampled twice a year between 2014 and 2016. Within these communities, the Proteobacteria, Bacterioidetes, Cyanobacteria, Deinococcus-Thermus and Actinobacteria were the most abundant phyla. At low taxonomic levels, we found high variability among epiphytic bacteria from U. lactuca and bacterial communities associated with macroalgae from Germany and Australia. We observed differences in the bacterial community composition across years driven by abundance shifts of Rhodobacteraceae Hyphomonadaceae, and Flavobacteriaceae, probably caused by an increase of seawater temperature. Our results support the need for functional studies of the microbiota associated with U. lactuca, a common macroalga in the Colombian Caribbean Sea.
Asunto(s)
Algas Marinas , Ulva , Bacterias/genética , Región del Caribe , Colombia , ARN Ribosómico 16S/genética , Agua de MarRESUMEN
Noncoding RNAs (ncRNAs) play prominent roles in the regulation of gene expression via their interactions with other biological molecules such as proteins and nucleic acids. Although much of our knowledge about how these ncRNAs operate in different biological processes has been obtained from experimental findings, computational biology can also clearly substantially boost this knowledge by suggesting possible novel interactions of these ncRNAs with other molecules. Computational predictions are thus used as an alternative source of new insights through a process of mutual enrichment because the information obtained through experiments continuously feeds through into computational methods. The results of these predictions in turn shed light on possible interactions that are subsequently validated experimentally. This review describes the latest advances in databases, bioinformatic tools, and new in silico strategies that allow the establishment or prediction of biological interactions of ncRNAs, particularly miRNAs and lncRNAs. The ncRNA species described in this work have a special emphasis on those found in humans, but information on ncRNA of other species is also included.
Asunto(s)
Biología Computacional/métodos , ARN no Traducido/genética , ARN no Traducido/metabolismo , Animales , Bases de Datos Genéticas , Expresión Génica , Redes Reguladoras de Genes , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , MicroARNs/análisis , MicroARNs/genética , MicroARNs/metabolismo , ARN Largo no Codificante/análisis , ARN Largo no Codificante/genética , ARN Largo no Codificante/metabolismo , Análisis de Secuencia de ARN/métodosRESUMEN
Co-expression networks may provide insights into the patterns of molecular interactions that underlie cellular processes. To obtain a better understanding of miRNA expression patterns in gastric adenocarcinoma and to provide markers that can be associated with histopathological findings, we performed weighted gene correlation network analysis (WGCNA) and compare it with a supervised analysis. Integrative analysis of target predictions and miRNA expression profiles in gastric cancer samples was also performed. WGCNA identified a module of co-expressed miRNAs that were associated with histological traits and tumor condition. Hub genes were identified based on statistical analysis and network centrality. The miRNAs 100, let-7c, 125b and 99a stood out for their association with the diffuse histological subtype. The 181 miRNA family and miRNA 21 highlighted for their association with the tumoral phenotype. The integrated analysis of miRNA and gene expression profiles showed the let-7 miRNA family playing a central role in the regulatory relationships.
Asunto(s)
Adenocarcinoma/genética , Perfilación de la Expresión Génica/métodos , Redes Reguladoras de Genes , MicroARNs/genética , Neoplasias Gástricas/genética , Adenocarcinoma/patología , Biomarcadores de Tumor/genética , Regulación Neoplásica de la Expresión Génica , Humanos , Neoplasias Gástricas/patología , Aprendizaje Automático SupervisadoRESUMEN
BACKGROUND: The clinical course of chronic lymphocytic leukemia (CLL) is highly variable; some patients follow an indolent course, but others progress to a more advanced stage. The mutational status of rearranged immunoglobulin heavy chain variable (IGVH) genes in CLL is a feature that is widely recognized for dividing patients into groups that are related to their prognoses. However, the regulatory programs associated with the IGVH statuses are poorly understood, and markers that can precisely predict survival outcomes have yet to be identified. METHODS: In this study, (i) we reconstructed gene regulatory networks in CLL by applying an information-theoretic approach to the expression profiles of 5 cohorts. (ii) We applied master regulator analysis (MRA) to these networks to identify transcription factors (TFs) that regulate an IGVH mutational status signature. The IGVH mutational status signature was developed by searching for differentially expressed genes between the IGVH mutational statuses in numerous CLL cohorts. (iii) To evaluate the biological implication of the inferred regulators, prognostic values were determined using time to treatment (TTT) and overall survival (OS) in two different cohorts. RESULTS: A robust IGVH expression signature was obtained, and various TFs emerged as regulators of the signature in most of the reconstructed networks. The TF targets expression profiles exhibited significant differences with respect to survival, which allowed the definition of a reduced profile with a high value for OS. TCF7 and its targets stood out for their roles in progression. CONCLUSION: TFs and their targets, which were obtained merely from inferred regulatory associations, have prognostic implications and reflect a regulatory context for prognosis.
Asunto(s)
Biomarcadores de Tumor , Regulación Leucémica de la Expresión Génica , Redes Reguladoras de Genes , Leucemia Linfocítica Crónica de Células B/genética , Leucemia Linfocítica Crónica de Células B/mortalidad , Biología Computacional/métodos , Bases de Datos de Ácidos Nucleicos , Femenino , Perfilación de la Expresión Génica , Humanos , Cadenas Pesadas de Inmunoglobulina/genética , Región Variable de Inmunoglobulina/genética , Leucemia Linfocítica Crónica de Células B/metabolismo , Masculino , Metaanálisis como Asunto , Mutación , Pronóstico , Factores de Transcripción/genética , Factores de Transcripción/metabolismoRESUMEN
Relationships between genes are best represented using networks constructed from information of different types, with metabolic information being the most valuable and widely used for genetic network reconstruction. Other types of information are usually also available, and it would be desirable to systematically include them in algorithms for network reconstruction. Here, we present an algorithm to construct a global metabolic network that uses all available enzymatic and metabolic information about the organism. We construct a global enzymatic network (GEN) with a total of 4226 nodes (EC numbers) and 42723 edges representing all known metabolic reactions. As an example we use microarray data for Arabidopsis thaliana and combine it with the metabolic network constructing a final gene interaction network for this organism with 8212 nodes (genes) and 4606,901 edges. All scripts are available to be used for any organism for which genomic data is available.
RESUMEN
The main objective of the present study was to reanalyse tomato expression data that was previously submitted to the Tomato Expression Database to dissect the resistance/defence genomic and metabolic responses of tomato to Phytophthora infestans under field conditions. Overrepresented gene sets belonging to chromosome 10 were identified using the Gene Set Enrichment Analysis, and we found that these genes tend to be located towards the end of the chromosome 10. An analysis of syntenic regions between Arabidopsis thaliana chromosomes and the tomato chromosome 10 allowed us to identify conserved regions in the two genomes. In addition to allowing for the identification of tomato candidate genes participating in resistance/defence in the field, this approach allowed us to investigate the relationships of the candidate genes with chromosomal position and participation in metabolic functions, thus offering more insight into the phenomena occurring during the infection process.
Asunto(s)
Cromosomas de las Plantas/genética , Resistencia a la Enfermedad/genética , Genes de Plantas , Phytophthora infestans , Solanum lycopersicum/genética , Arabidopsis/genética , Secuencia Conservada , Bases de Datos Genéticas , Solanum lycopersicum/metabolismo , Solanum lycopersicum/microbiología , Familia de Multigenes , SinteníaRESUMEN
Introduction: Although B-cell acute lymphoblastic leukemia (B-cell ALL) survival rates have improved in recent years, Hispanic children continue to have poorer survival rates. There are few tools available to identify at the time of diagnosis whether the patient will respond to induction therapy. Our goal was to identify predictive biomarkers of treatment response, which could also serve as prognostic biomarkers of death, by identifying methylated and differentially expressed genes between patients with positive minimal residual disease (MRD+) and negative minimal residual disease (MRD-). Methods: DNA and RNA were extracted from tumor blasts separated by immunomagnetic columns. Illumina MethlationEPIC and mRNA sequencing assays were performed on 13 bone marrows from Hispanic children with B-cell ALL. Partek Flow was used for transcript mapping and quantification, followed by differential expression analysis using DEseq2. DNA methylation analyses were performed with Partek Genomic Suite and Genome Studio. Gene expression and differential methylation were compared between patients with MRD-/- and MRD+/+ at the end of induction chemotherapy. Overexpressed and hypomethylated genes were selected and validated by RT-qPCR in samples of an independent validation cohort. The predictive ability of the genes was assessed by logistic regression. Survival and Cox regression analyses were performed to determine the association of genes with death. Results: DAPK1, BOC, CNKSR3, MIR4435-2HG, CTHRC1, NPDC1, SLC45A3, ITGA6, and ASCL2 were overexpressed and hypomethylated in MRD+/+ patients. Overexpression was also validated by RT-qPCR. DAPK1, BOC, ASCL2, and CNKSR3 can predict refractoriness, but MIR4435-2HG is the best predictor. Additionally, higher expression of MIR4435-2HG increases the probability of non-response, death, and the risk of death. Finally, MIR4435-2HG overexpression, together with MRD+, are associated with poorer survival, and together with overexpression of DAPK1 and ASCL2, it could improve the risk classification of patients with normal karyotype. Conclusion: MIR4435-2HG is a potential predictive biomarker of treatment response and death in children with B-cell ALL.
RESUMEN
Publicly available genomic data are a great source of biological knowledge that can be extracted when appropriate data analysis is used. Predicting the biological function of genes is of interest to understand molecular mechanisms of virulence and resistance in pathogens and hosts and is important for drug discovery and disease control. This is commonly done by searching for similar gene expression behavior. Here, we used publicly available Streptococcus pyogenes microarray data obtained during primate infection to identify genes that have a potential influence on virulence and Phytophtora infestance inoculated tomato microarray data to identify genes potentially implicated in resistance processes. This approach goes beyond co-expression analysis. We employed a quasi-likelihood model separated by primate gender/inoculation condition to model median gene expression of known virulence/resistance factors. Based on this model, an influence analysis considering time course measurement was performed to detect genes with atypical expression. This procedure allowed for the detection of genes potentially implicated in the infection process. Finally, we discuss the biological meaning of these results, showing that influence analysis is an efficient and useful alternative for functional gene prediction.
Asunto(s)
Perfilación de la Expresión Génica , Solanum lycopersicum/genética , Infecciones Estreptocócicas/genética , Streptococcus pyogenes/patogenicidad , Algoritmos , Animales , Biología Computacional/métodos , Femenino , Genómica , Funciones de Verosimilitud , Solanum lycopersicum/inmunología , Solanum lycopersicum/microbiología , Masculino , Enfermedades de las Plantas/genética , Enfermedades de las Plantas/inmunología , Enfermedades de las Plantas/microbiología , Primates , Infecciones Estreptocócicas/inmunología , Infecciones Estreptocócicas/microbiología , Streptococcus pyogenes/genética , Streptococcus pyogenes/inmunología , Factores de Virulencia/genéticaRESUMEN
Interesting biological information as, for example, gene expression data (microarrays), can be extracted from publicly available genomic data. As a starting point in order to narrow down the great possibilities of wet lab experiments, global high throughput data and available knowledge should be used to infer biological knowledge and emit biological hypothesis. Here, based on microarray data, we propose the use of cluster and classification methods that have become very popular and are implemented in freely available software in order to predict the participation in virulence mechanisms of different proteins coded by genes of the pathogen Streptococcus pyogenes. Confidence of predictions is based on classification errors of known genes and repetitive prediction by more than three methods. A special emphasis is done on the nonlinear kernel classification methods used. We propose a list of interesting candidates that could be virulence factors or that participate in the virulence process of S. pyogenes. Biological validations should start using this list of candidates as they show similar behavior to known virulence factors.
Asunto(s)
Biología Computacional/métodos , Streptococcus pyogenes/genética , Streptococcus pyogenes/patogenicidad , Transcriptoma , Factores de Virulencia/genética , Proteínas Bacterianas/genética , Análisis por Conglomerados , Análisis por Micromatrices , Streptococcus pyogenes/clasificaciónRESUMEN
Exosomes carry molecules of great biological and clinical interest, such as miRNAs. The contents of exosomes vary between healthy controls and cancer patients. Therefore, miRNAs and other molecules transported in exosomes are considered a potential source of diagnostic and prognostic biomarkers in cancer. Many miRNAs have been detected in recent years. Consequently, a substantial amount of miRNA-related data comparing patients and healthy individuals is available, which contributes to a better understanding of the initiation, development, malignancy, and metastasis of cancer using non-invasive sampling procedures. However, a re-analysis of available ncRNA data is rare. This study used available data about miRNAs in exosomes comparing healthy individuals and cancer patients to identify possible global changes related to the presence of cancer. A robust transcriptomic analysis identified two common miRNAs (miR-495-3p and miR-543) deregulated in five cancer datasets. They had already been implicated in different cancers but not reported in exosomes circulating in blood. The study also examined their target genes and the implications of these genes for functional processes.
RESUMEN
Potato (Solanum tuberosum L.) is the third largest source of antioxidants in the human diet, after maize and tomato. Potato landraces have particularly diverse contents of antioxidant compounds such as anthocyanins. We used this diversity to study the evolutionary and genetic basis of anthocyanin pigmentation. Specifically, we analyzed the transcriptomes and anthocyanin content of tubers from 37 landraces with different colorations. We conducted analyses of differential expression between potatoes with different colorations and used weighted correlation network analysis to identify genes whose expression is correlated to anthocyanin content across landraces. A very significant fraction of the genes identified in these two analyses had annotations related to the flavonoid-anthocyanin biosynthetic pathway, including 18 enzymes and 5 transcription factors. Importantly, the causal genes at the D, P and R loci governing anthocyanin accumulation in potato cultivars also showed correlations to anthocyanin production in the landraces studied here. Furthermore, we found that 60% of the genes identified in our study were located within anthocyanin QTLs. Finally, we identified new candidate enzymes and transcription factors that could have driven the diversification of anthocyanins. Our results indicate that many anthocyanins biosynthetic genes were manipulated in ancestral potato breeding and can be used in future breeding programs.
Asunto(s)
Solanum tuberosum , Solanum , Antocianinas/metabolismo , Antioxidantes/metabolismo , Flavonoides/metabolismo , Regulación de la Expresión Génica de las Plantas , Humanos , Fitomejoramiento , RNA-Seq , Solanum/genética , Solanum tuberosum/genética , Solanum tuberosum/metabolismo , Factores de Transcripción/metabolismoRESUMEN
The bioinformatic pipeline previously developed in our research laboratory is used to identify potential general and specific deregulated tumor genes and transcription factors related to the establishment and progression of tumoral diseases, now comparing lung cancer with other two types of cancer. Twenty microarray datasets were selected and analyzed separately to identify hub differentiated expressed genes and compared to identify all the deregulated genes and transcription factors in common between the three types of cancer and those unique to lung cancer. The winning DEGs analysis allowed to identify an important number of TFs deregulated in the majority of microarray datasets, which can become key biomarkers of general tumors and specific to lung cancer. A coexpression network was constructed for every dataset with all deregulated genes associated with lung cancer, according to DAVID's tool enrichment analysis, and transcription factors capable of regulating them, according to oPOSSUM´s tool. Several genes and transcription factors are coexpressed in the networks, suggesting that they could be related to the establishment or progression of the tumoral pathology in any tissue and specifically in the lung. The comparison of the coexpression networks of lung cancer and other types of cancer allowed the identification of common connectivity patterns with deregulated genes and transcription factors correlated to important tumoral processes and signaling pathways that have not been studied yet to experimentally validate their role in lung cancer. The Kaplan-Meier estimator determined the association of thirteen deregulated top winning transcription factors with the survival of lung cancer patients. The coregulatory analysis identified two top winning transcription factors networks related to the regulatory control of gene expression in lung and breast cancer. Our transcriptomic analysis suggests that cancer has an important coregulatory network of transcription factors related to the acquisition of the hallmarks of cancer. Moreover, lung cancer has a group of genes and transcription factors unique to pulmonary tissue that are coexpressed during tumorigenesis and must be studied experimentally to fully understand their role in the pathogenesis within its very complex transcriptomic scenario. Therefore, the downstream bioinformatic analysis developed was able to identify a coregulatory metafirm of cancer in general and specific to lung cancer taking into account the great heterogeneity of the tumoral process at cellular and population levels.
RESUMEN
The use of a new bioinformatics pipeline allowed the identification of deregulated transcription factors (TFs) coexpressed in lung cancer that could become biomarkers of tumor establishment and progression. A gene regulatory network (GRN) of lung cancer was created with the normalized gene expression levels of differentially expressed genes (DEGs) from the microarray dataset GSE19804. Moreover, coregulatory and transcriptional regulatory network (TRN) analyses were performed for the main regulators identified in the GRN analysis. The gene targets and binding motifs of all potentially implicated regulators were identified in the TRN and with multiple alignments of the TFs' target gene sequences. Six transcription factors (E2F3, FHL2, ETS1, KAT6B, TWIST1, and RUNX2) were identified in the GRN as essential regulators of gene expression in non-small-cell lung cancer (NSCLC) and related to the lung tumoral process. Our findings indicate that RUNX2 could be an important regulator of the lung cancer GRN through the formation of coregulatory complexes with other TFs related to the establishment and progression of lung cancer. Therefore, RUNX2 could become an essential biomarker for developing diagnostic tools and specific treatments against tumoral diseases in the lung after the experimental validation of its regulatory function.
RESUMEN
AIM: This study aims to investigate similarities and differences using lncRNA and mRNA coexpression network analysis in African ancestry (AA) and European ancestry (EA) among prostate cancer (PCa) patients. METHODS: We performed weighted gene coexpression network analysis of the expression from 49 of AA and 49 of EA to identify lncRNAs-mRNAs. RESULTS: 27 lncRNAs and 36 mRNAs were highly expressed in patients of AA. Two mRNAs and their antisense lncRNAs were expressed. Additionally, seven mRNAs were DE or coexpressed and had an impact on survival. CONCLUSION: We present a list of lncRNAs and mRNAs that were DE and coexpressed when comparing patients of AA and EA, and these data are a resource for future studies to understand the role of lncRNAs.
RESUMEN
The incidence of patients under 55 years old diagnosed with Prostate Cancer (EO-PCa) has increased during recent years. The molecular biology of PCa cancer in this group of patients remains unclear. Here, we applied weighted gene coexpression network analysis of the expression of miRNAs from 24 EO-PCa patients (38-45 years) and 25 late-onset PCa patients (LO-PCa, 71-74 years) to identify key miRNAs in EO-PCa patients. In total, 69 differentially expressed miRNAs were identified. Specifically, 26 and 14 miRNAs were exclusively deregulated in young and elderly patients, respectively, and 29 miRNAs were shared. We identified 20 hub miRNAs for the network built for EO-PCa. Six of these hub miRNAs exhibited prognostic significance in relapse-free or overall survival. Additionally, two of the hub miRNAs were coexpressed with mRNAs of genes previously identified as deregulated in EO-PCa and in the most aggressive forms of PCa in African-American patients compared with Caucasian patients. These genes are involved in activation of immune response pathways, increased rates of metastasis and poor prognosis in PCa patients. In conclusion, our analysis identified miRNAs that are potentially important in the molecular pathology of EO-PCa. These genes may serve as biomarkers in EO-PCa and as possible therapeutic targets.
Asunto(s)
Biomarcadores de Tumor , Regulación Neoplásica de la Expresión Génica , MicroARNs , Neoplasias de la Próstata , ARN Neoplásico , Adulto , Negro o Afroamericano , Anciano , Biomarcadores de Tumor/biosíntesis , Biomarcadores de Tumor/genética , Perfilación de la Expresión Génica , Humanos , Masculino , MicroARNs/biosíntesis , MicroARNs/genética , Persona de Mediana Edad , Neoplasias de la Próstata/genética , Neoplasias de la Próstata/metabolismo , Neoplasias de la Próstata/patología , ARN Neoplásico/biosíntesis , ARN Neoplásico/genética , Población BlancaRESUMEN
BACKGROUND: Analysis of patients with chromosomal abnormalities, including Turner syndrome and Klinefelter syndrome, has highlighted the importance of X-linked gene dosage as a contributing factor for disease susceptibility. Escape from X-inactivation and X-linked imprinting can result in transcriptional differences between normal men and women as well as in patients with sex chromosome abnormalities. OBJECTIVE: To identify differentially expressed genes among patients with Turner (45,X) and Klinefelter (46,XXY) syndrome using bioinformatics analysis. METHODOLOGY: Two gene expression data sets of Turner (45,X) and Klinefelter syndrome (47,XXY) were obtained from the Gene Omnibus Expression (GEO) database of the National Center for Biotechnology Information (NCBI). Statistical analysis was performed using R Bioconductor libraries. Differentially expressed genes (DEGs) were determined using significance analysis of microarray (SAM). The functional annotation of the DEGs was performed with DAVID v6.8 (The Database for Annotation, Visualization, and Integrated Discovery). RESULTS: There are no genes over-expressed simultaneously in both diseases. However, when crossing the list of under-expressed genes for 45,X cells and the list of over-expressed genes for 47,XXY cells, there are 16 common genes: SLC25A6, AKAP17A, ASMTL, KDM5C, KDM6A, ATRX, CSF2RA, DHRSX, CD99, ZBED1, EIF1AX, MVB12B, SMC1A, P2RY8, DOCK7, DDX3X, eight of which are involved in the regulation of gene expression by epigenetic mechanisms, regulation of splicing processes and protein synthesis. CONCLUSION: Of the 16 identified as under-expressed in 45,X cells and over-expressed in 47,XXY cells, 14 are located in X chromosome and 2 in autosomal chromosome; 8 of these genes are involved in the regulation of gene expression: 5 genes are related to epigenetic mechanisms, 2 in regulation of splicing processes, and 1 in the protein synthesis process. Our results are limited by it being the product of a bioinformatic analysis from mRNA isolated from whole blood, this makes necessary further exploration of the relationships between these genes and Turner syndrome and Klinefelter syndrome in the future.
Asunto(s)
Síndrome de Klinefelter/genética , Transcriptoma , Síndrome de Turner/genética , Ensamble y Desensamble de Cromatina , Metilación de ADN , Epigénesis Genética , Perfilación de la Expresión Génica , Sitios Genéticos , Humanos , Síndrome de Klinefelter/metabolismo , Empalme del ARN , Síndrome de Turner/metabolismo , Regulación hacia ArribaRESUMEN
Microarray technology is widely recognized as one of the most important tools when it comes to understanding genetic expression in biological processes. In light of the thousands of gene expression level measurements (including measurements across a number of conditions), identifying differentially expressed genes necessarily implies data mining or large-scale multiple testing procedures. To date, advances with regard to this field have been multivariate-descriptive or inferential-univariate in nature and therefore have important limitations regarding the biological validity of detected genes. In the present article, we present a new multivariate inferential method designed to detect active differentially expressed genes in gene expression data. The proposed method estimates false discovery rates using artificial components. Our method excels when applied to the most common gene expression data structures, providing new insights into differentially expressed genes. The method described herein was programmed in an R-Bioconductor package called acde that has been available since 2015.