Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 33
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
País de afiliación
Intervalo de año de publicación
1.
Cell ; 148(6): 1293-307, 2012 Mar 16.
Artículo en Inglés | MEDLINE | ID: mdl-22424236

RESUMEN

Personalized medicine is expected to benefit from combining genomic information with regular monitoring of physiological states by multiple high-throughput methods. Here, we present an integrative personal omics profile (iPOP), an analysis that combines genomic, transcriptomic, proteomic, metabolomic, and autoantibody profiles from a single individual over a 14 month period. Our iPOP analysis revealed various medical risks, including type 2 diabetes. It also uncovered extensive, dynamic changes in diverse molecular components and biological pathways across healthy and diseased conditions. Extremely high-coverage genomic and transcriptomic data, which provide the basis of our iPOP, revealed extensive heteroallelic changes during healthy and diseased states and an unexpected RNA editing mechanism. This study demonstrates that longitudinal iPOP can be used to interpret healthy and diseased states by connecting genomic information with additional dynamic omics activity.


Asunto(s)
Genoma Humano , Genómica , Medicina de Precisión , Diabetes Mellitus Tipo 2/genética , Femenino , Perfilación de la Expresión Génica , Humanos , Masculino , Metabolómica , Persona de Mediana Edad , Mutación , Proteómica , Virus Sincitiales Respiratorios/aislamiento & purificación , Rhinovirus/aislamiento & purificación
2.
Bioinformatics ; 36(7): 2306-2307, 2020 04 01.
Artículo en Inglés | MEDLINE | ID: mdl-31778155

RESUMEN

SUMMARY: PyIOmica is an open-source Python package focusing on integrating longitudinal multiple omics datasets, characterizing and categorizing temporal trends. The package includes multiple bioinformatics tools including data normalization, annotation, categorization, visualization and enrichment analysis for gene ontology terms and pathways. Additionally, the package includes an implementation of visibility graphs to visualize time series as networks. AVAILABILITY AND IMPLEMENTATION: PyIOmica is implemented as a Python package (pyiomica), available for download and installation through the Python Package Index (https://pypi.python.org/pypi/pyiomica), and can be deployed using the Python import function following installation. PyIOmica has been tested on Mac OS X, Unix/Linux and Microsoft Windows. The application is distributed under an MIT license. Source code for each release is also available for download on Zenodo (https://doi.org/10.5281/zenodo.3548040). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics.


Asunto(s)
Biología Computacional , Programas Informáticos , Ontología de Genes
3.
J Allergy Clin Immunol ; 132(3): 656-664.e17, 2013 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-23830146

RESUMEN

BACKGROUND: Combined immunodeficiency with multiple intestinal atresias (CID-MIA) is a rare hereditary disease characterized by intestinal obstructions and profound immune defects. OBJECTIVE: We sought to determine the underlying genetic causes of CID-MIA by analyzing the exomic sequences of 5 patients and their healthy direct relatives from 5 unrelated families. METHODS: We performed whole-exome sequencing on 5 patients with CID-MIA and 10 healthy direct family members belonging to 5 unrelated families with CID-MIA. We also performed targeted Sanger sequencing for the candidate gene tetratricopeptide repeat domain 7A (TTC7A) on 3 additional patients with CID-MIA. RESULTS: Through analysis and comparison of the exomic sequence of the subjects from these 5 families, we identified biallelic damaging mutations in the TTC7A gene, for a total of 7 distinct mutations. Targeted TTC7A gene sequencing in 3 additional unrelated patients with CID-MIA revealed biallelic deleterious mutations in 2 of them, as well as an aberrant splice product in the third patient. Staining of normal thymus showed that the TTC7A protein is expressed in thymic epithelial cells, as well as in thymocytes. Moreover, severe lymphoid depletion was observed in the thymus and peripheral lymphoid tissues from 2 patients with CID-MIA. CONCLUSIONS: We identified deleterious mutations of the TTC7A gene in 8 unrelated patients with CID-MIA and demonstrated that the TTC7A protein is expressed in the thymus. Our results strongly suggest that TTC7A gene defects cause CID-MIA.


Asunto(s)
Síndromes de Inmunodeficiencia/genética , Atresia Intestinal/genética , Intestinos/anomalías , Proteínas/genética , Animales , Preescolar , Exoma/genética , Femenino , Humanos , Lactante , Recién Nacido , Masculino , Ratones , Mutación , Análisis de Secuencia por Matrices de Oligonucleótidos , ARN Mensajero/metabolismo , Timo/metabolismo , Análisis de Matrices Tisulares
4.
Life Sci Alliance ; 7(7)2024 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-38724194

RESUMEN

NUT carcinoma (NC) is an aggressive cancer with no effective treatment. About 70% of NUT carcinoma is associated with chromosome translocation events that lead to the formation of a BRD4::NUTM1 fusion gene. Because the BRD4::NUTM1 gene is unequivocally cytotoxic when ectopically expressed in cell lines, questions remain on whether the fusion gene can initiate NC. Here, we report the first genetically engineered mouse model for NUT carcinoma that recapitulates the human t(15;19) chromosome translocation in mice. We demonstrated that the mouse t(2;17) syntenic chromosome translocation, forming the Brd4::Nutm1 fusion gene, could induce aggressive carcinomas in mice. The tumors present histopathological and molecular features similar to human NC, with enrichment of undifferentiated cells. Similar to the reports of human NC incidence, Brd4::Nutm1 can induce NC from a broad range of tissues with a strong phenotypical variability. The consistent induction of poorly differentiated carcinoma demonstrated a strong reprogramming activity of BRD4::NUTM1. The new mouse model provided a critical preclinical model for NC that will lead to better understanding and therapy development for NC.


Asunto(s)
Proteínas que Contienen Bromodominio , Proteínas de Neoplasias , Proteínas Nucleares , Proteínas de Fusión Oncogénica , Factores de Transcripción , Animales , Ratones , Carcinoma/genética , Carcinoma/metabolismo , Proteínas de Ciclo Celular/genética , Proteínas de Ciclo Celular/metabolismo , Modelos Animales de Enfermedad , Proteínas de Neoplasias/genética , Proteínas de Neoplasias/metabolismo , Proteínas Nucleares/genética , Proteínas Nucleares/metabolismo , Proteínas de Fusión Oncogénica/genética , Factores de Transcripción/genética , Factores de Transcripción/metabolismo , Translocación Genética/genética
5.
J Proteome Res ; 12(1): 45-57, 2013 Jan 04.
Artículo en Inglés | MEDLINE | ID: mdl-23259914

RESUMEN

We report progress assembling the parts list for chromosome 17 and illustrate the various processes that we have developed to integrate available data from diverse genomic and proteomic knowledge bases. As primary resources, we have used GPMDB, neXtProt, PeptideAtlas, Human Protein Atlas (HPA), and GeneCards. All sites share the common resource of Ensembl for the genome modeling information. We have defined the chromosome 17 parts list with the following information: 1169 protein-coding genes, the numbers of proteins confidently identified by various experimental approaches as documented in GPMDB, neXtProt, PeptideAtlas, and HPA, examples of typical data sets obtained by RNASeq and proteomic studies of epithelial derived tumor cell lines (disease proteome) and a normal proteome (peripheral mononuclear cells), reported evidence of post-translational modifications, and examples of alternative splice variants (ASVs). We have constructed a list of the 59 "missing" proteins as well as 201 proteins that have inconclusive mass spectrometric (MS) identifications. In this report we have defined a process to establish a baseline for the incorporation of new evidence on protein identification and characterization as well as related information from transcriptome analyses. This initial list of "missing" proteins that will guide the selection of appropriate samples for discovery studies as well as antibody reagents. Also we have illustrated the significant diversity of protein variants (including post-translational modifications, PTMs) using regions on chromosome 17 that contain important oncogenes. We emphasize the need for mandated deposition of proteomics data in public databases, the further development of improved PTM, ASV, and single nucleotide variant (SNV) databases, and the construction of Web sites that can integrate and regularly update such information. In addition, we describe the distribution of both clustered and scattered sets of protein families on the chromosome. Since chromosome 17 is rich in cancer-associated genes, we have focused the clustering of cancer-associated genes in such genomic regions and have used the ERBB2 amplicon as an example of the value of a proteogenomic approach in which one integrates transcriptomic with proteomic information and captures evidence of coexpression through coordinated regulation.


Asunto(s)
Cromosomas Humanos Par 17 , Genoma Humano , Proteínas , Proteómica , Secuencia de Aminoácidos , Cromosomas Humanos Par 17/genética , Cromosomas Humanos Par 17/metabolismo , Bases de Datos de Proteínas , Expresión Génica , Proyecto Genoma Humano , Humanos , Proteínas/clasificación , Proteínas/genética , Proteínas/metabolismo
6.
bioRxiv ; 2023 Mar 18.
Artículo en Inglés | MEDLINE | ID: mdl-36993537

RESUMEN

From the early days of spaceflight to current missions, astronauts continue to be exposed to multiple hazards that affect human health, including low gravity, high radiation, isolation during long-duration missions, a closed environment and distance from Earth. Their effects can lead to adverse physiological changes and necessitate countermeasure development and/or longitudinal monitoring. A time-resolved analysis of biological signals can detect and better characterize potential adverse events during spaceflight, ideally preventing them and maintaining astronauts' wellness. Here we provide a time-resolved assessment of the impact of spaceflight on multiple astronauts (n=27) by studying multiple biochemical and immune measurements before, during, and after long-duration orbital spaceflight. We reveal space-associated changes of astronauts' physiology on both the individual level and across astronauts, including associations with bone resorption and kidney function, as well as immune-system dysregulation.

7.
Front Physiol ; 14: 1219221, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37520819

RESUMEN

From the early days of spaceflight to current missions, astronauts continue to be exposed to multiple hazards that affect human health, including low gravity, high radiation, isolation during long-duration missions, a closed environment and distance from Earth. Their effects can lead to adverse physiological changes and necessitate countermeasure development and/or longitudinal monitoring. A time-resolved analysis of biological signals can detect and better characterize potential adverse events during spaceflight, ideally preventing them and maintaining astronauts' wellness. Here we provide a time-resolved assessment of the impact of spaceflight on multiple astronauts (n = 27) by studying multiple biochemical and immune measurements before, during, and after long-duration orbital spaceflight. We reveal space-associated changes of astronauts' physiology on both the individual level and across astronauts, including associations with bone resorption and kidney function, as well as immune-system dysregulation.

8.
Sci Rep ; 12(1): 12098, 2022 07 15.
Artículo en Inglés | MEDLINE | ID: mdl-35840765

RESUMEN

Longitudinal deep multiomics profiling, which combines biomolecular, physiological, environmental and clinical measures data, shows great promise for precision health. However, integrating and understanding the complexity of such data remains a big challenge. Here we utilize an individual-focused bottom-up approach aimed at first assessing single individuals' multiomics time series, and using the individual-level responses to assess multi-individual grouping based directly on similarity of their longitudinal deep multiomics profiles. We used this individual-focused approach to analyze profiles from a study profiling longitudinal responses in type 2 diabetes mellitus. After generating periodograms for individual subject omics signals, we constructed within-person omics networks and analyzed personal-level immune changes. The results identified both individual-level responses to immune perturbation, and the clusters of individuals that have similar behaviors in immune response and which were associated to measures of their diabetic status.


Asunto(s)
Diabetes Mellitus Tipo 2 , Estado Prediabético , Diabetes Mellitus Tipo 2/genética , Humanos , Estado Prediabético/genética
9.
Front Genet ; 13: 1026487, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36324501

RESUMEN

Differential Network (DN) analysis is a method that has long been used to interpret changes in gene expression data and provide biological insights. The method identifies the rewiring of gene networks in response to external perturbations. Our study applies the DN method to the analysis of RNA-sequencing (RNA-seq) time series datasets. We focus on expression changes: (i) in saliva of a human subject after pneumococcal vaccination (PPSV23) and (ii) in primary B cells treated ex vivo with a monoclonal antibody drug (Rituximab). The DN method enabled us to identify the activation of biological pathways consistent with the mechanisms of action of the PPSV23 vaccine and target pathways of Rituximab. The community detection algorithm on the DN revealed clusters of genes characterized by collective temporal behavior. All saliva and some B cell DN communities showed characteristic time signatures, outlining a chronological order in pathway activation in response to the perturbation. Moreover, we identified early and delayed responses within network modules in the saliva dataset and three temporal patterns in the B cell data.

10.
iScience ; 25(2): 103742, 2022 Feb 18.
Artículo en Inglés | MEDLINE | ID: mdl-35128353

RESUMEN

Recent clinical studies report that chromosomal 12q24.31 microdeletions are associated with autism spectrum disorder (ASD) and intellectual disability (ID). However, the causality and underlying mechanisms linking 12q24.31 microdeletions to ASD/ID remain undetermined. Here we show Kdm2b, one gene located in chromosomal 12q24.31, plays a critical role in maintaining neural stem cells (NSCs) in the mouse brain. Loss of the CxxC-ZF domain of KDM2B impairs its function in recruiting Polycomb repressive complex 1 (PRC1) to chromatin, resulting in de-repression of genes involved in cell apoptosis, cell-cycle arrest, NSC senescence, and loss of NSC populations in the brain. Of importance, the Kdm2b mutation is sufficient to induce ASD/ID-like behavioral and memory deficits. Thus, our study reveals a critical role of KDM2B in normal brain development, a causality between the Kdm2b mutation and ASD/ID-like phenotypes in mice, and potential molecular mechanisms linking the function of KDM2B-PRC1 in transcriptional regulation to the 12q24.31 microdeletion-associated ASD/ID.

11.
Sci Rep ; 11(1): 5623, 2021 03 11.
Artículo en Inglés | MEDLINE | ID: mdl-33707481

RESUMEN

Temporal behavior is an essential aspect of all biological systems. Time series have been previously represented as networks. Such representations must address two fundamental problems on how to: (1) Create appropriate networks to reflect the characteristics of biological time series. (2) Detect characteristic dynamic patterns or events as network temporal communities. General community detection methods use metrics comparing the connectivity within a community to random models, or are based on the betweenness centrality of edges or nodes. However, such methods were not designed for network representations of time series. We introduce a visibility-graph-based method to build networks from time series and detect temporal communities within these networks. To characterize unevenly sampled time series (typical of biological experiments), and simultaneously capture events associated to peaks and troughs, we introduce the Weighted Dual-Perspective Visibility Graph (WDPVG). To detect temporal communities in individual signals, we first find the shortest path of the network between start and end nodes, identifying high intensity nodes as the main stem of our community detection algorithm that act as hubs for each community. Then, we aggregate nodes outside the shortest path to the closest nodes found on the main stem based on the closest path length, thereby assigning every node to a temporal community based on proximity to the stem nodes/hubs. We demonstrate the validity and effectiveness of our method through simulation and biological applications.


Asunto(s)
Algoritmos , Características de la Residencia , Simulación por Computador , Bases de Datos como Asunto , Humanos , Factores de Tiempo
12.
Front Oncol ; 11: 754093, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34692539

RESUMEN

ASH1L and MLL1 are two histone methyltransferases that facilitate transcriptional activation during normal development. However, the roles of ASH1L and its enzymatic activity in the development of MLL-rearranged leukemias are not fully elucidated in Ash1L gene knockout animal models. In this study, we used an Ash1L conditional knockout mouse model to show that loss of ASH1L in hematopoietic progenitor cells impaired the initiation of MLL-AF9-induced leukemic transformation in vitro. Furthermore, genetic deletion of ASH1L in the MLL-AF9-transformed cells impaired the maintenance of leukemic cells in vitro and largely blocked the leukemia progression in vivo. Importantly, the loss of ASH1L function in the Ash1L-deleted cells could be rescued by wild-type but not the catalytic-dead mutant ASH1L, suggesting the enzymatic activity of ASH1L was required for its function in promoting MLL-AF9-induced leukemic transformation. At the molecular level, ASH1L enhanced the MLL-AF9 target gene expression by directly binding to the gene promoters and modifying the local histone H3K36me2 levels. Thus, our study revealed the critical functions of ASH1L in promoting the MLL-AF9-induced leukemogenesis, which provides a molecular basis for targeting ASH1L and its enzymatic activity to treat MLL-AF9-induced leukemias.

13.
Commun Biol ; 4(1): 756, 2021 06 18.
Artículo en Inglés | MEDLINE | ID: mdl-34145365

RESUMEN

Autism spectrum disorder (ASD) is a neurodevelopmental disease associated with various gene mutations. Recent genetic and clinical studies report that mutations of the epigenetic gene ASH1L are highly associated with human ASD and intellectual disability (ID). However, the causality and underlying molecular mechanisms linking ASH1L mutations to genesis of ASD/ID remain undetermined. Here we show loss of ASH1L in the developing mouse brain is sufficient to cause multiple developmental defects, core autistic-like behaviors, and impaired cognitive memory. Gene expression analyses uncover critical roles of ASH1L in regulating gene expression during neural cell development. Thus, our study establishes an ASD/ID mouse model revealing the critical function of an epigenetic factor ASH1L in normal brain development, a causality between Ash1L mutations and ASD/ID-like behaviors in mice, and potential molecular mechanisms linking Ash1L mutations to brain functional abnormalities.


Asunto(s)
Trastorno del Espectro Autista/genética , Encéfalo/crecimiento & desarrollo , Encéfalo/metabolismo , Proteínas de Unión al ADN/genética , N-Metiltransferasa de Histona-Lisina/genética , Discapacidad Intelectual/genética , Animales , Trastorno del Espectro Autista/metabolismo , Modelos Animales de Enfermedad , Desarrollo Embrionario/genética , Humanos , Ratones , Ratones Endogámicos C57BL , Ratones Noqueados
14.
Sci Rep ; 11(1): 710, 2021 01 12.
Artículo en Inglés | MEDLINE | ID: mdl-33436912

RESUMEN

Saliva omics has immense potential for non-invasive diagnostics, including monitoring very young or elderly populations, or individuals in remote locations. In this study, multiple saliva omics from an individual were monitored over three periods (100 timepoints) involving: (1) hourly sampling over 24 h without intervention, (2) hourly sampling over 24 h including immune system activation using the standard 23-valent pneumococcal polysaccharide vaccine, (3) daily sampling for 33 days profiling the post-vaccination response. At each timepoint total saliva transcriptome and proteome, and small RNA from salivary extracellular vesicles were profiled, including mRNA, miRNA, piRNA and bacterial RNA. The two 24-h periods were used in a paired analysis to remove daily variation and reveal vaccination responses. Over 18,000 omics longitudinal series had statistically significant temporal trends compared to a healthy baseline. Various immune response and regulation pathways were activated following vaccination, including interferon and cytokine signaling, and MHC antigen presentation. Immune response timeframes were concordant with innate and adaptive immunity development, and coincided with vaccination and reported fever. Overall, mRNA results appeared more specific and sensitive (timewise) to vaccination compared to other omics. The results suggest saliva omics can be consistently assessed for non-invasive personalized monitoring and immune response diagnostics.


Asunto(s)
Infecciones Neumocócicas/inmunología , Vacunas Neumococicas/administración & dosificación , Proteoma/efectos de los fármacos , Saliva/metabolismo , Sinusitis/inmunología , Streptococcus pneumoniae/inmunología , Transcriptoma/efectos de los fármacos , Adulto , Humanos , Inmunidad , Estudios Longitudinales , Masculino , Infecciones Neumocócicas/tratamiento farmacológico , Infecciones Neumocócicas/microbiología , Saliva/efectos de los fármacos , Sinusitis/tratamiento farmacológico , Sinusitis/microbiología , Factores de Tiempo , Vacunación
15.
Curr Protoc Bioinformatics ; 69(1): e91, 2020 03.
Artículo en Inglés | MEDLINE | ID: mdl-31851777

RESUMEN

MathIOmica is a package for bioinformatics, written in the Wolfram language, that provides multiple utilities to facilitate the analysis of longitudinal data generated from omics experiments, including transcriptomics, proteomics, and metabolomics data, as well as any generalized time series. MathIOmica uses Mathematica's notebook interface, wherein users can import longitudinal datasets, carry out quality control and normalization, generate time series, and classify temporal trends. MathIOmica provides spectral methods based on periodograms and autocorrelations for automatically detecting classes of temporal behavior and allowing the user to visualize collective temporal behavior, and also assess biological significance through Gene Ontology and pathway enrichment analyses. MathIOmica's time-series classification methods address common issues including missing data and uneven sampling in measurements. As such, the software is ideally suited for the analysis of experimental data from individualized profiling of subjects, can facilitate analysis of data from the emerging field of individualized health monitoring, and can detect temporal trends that may be associated with adverse health events. In this article, we import a transcriptomics (RNA-sequencing) dataset collected over multiple timepoints and generate time series for each transcript represented in the data. We classify the time series to identify classes of significant temporal trends (using autocorrelations). We assess statistical significance cutoffs in the classification by generating null distributions using randomly resampled time series. We then visualize the significant trends in heatmaps and assess biological significance using enrichment analyses. Finally, we visualize pathway results for statistically significant pathways of interest. © 2019 by John Wiley & Sons, Inc. Basic Protocol: Time series analysis of transcriptomics expression dataset.


Asunto(s)
Bases de Datos Factuales , Genómica/métodos , Programas Informáticos , Regulación de la Expresión Génica , Humanos , FN-kappa B/metabolismo , Necroptosis/genética , Transducción de Señal , Factores de Tiempo , Transcriptoma/genética
16.
Front Genet ; 11: 700, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-32765582

RESUMEN

Cells release nanometer-scale, lipid bilayer-enclosed biomolecular packages (extracellular vesicles; EVs) into their surrounding environment. EVs are hypothesized to be intercellular communication agents that regulate physiological states by transporting biomolecules between near and distant cells. The research community has consistently advocated for the importance of RNA contents in EVs by demonstrating that: (1) EV-related RNA contents can be detected in a liquid biopsy, (2) disease states significantly alter EV-related RNA contents, and (3) sensitive and specific liquid biopsies can be implemented in precision medicine settings by measuring EV-derived RNA contents. Furthermore, EVs have medical potential beyond diagnostics. Both natural and engineered EVs are being investigated for therapeutic applications such as regenerative medicine and as drug delivery agents. This review focuses specifically on EV characterization, analysis of their RNA content, and their functional implications. The NIH extracellular RNA communication (ERC) program has catapulted human EV research from an RNA profiling standpoint by standardizing the pipeline for working with EV transcriptomics data, and creating a centralized database for the scientific community. There are currently thousands of RNA-sequencing profiles hosted on the Extracellular RNA Atlas alone (Murillo et al., 2019), encompassing a variety of human biofluid types and health conditions. While a number of significant discoveries have been made through these studies individually, integrative analyses of these data have thus far been limited. A primary focus of the ERC program over the next five years is to bring higher resolution tools to the EV research community so that investigators can isolate and analyze EV sub-populations, and ultimately single EVs sourced from discrete cell types, tissues, and complex biofluids. Higher resolution techniques will be essential for evaluating the roles of circulating EVs at a level which impacts clinical decision making. We expect that advances in microfluidic technologies will drive near-term innovation and discoveries about the diverse RNA contents of EVs. Long-term translation of EV-based RNA profiling into a mainstay medical diagnostic tool will depend upon identifying robust patterns of circulating genetic material that correlate with a change in health status.

17.
iScience ; 23(11): 101646, 2020 Nov 20.
Artículo en Inglés | MEDLINE | ID: mdl-33103084

RESUMEN

The recruitment of Polycomb repressive complex 2 (PRC2) to gene promoters is critical for its function in repressing gene expression in murine embryonic stem cells (mESCs). However, previous studies have demonstrated that although the expression of early lineage-specific genes is largely repressed, the genome-wide PRC2 occupancy is unexpectedly reduced in naive mESCs. In this study, we provide evidence that fibroblast growth factor/extracellular signal-regulated kinase signaling determines the global PRC2 occupancy through regulating the expression of PRC2-recruiting factor JARID2 in naive mESCs. At the transcriptional level, the de-repression of bivalent genes is predominantly determined by the presence of cell signaling-associated transcription factors but not the status of PRC2 occupancy at gene promoters. Hence, this study not only reveals a key molecular mechanism by which cell signaling regulates the PRC2 occupancy in mESCs but also elucidates the functional roles of transcription factors and Polycomb-mediated epigenetic mechanisms in transcriptional regulation.

18.
PLoS One ; 15(12): e0243251, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-33315963

RESUMEN

Modern genomic data sets often involve multiple data-layers (e.g., DNA-sequence, gene expression), each of which itself can be high-dimensional. The biological processes underlying these data-layers can lead to intricate multivariate association patterns. We propose and evaluate two methods to determine the proportion of variance of an output data set that can be explained by an input data set when both data panels are high dimensional. Our approach uses random-effects models to estimate the proportion of variance of vectors in the linear span of the output set that can be explained by regression on the input set. We consider a method based on an orthogonal basis (Eigen-ANOVA) and one that uses random vectors (Monte Carlo ANOVA, MC-ANOVA) in the linear span of the output set. Using simulations, we show that the MC-ANOVA method gave nearly unbiased estimates. Estimates produced by Eigen-ANOVA were also nearly unbiased, except when the shared variance was very high (e.g., >0.9). We demonstrate the potential insight that can be obtained from the use of MC-ANOVA and Eigen-ANOVA by applying these two methods to the study of multi-locus linkage disequilibrium in chicken (Gallus gallus) genomes and to the assessment of inter-dependencies between gene expression, methylation, and copy-number-variants in data from breast cancer tumors from humans (Homo sapiens). Our analyses reveal that in chicken breeding populations ~50,000 evenly-spaced SNPs are enough to fully capture the span of whole-genome-sequencing genomes. In the study of multi-omic breast cancer data, we found that the span of copy-number-variants can be fully explained using either methylation or gene expression data and that roughly 74% of the variance in gene expression can be predicted from methylation data.


Asunto(s)
Genómica/métodos , Análisis de Varianza , Animales , Neoplasias de la Mama/genética , Pollos/genética , Variaciones en el Número de Copia de ADN , Metilación de ADN , Femenino , Regulación Neoplásica de la Expresión Génica , Humanos , Desequilibrio de Ligamiento , Método de Montecarlo , Polimorfismo de Nucleótido Simple , Secuenciación Completa del Genoma
19.
Sci Rep ; 9(1): 12413, 2019 08 27.
Artículo en Inglés | MEDLINE | ID: mdl-31455838

RESUMEN

In 2019 it is estimated that more than 21,000 new acute myeloid leukemia (AML) patients will be diagnosed in the United States, and nearly 11,000 are expected to die from the disease. AML is primarily diagnosed among the elderly (median 68 years old at diagnosis). Prognoses have significantly improved for younger patients, but as much as 70% of patients over 60 years old will die within a year of diagnosis. In this study, we conducted a reanalysis of 2,213 acute myeloid leukemia patients compared to 548 healthy individuals, using curated publicly available microarray gene expression data. We carried out an analysis of normalized batch corrected data, using a linear model that included considerations for disease, age, sex, and tissue. We identified 974 differentially expressed probe sets and 4 significant pathways associated with AML. Additionally, we identified 375 age- and 70 sex-related probe set expression signatures relevant to AML. Finally, we trained a k nearest neighbors model to classify AML and healthy subjects with 90.9% accuracy. Our findings provide a new reanalysis of public datasets, that enabled the identification of new gene sets relevant to AML that can potentially be used in future experiments and possible stratified disease diagnostics.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , Perfilación de la Expresión Génica , Regulación Leucémica de la Expresión Génica , Leucemia Mieloide Aguda , Transcriptoma , Adulto , Anciano , Femenino , Humanos , Leucemia Mieloide Aguda/genética , Leucemia Mieloide Aguda/metabolismo , Masculino , Persona de Mediana Edad , Análisis de Secuencia por Matrices de Oligonucleótidos , Estados Unidos
20.
Front Neurosci ; 13: 392, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-31068785

RESUMEN

Alzheimer's disease (AD) has been categorized by the Centers for Disease Control and Prevention (CDC) as the 6th leading cause of death in the United States. AD is a significant health-care burden because of its increased occurrence (specifically in the elderly population), and the lack of effective treatments and preventive methods. With an increase in life expectancy, the CDC expects AD cases to rise to 15 million by 2060. Aging has been previously associated with susceptibility to AD, and there are ongoing efforts to effectively differentiate between normal and AD age-related brain degeneration and memory loss. AD targets neuronal function and can cause neuronal loss due to the buildup of amyloid-beta plaques and intracellular neurofibrillary tangles. Our study aims to identify temporal changes within gene expression profiles of healthy controls and AD subjects. We conducted a meta-analysis using publicly available microarray expression data from AD and healthy cohorts. For our meta-analysis, we selected datasets that reported donor age and gender, and used Affymetrix and Illumina microarray platforms (8 datasets, 2,088 samples). Raw microarray expression data were re-analyzed, and normalized across arrays. We then performed an analysis of variance, using a linear model that incorporated age, tissue type, sex, and disease state as effects, as well as study to account for batch effects, and included binary interactions between factors. Our results identified 3,735 statistically significant (Bonferroni adjusted p < 0.05) gene expression differences between AD and healthy controls, which we filtered for biological effect (10% two-tailed quantiles of mean differences between groups) to obtain 352 genes. Interesting pathways identified as enriched comprised of neurodegenerative diseases pathways (including AD), and also mitochondrial translation and dysfunction, synaptic vesicle cycle and GABAergic synapse, and gene ontology terms enrichment in neuronal system, transmission across chemical synapses and mitochondrial translation. Overall our approach allowed us to effectively combine multiple available microarray datasets and identify gene expression differences between AD and healthy individuals including full age and tissue type considerations. Our findings provide potential gene and pathway associations that can be targeted to improve AD diagnostics and potentially treatment or prevention.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA