Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 18 de 18
Filtrar
Más filtros












Base de datos
Intervalo de año de publicación
1.
J Biomed Inform ; 143: 104406, 2023 07.
Artículo en Inglés | MEDLINE | ID: mdl-37257630

RESUMEN

Multi-view clustering methods are essential for the stratification of patients into sub-groups of similar molecular characteristics. In recent years, a wide range of methods have been developed for this purpose. However, due to the high diversity of cancer-related data, a single method may not perform sufficiently well in all cases. We present Parea, a multi-view hierarchical ensemble clustering approach for disease subtype discovery. We demonstrate its performance on several machine learning benchmark datasets. We apply and validate our methodology on real-world multi-view patient data, comprising seven types of cancer. Parea outperforms the current state-of-the-art on six out of seven analysed cancer types. We have integrated the Parea method into our Python package Pyrea (https://github.com/mdbloice/Pyrea), which enables the effortless and flexible design of ensemble workflows while incorporating a wide range of fusion and clustering algorithms.


Asunto(s)
Algoritmos , Neoplasias , Humanos , Análisis por Conglomerados , Neoplasias/genética , Aprendizaje Automático
2.
Stud Health Technol Inform ; 294: 137-138, 2022 May 25.
Artículo en Inglés | MEDLINE | ID: mdl-35612038

RESUMEN

Feature selection is a fundamental challenge in machine learning. For instance in bioinformatics, it is essential when one wishes to detect biomarkers. Tree-based methods are predominantly used for this purpose. In this paper, we study the stability of the feature selection methods BORUTA, VITA, and RRF (regularized random forest). In particular, we investigate the feature ranking instability of the associated stochastic algorithms. For stabilization of the feature ranks, we propose to compute consensus values from multiple feature selection runs, applying rank aggregation techniques. Our results show that these consolidated features are more accurate and robust, which helps to make practical machine learning applications more trustworthy.


Asunto(s)
Algoritmos , Aprendizaje Automático , Biomarcadores , Biología Computacional/métodos
3.
J Biomed Inform ; 113: 103636, 2021 01.
Artículo en Inglés | MEDLINE | ID: mdl-33271342

RESUMEN

Recent advances in multi-omics clustering methods enable a more fine-tuned separation of cancer patients into clinical relevant clusters. These advancements have the potential to provide a deeper understanding of cancer progression and may facilitate the treatment of cancer patients. Here, we present a simple hierarchical clustering and data fusion approach, named HC-fused, for the detection of disease subtypes. Unlike other methods, the proposed approach naturally reports on the individual contribution of each single-omic to the data fusion process. We perform multi-view simulations with disjoint and disjunct cluster elements across the views to highlight fundamentally different data integration behavior of various state-of-the-art methods. HC-fused combines the strengths of some recently published methods and shows superior performance on real world cancer data from the TCGA (The Cancer Genome Atlas) database. An R implementation of our method is available on GitHub (pievos101/HC-fused).


Asunto(s)
Algoritmos , Neoplasias , Análisis por Conglomerados , Bases de Datos Factuales , Humanos , Neoplasias/genética
4.
Mol Ecol Resour ; 20(6): 1597-1609, 2020 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-32639602

RESUMEN

In recent years, genome-scan methods have been extensively used to detect local signatures of selection and introgression. Most of these methods are either designed for one or the other case, which may impair the study of combined cases. Here, we introduce a series of versatile genome-scan methods applicable for both cases, the detection of selection and introgression. The proposed approaches are based on nonparametric k-nearest neighbour (kNN) techniques, while incorporating pairwise Fixation Index (FST ) and pairwise nucleotide differences (dxy ) as features. We benchmark our methods using a wide range of simulation scenarios, with varying parameters, such as recombination rates, population background histories, selection strengths, the proportion of introgression and the time of gene flow. We find that kNN-based methods perform remarkably well compared with the state-of-the-art. Finally, we demonstrate how to perform kNN-based genome scans on real-world genomic data using the population genomics R-package popgenome.


Asunto(s)
Simulación por Computador , Genoma , Genómica , Modelos Genéticos , Flujo Génico , Genética de Población , Metagenómica , Polimorfismo de Nucleótido Simple , Selección Genética
6.
Nat Commun ; 10(1): 4666, 2019 10 11.
Artículo en Inglés | MEDLINE | ID: mdl-31604930

RESUMEN

Deregulation of transcription factors (TFs) is an important driver of tumorigenesis, but non-invasive assays for assessing transcription factor activity are lacking. Here we develop and validate a minimally invasive method for assessing TF activity based on cell-free DNA sequencing and nucleosome footprint analysis. We analyze whole genome sequencing data for >1,000 cell-free DNA samples from cancer patients and healthy controls using a bioinformatics pipeline developed by us that infers accessibility of TF binding sites from cell-free DNA fragmentation patterns. We observe patient-specific as well as tumor-specific patterns, including accurate prediction of tumor subtypes in prostate cancer, with important clinical implications for the management of patients. Furthermore, we show that cell-free DNA TF profiling is capable of detection of early-stage colorectal carcinomas. Our approach for mapping tumor-specific transcription factor binding in vivo based on blood samples makes a key part of the noncoding genome amenable to clinical analysis.


Asunto(s)
Neoplasias de la Mama/genética , Ácidos Nucleicos Libres de Células/química , Neoplasias del Colon/genética , Neoplasias de la Próstata/genética , Factores de Transcripción/fisiología , Sitios de Unión , Neoplasias de la Mama/sangre , Neoplasias de la Mama/diagnóstico , Neoplasias del Colon/sangre , Neoplasias del Colon/diagnóstico , Biología Computacional , Fragmentación del ADN , Detección Precoz del Cáncer/métodos , Femenino , Humanos , Masculino , Nucleosomas/química , Neoplasias de la Próstata/sangre , Neoplasias de la Próstata/diagnóstico
7.
Transl Oncol ; 12(2): 256-268, 2019 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-30439626

RESUMEN

BACKGROUND & AIMS: Steatohepatitis (SH) and SH-associated hepatocellular carcinoma (HCC) are of considerable clinical significance. SH is morphologically characterized by steatosis, liver cell ballooning, cytoplasmic aggregates termed Mallory-Denk bodies (MDBs), inflammation, and fibrosis at late stage. Disturbance of the keratin cytoskeleton and aggregation of keratins (KRTs) are essential for MDB formation. METHODS: We analyzed livers of aged Krt18-/- mice that spontaneously developed in the majority of cases SH-associated HCC independent of sex. Interestingly, the hepatic lipid profile in Krt18-/- mice, which accumulate KRT8, closely resembles human SH lipid profiles and shows that the excess of KRT8 over KRT18 determines the likelihood to develop SH-associated HCC linked with enhanced lipogenesis. RESULTS: Our analysis of the genetic profile of Krt18-/- mice with 26 human hepatoma cell lines and with data sets of >300 patients with HCC, where Krt18-/- gene signatures matched human HCC. Interestingly, a high KRT8/18 ratio is associated with an aggressive HCC phenotype. CONCLUSIONS: We can prove that intermediate filaments and their binding partners are tightly linked to hepatic lipid metabolism and to hepatocarcinogenesis. We suggest KRT8/18 ratio as a novel HCC biomarker for HCC.

8.
Diabetologia ; 61(11): 2398-2411, 2018 11.
Artículo en Inglés | MEDLINE | ID: mdl-30091044

RESUMEN

AIMS/HYPOTHESIS: An adverse intrauterine environment can result in permanent changes in the physiology of the offspring and predispose to diseases in adulthood. One such exposure, gestational diabetes mellitus (GDM), has been linked to development of metabolic disorders and cardiovascular disease in offspring. Epigenetic variation, including DNA methylation, is recognised as a leading mechanism underpinning fetal programming and we hypothesised that this plays a key role in fetoplacental endothelial dysfunction following exposure to GDM. Thus, we conducted a pilot epigenetic study to analyse concordant DNA methylation and gene expression changes in GDM-exposed fetoplacental endothelial cells. METHODS: Genome-wide methylation analysis of primary fetoplacental arterial endothelial cells (AEC) and venous endothelial cells (VEC) from healthy pregnancies and GDM-complicated pregnancies in parallel with transcriptome analysis identified methylation and expression changes. Most-affected pathways and functions were identified by Ingenuity Pathway Analysis and validated using functional assays. RESULTS: Transcriptome and methylation analyses identified variation in gene expression linked to GDM-associated DNA methylation in 408 genes in AEC and 159 genes in VEC, implying a direct functional link. Pathway analysis found that genes altered by exposure to GDM clustered to functions associated with 'cell morphology' and 'cellular movement' in healthy AEC and VEC. Further functional analysis demonstrated that GDM-exposed cells had altered actin organisation and barrier function. CONCLUSIONS/INTERPRETATION: Our data indicate that exposure to GDM programs atypical morphology and barrier function in fetoplacental endothelial cells by DNA methylation and gene expression change. The effects differ between AEC and VEC, indicating a stringent cell-specific sensitivity to adverse exposures associated with developmental programming in utero. DATA AVAILABILITY: DNA methylation and gene expression datasets generated and analysed during the current study are available at the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database ( http://www.ncbi.nlm.nih.gov/geo ) under accession numbers GSE106099 and GSE103552, respectively.


Asunto(s)
Diabetes Gestacional/metabolismo , Células Endoteliales/metabolismo , Feto/irrigación sanguínea , Placenta/irrigación sanguínea , Metilación de ADN/genética , Diabetes Gestacional/genética , Epigénesis Genética/genética , Femenino , Desarrollo Fetal/genética , Humanos , Embarazo
10.
Oncotarget ; 7(45): 73309-73322, 2016 11 08.
Artículo en Inglés | MEDLINE | ID: mdl-27689336

RESUMEN

Backround: Steatohepatitis (SH)-associated liver carcinogenesis is an increasingly important issue in clinical medicine. SH is morphologically characterized by steatosis, hepatocyte injury, ballooning, hepatocytic cytoplasmic inclusions termed Mallory-Denk bodies (MDBs), inflammation and fibrosis. RESULTS: 17-20-months-old Krt18-/- and Krt18+/- mice in contrast to wt mice spontaneously developed liver lesions closely resembling the morphological spectrum of human SH as well as liver tumors. The pathologic alterations were more pronounced in Krt18-/- than in Krt18+/- mice. The frequency of liver tumors with male predominance was significantly higher in Krt18-/- compared to age-matched Krt18+/- and wt mice. Krt18-deficient tumors in contrast to wt animals displayed SH features and often pleomorphic morphology. aCGH analysis of tumors revealed chromosomal aberrations in Krt18-/- liver tumors, affecting loci of oncogenes and tumor suppressor genes. MATERIALS AND METHODS: Livers of 3-, 6-, 12- and 17-20-months-old aged wild type (wt), Krt18+/- and Krt18-/- (129P2/OlaHsd background) mice were analyzed by light and immunofluorescence microscopy as well as immunohistochemistry. Liver tumors arising in aged mice were analyzed by array comparative genomic hybridization (aCGH). CONCLUSIONS: Our findings show that K18 deficiency of hepatocytes leads to steatosis, increasing with age, and finally to SH. K18 deficiency and age promote liver tumor development in mice, frequently on the basis of chromosomal instability, resembling human HCC with stemness features.


Asunto(s)
Hígado Graso/complicaciones , Hígado Graso/genética , Queratina-18/genética , Neoplasias Hepáticas/etiología , Animales , Transformación Celular Neoplásica , Aberraciones Cromosómicas , Hibridación Genómica Comparativa , Modelos Animales de Enfermedad , Genómica/métodos , Inmunohistoquímica , Queratina-18/deficiencia , Neoplasias Hepáticas/patología , Masculino , Ratones , Ratones Noqueados , Fenotipo
11.
PLoS One ; 11(9): e0161425, 2016.
Artículo en Inglés | MEDLINE | ID: mdl-27584017

RESUMEN

Bariatric surgery is currently one of the most effective treatments for obesity and leads to significant weight reduction, improved cardiovascular risk factors and overall survival in treated patients. To date, most studies focused on short-term effects of bariatric surgery on the metabolic profile and found high variation in the individual responses to surgery. The aim of this study was to identify relevant metabolic changes not only shortly after bariatric surgery (Roux-en-Y gastric bypass) but also up to one year after the intervention by using untargeted metabolomics. 132 serum samples taken from 44 patients before surgery, after hospital discharge (1-3 weeks after surgery) and at a 1-year follow-up during a prospective study (NCT01271062) performed at two study centers (Austria and Switzerland). The samples included 24 patients with type 2 diabetes at baseline, thereof 9 with diabetes remission after one year. The samples were analyzed by using liquid chromatography coupled to high resolution mass spectrometry (LC-HRMS, HILIC-QExactive). Raw data was processed with XCMS and drift-corrected through quantile regression based on quality controls. 177 relevant metabolic features were selected through Random Forests and univariate testing and 36 metabolites were identified. Identified metabolites included trimethylamine-N-oxide, alanine, phenylalanine and indoxyl-sulfate which are known markers for cardiovascular risk. In addition we found a significant decrease in alanine after one year in the group of patients with diabetes remission relative to non-remission. Our analysis highlights the importance of assessing multiple points in time in subjects undergoing bariatric surgery to enable the identification of biomarkers for treatment response, cardiovascular benefit and diabetes remission. Key-findings include different trend pattern over time for various metabolites and demonstrated that short term changes should not necessarily be used to identify important long term effects of bariatric surgery.


Asunto(s)
Derivación Gástrica/métodos , Metabolómica , Adulto , Austria , Cirugía Bariátrica , Cromatografía Líquida de Alta Presión , Femenino , Humanos , Masculino , Espectrometría de Masas , Persona de Mediana Edad , Suiza
12.
Stat Appl Genet Mol Biol ; 14(3): 311-6, 2015 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-25968440

RESUMEN

High-throughput sequencing techniques are increasingly affordable and produce massive amounts of data. Together with other high-throughput technologies, such as microarrays, there are an enormous amount of resources in databases. The collection of these valuable data has been routine for more than a decade. Despite different technologies, many experiments share the same goal. For instance, the aims of RNA-seq studies often coincide with those of differential gene expression experiments based on microarrays. As such, it would be logical to utilize all available data. However, there is a lack of biostatistical tools for the integration of results obtained from different technologies. Although diverse technological platforms produce different raw data, one commonality for experiments with the same goal is that all the outcomes can be transformed into a platform-independent data format - rankings - for the same set of items. Here we present the R package TopKLists, which allows for statistical inference on the lengths of informative (top-k) partial lists, for stochastic aggregation of full or partial lists, and for graphical exploration of the input and consolidated output. A graphical user interface has also been implemented for providing access to the underlying algorithms. To illustrate the applicability and usefulness of the package, we integrated microRNA data of non-small cell lung cancer across different measurement techniques and draw conclusions. The package can be obtained from CRAN under a LGPL-3 license.


Asunto(s)
Genómica/métodos , Programas Informáticos , Biología Computacional/métodos , Perfilación de la Expresión Génica/métodos , MicroARNs/genética , Modelos Estadísticos
13.
PLoS One ; 6(7): e21774, 2011.
Artículo en Inglés | MEDLINE | ID: mdl-21755000

RESUMEN

We describe the distribution of indoleamine 2,3-dioxygenase 1 (IDO1) in vascular endothelium of human first-trimester and term placenta. Expression of IDO1 protein on the fetal side of the interface extended from almost exclusively sub-trophoblastic capillaries in first-trimester placenta to a nearly general presence on villous vascular endothelia at term, including also most bigger vessels such as villous arteries and veins of stem villi and vessels of the chorionic plate. Umbilical cord vessels were generally negative for IDO1 protein. In the fetal part of the placenta positivity for IDO1 was restricted to vascular endothelium, which did not co-express HLA-DR. This finding paralleled detectability of IDO1 mRNA in first trimester and term tissue and a high increase in the kynurenine to tryptophan ratio in chorionic villous tissue from first trimester to term placenta. Endothelial cells isolated from the chorionic plate of term placenta expressed IDO1 mRNA in contrast to endothelial cells originating from human umbilical vein, iliac vein or aorta. In first trimester decidua we found endothelium of arteries rather than veins expressing IDO1, which was complementory to expression of HLA-DR. An estimation of IDO activity on the basis of the ratio of kynurenine and tryptophan in blood taken from vessels of the chorionic plate of term placenta indicated far higher values than those found in the peripheral blood of adults. Thus, a gradient of vascular endothelial IDO1 expression is present at both sides of the feto-maternal interface.


Asunto(s)
Endotelio Vascular/enzimología , Indolamina-Pirrol 2,3,-Dioxigenasa/metabolismo , Intercambio Materno-Fetal , Separación Celular , Corion/citología , Corion/enzimología , Decidua/citología , Decidua/enzimología , Células Endoteliales/citología , Células Endoteliales/enzimología , Endotelio Vascular/citología , Epítopos/inmunología , Femenino , Regulación Enzimológica de la Expresión Génica , Antígenos HLA-DR , Humanos , Inmunohistoquímica , Indolamina-Pirrol 2,3,-Dioxigenasa/genética , Adhesión en Parafina , Embarazo , Primer Trimestre del Embarazo/metabolismo , Transporte de Proteínas , ARN Mensajero/genética , ARN Mensajero/metabolismo , Triptófano/metabolismo
14.
PLoS One ; 6(3): e15086, 2011 Mar 08.
Artículo en Inglés | MEDLINE | ID: mdl-21408197

RESUMEN

The distribution of cells in stained tissue sections provides information that may be analyzed by means of morphometric computation. We developed an algorithm for automated analysis for the purpose of answering questions pertaining to the relative densities of wandering cells in the vicinity of comparatively immobile tissue structures such as vessels or tumors. As an example, we present the analysis of distribution of CD56-positive cells and of CXCR3-positive cells (relative densities of peri-vascular versus non-vascular cell populations) in relation to the endothelium of capillaries and venules of human parietal decidua tissue of first trimester pregnancy. In addition, the distribution of CD56-positive cells (mostly uterine NK cells) in relation to spiral arteries is analyzed. The image analysis is based on microphotographs of two-color immunohistological stainings. Discrete distances (10-50 µm) from the fixed structures were chosen for the purpose of defining the extent of neighborhood areas. For the sake of better comparison of cell distributions at different overall cell densities a model of random distribution of "cells" in relation to neighborhood areas and rest decidua of a specific sample was built. In the chosen instances, we found increased perivascular density of CD56-positive cells and of CXCR3-positive cells. In contrast, no accumulation of CD56-positive cells was found in the neighborhood of spiral arteries.


Asunto(s)
Movimiento Celular , Decidua/citología , Inmunohistoquímica/métodos , Automatización , Vasos Sanguíneos/metabolismo , Antígeno CD56/metabolismo , Simulación por Computador , Decidua/metabolismo , Endotelio/metabolismo , Femenino , Humanos , Imagenología Tridimensional , Embarazo , Primer Trimestre del Embarazo/metabolismo , Receptores CXCR3/metabolismo , Distribución Tisular
15.
Bioinformatics ; 25(6): 703-13, 2009 Mar 15.
Artículo en Inglés | MEDLINE | ID: mdl-19147666

RESUMEN

MOTIVATION: Genome analysis has become one of the most important tools for understanding the complex process of cancerogenesis. With increasing resolution of CGH arrays, the demand for computationally efficient algorithms arises, which are effective in the detection of aberrations even in very noisy data. RESULTS: We developed a rather simple, non-parametric technique of high computational efficiency for CGH array analysis that adopts a median absolute deviation concept for breakpoint detection, comprising median smoothing for pre-processing. The resulting algorithm has the potential to outperform any single smoothing approach as well as several recently proposed segmentation techniques. We show its performance through the application of simulated and real datasets in comparison to three other methods for array CGH analysis. IMPLEMENTATION: Our approach is implemented in the R-language and environment for statistical computing (version 2.6.1 for Windows, R-project, 2007). The code is available at: http://www.iba.muni.cz/~budinska/msmad.html. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Hibridación Genómica Comparativa/métodos , Biología Computacional/métodos , Algoritmos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Lenguajes de Programación
16.
Cancer Res ; 66(7): 3401-8, 2006 Apr 01.
Artículo en Inglés | MEDLINE | ID: mdl-16585161

RESUMEN

Mutations leading to activation of the RAF-mitogen-activated protein kinase/extracellular signal-regulated (ERK) kinase (MEK)-ERK pathway are key events in the pathogenesis of human malignancies. In a screen of 82 acute myeloid leukemia (AML) samples, 45 (55%) showed activated ERK and thus were further analyzed for mutations in B-RAF and C-RAF. Two C-RAF germ-line mutations, S427G and I448V, were identified in patients with therapy-related AML in the absence of alterations in RAS and FLT3. Both exchanges were located within the kinase domain of C-RAF. In vitro and in vivo kinase assays revealed significantly increased activity for (S427G)C-RAF but not for (I448V)C-RAF. The involvement of the S427G C-RAF mutation in constitutive activation of ERK was further confirmed through demonstration of activating phosphorylations on C-RAF, MEK, and ERK in neoplastic cells, but not in nonneoplastic cells. Transformation and survival assays showed oncogenic and antiapoptotic properties for both mutations. Screening healthy individuals revealed a <1/400 frequency of these mutations and, in the case of I448V, inheritance was observed over three generations with another mutation carrier suffering from cancer. Taken together, these data are the first to relate C-RAF mutations to human malignancies. As both mutations are of germ-line origin, they might constitute a novel tumor-predisposing factor.


Asunto(s)
Transformación Celular Neoplásica/genética , Mutación de Línea Germinal , Leucemia Mieloide/genética , Neoplasias Primarias Secundarias/genética , Proteínas Proto-Oncogénicas c-raf/genética , Enfermedad Aguda , Adulto , Anciano , Secuencia de Aminoácidos , Animales , Apoptosis/genética , Secuencia de Bases , Células COS , Chlorocebus aethiops , Quinasas MAP Reguladas por Señal Extracelular/metabolismo , Regulación Leucémica de la Expresión Génica/genética , Genes ras , Células HL-60 , Humanos , Leucemia Mieloide/enzimología , Leucemia Mieloide/patología , Sistema de Señalización de MAP Quinasas , Ratones , Datos de Secuencia Molecular , Células 3T3 NIH , Neoplasias Primarias Secundarias/enzimología , Neoplasias Primarias Secundarias/patología , Linaje , Fosforilación , Proteínas Proto-Oncogénicas B-raf/genética , Alineación de Secuencia , Tirosina Quinasa 3 Similar a fms/genética
18.
Methods Inf Med ; 43(5): 439-44, 2004.
Artículo en Inglés | MEDLINE | ID: mdl-15702197

RESUMEN

OBJECTIVES: A typical bioinformatics task in microarray analysis is the classification of biological samples into two alternative categories. A procedure is needed which, based on the expression levels measured, allows us to compute the probability that a new sample belongs to a certain class. METHODS: For the purpose of classification the statistical approach of binary regression is considered. Highdimensionality and at the same time small sample sizes make it a challenging task. Standard logit or probit regression fails because of condition problems and poor predictive performance. The concepts of frequentist and of Bayesian penalization for binary regression are introduced. A Bayesian interpretation of the penalized log-likelihood is given. Finally the role of cross-validation for regularization and feature selection is discussed. RESULTS: Penalization makes classical binary regression a suitable tool for microarray analysis. We illustrate penalized logit and Bayesian probit regression on a well-known data set and compare the obtained results, also with respect to published results from decision trees. CONCLUSIONS: The frequentist and the Bayesian penalization concept work equally well on the example data, however some method-specific differences can be made out. Moreover the Bayesian approach yields a quantification (posterior probabilities) of the bias due to the constraining assumptions.


Asunto(s)
Perfilación de la Expresión Génica , Análisis de Secuencia por Matrices de Oligonucleótidos/clasificación , Teorema de Bayes , Biología Computacional , Análisis de Secuencia por Matrices de Oligonucleótidos/estadística & datos numéricos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...