Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 18 de 18
Filtrar
1.
J Biomed Inform ; 143: 104406, 2023 07.
Artigo em Inglês | MEDLINE | ID: mdl-37257630

RESUMO

Multi-view clustering methods are essential for the stratification of patients into sub-groups of similar molecular characteristics. In recent years, a wide range of methods have been developed for this purpose. However, due to the high diversity of cancer-related data, a single method may not perform sufficiently well in all cases. We present Parea, a multi-view hierarchical ensemble clustering approach for disease subtype discovery. We demonstrate its performance on several machine learning benchmark datasets. We apply and validate our methodology on real-world multi-view patient data, comprising seven types of cancer. Parea outperforms the current state-of-the-art on six out of seven analysed cancer types. We have integrated the Parea method into our Python package Pyrea (https://github.com/mdbloice/Pyrea), which enables the effortless and flexible design of ensemble workflows while incorporating a wide range of fusion and clustering algorithms.


Assuntos
Algoritmos , Neoplasias , Humanos , Análise por Conglomerados , Neoplasias/genética , Aprendizado de Máquina
2.
Stud Health Technol Inform ; 294: 137-138, 2022 May 25.
Artigo em Inglês | MEDLINE | ID: mdl-35612038

RESUMO

Feature selection is a fundamental challenge in machine learning. For instance in bioinformatics, it is essential when one wishes to detect biomarkers. Tree-based methods are predominantly used for this purpose. In this paper, we study the stability of the feature selection methods BORUTA, VITA, and RRF (regularized random forest). In particular, we investigate the feature ranking instability of the associated stochastic algorithms. For stabilization of the feature ranks, we propose to compute consensus values from multiple feature selection runs, applying rank aggregation techniques. Our results show that these consolidated features are more accurate and robust, which helps to make practical machine learning applications more trustworthy.


Assuntos
Algoritmos , Aprendizado de Máquina , Biomarcadores , Biologia Computacional/métodos
3.
J Biomed Inform ; 113: 103636, 2021 01.
Artigo em Inglês | MEDLINE | ID: mdl-33271342

RESUMO

Recent advances in multi-omics clustering methods enable a more fine-tuned separation of cancer patients into clinical relevant clusters. These advancements have the potential to provide a deeper understanding of cancer progression and may facilitate the treatment of cancer patients. Here, we present a simple hierarchical clustering and data fusion approach, named HC-fused, for the detection of disease subtypes. Unlike other methods, the proposed approach naturally reports on the individual contribution of each single-omic to the data fusion process. We perform multi-view simulations with disjoint and disjunct cluster elements across the views to highlight fundamentally different data integration behavior of various state-of-the-art methods. HC-fused combines the strengths of some recently published methods and shows superior performance on real world cancer data from the TCGA (The Cancer Genome Atlas) database. An R implementation of our method is available on GitHub (pievos101/HC-fused).


Assuntos
Algoritmos , Neoplasias , Análise por Conglomerados , Bases de Dados Factuais , Humanos , Neoplasias/genética
4.
Mol Ecol Resour ; 20(6): 1597-1609, 2020 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-32639602

RESUMO

In recent years, genome-scan methods have been extensively used to detect local signatures of selection and introgression. Most of these methods are either designed for one or the other case, which may impair the study of combined cases. Here, we introduce a series of versatile genome-scan methods applicable for both cases, the detection of selection and introgression. The proposed approaches are based on nonparametric k-nearest neighbour (kNN) techniques, while incorporating pairwise Fixation Index (FST ) and pairwise nucleotide differences (dxy ) as features. We benchmark our methods using a wide range of simulation scenarios, with varying parameters, such as recombination rates, population background histories, selection strengths, the proportion of introgression and the time of gene flow. We find that kNN-based methods perform remarkably well compared with the state-of-the-art. Finally, we demonstrate how to perform kNN-based genome scans on real-world genomic data using the population genomics R-package popgenome.


Assuntos
Simulação por Computador , Genoma , Genômica , Modelos Genéticos , Fluxo Gênico , Genética Populacional , Metagenômica , Polimorfismo de Nucleotídeo Único , Seleção Genética
6.
Nat Commun ; 10(1): 4666, 2019 10 11.
Artigo em Inglês | MEDLINE | ID: mdl-31604930

RESUMO

Deregulation of transcription factors (TFs) is an important driver of tumorigenesis, but non-invasive assays for assessing transcription factor activity are lacking. Here we develop and validate a minimally invasive method for assessing TF activity based on cell-free DNA sequencing and nucleosome footprint analysis. We analyze whole genome sequencing data for >1,000 cell-free DNA samples from cancer patients and healthy controls using a bioinformatics pipeline developed by us that infers accessibility of TF binding sites from cell-free DNA fragmentation patterns. We observe patient-specific as well as tumor-specific patterns, including accurate prediction of tumor subtypes in prostate cancer, with important clinical implications for the management of patients. Furthermore, we show that cell-free DNA TF profiling is capable of detection of early-stage colorectal carcinomas. Our approach for mapping tumor-specific transcription factor binding in vivo based on blood samples makes a key part of the noncoding genome amenable to clinical analysis.


Assuntos
Neoplasias da Mama/genética , Ácidos Nucleicos Livres/química , Neoplasias do Colo/genética , Neoplasias da Próstata/genética , Fatores de Transcrição/fisiologia , Sítios de Ligação , Neoplasias da Mama/sangue , Neoplasias da Mama/diagnóstico , Neoplasias do Colo/sangue , Neoplasias do Colo/diagnóstico , Biologia Computacional , Fragmentação do DNA , Detecção Precoce de Câncer/métodos , Feminino , Humanos , Masculino , Nucleossomos/química , Neoplasias da Próstata/sangue , Neoplasias da Próstata/diagnóstico
7.
Transl Oncol ; 12(2): 256-268, 2019 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-30439626

RESUMO

BACKGROUND & AIMS: Steatohepatitis (SH) and SH-associated hepatocellular carcinoma (HCC) are of considerable clinical significance. SH is morphologically characterized by steatosis, liver cell ballooning, cytoplasmic aggregates termed Mallory-Denk bodies (MDBs), inflammation, and fibrosis at late stage. Disturbance of the keratin cytoskeleton and aggregation of keratins (KRTs) are essential for MDB formation. METHODS: We analyzed livers of aged Krt18-/- mice that spontaneously developed in the majority of cases SH-associated HCC independent of sex. Interestingly, the hepatic lipid profile in Krt18-/- mice, which accumulate KRT8, closely resembles human SH lipid profiles and shows that the excess of KRT8 over KRT18 determines the likelihood to develop SH-associated HCC linked with enhanced lipogenesis. RESULTS: Our analysis of the genetic profile of Krt18-/- mice with 26 human hepatoma cell lines and with data sets of >300 patients with HCC, where Krt18-/- gene signatures matched human HCC. Interestingly, a high KRT8/18 ratio is associated with an aggressive HCC phenotype. CONCLUSIONS: We can prove that intermediate filaments and their binding partners are tightly linked to hepatic lipid metabolism and to hepatocarcinogenesis. We suggest KRT8/18 ratio as a novel HCC biomarker for HCC.

8.
Diabetologia ; 61(11): 2398-2411, 2018 11.
Artigo em Inglês | MEDLINE | ID: mdl-30091044

RESUMO

AIMS/HYPOTHESIS: An adverse intrauterine environment can result in permanent changes in the physiology of the offspring and predispose to diseases in adulthood. One such exposure, gestational diabetes mellitus (GDM), has been linked to development of metabolic disorders and cardiovascular disease in offspring. Epigenetic variation, including DNA methylation, is recognised as a leading mechanism underpinning fetal programming and we hypothesised that this plays a key role in fetoplacental endothelial dysfunction following exposure to GDM. Thus, we conducted a pilot epigenetic study to analyse concordant DNA methylation and gene expression changes in GDM-exposed fetoplacental endothelial cells. METHODS: Genome-wide methylation analysis of primary fetoplacental arterial endothelial cells (AEC) and venous endothelial cells (VEC) from healthy pregnancies and GDM-complicated pregnancies in parallel with transcriptome analysis identified methylation and expression changes. Most-affected pathways and functions were identified by Ingenuity Pathway Analysis and validated using functional assays. RESULTS: Transcriptome and methylation analyses identified variation in gene expression linked to GDM-associated DNA methylation in 408 genes in AEC and 159 genes in VEC, implying a direct functional link. Pathway analysis found that genes altered by exposure to GDM clustered to functions associated with 'cell morphology' and 'cellular movement' in healthy AEC and VEC. Further functional analysis demonstrated that GDM-exposed cells had altered actin organisation and barrier function. CONCLUSIONS/INTERPRETATION: Our data indicate that exposure to GDM programs atypical morphology and barrier function in fetoplacental endothelial cells by DNA methylation and gene expression change. The effects differ between AEC and VEC, indicating a stringent cell-specific sensitivity to adverse exposures associated with developmental programming in utero. DATA AVAILABILITY: DNA methylation and gene expression datasets generated and analysed during the current study are available at the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database ( http://www.ncbi.nlm.nih.gov/geo ) under accession numbers GSE106099 and GSE103552, respectively.


Assuntos
Diabetes Gestacional/metabolismo , Células Endoteliais/metabolismo , Feto/irrigação sanguínea , Placenta/irrigação sanguínea , Metilação de DNA/genética , Diabetes Gestacional/genética , Epigênese Genética/genética , Feminino , Desenvolvimento Fetal/genética , Humanos , Gravidez
10.
Oncotarget ; 7(45): 73309-73322, 2016 11 08.
Artigo em Inglês | MEDLINE | ID: mdl-27689336

RESUMO

Backround: Steatohepatitis (SH)-associated liver carcinogenesis is an increasingly important issue in clinical medicine. SH is morphologically characterized by steatosis, hepatocyte injury, ballooning, hepatocytic cytoplasmic inclusions termed Mallory-Denk bodies (MDBs), inflammation and fibrosis. RESULTS: 17-20-months-old Krt18-/- and Krt18+/- mice in contrast to wt mice spontaneously developed liver lesions closely resembling the morphological spectrum of human SH as well as liver tumors. The pathologic alterations were more pronounced in Krt18-/- than in Krt18+/- mice. The frequency of liver tumors with male predominance was significantly higher in Krt18-/- compared to age-matched Krt18+/- and wt mice. Krt18-deficient tumors in contrast to wt animals displayed SH features and often pleomorphic morphology. aCGH analysis of tumors revealed chromosomal aberrations in Krt18-/- liver tumors, affecting loci of oncogenes and tumor suppressor genes. MATERIALS AND METHODS: Livers of 3-, 6-, 12- and 17-20-months-old aged wild type (wt), Krt18+/- and Krt18-/- (129P2/OlaHsd background) mice were analyzed by light and immunofluorescence microscopy as well as immunohistochemistry. Liver tumors arising in aged mice were analyzed by array comparative genomic hybridization (aCGH). CONCLUSIONS: Our findings show that K18 deficiency of hepatocytes leads to steatosis, increasing with age, and finally to SH. K18 deficiency and age promote liver tumor development in mice, frequently on the basis of chromosomal instability, resembling human HCC with stemness features.


Assuntos
Fígado Gorduroso/complicações , Fígado Gorduroso/genética , Queratina-18/genética , Neoplasias Hepáticas/etiologia , Animais , Transformação Celular Neoplásica , Aberrações Cromossômicas , Hibridização Genômica Comparativa , Modelos Animais de Doenças , Genômica/métodos , Imuno-Histoquímica , Queratina-18/deficiência , Neoplasias Hepáticas/patologia , Masculino , Camundongos , Camundongos Knockout , Fenótipo
11.
PLoS One ; 11(9): e0161425, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-27584017

RESUMO

Bariatric surgery is currently one of the most effective treatments for obesity and leads to significant weight reduction, improved cardiovascular risk factors and overall survival in treated patients. To date, most studies focused on short-term effects of bariatric surgery on the metabolic profile and found high variation in the individual responses to surgery. The aim of this study was to identify relevant metabolic changes not only shortly after bariatric surgery (Roux-en-Y gastric bypass) but also up to one year after the intervention by using untargeted metabolomics. 132 serum samples taken from 44 patients before surgery, after hospital discharge (1-3 weeks after surgery) and at a 1-year follow-up during a prospective study (NCT01271062) performed at two study centers (Austria and Switzerland). The samples included 24 patients with type 2 diabetes at baseline, thereof 9 with diabetes remission after one year. The samples were analyzed by using liquid chromatography coupled to high resolution mass spectrometry (LC-HRMS, HILIC-QExactive). Raw data was processed with XCMS and drift-corrected through quantile regression based on quality controls. 177 relevant metabolic features were selected through Random Forests and univariate testing and 36 metabolites were identified. Identified metabolites included trimethylamine-N-oxide, alanine, phenylalanine and indoxyl-sulfate which are known markers for cardiovascular risk. In addition we found a significant decrease in alanine after one year in the group of patients with diabetes remission relative to non-remission. Our analysis highlights the importance of assessing multiple points in time in subjects undergoing bariatric surgery to enable the identification of biomarkers for treatment response, cardiovascular benefit and diabetes remission. Key-findings include different trend pattern over time for various metabolites and demonstrated that short term changes should not necessarily be used to identify important long term effects of bariatric surgery.


Assuntos
Derivação Gástrica/métodos , Metabolômica , Adulto , Áustria , Cirurgia Bariátrica , Cromatografia Líquida de Alta Pressão , Feminino , Humanos , Masculino , Espectrometria de Massas , Pessoa de Meia-Idade , Suíça
12.
Stat Appl Genet Mol Biol ; 14(3): 311-6, 2015 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-25968440

RESUMO

High-throughput sequencing techniques are increasingly affordable and produce massive amounts of data. Together with other high-throughput technologies, such as microarrays, there are an enormous amount of resources in databases. The collection of these valuable data has been routine for more than a decade. Despite different technologies, many experiments share the same goal. For instance, the aims of RNA-seq studies often coincide with those of differential gene expression experiments based on microarrays. As such, it would be logical to utilize all available data. However, there is a lack of biostatistical tools for the integration of results obtained from different technologies. Although diverse technological platforms produce different raw data, one commonality for experiments with the same goal is that all the outcomes can be transformed into a platform-independent data format - rankings - for the same set of items. Here we present the R package TopKLists, which allows for statistical inference on the lengths of informative (top-k) partial lists, for stochastic aggregation of full or partial lists, and for graphical exploration of the input and consolidated output. A graphical user interface has also been implemented for providing access to the underlying algorithms. To illustrate the applicability and usefulness of the package, we integrated microRNA data of non-small cell lung cancer across different measurement techniques and draw conclusions. The package can be obtained from CRAN under a LGPL-3 license.


Assuntos
Genômica/métodos , Software , Biologia Computacional/métodos , Perfilação da Expressão Gênica/métodos , MicroRNAs/genética , Modelos Estatísticos
13.
PLoS One ; 6(7): e21774, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-21755000

RESUMO

We describe the distribution of indoleamine 2,3-dioxygenase 1 (IDO1) in vascular endothelium of human first-trimester and term placenta. Expression of IDO1 protein on the fetal side of the interface extended from almost exclusively sub-trophoblastic capillaries in first-trimester placenta to a nearly general presence on villous vascular endothelia at term, including also most bigger vessels such as villous arteries and veins of stem villi and vessels of the chorionic plate. Umbilical cord vessels were generally negative for IDO1 protein. In the fetal part of the placenta positivity for IDO1 was restricted to vascular endothelium, which did not co-express HLA-DR. This finding paralleled detectability of IDO1 mRNA in first trimester and term tissue and a high increase in the kynurenine to tryptophan ratio in chorionic villous tissue from first trimester to term placenta. Endothelial cells isolated from the chorionic plate of term placenta expressed IDO1 mRNA in contrast to endothelial cells originating from human umbilical vein, iliac vein or aorta. In first trimester decidua we found endothelium of arteries rather than veins expressing IDO1, which was complementory to expression of HLA-DR. An estimation of IDO activity on the basis of the ratio of kynurenine and tryptophan in blood taken from vessels of the chorionic plate of term placenta indicated far higher values than those found in the peripheral blood of adults. Thus, a gradient of vascular endothelial IDO1 expression is present at both sides of the feto-maternal interface.


Assuntos
Endotélio Vascular/enzimologia , Indolamina-Pirrol 2,3,-Dioxigenase/metabolismo , Troca Materno-Fetal , Separação Celular , Córion/citologia , Córion/enzimologia , Decídua/citologia , Decídua/enzimologia , Células Endoteliais/citologia , Células Endoteliais/enzimologia , Endotélio Vascular/citologia , Epitopos/imunologia , Feminino , Regulação Enzimológica da Expressão Gênica , Antígenos HLA-DR , Humanos , Imuno-Histoquímica , Indolamina-Pirrol 2,3,-Dioxigenase/genética , Inclusão em Parafina , Gravidez , Primeiro Trimestre da Gravidez/metabolismo , Transporte Proteico , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Triptofano/metabolismo
14.
PLoS One ; 6(3): e15086, 2011 Mar 08.
Artigo em Inglês | MEDLINE | ID: mdl-21408197

RESUMO

The distribution of cells in stained tissue sections provides information that may be analyzed by means of morphometric computation. We developed an algorithm for automated analysis for the purpose of answering questions pertaining to the relative densities of wandering cells in the vicinity of comparatively immobile tissue structures such as vessels or tumors. As an example, we present the analysis of distribution of CD56-positive cells and of CXCR3-positive cells (relative densities of peri-vascular versus non-vascular cell populations) in relation to the endothelium of capillaries and venules of human parietal decidua tissue of first trimester pregnancy. In addition, the distribution of CD56-positive cells (mostly uterine NK cells) in relation to spiral arteries is analyzed. The image analysis is based on microphotographs of two-color immunohistological stainings. Discrete distances (10-50 µm) from the fixed structures were chosen for the purpose of defining the extent of neighborhood areas. For the sake of better comparison of cell distributions at different overall cell densities a model of random distribution of "cells" in relation to neighborhood areas and rest decidua of a specific sample was built. In the chosen instances, we found increased perivascular density of CD56-positive cells and of CXCR3-positive cells. In contrast, no accumulation of CD56-positive cells was found in the neighborhood of spiral arteries.


Assuntos
Movimento Celular , Decídua/citologia , Imuno-Histoquímica/métodos , Automação , Vasos Sanguíneos/metabolismo , Antígeno CD56/metabolismo , Simulação por Computador , Decídua/metabolismo , Endotélio/metabolismo , Feminino , Humanos , Imageamento Tridimensional , Gravidez , Primeiro Trimestre da Gravidez/metabolismo , Receptores CXCR3/metabolismo , Distribuição Tecidual
15.
Bioinformatics ; 25(6): 703-13, 2009 Mar 15.
Artigo em Inglês | MEDLINE | ID: mdl-19147666

RESUMO

MOTIVATION: Genome analysis has become one of the most important tools for understanding the complex process of cancerogenesis. With increasing resolution of CGH arrays, the demand for computationally efficient algorithms arises, which are effective in the detection of aberrations even in very noisy data. RESULTS: We developed a rather simple, non-parametric technique of high computational efficiency for CGH array analysis that adopts a median absolute deviation concept for breakpoint detection, comprising median smoothing for pre-processing. The resulting algorithm has the potential to outperform any single smoothing approach as well as several recently proposed segmentation techniques. We show its performance through the application of simulated and real datasets in comparison to three other methods for array CGH analysis. IMPLEMENTATION: Our approach is implemented in the R-language and environment for statistical computing (version 2.6.1 for Windows, R-project, 2007). The code is available at: http://www.iba.muni.cz/~budinska/msmad.html. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Hibridização Genômica Comparativa/métodos , Biologia Computacional/métodos , Algoritmos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Linguagens de Programação
16.
Cancer Res ; 66(7): 3401-8, 2006 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-16585161

RESUMO

Mutations leading to activation of the RAF-mitogen-activated protein kinase/extracellular signal-regulated (ERK) kinase (MEK)-ERK pathway are key events in the pathogenesis of human malignancies. In a screen of 82 acute myeloid leukemia (AML) samples, 45 (55%) showed activated ERK and thus were further analyzed for mutations in B-RAF and C-RAF. Two C-RAF germ-line mutations, S427G and I448V, were identified in patients with therapy-related AML in the absence of alterations in RAS and FLT3. Both exchanges were located within the kinase domain of C-RAF. In vitro and in vivo kinase assays revealed significantly increased activity for (S427G)C-RAF but not for (I448V)C-RAF. The involvement of the S427G C-RAF mutation in constitutive activation of ERK was further confirmed through demonstration of activating phosphorylations on C-RAF, MEK, and ERK in neoplastic cells, but not in nonneoplastic cells. Transformation and survival assays showed oncogenic and antiapoptotic properties for both mutations. Screening healthy individuals revealed a <1/400 frequency of these mutations and, in the case of I448V, inheritance was observed over three generations with another mutation carrier suffering from cancer. Taken together, these data are the first to relate C-RAF mutations to human malignancies. As both mutations are of germ-line origin, they might constitute a novel tumor-predisposing factor.


Assuntos
Transformação Celular Neoplásica/genética , Mutação em Linhagem Germinativa , Leucemia Mieloide/genética , Segunda Neoplasia Primária/genética , Proteínas Proto-Oncogênicas c-raf/genética , Doença Aguda , Adulto , Idoso , Sequência de Aminoácidos , Animais , Apoptose/genética , Sequência de Bases , Células COS , Chlorocebus aethiops , MAP Quinases Reguladas por Sinal Extracelular/metabolismo , Regulação Leucêmica da Expressão Gênica/genética , Genes ras , Células HL-60 , Humanos , Leucemia Mieloide/enzimologia , Leucemia Mieloide/patologia , Sistema de Sinalização das MAP Quinases , Camundongos , Dados de Sequência Molecular , Células NIH 3T3 , Segunda Neoplasia Primária/enzimologia , Segunda Neoplasia Primária/patologia , Linhagem , Fosforilação , Proteínas Proto-Oncogênicas B-raf/genética , Alinhamento de Sequência , Tirosina Quinase 3 Semelhante a fms/genética
18.
Methods Inf Med ; 43(5): 439-44, 2004.
Artigo em Inglês | MEDLINE | ID: mdl-15702197

RESUMO

OBJECTIVES: A typical bioinformatics task in microarray analysis is the classification of biological samples into two alternative categories. A procedure is needed which, based on the expression levels measured, allows us to compute the probability that a new sample belongs to a certain class. METHODS: For the purpose of classification the statistical approach of binary regression is considered. Highdimensionality and at the same time small sample sizes make it a challenging task. Standard logit or probit regression fails because of condition problems and poor predictive performance. The concepts of frequentist and of Bayesian penalization for binary regression are introduced. A Bayesian interpretation of the penalized log-likelihood is given. Finally the role of cross-validation for regularization and feature selection is discussed. RESULTS: Penalization makes classical binary regression a suitable tool for microarray analysis. We illustrate penalized logit and Bayesian probit regression on a well-known data set and compare the obtained results, also with respect to published results from decision trees. CONCLUSIONS: The frequentist and the Bayesian penalization concept work equally well on the example data, however some method-specific differences can be made out. Moreover the Bayesian approach yields a quantification (posterior probabilities) of the bias due to the constraining assumptions.


Assuntos
Perfilação da Expressão Gênica , Análise de Sequência com Séries de Oligonucleotídeos/classificação , Teorema de Bayes , Biologia Computacional , Análise de Sequência com Séries de Oligonucleotídeos/estatística & dados numéricos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA