Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 124
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Stat Med ; 40(19): 4279-4293, 2021 08 30.
Artigo em Inglês | MEDLINE | ID: mdl-33987868

RESUMO

Gaussian graphical models are usually estimated from unreplicated data. The data are, however, likely to comprise signal and noise. These two cannot be deconvoluted from unreplicated data. Pragmatically, the noise is then ignored in practice. We point out the consequences of this practice for the reconstruction of the conditional independence graph of the signal. Replicated data allow for the deconvolution of signal and noise and the reconstruction of former's conditional independence graph. Hereto we present a penalized Expectation-Maximization algorithm. The penalty parameter is chosen to maximize the F-fold cross-validated log-likelihood. Sampling schemes of the folds from replicated data are discussed. By simulation we investigate the effect of replicates on the reconstruction of the signal's conditional independence graph. Moreover, we compare the proposed method to several obvious competitors. In an application we use data from oncogenomic studies with replicates to reconstruct the gene-gene interaction networks, operationalized as conditional independence graphs. This yields a realistic portrait of the effect of ignoring other sources but sampling variation. In addition, it bears implications on the reproducibility of inferred gene-gene interaction networks reported in literature.


Assuntos
Algoritmos , Redes Reguladoras de Genes , Simulação por Computador , Humanos , Distribuição Normal , Reprodutibilidade dos Testes
2.
Eur J Clin Invest ; 49(7): e13121, 2019 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-31013351

RESUMO

BACKGROUND: Recently, it was shown that 12 weeks of lipopolysaccharide (LPS) administration to nonatherosclerotic mice induced thickening of the aortic heart valve (AV). Whether such effects may also occur even earlier is unknown. As most patients with AV stenosis also have atherosclerosis, we studied the short-term effect of LPS on the AVs in an atherosclerotic mouse model. METHODS: ApoE*3Leiden mice, on an atherogenic diet, were injected intraperitoneally with either LPS or phosphate buffered saline (PBS), and sacrificed 2 or 15 days later. AVs were assessed for size, fibrosis, glycosaminoglycans (GAGs), lipids, calcium deposits, iron deposits and inflammatory cells. RESULTS: LPS injection caused an increase in maximal leaflet thickness at 2 days (128.4 µm) compared to PBS-injected mice (67.8 µm; P = 0.007), whereas at 15 days this was not significantly different. LPS injection did not significantly affect average AV thickness on day 2 (37.8 µm), but did significantly increase average AV thickness at day 15 (41.6 µm; P = 0.038) compared to PBS-injected mice (31.7 and 32.3 µm respectively). LPS injection did not affect AV fibrosis, GAGs and lipid content. Furthermore, no calcium deposits were found. Iron deposits, indicative for valve haemorrhage, were observed in one AV of the PBS-injected group (a day 2 mouse; 9.1%) and in five AVs of the LPS-injected group (both day 2- and 15 mice; 29.4%). No significant differences in inflammatory cell infiltration were observed upon LPS injection. CONCLUSION: Short-term LPS apparently has the potential to increase AV thickening and haemorrhage. These results suggest that systemic inflammation can acutely compromise AV structure.


Assuntos
Valva Aórtica/patologia , Apolipoproteínas E/metabolismo , Endotoxinas/toxicidade , Lipopolissacarídeos/toxicidade , Análise de Variância , Animais , Valva Aórtica/efeitos dos fármacos , Aterosclerose/induzido quimicamente , Dieta Aterogênica , Modelos Animais de Doenças , Endotoxinas/administração & dosagem , Feminino , Fibrose/induzido quimicamente , Metabolismo dos Lipídeos/fisiologia , Lipopolissacarídeos/administração & dosagem , Camundongos , Proteína Amiloide A Sérica/metabolismo , Remodelação Vascular/efeitos dos fármacos
3.
Biom J ; 61(2): 391-405, 2019 03.
Artigo em Inglês | MEDLINE | ID: mdl-30136415

RESUMO

Time-course omics experiments enable the reconstruction of the dynamics of the cellular regulatory network. Here, we describe the means for this reconstruction and the downstream exploitation of the inferred network. It is assumed that one of the various vector-autoregressive models (VAR) models presented here serves as a reasonably accurate description of the time-course omics data. The models are estimated through ridge penalized likelihood maximization, accompanied by functionality for the determination of optimal penalty paramaters. Prior knowledge on the network topology is accommodated by the estimation procedures. Various routes that translate the fitted models into more tangible implications for the medical researcher are described. The network is inferred from the-nonsparse-ridge estimates through empirical Bayes probabilistic thresholding. The influence of a (trait of a) molecular entity at the current time on those at future time points is assessed by mutual information, impulse response analysis, and path decomposition of the covariance. The presented methodology is applied to the omics data from the p53 signaling pathway during HPV-induced cellular transformation. All methodology is implemented in the ragt2ridges package, freely available from the Comprehensive R Archive Network.


Assuntos
Biologia Computacional , Modelos Estatísticos , Linhagem Celular Tumoral , Feminino , Humanos , Papillomaviridae/fisiologia , Análise de Regressão , Transdução de Sinais , Proteína Supressora de Tumor p53/metabolismo , Neoplasias do Colo do Útero/genética , Neoplasias do Colo do Útero/patologia , Neoplasias do Colo do Útero/virologia
4.
BMC Bioinformatics ; 19(1): 301, 2018 Aug 20.
Artigo em Inglês | MEDLINE | ID: mdl-30126372

RESUMO

BACKGROUND: Reproducibility of hits from independent CRISPR or siRNA screens is poor. This is partly due to data normalization primarily addressing technical variability within independent screens, and not the technical differences between them. RESULTS: We present "rscreenorm", a method that standardizes the functional data ranges between screens using assay controls, and subsequently performs a piecewise-linear normalization to make data distributions across all screens comparable. In simulation studies, rscreenorm reduces false positives. Using two multiple-cell lines siRNA screens, rscreenorm increased reproducibility between 27 and 62% for hits, and up to 5-fold for non-hits. Using publicly available CRISPR-Cas screen data, application of commonly used median centering yields merely 34% of overlapping hits, in contrast with rscreenorm yielding 84% of overlapping hits. Furthermore, rscreenorm yielded at most 8% discordant results, whilst median-centering yielded as much as 55%. CONCLUSIONS: Rscreenorm yields more consistent results and keeps false positive rates under control, improving reproducibility of genetic screens data analysis from multiple cell lines.


Assuntos
Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas/genética , Testes Genéticos/métodos , Genômica/métodos , RNA Interferente Pequeno/genética , Humanos , Reprodutibilidade dos Testes
5.
Biom J ; 60(3): 547-563, 2018 05.
Artigo em Inglês | MEDLINE | ID: mdl-29320604

RESUMO

Cross-sectional studies may shed light on the evolution of a disease like cancer through the comparison of patient traits among disease stages. This problem is especially challenging when a gene-gene interaction network needs to be reconstructed from omics data, and, in addition, the patients of each stage need not form a homogeneous group. Here, the problem is operationalized as the estimation of stage-wise mixtures of Gaussian graphical models (GGMs) from high-dimensional data. These mixtures are fitted by a (fused) ridge penalized EM algorithm. The fused ridge penalty shrinks GGMs of contiguous stages. The (fused) ridge penalty parameters are chosen through cross-validation. The proposed estimation procedures are shown to be consistent and their performance in other respects is studied in simulation. The down-stream exploitation of the fitted GGMs is outlined. In a data illustration the methodology is employed to identify gene-gene interaction network changes in the transition from normal to cancer prostate tissue.


Assuntos
Biologia Computacional , Estudos Transversais , Redes Reguladoras de Genes , Humanos , Modelos Estatísticos , Distribuição Normal
6.
Neuropediatrics ; 48(3): 152-160, 2017 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-28561206

RESUMO

4H (hypomyelination, hypodontia and hypogonadotropic hypogonadism) leukodystrophy (4H) is an autosomal recessive hypomyelinating white matter (WM) disorder with neurologic, dental, and endocrine abnormalities. The aim of this study was to develop and validate a magnetic resonance imaging (MRI) scoring system for 4H. A scoring system (0-54) was developed to quantify hypomyelination and atrophy of different brain regions. Pons diameter and bicaudate ratio were included as measures of cerebral and brainstem atrophy, and reference values were determined using controls. Five independent raters completed the scoring system in 40 brain MRI scans collected from 36 patients with genetically proven 4H. Interrater reliability (IRR) and correlations between MRI scores, age, gross motor function, gender, and mutated gene were assessed. IRR for total MRI severity was found to be excellent (intraclass correlation coefficient: 0.87; 95% confidence interval: 0.80-0.92) but varied between different items with some (e.g., myelination of the cerebellar WM) showing poor IRR. Atrophy increased with age in contrast to hypomyelination scores. MRI scores (global, hypomyelination, and atrophy scores) significantly correlated with clinical handicap (p < 0.01 for all three items) and differed between the different genotypes. Our 4H MRI scoring system reliably quantifies hypomyelination and atrophy in patients with 4H, and MRI scores reflect clinical disease severity.


Assuntos
Anodontia/diagnóstico por imagem , Ataxia/diagnóstico por imagem , Encéfalo/diagnóstico por imagem , Hipogonadismo/diagnóstico por imagem , Leucoencefalopatias/diagnóstico por imagem , Imageamento por Ressonância Magnética , Índice de Gravidade de Doença , Adolescente , Adulto , Atrofia , Criança , Pré-Escolar , Avaliação da Deficiência , Feminino , Seguimentos , Humanos , Lactente , Recém-Nascido , Imageamento por Ressonância Magnética/métodos , Masculino , Atividade Motora , Bainha de Mielina , Tamanho do Órgão , Reprodutibilidade dos Testes , Estudos Retrospectivos , Adulto Jovem
7.
Biom J ; 59(1): 172-191, 2017 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-27902843

RESUMO

Omics experiments endowed with a time-course design may enable us to uncover the dynamic interplay among genes of cellular processes. Multivariate techniques (like VAR(1) models describing the temporal and contemporaneous relations among variates) that may facilitate this goal are hampered by the high-dimensionality of the resulting data. This is resolved by the presented ridge regularized maximum likelihood estimation procedure for the VAR(1) model. Information on the absence of temporal and contemporaneous relations may be incorporated in this procedure. Its computational efficient implemention is discussed. The estimation procedure is accompanied with an LOOCV scheme to determine the associated penalty parameters. Downstream exploitation of the estimated VAR(1) model is outlined: an empirical Bayes procedure to identify the interesting temporal and contemporaneous relationships, impulse response analysis, mutual information analysis, and covariance decomposition into the (graphical) relations among variates. In a simulation study the presented ridge estimation procedure outperformed a sparse competitor in terms of Frobenius loss of the estimates, while their selection properties are on par. The proposed machinery is illustrated in the reconstruction of the p53 signaling pathway during HPV-induced cellular transformation. The methodology is implemented in the ragt2ridges R-package available from CRAN.


Assuntos
Biologia Computacional/métodos , Modelos Estatísticos , Teorema de Bayes , Simulação por Computador , Humanos , Funções Verossimilhança , Software , Fatores de Tempo
8.
Biom J ; 59(5): 932-947, 2017 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-28393396

RESUMO

Reconstruction of a high-dimensional network may benefit substantially from the inclusion of prior knowledge on the network topology. In the case of gene interaction networks such knowledge may come for instance from pathway repositories like KEGG, or be inferred from data of a pilot study. The Bayesian framework provides a natural means of including such prior knowledge. Based on a Bayesian Simultaneous Equation Model, we develop an appealing Empirical Bayes (EB) procedure that automatically assesses the agreement of the used prior knowledge with the data at hand. We use variational Bayes method for posterior densities approximation and compare its accuracy with that of Gibbs sampling strategy. Our method is computationally fast, and can outperform known competitors. In a simulation study, we show that accurate prior data can greatly improve the reconstruction of the network, but need not harm the reconstruction if wrong. We demonstrate the benefits of the method in an analysis of gene expression data from GEO. In particular, the edges of the recovered network have superior reproducibility (compared to that of competitors) over resampled versions of the data.


Assuntos
Biometria/métodos , Modelos Estatísticos , Teorema de Bayes , Simulação por Computador , Redes Reguladoras de Genes , Projetos Piloto , Reprodutibilidade dos Testes
9.
Stat Med ; 35(3): 368-81, 2016 Feb 10.
Artigo em Inglês | MEDLINE | ID: mdl-26365903

RESUMO

For many high-dimensional studies, additional information on the variables, like (genomic) annotation or external p-values, is available. In the context of binary and continuous prediction, we develop a method for adaptive group-regularized (logistic) ridge regression, which makes structural use of such 'co-data'. Here, 'groups' refer to a partition of the variables according to the co-data. We derive empirical Bayes estimates of group-specific penalties, which possess several nice properties: (i) They are analytical. (ii) They adapt to the informativeness of the co-data for the data at hand. (iii) Only one global penalty parameter requires tuning by cross-validation. In addition, the method allows use of multiple types of co-data at little extra computational effort. We show that the group-specific penalties may lead to a larger distinction between 'near-zero' and relatively large regression parameters, which facilitates post hoc variable selection. The method, termed GRridge, is implemented in an easy-to-use R-package. It is demonstrated on two cancer genomics studies, which both concern the discrimination of precancerous cervical lesions from normal cervix tissues using methylation microarray data. For both examples, GRridge clearly improves the predictive performances of ordinary logistic ridge regression and the group lasso. In addition, we show that for the second study, the relatively good predictive performance is maintained when selecting only 42 variables.


Assuntos
Testes Genéticos/estatística & dados numéricos , Lesões Pré-Cancerosas/diagnóstico , Projetos de Pesquisa/estatística & dados numéricos , Neoplasias do Colo do Útero/diagnóstico , Teorema de Bayes , Simulação por Computador , Metilação de DNA/genética , Feminino , Testes Genéticos/métodos , Humanos , Modelos Logísticos , Lesões Pré-Cancerosas/genética , Valor Preditivo dos Testes , Reprodutibilidade dos Testes , Projetos de Pesquisa/normas , Neoplasias do Colo do Útero/genética
10.
Stat Appl Genet Mol Biol ; 13(2): 141-58, 2014 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-24552967

RESUMO

Through integration of genomic data from multiple sources, we may obtain a more accurate and complete picture of the molecular mechanisms underlying tumorigenesis. We discuss the integration of DNA copy number and mRNA gene expression data from an observational integrative genomics study involving cancer patients. The two molecular levels involved are linked through the central dogma of molecular biology. DNA copy number aberrations abound in the cancer cell. Here we investigate how these aberrations affect gene expression levels within a pathway using observational integrative genomics data of cancer patients. In particular, we aim to identify differential edges between regulatory networks of two groups involving these molecular levels. Motivated by the rate equations, the regulatory mechanism between DNA copy number aberrations and gene expression levels within a pathway is modeled by a simultaneous-equations model, for the one- and two-group case. The latter facilitates the identification of differential interactions between the two groups. Model parameters are estimated by penalized least squares using the lasso (L1) penalty to obtain a sparse pathway topology. Simulations show that the inclusion of DNA copy number data benefits the discovery of gene-gene interactions. In addition, the simulations reveal that cis-effects tend to be over-estimated in a univariate (single gene) analysis. In the application to real data from integrative oncogenomic studies we show that inclusion of prior information on the regulatory network architecture benefits the reproducibility of all edges. Furthermore, analyses of the TP53 and TGFb signaling pathways between ER+ and ER- samples from an integrative genomics breast cancer study identify reproducible differential regulatory patterns that corroborate with existing literature.


Assuntos
Neoplasias da Mama/genética , Variações do Número de Cópias de DNA/genética , Regulação Neoplásica da Expressão Gênica/genética , Neoplasias da Mama/patologia , Feminino , Perfilação da Expressão Gênica , Genômica , Humanos , Modelos Teóricos
11.
Brain ; 137(Pt 4): 1019-29, 2014 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-24566671

RESUMO

Leukoencephalopathy with brainstem and spinal cord involvement and lactate elevation is a disorder caused by recessive mutations in the gene DARS2, which encodes mitochondrial aspartyl-tRNA synthetase. Recent observations indicate that the phenotypic range of the disease is much wider than initially thought. Currently, no treatment is available. The aims of our study were (i) to explore a possible genotype-phenotype correlation; and (ii) to identify potential therapeutic agents that modulate the splice site mutations in intron 2 of DARS2, present in almost all patients. A cross-sectional observational study was performed in 78 patients with two DARS2 mutations in the Amsterdam and Helsinki databases up to December 2012. Clinical information was collected via questionnaires. An inventory was made of the DARS2 mutations in these patients and those previously published. An assay was developed to assess mitochondrial aspartyl-tRNA synthetase enzyme activity in cells. Using a fluorescence reporter system we screened for drugs that modulate DARS2 splicing. Clinical information of 66 patients was obtained. The clinical severity varied from infantile onset, rapidly fatal disease to adult onset, slow and mild disease. The most common phenotype was characterized by childhood onset and slow neurological deterioration. Full wheelchair dependency was rare and usually began in adulthood. In total, 60 different DARS2 mutations were identified, 13 of which have not been reported before. Except for 4 of 42 cases published by others, all patients were compound heterozygous. Ninety-four per cent of the patients had a splice site mutation in intron 2. The groups of patients sharing the same two mutations were too small for formal assessment of genotype-phenotype correlation. However, some combinations of mutations were consistently associated with a mild phenotype. The mitochondrial aspartyl-tRNA synthetase activity was strongly reduced in patient cells. Among the compounds screened, cantharidin was identified as the most potent modulator of DARS2 splicing. In conclusion, the phenotypic spectrum of leukoencephalopathy with brainstem and spinal cord involvement and lactate elevation is wide, but most often the disease has a relatively slow and mild course. The available evidence suggests that the genotype influences the phenotype, but because of the high number of private mutations, larger numbers of patients are necessary to confirm this. The activity of mitochondrial aspartyl-tRNA synthetase is significantly reduced in patient cells. A compound screen established a 'proof of principle' that the splice site mutation can be influenced. This finding is promising for future therapeutic strategies.


Assuntos
Processamento Alternativo/efeitos dos fármacos , Aspartato-tRNA Ligase/deficiência , Leucoencefalopatias/complicações , Leucoencefalopatias/genética , Doenças Mitocondriais/complicações , Doenças Mitocondriais/genética , Adolescente , Adulto , Idade de Início , Aspartato-tRNA Ligase/genética , Aspartato-tRNA Ligase/metabolismo , Cantaridina/farmacologia , Criança , Pré-Escolar , Estudos Transversais , Análise Mutacional de DNA , Progressão da Doença , Inibidores Enzimáticos/farmacologia , Feminino , Estudos de Associação Genética , Humanos , Lactente , Leucoencefalopatias/tratamento farmacológico , Leucoencefalopatias/enzimologia , Masculino , Pessoa de Meia-Idade , Doenças Mitocondriais/tratamento farmacológico , Doenças Mitocondriais/enzimologia , Mutação , Reação em Cadeia da Polimerase Via Transcriptase Reversa , Adulto Jovem
12.
Philos Trans A Math Phys Eng Sci ; 373(2034)2015 Feb 13.
Artigo em Inglês | MEDLINE | ID: mdl-25548271

RESUMO

Transition patterns between different sleep stages are analysed in terms of probability distributions of symbolic sequences for young and old subjects with and without sleep disorder. Changes of these patterns due to ageing are compared with variations of transition probabilities due to sleep disorder.

13.
Bull Math Biol ; 77(9): 1768-86, 2015 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-26376888

RESUMO

Many pathways are dysregulated in cancer. Dysregulation of the regulatory network results in less control of transcript levels in the cell. Hence, dysregulation is reflected in the heterogeneity of the transcriptome: the more dysregulated the pathway, the more the transcriptomic heterogeneity. We identify four scenarios for a transcriptomic heterogeneity increase (i.e., pathway dysregulation) in cancer: (1) activation of a molecular switch, (2) a structural change in a regulator, (3) a temporal change in a regulator, and (4) weakening of gene-gene interactions. These mechanisms are statistically motivated, explored in silico, and their plausibility to occur in vivo illustrated by means of oncogenomics data of breast cancer studies.


Assuntos
Redes Reguladoras de Genes , Neoplasias/genética , Neoplasias da Mama/genética , Simulação por Computador , Epistasia Genética , Feminino , Regulação Neoplásica da Expressão Gênica , Humanos , Conceitos Matemáticos , Modelos Genéticos , Transcriptoma
14.
BMC Bioinformatics ; 15: 236, 2014 Jul 08.
Artigo em Inglês | MEDLINE | ID: mdl-25004928

RESUMO

BACKGROUND: A number of statistical models has been proposed for studying the association between gene expression and copy number data in integrated analysis. The next step is to compare association patterns between different groups of samples. RESULTS: We propose a method, named dSIM, to find differences in association between copy number and gene expression, when comparing two groups of samples. Firstly, we use ridge regression to correct for the baseline associations between copy number and gene expression. Secondly, the global test is applied to the corrected data in order to find differences in association patterns between two groups of samples. We show that dSIM detects differences even in small genomic regions in a simulation study. We also apply dSIM to two publicly available breast cancer datasets and identify chromosome arms where copy number led gene expression regulation differs between positive and negative estrogen receptor samples. In spite of differing genomic coverage, some selected arms are identified in both datasets. CONCLUSION: We developed a flexible and robust method for studying association differences between two groups of samples while integrating genomic data from different platforms. dSIM can be used with most types of microarray/sequencing data, including methylation and microRNA expression. The method is implemented in R and will be made part of the BioConductor package SIM.


Assuntos
Biologia Computacional/métodos , Perfilação da Expressão Gênica , Neoplasias da Mama/genética , Neoplasias da Mama/metabolismo , Feminino , Dosagem de Genes/genética , Humanos , Receptores de Estrogênio/metabolismo
15.
BMC Bioinformatics ; 15: 327, 2014 Oct 02.
Artigo em Inglês | MEDLINE | ID: mdl-25278371

RESUMO

BACKGROUND: To determine which changes in the host cell genome are crucial for cervical carcinogenesis, a longitudinal in vitro model system of HPV-transformed keratinocytes was profiled in a genome-wide manner. Four cell lines affected with either HPV16 or HPV18 were assayed at 8 sequential time points for gene expression (mRNA) and gene copy number (DNA) using high-resolution microarrays. Available methods for temporal differential expression analysis are not designed for integrative genomic studies. RESULTS: Here, we present a method that allows for the identification of differential gene expression associated with DNA copy number changes over time. The temporal variation in gene expression is described by a generalized linear mixed model employing low-rank thin-plate splines. Model parameters are estimated with an empirical Bayes procedure, which exploits integrated nested Laplace approximation for fast computation. Iteratively, posteriors of hyperparameters and model parameters are estimated. The empirical Bayes procedure shrinks multiple dispersion-related parameters. Shrinkage leads to more stable estimates of the model parameters, better control of false positives and improvement of reproducibility. In addition, to make estimates of the DNA copy number more stable, model parameters are also estimated in a multivariate way using triplets of features, imposing a spatial prior for the copy number effect. CONCLUSION: With the proposed method for analysis of time-course multilevel molecular data, more profound insight may be gained through the identification of temporal differential expression induced by DNA copy number abnormalities. In particular, in the analysis of an integrative oncogenomics study with a time-course set-up our method finds genes previously reported to be involved in cervical carcinogenesis. Furthermore, the proposed method yields improvements in sensitivity, specificity and reproducibility compared to existing methods. Finally, the proposed method is able to handle count (RNAseq) data from time course experiments as is shown on a real data set.


Assuntos
Dosagem de Genes , Regulação da Expressão Gênica , Genômica/métodos , Interações Hospedeiro-Patógeno , Papillomavirus Humano 16/fisiologia , Papillomavirus Humano 18/fisiologia , Queratinócitos/virologia , Teorema de Bayes , Linhagem Celular , Simulação por Computador , DNA/genética , DNA Complementar , Genoma , Humanos , Queratinócitos/metabolismo , Modelos Genéticos , Infecções por Papillomavirus/genética
16.
Biostatistics ; 14(1): 113-28, 2013 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-22988280

RESUMO

Next generation sequencing is quickly replacing microarrays as a technique to probe different molecular levels of the cell, such as DNA or RNA. The technology provides higher resolution, while reducing bias. RNA sequencing results in counts of RNA strands. This type of data imposes new statistical challenges. We present a novel, generic approach to model and analyze such data. Our approach aims at large flexibility of the likelihood (count) model and the regression model alike. Hence, a variety of count models is supported, such as the popular NB model, which accounts for overdispersion. In addition, complex, non-balanced designs and random effects are accommodated. Like some other methods, our method provides shrinkage of dispersion-related parameters. However, we extend it by enabling joint shrinkage of parameters, including those for which inference is desired. We argue that this is essential for Bayesian multiplicity correction. Shrinkage is effectuated by empirically estimating priors. We discuss several parametric (mixture) and non-parametric priors and develop procedures to estimate (parameters of) those. Inference is provided by means of local and Bayesian false discovery rates. We illustrate our method on several simulations and two data sets, also to compare it with other methods. Model- and data-based simulations show substantial improvements in the sensitivity at the given specificity. The data motivate the use of the ZI-NB as a powerful alternative to the NB, which results in higher detection rates for low-count data. Finally, compared with other methods, the results on small sample subsets are more reproducible when validated on their large sample complements, illustrating the importance of the type of shrinkage.


Assuntos
Teorema de Bayes , Interpretação Estatística de Dados , Modelos Estatísticos , RNA/química , Análise de Sequência de RNA/métodos , Sequência de Bases , Simulação por Computador , Dados de Sequência Molecular , RNA/genética
17.
Stat Appl Genet Mol Biol ; 12(2): 143-74, 2013 Apr 19.
Artigo em Inglês | MEDLINE | ID: mdl-23735435

RESUMO

The process of occurrence of genomic aberrations over time in the genetic material of cancer cells reflects the progression of the cancer. Modern technologies like aCGH (array Comparative Genomic Hybridization) and MPS (Massive Parallel Sequencing) provide high-resolution measurements of DNA copy number aberrations, that reveal the full scale of genomic aberrations. A continuous time Markov chain model is proposed to describe the accumulation of aberrations over time. Time however is a latent variable (with the number of aberrations as a proxy). Integrating out time, yields the distribution of the observed DNA copy number data. The model parameters are estimated from high-dimensional DNA copy number data by means of penalized maximum pseudo- and likelihood and method of moments procedures. Having fitted the model, posterior time estimates of the advancement of each sample's cancer are obtained and the most likely locations of a sample's aberrations are predicted. The three estimation methods are compared in a simulation study. The paper closes with an application of the proposed methodology on cancer data.


Assuntos
Variações do Número de Cópias de DNA , Genômica , Modelos Estatísticos , Neoplasias/genética , Algoritmos , Hibridização Genômica Comparativa , Biologia Computacional/métodos , Simulação por Computador , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Cadeias de Markov , Fatores de Tempo
18.
Chaos ; 24(2): 024404, 2014 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-24985458

RESUMO

Many sleep centres try to perform a reduced portable test in order to decrease the number of overnight polysomnographies that are expensive, time-consuming, and disturbing. With some limitations, heart rate variability (HRV) has been useful in this task. The aim of this investigation was to evaluate if inclusion of symbolic dynamics variables to a logistic regression model integrating clinical and physical variables, can improve the detection of subjects for further polysomnographies. To our knowledge, this is the first contribution that innovates in that strategy. A group of 133 patients has been referred to the sleep center for suspected sleep apnea. Clinical assessment of the patients consisted of a sleep related questionnaire and a physical examination. The clinical variables related to apnea and selected in the statistical model were age (p < 10(-3)), neck circumference (p < 10(-3)), score on a questionnaire scale intended to quantify daytime sleepiness (p < 10(-3)), and intensity of snoring (p < 10(-3)). The validation of this model demonstrated an increase in classification performance when a variable based on non-linear dynamics of HRV (p < 0.01) was used additionally to the other variables. For diagnostic rule based only on clinical and physical variables, the corresponding area under the receiver operating characteristic (ROC) curve was 0.907 (95% confidence interval (CI) = 0.848, 0.967), (sensitivity 87.10% and specificity 80%). For the model including the average of a symbolic dynamic variable, the area under the ROC curve was increased to 0.941 (95% = 0.897, 0.985), (sensitivity 88.71% and specificity 82.86%). In conclusion, symbolic dynamics, coupled with significant clinical and physical variables can help to prioritize polysomnographies in patients with a high probability of apnea. In addition, the processing of the HRV is a well established low cost and robust technique.


Assuntos
Frequência Cardíaca/fisiologia , Apneia Obstrutiva do Sono/diagnóstico , Apneia Obstrutiva do Sono/fisiopatologia , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Intervalos de Confiança , Bases de Dados como Assunto , Eletrocardiografia , Feminino , Humanos , Modelos Logísticos , Masculino , Pessoa de Meia-Idade , Análise Multivariada , Dinâmica não Linear , Curva ROC , Apneia Obstrutiva do Sono/diagnóstico por imagem , Inquéritos e Questionários , Ultrassonografia , Adulto Jovem
19.
Brief Bioinform ; 12(1): 10-21, 2011 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-20172948

RESUMO

Analysis of DNA copy number profiles requires methods tailored to the specific nature of these data. The number of available data analysis methods has grown enormously in the last 5 years. We discuss the typical characteristics of DNA copy number data, as measured by microarray technology and review the extensive literature on preprocessing methods such as segmentation and calling. Subsequently, the focus narrows to applications of DNA copy number in cancer, in particular, several downstream analyses of multi-sample data sets such as testing, clustering and classification. Finally, we look ahead: what should we prepare for and which methodology-related topics may deserve attention in the near future?


Assuntos
Biologia Computacional/métodos , Variações do Número de Cópias de DNA , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Hibridização Genômica Comparativa , Perfilação da Expressão Gênica/métodos
20.
Front Cardiovasc Med ; 10: 1276321, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-38028437

RESUMO

Background: Myocarditis is a condition that can have severe adverse outcomes and lead to sudden cardiac death if remaining undetected. This study tested the capability of cardiac magnetic field mapping to detect patients with clinically suspected myocarditis. This could open up the way for rapid, non-invasive, and cost-effective screening of suspected cases before a gold standard assessment via endomyocardial biopsy. Methods: Historical cardiac magnetic field maps (n = 97) and data from a state-of-the-art magnetocardiography device (n = 30) were analyzed using the Kullback-Leibler entropy (KLE) for dimensionality reduction and topological quantification. Linear discriminant analysis was used to discern between patients with ongoing myocarditis and healthy controls. Results: The STT segment of a magnetocardiogram, i.e., the section between the end of the S wave and the end of the T wave, was best suited to discern both groups. Using a 250-ms excerpt from the onset of the STT segment gave a reliable classification between the myocarditis and control group for both historic data (sensitivity: 0.83, specificity: 0.85, accuracy: 0.84) and recent data (sensitivity: 0.69, specificity: 0.88, accuracy: 0.80) using the KLE to quantify the topology of the cardiac magnetic field map. Conclusion: The implementation based on KLE can reliably distinguish between clinically suspected myocarditis patients and healthy controls. We implemented an automatized feature selection based on LDA to replace the observer-dependent manual thresholding in previous studies.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA