Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Bioinformatics ; 40(9)2024 09 02.
Artigo em Inglês | MEDLINE | ID: mdl-39177091

RESUMO

MOTIVATION: Circulating-cell free DNA (cfDNA) is widely explored as a noninvasive biomarker for cancer screening and diagnosis. The ability to decode the cells of origin in cfDNA would provide biological insights into pathophysiological mechanisms, aiding in cancer characterization and directing clinical management and follow-up. RESULTS: We developed a DNA methylation signature-based deconvolution algorithm, MetDecode, for cancer tissue origin identification. We built a reference atlas exploiting de novo and published whole-genome methylation sequencing data for colorectal, breast, ovarian, and cervical cancer, and blood-cell-derived entities. MetDecode models the contributors absent in the atlas with methylation patterns learnt on-the-fly from the input cfDNA methylation profiles. In addition, our model accounts for the coverage of each marker region to alleviate potential sources of noise. In-silico experiments showed a limit of detection down to 2.88% of tumor tissue contribution in cfDNA. MetDecode produced Pearson correlation coefficients above 0.95 and outperformed other methods in simulations (P < 0.001; T-test; one-sided). In plasma cfDNA profiles from cancer patients, MetDecode assigned the correct tissue-of-origin in 84.2% of cases. In conclusion, MetDecode can unravel alterations in the cfDNA pool components by accurately estimating the contribution of multiple tissues, while supplied with an imperfect reference atlas. AVAILABILITY AND IMPLEMENTATION: MetDecode is available at https://github.com/JorisVermeeschLab/MetDecode.


Assuntos
Algoritmos , Biomarcadores Tumorais , Ácidos Nucleicos Livres , Metilação de DNA , Neoplasias , Humanos , Neoplasias/genética , Ácidos Nucleicos Livres/sangue , Biomarcadores Tumorais/sangue
2.
Sci Rep ; 14(1): 13188, 2024 06 08.
Artigo em Inglês | MEDLINE | ID: mdl-38851759

RESUMO

Genome interpretation (GI) encompasses the computational attempts to model the relationship between genotype and phenotype with the goal of understanding how the first leads to the second. While traditional approaches have focused on sub-problems such as predicting the effect of single nucleotide variants or finding genetic associations, recent advances in neural networks (NNs) have made it possible to develop end-to-end GI models that take genomic data as input and predict phenotypes as output. However, technical and modeling issues still need to be fixed for these models to be effective, including the widespread underdetermination of genomic datasets, making them unsuitable for training large, overfitting-prone, NNs. Here we propose novel GI models to address this issue, exploring the use of two types of transfer learning approaches and proposing a novel Biologically Meaningful Sparse NN layer specifically designed for end-to-end GI. Our models predict the leaf and seed ionome in A.thaliana, obtaining comparable results to our previous over-parameterized model while reducing the number of parameters by 8.8 folds. We also investigate how the effect of population stratification influences the evaluation of the performances, highlighting how it leads to (1) an instance of the Simpson's Paradox, and (2) model generalization limitations.


Assuntos
Arabidopsis , Genoma de Planta , Folhas de Planta , Sementes , Arabidopsis/genética , Folhas de Planta/genética , Folhas de Planta/metabolismo , Sementes/genética , Sementes/metabolismo , Redes Neurais de Computação , Genômica/métodos , Fenótipo , Modelos Genéticos , Genótipo
3.
Comput Struct Biotechnol J ; 23: 1773-1785, 2024 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-38689715

RESUMO

Magnesium (Mg)-based implants have emerged as a promising alternative for orthopedic applications, owing to their bioactive properties and biodegradability. As the implants degrade, Mg2+ ions are released, influencing all surrounding cell types, especially mesenchymal stem cells (MSCs). MSCs are vital for bone tissue regeneration, therefore, it is essential to understand their molecular response to Mg2+ ions in order to maximize the potential of Mg-based biomaterials. In this study, we conducted a gene regulatory network (GRN) analysis to examine the molecular responses of MSCs to Mg2+ ions. We used time-series proteomics data collected at 11 time points across a 21-day period for the GRN construction. We studied the impact of Mg2+ ions on the resulting networks and identified the key proteins and protein interactions affected by the application of Mg2+ ions. Our analysis highlights MYL1, MDH2, GLS, and TRIM28 as the primary targets of Mg2+ ions in the response of MSCs during 1-21 days phase. Our results also identify MDH2-MYL1, MDH2-RPS26, TRIM28-AK1, TRIM28-SOD2, and GLS-AK1 as the critical protein relationships affected by Mg2+ ions. By offering a comprehensive understanding of the regulatory role of Mg2+ ions on MSCs, our study contributes valuable insights into the molecular response of MSCs to Mg-based materials, thereby facilitating the development of innovative therapeutic strategies for orthopedic applications.

4.
Genome Biol ; 24(1): 224, 2023 10 05.
Artigo em Inglês | MEDLINE | ID: mdl-37798735

RESUMO

BACKGROUND: Despite clear evidence of nonlinear interactions in the molecular architecture of polygenic diseases, linear models have so far appeared optimal in genotype-to-phenotype modeling. A key bottleneck for such modeling is that genetic data intrinsically suffers from underdetermination ([Formula: see text]). Millions of variants are present in each individual while the collection of large, homogeneous cohorts is hindered by phenotype incidence, sequencing cost, and batch effects. RESULTS: We demonstrate that when we provide enough training data and control the complexity of nonlinear models, a neural network outperforms additive approaches in whole exome sequencing-based inflammatory bowel disease case-control prediction. To do so, we propose a biologically meaningful sparsified neural network architecture, providing empirical evidence for positive and negative epistatic effects present in the inflammatory bowel disease pathogenesis. CONCLUSIONS: In this paper, we show that underdetermination is likely a major driver for the apparent optimality of additive modeling in clinical genetics today.


Assuntos
Doenças Inflamatórias Intestinais , Dinâmica não Linear , Humanos , Tamanho da Amostra , Doenças Inflamatórias Intestinais/genética , Redes Neurais de Computação , Fenótipo
5.
Bioinformatics ; 38(10): 2802-2809, 2022 05 13.
Artigo em Inglês | MEDLINE | ID: mdl-35561176

RESUMO

MOTIVATION: Transcriptional regulation mechanisms allow cells to adapt and respond to external stimuli by altering gene expression. The possible cell transcriptional states are determined by the underlying gene regulatory network (GRN), and reliably inferring such network would be invaluable to understand biological processes and disease progression. RESULTS: In this article, we present a novel method for the inference of GRNs, called PORTIA, which is based on robust precision matrix estimation, and we show that it positively compares with state-of-the-art methods while being orders of magnitude faster. We extensively validated PORTIA using the DREAM and MERLIN+P datasets as benchmarks. In addition, we propose a novel scoring metric that builds on graph-theoretical concepts. AVAILABILITY AND IMPLEMENTATION: The code and instructions for data acquisition and full reproduction of our results are available at https://github.com/AntoinePassemiers/PORTIA-Manuscript. PORTIA is available on PyPI as a Python package (portia-grn). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Redes Reguladoras de Genes , Regulação da Expressão Gênica
6.
BMC Biol ; 19(1): 3, 2021 01 13.
Artigo em Inglês | MEDLINE | ID: mdl-33441128

RESUMO

BACKGROUND: Identifying variants that drive tumor progression (driver variants) and distinguishing these from variants that are a byproduct of the uncontrolled cell growth in cancer (passenger variants) is a crucial step for understanding tumorigenesis and precision oncology. Various bioinformatics methods have attempted to solve this complex task. RESULTS: In this study, we investigate the assumptions on which these methods are based, showing that the different definitions of driver and passenger variants influence the difficulty of the prediction task. More importantly, we prove that the data sets have a construction bias which prevents the machine learning (ML) methods to actually learn variant-level functional effects, despite their excellent performance. This effect results from the fact that in these data sets, the driver variants map to a few driver genes, while the passenger variants spread across thousands of genes, and thus just learning to recognize driver genes provides almost perfect predictions. CONCLUSIONS: To mitigate this issue, we propose a novel data set that minimizes this bias by ensuring that all genes covered by the data contain both driver and passenger variants. As a result, we show that the tested predictors experience a significant drop in performance, which should not be considered as poorer modeling, but rather as correcting unwarranted optimism. Finally, we propose a weighting procedure to completely eliminate the gene effects on such predictions, thus precisely evaluating the ability of predictors to model the functional effects of single variants, and we show that indeed this task is still open.


Assuntos
Carcinogênese/genética , Progressão da Doença , Aprendizado de Máquina , Oncologia/instrumentação , Neoplasias/genética , Medicina de Precisão/instrumentação , Neoplasias/patologia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA