Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Mais filtros











Base de dados
Intervalo de ano de publicação
1.
BMC Bioinformatics ; 24(1): 395, 2023 Oct 20.
Artigo em Inglês | MEDLINE | ID: mdl-37864168

RESUMO

BACKGROUND: Transcription factors (TF) play a crucial role in the regulation of gene transcription; alterations of their activity and binding to DNA areas are strongly involved in cancer and other disease onset and development. For proper biomedical investigation, it is hence essential to correctly trace TF dense DNA areas, having multiple bindings of distinct factors, and select DNA high occupancy target (HOT) zones, showing the highest accumulation of such bindings. Indeed, systematic and replicable analysis of HOT zones in a large variety of cells and tissues would allow further understanding of their characteristics and could clarify their functional role. RESULTS: Here, we propose, thoroughly explain and discuss a full computational procedure to study in-depth DNA dense areas of transcription factor accumulation and identify HOT zones. This methodology, developed as a computationally efficient parametric algorithm implemented in an R/Bioconductor package, uses a systematic approach with two alternative methods to examine transcription factor bindings and provide comparative and fully-reproducible assessments. It offers different resolutions by introducing three distinct types of accumulation, which can analyze DNA from single-base to region-oriented levels, and a moving window, which can estimate the influence of the neighborhood for each DNA base under exam. CONCLUSIONS: We quantitatively assessed the full procedure by using our implemented software package, named TFHAZ, in two example applications of biological interest, proving its full reliability and relevance.


Assuntos
Regulação da Expressão Gênica , Fatores de Transcrição , Fatores de Transcrição/metabolismo , Reprodutibilidade dos Testes , DNA/genética , Ligação Proteica , Sítios de Ligação/genética
2.
J Biomed Inform ; 144: 104457, 2023 08.
Artigo em Inglês | MEDLINE | ID: mdl-37488024

RESUMO

BACKGROUND AND OBJECTIVE: Many classification tasks in translational bioinformatics and genomics are characterized by the high dimensionality of potential features and unbalanced sample distribution among classes. This can affect classifier robustness and increase the risk of overfitting, curse of dimensionality and generalization leaks; furthermore and most importantly, this can prevent obtaining adequate patient stratification required for precision medicine in facing complex diseases, like cancer. Setting up a feature selection strategy able to extract only proper predictive features by removing irrelevant, redundant, and noisy ones is crucial to achieving valuable results on the desired task. METHODS: We propose a new feature selection approach, called ReRa, based on supervised Relevance-Redundancy assessments. ReRa consists of a customized step of relevance-based filtering, to identify a reduced subset of meaningful features, followed by a supervised similarity-based procedure to minimize redundancy. This latter step innovatively uses a combination of global and class-specific similarity assessments to remove redundant features while preserving those differentiated across classes, even when these classes are strongly unbalanced. RESULTS: We compared ReRa with several existing feature selection methods to obtain feature spaces on which performing breast cancer patient subtyping using several classifiers: we considered two use cases based on gene or transcript isoform expression. In the vast majority of the assessed scenarios, when using ReRa-selected feature spaces, the performances were significantly increased compared to simple feature filtering, LASSO regularization, or even MRmr - another Relevance-Redundancy method. The two use cases represent an insightful example of translational application, taking advantage of ReRa capabilities to investigate and enhance a clinically-relevant patient stratification task, which could be easily applied also to other cancer types and diseases. CONCLUSIONS: ReRa approach has the potential to improve the performance of machine learning models used in an unbalanced classification scenario. Compared to another Relevance-Redundancy approach like MRmr, ReRa does not require tuning the number of preserved features, ensures efficiency and scalability over huge initial dimensionalities and allows re-evaluation of all previously selected features at each iteration of the redundancy assessment, to ultimately preserve only the most relevant and class-differentiated features.


Assuntos
Algoritmos , Neoplasias da Mama , Humanos , Feminino , Biologia Computacional/métodos , Genômica , Proteômica , Neoplasias da Mama/diagnóstico , Neoplasias da Mama/genética
3.
Genome Med ; 15(1): 37, 2023 05 15.
Artigo em Inglês | MEDLINE | ID: mdl-37189167

RESUMO

BACKGROUND: Transcriptional classification has been used to stratify colorectal cancer (CRC) into molecular subtypes with distinct biological and clinical features. However, it is not clear whether such subtypes represent discrete, mutually exclusive entities or molecular/phenotypic states with potential overlap. Therefore, we focused on the CRC Intrinsic Subtype (CRIS) classifier and evaluated whether assigning multiple CRIS subtypes to the same sample provides additional clinically and biologically relevant information. METHODS: A multi-label version of the CRIS classifier (multiCRIS) was applied to newly generated RNA-seq profiles from 606 CRC patient-derived xenografts (PDXs), together with human CRC bulk and single-cell RNA-seq datasets. Biological and clinical associations of single- and multi-label CRIS were compared. Finally, a machine learning-based multi-label CRIS predictor (ML2CRIS) was developed for single-sample classification. RESULTS: Surprisingly, about half of the CRC cases could be significantly assigned to more than one CRIS subtype. Single-cell RNA-seq analysis revealed that multiple CRIS membership can be a consequence of the concomitant presence of cells of different CRIS class or, less frequently, of cells with hybrid phenotype. Multi-label assignments were found to improve prediction of CRC prognosis and response to treatment. Finally, the ML2CRIS classifier was validated for retaining the same biological and clinical associations also in the context of single-sample classification. CONCLUSIONS: These results show that CRIS subtypes retain their biological and clinical features even when concomitantly assigned to the same CRC sample. This approach could be potentially extended to other cancer types and classification systems.


Assuntos
Neoplasias Colorretais , Animais , Humanos , Neoplasias Colorretais/patologia , Prognóstico , Modelos Animais de Doenças , Biomarcadores Tumorais/genética
4.
Artigo em Inglês | MEDLINE | ID: mdl-33270566

RESUMO

Breast Cancer comprises multiple subtypes implicated in prognosis. Existing stratification methods rely on the expression quantification of small gene sets. Next Generation Sequencing promises large amounts of omic data in the next years. In this scenario, we explore the potential of machine learning and, particularly, deep learning for breast cancer subtyping. Due to the paucity of publicly available data, we leverage on pan-cancer and non-cancer data to design semi-supervised settings. We make use of multi-omic data, including microRNA expressions and copy number alterations, and we provide an in-depth investigation of several supervised and semi-supervised architectures. Obtained accuracy results show simpler models to perform at least as well as the deep semi-supervised approaches on our task over gene expression data. When multi-omic data types are combined together, performance of deep models shows little (if any) improvement in accuracy, indicating the need for further analysis on larger datasets of multi-omic data as and when they become available. From a biological perspective, our linear model mostly confirms known gene-subtype annotations. Conversely, deep approaches model non-linear relationships, which is reflected in a more varied and still unexplored set of representative omic features that may prove useful for breast cancer subtyping.


Assuntos
Neoplasias da Mama , Aprendizado Profundo , Neoplasias da Mama/genética , Variações do Número de Cópias de DNA , Feminino , Humanos , Aprendizado de Máquina , Aprendizado de Máquina Supervisionado
5.
Sci Rep ; 10(1): 14071, 2020 08 21.
Artigo em Inglês | MEDLINE | ID: mdl-32826944

RESUMO

Stratification of breast cancer (BC) into molecular subtypes by multigene expression assays is of demonstrated clinical utility. In principle, global RNA-sequencing (RNA-seq) should enable reconstructing existing transcriptional classifications of BC samples. Yet, it is not clear whether adaptation to RNA-seq of classifiers originally developed using PCR or microarrays, or reconstruction through machine learning (ML) is preferable. Hence, we focused on robustness and portability of PAM50, a nearest-centroid classifier developed on microarray data to identify five BC "intrinsic subtypes". We found that standard PAM50 is profoundly affected by the composition of the sample cohort used for reference construction, and we propose a strategy, named AWCA, to mitigate this issue, improving classification robustness, with over 90% of concordance, and prognostic ability; we also show that AWCA-based PAM50 can even be applied as single-sample method. Furthermore, we explored five supervised learners to build robust, single-sample intrinsic subtype callers via RNA-seq. From our ML-based survey, regularized multiclass logistic regression (mLR) displayed the best performance, further increased by ad-hoc gene selection on the global transcriptome. On external test sets, mLR classifications reached 90% concordance with PAM50-based calls, without need of reference sample; mLR proven robustness and prognostic ability make it an equally valuable single-sample method to strengthen BC subtyping.


Assuntos
Neoplasias da Mama/classificação , Carcinoma/classificação , Aprendizado de Máquina , Análise de Sequência de RNA , Biomarcadores Tumorais , Neoplasias da Mama/química , Neoplasias da Mama/genética , Carcinoma/química , Carcinoma/genética , Conjuntos de Dados como Assunto , Estrogênios , Feminino , Humanos , Modelos Logísticos , Neoplasias Hormônio-Dependentes/química , Neoplasias Hormônio-Dependentes/genética , Prognóstico , Receptores de Estrogênio/análise , Recidiva
6.
Anticancer Res ; 40(6): 3355-3360, 2020 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-32487631

RESUMO

BACKGROUND/AIM: Proliferation biomarkers such as MIB-1 are strong predictors of clinical outcome and response to therapy in patients with non-small-cell lung cancer, but they require histological examination. In this work, we present a classification model to predict MIB-1 expression based on clinical parameters from positron emission tomography. PATIENTS AND METHODS: We retrospectively evaluated 78 patients with histology-proven non-small-cell lung cancer (NSCLC) who underwent 18F-FDG-PET/CT for clinical examination. We stratified the population into a low and high proliferation group using MIB-1=25% as cut-off value. We built a predictive model based on binary classification trees to estimate the group label from the maximum standardized uptake value (SUVmax) and lesion diameter. RESULTS: The proposed model showed ability to predict the correct proliferation group with overall accuracy >82% (78% and 86% for the low- and high-proliferation group, respectively). CONCLUSION: Our results indicate that radiotracer activity evaluated via SUVmax and lesion diameter are correlated with tumour proliferation index MIB-1.


Assuntos
Carcinoma Pulmonar de Células não Pequenas/classificação , Carcinoma Pulmonar de Células não Pequenas/diagnóstico por imagem , Fluordesoxiglucose F18 , Antígeno Ki-67/biossíntese , Neoplasias Pulmonares/classificação , Neoplasias Pulmonares/diagnóstico por imagem , Carcinoma Pulmonar de Células não Pequenas/metabolismo , Carcinoma Pulmonar de Células não Pequenas/patologia , Proliferação de Células/fisiologia , Feminino , Humanos , Imuno-Histoquímica , Neoplasias Pulmonares/metabolismo , Neoplasias Pulmonares/patologia , Masculino , Tomografia por Emissão de Pósitrons combinada à Tomografia Computadorizada/métodos , Compostos Radiofarmacêuticos , Estudos Retrospectivos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA