Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 150
Filtrar
1.
Bioinform Adv ; 4(1): vbae048, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38638280

RESUMO

Motivation: Cell-type deconvolution methods aim to infer cell composition from bulk transcriptomic data. The proliferation of developed methods coupled with inconsistent results obtained in many cases, highlights the pressing need for guidance in the selection of appropriate methods. Additionally, the growing accessibility of single-cell RNA sequencing datasets, often accompanied by bulk expression from related samples enable the benchmark of existing methods. Results: In this study, we conduct a comprehensive assessment of 31 methods, utilizing single-cell RNA-sequencing data from diverse human and mouse tissues. Employing various simulation scenarios, we reveal the efficacy of regression-based deconvolution methods, highlighting their sensitivity to reference choices. We investigate the impact of bulk-reference differences, incorporating variables such as sample, study and technology. We provide validation using a gold standard dataset from mononuclear cells and suggest a consensus prediction of proportions when ground truth is not available. We validated the consensus method on data from the stomach and studied its spillover effect. Importantly, we propose the use of the critical assessment of transcriptomic deconvolution (CATD) pipeline which encompasses functionalities for generating references and pseudo-bulks and running implemented deconvolution methods. CATD streamlines simultaneous deconvolution of numerous bulk samples, providing a practical solution for speeding up the evaluation of newly developed methods. Availability and implementation: https://github.com/Papatheodorou-Group/CATD_snakemake.

2.
Nucleic Acids Res ; 52(D1): D107-D114, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37992296

RESUMO

Expression Atlas (www.ebi.ac.uk/gxa) and its newest counterpart the Single Cell Expression Atlas (www.ebi.ac.uk/gxa/sc) are EMBL-EBI's knowledgebases for gene and protein expression and localisation in bulk and at single cell level. These resources aim to allow users to investigate their expression in normal tissue (baseline) or in response to perturbations such as disease or changes to genotype (differential) across multiple species. Users are invited to search for genes or metadata terms across species or biological conditions in a standardised consistent interface. Alongside these data, new features in Single Cell Expression Atlas allow users to query metadata through our new cell type wheel search. At the experiment level data can be explored through two types of dimensionality reduction plots, t-distributed Stochastic Neighbor Embedding (tSNE) and Uniform Manifold Approximation and Projection (UMAP), overlaid with either clustering or metadata information to assist users' understanding. Data are also visualised as marker gene heatmaps identifying genes that help confer cluster identity. For some data, additional visualisations are available as interactive cell level anatomograms and cell type gene expression heatmaps.


Assuntos
Bases de Dados Genéticas , Perfilação da Expressão Gênica , Proteômica , Genótipo , Metadados , Análise de Célula Única , Internet , Humanos , Animais
3.
Front Cell Dev Biol ; 11: 1297910, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-38020918

RESUMO

Melanoma is the deadliest form of skin cancer and develops from the melanocytes that are responsible for the pigmentation of the skin. The skin is also a highly regenerative organ, harboring a pool of undifferentiated melanocyte stem cells that proliferate and differentiate into mature melanocytes during regenerative processes in the adult. Melanoma and melanocyte regeneration share remarkable cellular features, including activation of cell proliferation and migration. Yet, melanoma considerably differs from the regenerating melanocytes with respect to abnormal proliferation, invasive growth, and metastasis. Thus, it is likely that at the cellular level, melanoma resembles early stages of melanocyte regeneration with increased proliferation but separates from the later melanocyte regeneration stages due to reduced proliferation and enhanced differentiation. Here, by exploiting the zebrafish melanocytes that can efficiently regenerate and be induced to undergo malignant melanoma, we unravel the transcriptome profiles of the regenerating melanocytes during early and late regeneration and the melanocytic nevi and malignant melanoma. Our global comparison of the gene expression profiles of melanocyte regeneration and nevi/melanoma uncovers the opposite regulation of a substantial number of genes related to Wnt signaling and transforming growth factor beta (TGF-ß)/(bone morphogenetic protein) BMP signaling pathways between regeneration and cancer. Functional activation of canonical Wnt or TGF-ß/BMP pathways during melanocyte regeneration promoted melanocyte regeneration but potently suppressed the invasiveness, migration, and proliferation of human melanoma cells in vitro and in vivo. Therefore, the opposite regulation of signaling mechanisms between melanocyte regeneration and melanoma can be exploited to stop tumor growth and develop new anti-cancer therapies.

4.
Nat Commun ; 14(1): 6495, 2023 10 14.
Artigo em Inglês | MEDLINE | ID: mdl-37838716

RESUMO

The growing number of available single-cell gene expression datasets from different species creates opportunities to explore evolutionary relationships between cell types across species. Cross-species integration of single-cell RNA-sequencing data has been particularly informative in this context. However, in order to do so robustly it is essential to have rigorous benchmarking and appropriate guidelines to ensure that integration results truly reflect biology. Here, we benchmark 28 combinations of gene homology mapping methods and data integration algorithms in a variety of biological settings. We examine the capability of each strategy to perform species-mixing of known homologous cell types and to preserve biological heterogeneity using 9 established metrics. We also develop a new biology conservation metric to address the maintenance of cell type distinguishability. Overall, scANVI, scVI and SeuratV4 methods achieve a balance between species-mixing and biology conservation. For evolutionarily distant species, including in-paralogs is beneficial. SAMap outperforms when integrating whole-body atlases between species with challenging gene homology annotation. We provide our freely available cross-species integration and assessment pipeline to help analyse new data and develop new algorithms.


Assuntos
Algoritmos , Benchmarking , Anotação de Sequência Molecular , Sequenciamento do Exoma , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos
5.
Bioinformatics ; 39(10)2023 10 03.
Artigo em Inglês | MEDLINE | ID: mdl-37756700

RESUMO

MOTIVATION: The nuclear pore complex (NPC) is the only passageway for macromolecules between nucleus and cytoplasm, and an important reference standard in microscopy: it is massive and stereotypically arranged. The average architecture of NPC proteins has been resolved with pseudoatomic precision, however observed NPC heterogeneities evidence a high degree of divergence from this average. Single-molecule localization microscopy (SMLM) images NPCs at protein-level resolution, whereupon image analysis software studies NPC variability. However, the true picture of this variability is unknown. In quantitative image analysis experiments, it is thus difficult to distinguish intrinsically high SMLM noise from variability of the underlying structure. RESULTS: We introduce CIR4MICS ('ceramics', Configurable, Irregular Rings FOR MICroscopy Simulations), a pipeline that synthesizes ground truth datasets of structurally variable NPCs based on architectural models of the true NPC. Users can select one or more N- or C-terminally tagged NPC proteins, and simulate a wide range of geometric variations. We also represent the NPC as a spring-model such that arbitrary deforming forces, of user-defined magnitudes, simulate irregularly shaped variations. Further, we provide annotated reference datasets of simulated human NPCs, which facilitate a side-by-side comparison with real data. To demonstrate, we synthetically replicate a geometric analysis of real NPC radii and reveal that a range of simulated variability parameters can lead to observed results. Our simulator is therefore valuable to test the capabilities of image analysis methods, as well as to inform experimentalists about the requirements of hypothesis-driven imaging studies. AVAILABILITY AND IMPLEMENTATION: Code: https://github.com/uhlmanngroup/cir4mics. Simulated data: BioStudies S-BSST1058.


Assuntos
Microscopia , Poro Nuclear , Humanos , Poro Nuclear/química , Poro Nuclear/metabolismo , Complexo de Proteínas Formadoras de Poros Nucleares/análise , Complexo de Proteínas Formadoras de Poros Nucleares/metabolismo , Imagem Individual de Molécula/métodos , Software
6.
Clin Cancer Res ; 29(7): 1220-1231, 2023 04 03.
Artigo em Inglês | MEDLINE | ID: mdl-36815791

RESUMO

PURPOSE: Patients with resected localized clear-cell renal cell carcinoma (ccRCC) remain at variable risk of recurrence. Incorporation of biomarkers may refine risk prediction and inform adjuvant treatment decisions. We explored the role of tumor genomics in this setting, leveraging the largest cohort to date of localized ccRCC tissues subjected to targeted gene sequencing. EXPERIMENTAL DESIGN: The somatic mutation status of 12 genes was determined in 943 ccRCC cases from a multinational cohort of patients, and associations to outcomes were examined in a Discovery (n = 469) and Validation (n = 474) framework. RESULTS: Tumors containing a von-Hippel Lindau (VHL) mutation alone were associated with significantly improved outcomes in comparison with tumors containing a VHL plus additional mutations. Within the Discovery cohort, those with VHL+0, VHL+1, VHL+2, and VHL+≥3 tumors had disease-free survival (DFS) rates of 90.8%, 80.1%, 68.2%, and 50.7% respectively, at 5 years. This trend was replicated in the Validation cohort. Notably, these genomically defined groups were independent of tumor mutational burden. Amongst patients eligible for adjuvant therapy, those with a VHL+0 tumor (29%) had a 5-year DFS rate of 79.3% and could, therefore, potentially be spared further treatment. Conversely, patients with VHL+2 and VHL+≥3 tumors (32%) had equivalent DFS rates of 45.6% and 35.3%, respectively, and should be prioritized for adjuvant therapy. CONCLUSIONS: Genomic characterization of ccRCC identified biologically distinct groups of patients with divergent relapse rates. These groups account for the ∼80% of cases with VHL mutations and could be used to personalize adjuvant treatment discussions with patients as well as inform future adjuvant trial design.


Assuntos
Carcinoma de Células Renais , Neoplasias Renais , Humanos , Carcinoma de Células Renais/genética , Carcinoma de Células Renais/terapia , Carcinoma de Células Renais/metabolismo , Neoplasias Renais/genética , Neoplasias Renais/terapia , Neoplasias Renais/metabolismo , Proteína Supressora de Tumor Von Hippel-Lindau/genética , Recidiva Local de Neoplasia/genética , Mutação
10.
J Mol Biol ; 434(11): 167505, 2022 06 15.
Artigo em Inglês | MEDLINE | ID: mdl-35189131

RESUMO

Despite the huge impact of data resources in genomics and structural biology, until now there has been no central archive for biological data for all imaging modalities. The BioImage Archive is a new data resource at the European Bioinformatics Institute (EMBL-EBI) designed to fill this gap. In its initial development BioImage Archive accepts bioimaging data associated with publications, in any format, from any imaging modality from the molecular to the organism scale, excluding medical imaging. The BioImage Archive will ensure reproducibility of published studies that derive results from image data and reduce duplication of effort. Most importantly, the BioImage Archive will help scientists to generate new insights through reuse of existing data to answer new biological questions, and provision of training, testing and benchmarking data for development of tools for image analysis. The archive is available at https://www.ebi.ac.uk/bioimage-archive/.


Assuntos
Arquivos , Uso da Internet , Microscopia , Bases de Dados Factuais , Reprodutibilidade dos Testes
11.
Front Cell Dev Biol ; 10: 813314, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35223842

RESUMO

Gliomas are the most frequent type of brain cancers and characterized by continuous proliferation, inflammation, angiogenesis, invasion and dedifferentiation, which are also among the initiator and sustaining factors of brain regeneration during restoration of tissue integrity and function. Thus, brain regeneration and brain cancer should share more molecular mechanisms at early stages of regeneration where cell proliferation dominates. However, the mechanisms could diverge later when the regenerative response terminates, while cancer cells sustain proliferation. To test this hypothesis, we exploited the adult zebrafish that, in contrast to the mammals, can efficiently regenerate the brain in response to injury. By comparing transcriptome profiles of the regenerating zebrafish telencephalon at its three different stages, i.e., 1 day post-lesion (dpl)-early wound healing stage, 3 dpl-early proliferative stage and 14 dpl-differentiation stage, to those of two brain cancers, i.e., low-grade glioma (LGG) and glioblastoma (GBM), we reveal the common and distinct molecular mechanisms of brain regeneration and brain cancer. While the transcriptomes of 1 dpl and 3 dpl harbor unique gene modules and gene expression profiles that are more divergent from the control, the transcriptome of 14 dpl converges to that of the control. Next, by functional analysis of the transcriptomes of brain regeneration stages to LGG and GBM, we reveal the common and distinct molecular pathways in regeneration and cancer. 1 dpl and LGG and GBM resemble with regard to signaling pathways related to metabolism and neurogenesis, while 3 dpl and LGG and GBM share pathways that control cell proliferation and differentiation. On the other hand, 14 dpl and LGG and GBM converge with respect to developmental and morphogenetic processes. Finally, our global comparison of gene expression profiles of three brain regeneration stages, LGG and GBM exhibit that 1 dpl is the most similar stage to LGG and GBM while 14 dpl is the most distant stage to both brain cancers. Therefore, early convergence and later divergence of brain regeneration and brain cancer constitutes a key starting point in comparative understanding of cellular and molecular events between the two phenomena and development of relevant targeted therapies for brain cancers.

12.
Nucleic Acids Res ; 50(D1): D543-D552, 2022 01 07.
Artigo em Inglês | MEDLINE | ID: mdl-34723319

RESUMO

The PRoteomics IDEntifications (PRIDE) database (https://www.ebi.ac.uk/pride/) is the world's largest data repository of mass spectrometry-based proteomics data. PRIDE is one of the founding members of the global ProteomeXchange (PX) consortium and an ELIXIR core data resource. In this manuscript, we summarize the developments in PRIDE resources and related tools since the previous update manuscript was published in Nucleic Acids Research in 2019. The number of submitted datasets to PRIDE Archive (the archival component of PRIDE) has reached on average around 500 datasets per month during 2021. In addition to continuous improvements in PRIDE Archive data pipelines and infrastructure, the PRIDE Spectra Archive has been developed to provide direct access to the submitted mass spectra using Universal Spectrum Identifiers. As a key point, the file format MAGE-TAB for proteomics has been developed to enable the improvement of sample metadata annotation. Additionally, the resource PRIDE Peptidome provides access to aggregated peptide/protein evidences across PRIDE Archive. Furthermore, we will describe how PRIDE has increased its efforts to reuse and disseminate high-quality proteomics data into other added-value resources such as UniProt, Ensembl and Expression Atlas.


Assuntos
Bases de Dados de Proteínas , Metadados/estatística & dados numéricos , Anotação de Sequência Molecular/estatística & dados numéricos , Peptídeos/química , Proteínas/química , Software , Sequência de Aminoácidos , Bibliometria , Conjuntos de Dados como Assunto , Humanos , Armazenamento e Recuperação da Informação , Internet , Espectrometria de Massas , Peptídeos/genética , Peptídeos/metabolismo , Proteínas/genética , Proteínas/metabolismo , Proteômica/instrumentação , Proteômica/métodos , Alinhamento de Sequência
13.
Nucleic Acids Res ; 50(D1): D129-D140, 2022 01 07.
Artigo em Inglês | MEDLINE | ID: mdl-34850121

RESUMO

The EMBL-EBI Expression Atlas is an added value knowledge base that enables researchers to answer the question of where (tissue, organism part, developmental stage, cell type) and under which conditions (disease, treatment, gender, etc) a gene or protein of interest is expressed. Expression Atlas brings together data from >4500 expression studies from >65 different species, across different conditions and tissues. It makes these data freely available in an easy to visualise form, after expert curation to accurately represent the intended experimental design, re-analysed via standardised pipelines that rely on open-source community developed tools. Each study's metadata are annotated using ontologies. The data are re-analyzed with the aim of reproducing the original conclusions of the underlying experiments. Expression Atlas is currently divided into Bulk Expression Atlas and Single Cell Expression Atlas. Expression Atlas contains data from differential studies (microarray and bulk RNA-Seq) and baseline studies (bulk RNA-Seq and proteomics), whereas Single Cell Expression Atlas is currently dedicated to Single Cell RNA-Sequencing (scRNA-Seq) studies. The resource has been in continuous development since 2009 and it is available at https://www.ebi.ac.uk/gxa.


Assuntos
Bases de Dados Genéticas , Proteínas/genética , Proteômica , Software , Biologia Computacional , Perfilação da Expressão Gênica , Humanos , Proteínas/química , RNA-Seq , Análise de Sequência de RNA , Análise de Célula Única
14.
Nat Commun ; 12(1): 5854, 2021 10 06.
Artigo em Inglês | MEDLINE | ID: mdl-34615866

RESUMO

The amount of public proteomics data is rapidly increasing but there is no standardized format to describe the sample metadata and their relationship with the dataset files in a way that fully supports their understanding or reanalysis. Here we propose to develop the transcriptomics data format MAGE-TAB into a standard representation for proteomics sample metadata. We implement MAGE-TAB-Proteomics in a crowdsourcing project to manually curate over 200 public datasets. We also describe tools and libraries to validate and submit sample metadata-related information to the PRIDE repository. We expect that these developments will improve the reproducibility and facilitate the reanalysis and integration of public proteomics datasets.


Assuntos
Análise de Dados , Bases de Dados de Proteínas , Metadados , Proteômica , Big Data , Humanos , Reprodutibilidade dos Testes , Software , Transcriptoma
15.
Sci Rep ; 11(1): 20833, 2021 10 21.
Artigo em Inglês | MEDLINE | ID: mdl-34675242

RESUMO

Several single-cell RNA sequencing (scRNA-seq) studies analyzing immune response to COVID-19 infection have been recently published. Most of these studies have small sample sizes, which limits the conclusions that can be made with high confidence. By re-analyzing these data in a standardized manner, we validated 8 of the 20 published results across multiple datasets. In particular, we found a consistent decrease in T-cells with increasing COVID-19 infection severity, upregulation of type I Interferon signal pathways, presence of expanded B-cell clones in COVID-19 patients but no consistent trend in T-cell clonal expansion. Overall, our results show that the conclusions drawn from scRNA-seq data analysis of small cohorts of COVID-19 patients need to be treated with some caution.


Assuntos
Biomarcadores/metabolismo , COVID-19/imunologia , COVID-19/metabolismo , RNA Citoplasmático Pequeno , Análise de Célula Única , Líquido da Lavagem Broncoalveolar , Biologia Computacional , Bases de Dados Factuais , Perfilação da Expressão Gênica/métodos , Genoma Humano , Genoma Viral , Humanos , Imunidade , Leucócitos Mononucleares/citologia , RNA-Seq , Reprodutibilidade dos Testes , SARS-CoV-2 , Análise de Sequência de RNA/métodos , Transdução de Sinais , Regulação para Cima
17.
Sci Data ; 8(1): 115, 2021 04 23.
Artigo em Inglês | MEDLINE | ID: mdl-33893311

RESUMO

Using 11 proteomics datasets, mostly available through the PRIDE database, we assembled a reference expression map for 191 cancer cell lines and 246 clinical tumour samples, across 13 lineages. We found unique peptides identified only in tumour samples despite a much higher coverage in cell lines. These were mainly mapped to proteins related to regulation of signalling receptor activity. Correlations between baseline expression in cell lines and tumours were calculated. We found these to be highly similar across all samples with most similarity found within a given sample type. Integration of proteomics and transcriptomics data showed median correlation across cell lines to be 0.58 (range between 0.43 and 0.66). Additionally, in agreement with previous studies, variation in mRNA levels was often a poor predictor of changes in protein abundance. To our knowledge, this work constitutes the first meta-analysis focusing on cancer-related public proteomics datasets. We therefore also highlight shortcomings and limitations of such studies. All data is available through PRIDE dataset identifier PXD013455 and in Expression Atlas.


Assuntos
Proteínas de Neoplasias/biossíntese , Neoplasias/metabolismo , Linhagem Celular Tumoral , Conjuntos de Dados como Assunto , Humanos , Proteínas de Neoplasias/genética , Neoplasias/genética , Proteômica , RNA Mensageiro/biossíntese , RNA Mensageiro/genética , Transcriptoma
19.
Nat Commun ; 12(1): 1137, 2021 02 18.
Artigo em Inglês | MEDLINE | ID: mdl-33602918

RESUMO

Adjuvant systemic therapies are now routinely used following resection of stage III melanoma, however accurate prognostic information is needed to better stratify patients. We use differential expression analyses of primary tumours from 204 RNA-sequenced melanomas within a large adjuvant trial, identifying a 121 metastasis-associated gene signature. This signature strongly associated with progression-free (HR = 1.63, p = 5.24 × 10-5) and overall survival (HR = 1.61, p = 1.67 × 10-4), was validated in 175 regional lymph nodes metastasis as well as two externally ascertained datasets. The machine learning classification models trained using the signature genes performed significantly better in predicting metastases than models trained with clinical covariates (pAUROC = 7.03 × 10-4), or published prognostic signatures (pAUROC < 0.05). The signature score negatively correlated with measures of immune cell infiltration (ρ = -0.75, p < 2.2 × 10-16), with a higher score representing reduced lymphocyte infiltration and a higher 5-year risk of death in stage II melanoma. Our expression signature identifies melanoma patients at higher risk of metastases and warrants further evaluation in adjuvant clinical trials.


Assuntos
Perfilação da Expressão Gênica , Regulação Neoplásica da Expressão Gênica , Melanoma/genética , Bases de Dados Genéticas , Humanos , Aprendizado de Máquina , Análise Multivariada , Estadiamento de Neoplasias , Prognóstico , Intervalo Livre de Progressão , Modelos de Riscos Proporcionais , Reprodutibilidade dos Testes , Fatores de Tempo , Resultado do Tratamento
20.
Front Immunol ; 12: 781432, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-35046942

RESUMO

Despite many studies on the immune characteristics of Coronavirus disease 2019 (COVID-19) patients in the progression stage, a detailed understanding of pertinent immune cells in recovered patients is lacking. We performed single-cell RNA sequencing on samples from recovered COVID-19 patients and healthy controls. We created a comprehensive immune landscape with more than 260,000 peripheral blood mononuclear cells (PBMCs) from 41 samples by integrating our dataset with previously reported datasets, which included samples collected between 27 and 47 days after symptom onset. According to our large-scale single-cell analysis, recovered patients, who had severe symptoms (severe/critical recovered), still exhibited peripheral immune disorders 1-2 months after symptom onset. Specifically, in these severe/critical recovered patients, human leukocyte antigen (HLA) class II and antigen processing pathways were downregulated in both CD14 monocytes and dendritic cells compared to healthy controls, while the proportion of CD14 monocytes increased. These may lead to the downregulation of T-cell differentiation pathways in memory T cells. However, in the mild/moderate recovered patients, the proportion of plasmacytoid dendritic cells increased compared to healthy controls, accompanied by the upregulation of HLA-DRA and HLA-DRB1 in both CD14 monocytes and dendritic cells. In addition, T-cell differentiation regulation and memory T cell-related genes FOS, JUN, CD69, CXCR4, and CD83 were upregulated in the mild/moderate recovered patients. Further, the immunoglobulin heavy chain V3-21 (IGHV3-21) gene segment was preferred in B-cell immune repertoires in severe/critical recovered patients. Collectively, we provide a large-scale single-cell atlas of the peripheral immune response in recovered COVID-19 patients.


Assuntos
COVID-19/imunologia , Células Dendríticas/imunologia , Células T de Memória/imunologia , Monócitos/imunologia , RNA-Seq , SARS-CoV-2/imunologia , Análise de Célula Única , COVID-19/genética , Feminino , Humanos , Masculino
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...