Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 53
Filter
1.
Nucleic Acids Res ; 52(D1): D1639-D1650, 2024 Jan 05.
Article in English | MEDLINE | ID: mdl-37811889

ABSTRACT

Advanced multi-omics technologies offer much information that can uncover the regulatory mechanisms from genotype to phenotype. In soybean, numerous multi-omics databases have been published. Although they cover multiple omics, there are still limitations when it comes to the types and scales of omics datasets and analysis methods utilized. This study aims to address these limitations by collecting and integrating a comprehensive set of multi-omics datasets. This includes 38 genomes, transcriptomes from 435 tissue samples, 125 phenotypes from 6686 accessions, epigenome data involving histone modification, transcription factor binding, chromosomal accessibility and chromosomal interaction, as well as genetic variation data from 24 501 soybean accessions. Then, common analysis pipelines and statistical methods were applied to mine information from these multi-omics datasets, resulting in the successful establishment of a user-friendly multi-omics database called SoyMD (https://yanglab.hzau.edu.cn/SoyMD/#/). SoyMD provides researchers with efficient query options and analysis tools, allowing them to swiftly access relevant omics information and conduct comprehensive multi-omics data analyses. Another notable feature of SoyMD is its capability to facilitate the analysis of candidate genes, as demonstrated in the case study on seed oil content. This highlights the immense potential of SoyMD in soybean genetic breeding and functional genomics research.


Subject(s)
Databases, Factual , Glycine max , Software , Genomics/methods , Glycine max/genetics , Multiomics , Plant Breeding
2.
Nucleic Acids Res ; 52(D1): D107-D114, 2024 Jan 05.
Article in English | MEDLINE | ID: mdl-37992296

ABSTRACT

Expression Atlas (www.ebi.ac.uk/gxa) and its newest counterpart the Single Cell Expression Atlas (www.ebi.ac.uk/gxa/sc) are EMBL-EBI's knowledgebases for gene and protein expression and localisation in bulk and at single cell level. These resources aim to allow users to investigate their expression in normal tissue (baseline) or in response to perturbations such as disease or changes to genotype (differential) across multiple species. Users are invited to search for genes or metadata terms across species or biological conditions in a standardised consistent interface. Alongside these data, new features in Single Cell Expression Atlas allow users to query metadata through our new cell type wheel search. At the experiment level data can be explored through two types of dimensionality reduction plots, t-distributed Stochastic Neighbor Embedding (tSNE) and Uniform Manifold Approximation and Projection (UMAP), overlaid with either clustering or metadata information to assist users' understanding. Data are also visualised as marker gene heatmaps identifying genes that help confer cluster identity. For some data, additional visualisations are available as interactive cell level anatomograms and cell type gene expression heatmaps.


Subject(s)
Databases, Genetic , Gene Expression Profiling , Proteomics , Genotype , Metadata , Single-Cell Analysis , Internet , Humans , Animals
3.
Nucleic Acids Res ; 51(D1): D1446-D1456, 2023 01 06.
Article in English | MEDLINE | ID: mdl-36215030

ABSTRACT

Cotton is an important economic crop, and many loci for important traits have been identified, but it remains challenging and time-consuming to identify candidate or causal genes/variants and clarify their roles in phenotype formation and regulation. Here, we first collected and integrated the multi-omics datasets including 25 genomes, transcriptomes in 76 tissue samples, epigenome data of five species and metabolome data of 768 metabolites from four tissues, and genetic variation, trait and transcriptome datasets from 4180 cotton accessions. Then, a cotton multi-omics database (CottonMD, http://yanglab.hzau.edu.cn/CottonMD/) was constructed. In CottonMD, multiple statistical methods were applied to identify the associations between variations and phenotypes, and many easy-to-use analysis tools were provided to help researchers quickly acquire the related omics information and perform multi-omics data analysis. Two case studies demonstrated the power of CottonMD for identifying and analyzing the candidate genes, as well as the great potential of integrating multi-omics data for cotton genetic breeding and functional genomics research.


Subject(s)
Databases, Factual , Gossypium , Multiomics , Genome , Genomics/methods , Phenotype , Gossypium/chemistry , Gossypium/genetics
4.
Nucleic Acids Res ; 51(D1): D1539-D1548, 2023 01 06.
Article in English | MEDLINE | ID: mdl-36370099

ABSTRACT

Mass spectrometry (MS) is by far the most used experimental approach in high-throughput proteomics. The ProteomeXchange (PX) consortium of proteomics resources (http://www.proteomexchange.org) was originally set up to standardize data submission and dissemination of public MS proteomics data. It is now 10 years since the initial data workflow was implemented. In this manuscript, we describe the main developments in PX since the previous update manuscript in Nucleic Acids Research was published in 2020. The six members of the Consortium are PRIDE, PeptideAtlas (including PASSEL), MassIVE, jPOST, iProX and Panorama Public. We report the current data submission statistics, showcasing that the number of datasets submitted to PX resources has continued to increase every year. As of June 2022, more than 34 233 datasets had been submitted to PX resources, and from those, 20 062 (58.6%) just in the last three years. We also report the development of the Universal Spectrum Identifiers and the improvements in capturing the experimental metadata annotations. In parallel, we highlight that data re-use activities of public datasets continue to increase, enabling connections between PX resources and other popular bioinformatics resources, novel research and also new data resources. Finally, we summarise the current state-of-the-art in data management practices for sensitive human (clinical) proteomics data.


Subject(s)
Proteomics , Software , Humans , Databases, Protein , Mass Spectrometry , Proteomics/methods , Computational Biology/methods
5.
J Proteome Res ; 23(6): 1948-1959, 2024 Jun 07.
Article in English | MEDLINE | ID: mdl-38717300

ABSTRACT

The availability of an increasingly large amount of public proteomics data sets presents an opportunity for performing combined analyses to generate comprehensive organism-wide protein expression maps across different organisms and biological conditions. Sus scrofa, a domestic pig, is a model organism relevant for food production and for human biomedical research. Here, we reanalyzed 14 public proteomics data sets from the PRIDE database coming from pig tissues to assess baseline (without any biological perturbation) protein abundance in 14 organs, encompassing a total of 20 healthy tissues from 128 samples. The analysis involved the quantification of protein abundance in 599 mass spectrometry runs. We compared protein expression patterns among different pig organs and examined the distribution of proteins across these organs. Then, we studied how protein abundances were compared across different data sets and studied the tissue specificity of the detected proteins. Of particular interest, we conducted a comparative analysis of protein expression between pig and human tissues, revealing a high degree of correlation in protein expression among orthologs, particularly in brain, kidney, heart, and liver samples. We have integrated the protein expression results into the Expression Atlas resource for easy access and visualization of the protein expression data individually or alongside gene expression data.


Subject(s)
Kidney , Proteomics , Animals , Proteomics/methods , Humans , Swine , Kidney/metabolism , Kidney/chemistry , Organ Specificity , Liver/metabolism , Liver/chemistry , Databases, Protein , Brain/metabolism , Myocardium/metabolism , Myocardium/chemistry , Sus scrofa/metabolism , Sus scrofa/genetics , Proteome/metabolism , Proteome/analysis , Mass Spectrometry
6.
Nucleic Acids Res ; 50(D1): D543-D552, 2022 01 07.
Article in English | MEDLINE | ID: mdl-34723319

ABSTRACT

The PRoteomics IDEntifications (PRIDE) database (https://www.ebi.ac.uk/pride/) is the world's largest data repository of mass spectrometry-based proteomics data. PRIDE is one of the founding members of the global ProteomeXchange (PX) consortium and an ELIXIR core data resource. In this manuscript, we summarize the developments in PRIDE resources and related tools since the previous update manuscript was published in Nucleic Acids Research in 2019. The number of submitted datasets to PRIDE Archive (the archival component of PRIDE) has reached on average around 500 datasets per month during 2021. In addition to continuous improvements in PRIDE Archive data pipelines and infrastructure, the PRIDE Spectra Archive has been developed to provide direct access to the submitted mass spectra using Universal Spectrum Identifiers. As a key point, the file format MAGE-TAB for proteomics has been developed to enable the improvement of sample metadata annotation. Additionally, the resource PRIDE Peptidome provides access to aggregated peptide/protein evidences across PRIDE Archive. Furthermore, we will describe how PRIDE has increased its efforts to reuse and disseminate high-quality proteomics data into other added-value resources such as UniProt, Ensembl and Expression Atlas.


Subject(s)
Databases, Protein , Metadata/statistics & numerical data , Molecular Sequence Annotation/statistics & numerical data , Peptides/chemistry , Proteins/chemistry , Software , Amino Acid Sequence , Bibliometrics , Datasets as Topic , Humans , Information Storage and Retrieval , Internet , Mass Spectrometry , Peptides/genetics , Peptides/metabolism , Proteins/genetics , Proteins/metabolism , Proteomics/instrumentation , Proteomics/methods , Sequence Alignment
7.
Nucleic Acids Res ; 50(D1): D129-D140, 2022 01 07.
Article in English | MEDLINE | ID: mdl-34850121

ABSTRACT

The EMBL-EBI Expression Atlas is an added value knowledge base that enables researchers to answer the question of where (tissue, organism part, developmental stage, cell type) and under which conditions (disease, treatment, gender, etc) a gene or protein of interest is expressed. Expression Atlas brings together data from >4500 expression studies from >65 different species, across different conditions and tissues. It makes these data freely available in an easy to visualise form, after expert curation to accurately represent the intended experimental design, re-analysed via standardised pipelines that rely on open-source community developed tools. Each study's metadata are annotated using ontologies. The data are re-analyzed with the aim of reproducing the original conclusions of the underlying experiments. Expression Atlas is currently divided into Bulk Expression Atlas and Single Cell Expression Atlas. Expression Atlas contains data from differential studies (microarray and bulk RNA-Seq) and baseline studies (bulk RNA-Seq and proteomics), whereas Single Cell Expression Atlas is currently dedicated to Single Cell RNA-Sequencing (scRNA-Seq) studies. The resource has been in continuous development since 2009 and it is available at https://www.ebi.ac.uk/gxa.


Subject(s)
Databases, Genetic , Proteins/genetics , Proteomics , Software , Computational Biology , Gene Expression Profiling , Humans , Proteins/chemistry , RNA-Seq , Sequence Analysis, RNA , Single-Cell Analysis
8.
J Proteome Res ; 22(3): 729-742, 2023 03 03.
Article in English | MEDLINE | ID: mdl-36577097

ABSTRACT

The availability of proteomics datasets in the public domain, and in the PRIDE database, in particular, has increased dramatically in recent years. This unprecedented large-scale availability of data provides an opportunity for combined analyses of datasets to get organism-wide protein abundance data in a consistent manner. We have reanalyzed 24 public proteomics datasets from healthy human individuals to assess baseline protein abundance in 31 organs. We defined tissue as a distinct functional or structural region within an organ. Overall, the aggregated dataset contains 67 healthy tissues, corresponding to 3,119 mass spectrometry runs covering 498 samples from 489 individuals. We compared protein abundances between different organs and studied the distribution of proteins across these organs. We also compared the results with data generated in analogous studies. Additionally, we performed gene ontology and pathway-enrichment analyses to identify organ-specific enriched biological processes and pathways. As a key point, we have integrated the protein abundance results into the resource Expression Atlas, where they can be accessed and visualized either individually or together with gene expression data coming from transcriptomics datasets. We believe this is a good mechanism to make proteomics data more accessible for life scientists.


Subject(s)
Proteome , Proteomics , Humans , Proteome/analysis , Proteomics/methods , Gene Expression Profiling , Databases, Factual , Mass Spectrometry/methods , Databases, Protein
9.
Plant Biotechnol J ; 21(8): 1611-1627, 2023 08.
Article in English | MEDLINE | ID: mdl-37154465

ABSTRACT

Plant hormones are the intrinsic factors that control plant development. The integration of different phytohormone pathways in a complex network of synergistic, antagonistic and additive interactions has been elucidated in model plants. However, the systemic level of transcriptional responses to hormone crosstalk in Brassica napus is largely unknown. Here, we present an in-depth temporal-resolution study of the transcriptomes of the seven hormones in B. napus seedlings. Differentially expressed gene analysis revealed few common target genes that co-regulated (up- and down-regulated) by seven hormones; instead, different hormones appear to regulate distinct members of protein families. We then constructed the regulatory networks between the seven hormones side by side, which allowed us to identify key genes and transcription factors that regulate the hormone crosstalk in B. napus. Using this dataset, we uncovered a novel crosstalk between gibberellin and cytokinin in which cytokinin homeostasis was mediated by RGA-related CKXs expression. Moreover, the modulation of gibberellin metabolism by the identified key transcription factors was confirmed in B. napus. Furthermore, all data were available online from http://yanglab.hzau.edu.cn/BnTIR/hormone. Our study reveals an integrated hormone crosstalk network in Brassica napus, which also provides a versatile resource for future hormone studies in plant species.


Subject(s)
Brassica napus , Plant Growth Regulators , Plant Growth Regulators/metabolism , Brassica napus/metabolism , Gibberellins/metabolism , Gene Expression Profiling , Transcription Factors/genetics , Transcription Factors/metabolism , Hormones/metabolism , Cytokinins/metabolism
10.
PLoS Comput Biol ; 18(6): e1010174, 2022 06.
Article in English | MEDLINE | ID: mdl-35714157

ABSTRACT

The increasingly large amount of proteomics data in the public domain enables, among other applications, the combined analyses of datasets to create comparative protein expression maps covering different organisms and different biological conditions. Here we have reanalysed public proteomics datasets from mouse and rat tissues (14 and 9 datasets, respectively), to assess baseline protein abundance. Overall, the aggregated dataset contained 23 individual datasets, including a total of 211 samples coming from 34 different tissues across 14 organs, comprising 9 mouse and 3 rat strains, respectively. In all cases, we studied the distribution of canonical proteins between the different organs. The number of canonical proteins per dataset ranged from 273 (tendon) and 9,715 (liver) in mouse, and from 101 (tendon) and 6,130 (kidney) in rat. Then, we studied how protein abundances compared across different datasets and organs for both species. As a key point we carried out a comparative analysis of protein expression between mouse, rat and human tissues. We observed a high level of correlation of protein expression among orthologs between all three species in brain, kidney, heart and liver samples, whereas the correlation of protein expression was generally slightly lower between organs within the same species. Protein expression results have been integrated into the resource Expression Atlas for widespread dissemination.


Subject(s)
Proteins , Proteomics , Animals , Brain/metabolism , Mice , Proteins/metabolism , Rats
SELECTION OF CITATIONS
SEARCH DETAIL