Pesquisa | Portal Regional da BVS

Proteogenomic data and resources for pan-cancer analysis.

Li, Yize; Dou, Yongchao; Da Veiga Leprevost, Felipe; Geffen, Yifat; Calinawan, Anna P; Aguet, François; Akiyama, Yo; Anand, Shankara; Birger, Chet; Cao, Song; Chaudhary, Rekha; Chilappagari, Padmini; Cieslik, Marcin; Colaprico, Antonio; Zhou, Daniel Cui; Day, Corbin; Domagalski, Marcin J; Esai Selvan, Myvizhi; Fenyö, David; Foltz, Steven M; Francis, Alicia; Gonzalez-Robles, Tania; Gümüs, Zeynep H; Heiman, David; Holck, Michael; Hong, Runyu; Hu, Yingwei; Jaehnig, Eric J; Ji, Jiayi; Jiang, Wen; Katsnelson, Lizabeth; Ketchum, Karen A; Klein, Robert J; Lei, Jonathan T; Liang, Wen-Wei; Liao, Yuxing; Lindgren, Caleb M; Ma, Weiping; Ma, Lei; MacCoss, Michael J; Martins Rodrigues, Fernanda; McKerrow, Wilson; Nguyen, Ngoc; Oldroyd, Robert; Pilozzi, Alexander; Pugliese, Pietro; Reva, Boris; Rudnick, Paul; Ruggles, Kelly V; Rykunov, Dmitry.

Cancer Cell ; 41(8): 1397-1406, 2023 08 14.

Artigo em Inglês | MEDLINE | ID: mdl-37582339

RESUMO

The National Cancer Institute's Clinical Proteomic Tumor Analysis Consortium (CPTAC) investigates tumors from a proteogenomic perspective, creating rich multi-omics datasets connecting genomic aberrations to cancer phenotypes. To facilitate pan-cancer investigations, we have generated harmonized genomic, transcriptomic, proteomic, and clinical data for >1000 tumors in 10 cohorts to create a cohesive and powerful dataset for scientific discovery. We outline efforts by the CPTAC pan-cancer working group in data harmonization, data dissemination, and computational resources for aiding biological discoveries. We also discuss challenges for multi-omics data integration and analysis, specifically the unique challenges of working with both nucleotide sequencing and mass spectrometry proteomics data.

Assuntos

Neoplasias , Proteogenômica , Humanos , Proteômica , Genômica , Neoplasias/genética , Perfilação da Expressão Gênica

State-of-the-Art Data Management: Improving the Reproducibility, Consistency, and Traceability of Structural Biology and in Vitro Biochemical Experiments.

Cooper, David R; Grabowski, Marek; Zimmerman, Matthew D; Porebski, Przemyslaw J; Shabalin, Ivan G; Woinska, Magdalena; Domagalski, Marcin J; Zheng, Heping; Sroka, Piotr; Cymborowski, Marcin; Czub, Mateusz P; Niedzialkowska, Ewa; Venkataramany, Barat S; Osinski, Tomasz; Fratczak, Zbigniew; Bajor, Jacek; Gonera, Juliusz; MacLean, Elizabeth; Wojciechowska, Kamila; Konina, Krzysztof; Wajerowicz, Wojciech; Chruszcz, Maksymilian; Minor, Wladek.

Methods Mol Biol ; 2199: 209-236, 2021.

Artigo em Inglês | MEDLINE | ID: mdl-33125653

RESUMO

Efficient and comprehensive data management is an indispensable component of modern scientific research and requires effective tools for all but the most trivial experiments. The LabDB system developed and used in our laboratory was originally designed to track the progress of a structure determination pipeline in several large National Institutes of Health (NIH) projects. While initially designed for structural biology experiments, its modular nature makes it easily applied in laboratories of various sizes in many experimental fields. Over many years, LabDB has transformed into a sophisticated system integrating a range of biochemical, biophysical, and crystallographic experimental data, which harvests data both directly from laboratory instruments and through human input via a web interface. The core module of the system handles many types of universal laboratory management data, such as laboratory personnel, chemical inventories, storage locations, and custom stock solutions. LabDB also tracks various biochemical experiments, including spectrophotometric and fluorescent assays, thermal shift assays, isothermal titration calorimetry experiments, and more. LabDB has been used to manage data for experiments that resulted in over 1200 deposits to the Protein Data Bank (PDB); the system is currently used by the Center for Structural Genomics of Infectious Diseases (CSGID) and several large laboratories. This chapter also provides examples of data mining analyses and warnings about incomplete and inconsistent experimental data. These features, together with its capabilities for detailed tracking, analysis, and auditing of experimental data, make the described system uniquely suited to inspect potential sources of irreproducibility in life sciences research.

Assuntos

Biologia Computacional , Sistemas de Gerenciamento de Base de Dados , Bases de Dados de Proteínas , Humanos , Reprodutibilidade dos Testes

Correction to: Classification, substrate specificity and structural features of D-2-hydroxyacid dehydrogenases: 2HADH knowledgebase.

Matelska, Dorota; Shabalin, Ivan G; Jablonska, Jagoda; Domagalski, Marcin J; Kutner, Jan; Ginalski, Krzysztof; Minor, Wladek.

BMC Psychiatry ; 19(1): 221, 2019 Jul 16.

Artigo em Inglês | MEDLINE | ID: mdl-31311510

RESUMO

Following publication of the original article [1], we have been notified that some important information was omitted by the authors from the Competing interests section. The declaration should read as below.

Classification, substrate specificity and structural features of D-2-hydroxyacid dehydrogenases: 2HADH knowledgebase.

Matelska, Dorota; Shabalin, Ivan G; Jablonska, Jagoda; Domagalski, Marcin J; Kutner, Jan; Ginalski, Krzysztof; Minor, Wladek.

BMC Evol Biol ; 18(1): 199, 2018 12 22.

Artigo em Inglês | MEDLINE | ID: mdl-30577795

RESUMO

BACKGROUND: The family of D-isomer specific 2-hydroxyacid dehydrogenases (2HADHs) contains a wide range of oxidoreductases with various metabolic roles as well as biotechnological applications. Despite a vast amount of biochemical and structural data for various representatives of the family, the long and complex evolution and broad sequence diversity hinder functional annotations for uncharacterized members. RESULTS: We report an in-depth phylogenetic analysis, followed by mapping of available biochemical and structural data on the reconstructed phylogenetic tree. The analysis suggests that some subfamilies comprising enzymes with similar yet broad substrate specificity profiles diverged early in the evolution of 2HADHs. Based on the phylogenetic tree, we present a revised classification of the family that comprises 22 subfamilies, including 13 new subfamilies not studied biochemically. We summarize characteristics of the nine biochemically studied subfamilies by aggregating all available sequence, biochemical, and structural data, providing comprehensive descriptions of the active site, cofactor-binding residues, and potential roles of specific structural regions in substrate recognition. In addition, we concisely present our analysis as an online 2HADH enzymes knowledgebase. CONCLUSIONS: The knowledgebase enables navigation over the 2HADHs classification, search through collected data, and functional predictions of uncharacterized 2HADHs. Future characterization of the new subfamilies may result in discoveries of enzymes with novel metabolic roles and with properties beneficial for biotechnological applications.

Assuntos

Oxirredutases do Álcool/química , Oxirredutases do Álcool/classificação , Bases de Conhecimento , Oxirredutases do Álcool/metabolismo , Sequência de Aminoácidos , Domínio Catalítico , Coenzimas/metabolismo , Funções Verossimilhança , Filogenia , Especificidade por Substrato

Data management in the modern structural biology and biomedical research environment.

Zimmerman, Matthew D; Grabowski, Marek; Domagalski, Marcin J; Maclean, Elizabeth M; Chruszcz, Maksymilian; Minor, Wladek.

Methods Mol Biol ; 1140: 1-25, 2014.

Artigo em Inglês | MEDLINE | ID: mdl-24590705

RESUMO

Modern high-throughput structural biology laboratories produce vast amounts of raw experimental data. The traditional method of data reduction is very simple-results are summarized in peer-reviewed publications, which are hopefully published in high-impact journals. By their nature, publications include only the most important results derived from experiments that may have been performed over the course of many years. The main content of the published paper is a concise compilation of these data, an interpretation of the experimental results, and a comparison of these results with those obtained by other scientists.Due to an avalanche of structural biology manuscripts submitted to scientific journals, in many recent cases descriptions of experimental methodology (and sometimes even experimental results) are pushed to supplementary materials that are only published online and sometimes may not be reviewed as thoroughly as the main body of a manuscript. Trouble may arise when experimental results are contradicting the results obtained by other scientists, which requires (in the best case) the reexamination of the original raw data or independent repetition of the experiment according to the published description of the experiment. There are reports that a significant fraction of experiments obtained in academic laboratories cannot be repeated in an industrial environment (Begley CG & Ellis LM, Nature 483(7391):531-3, 2012). This is not an indication of scientific fraud but rather reflects the inadequate description of experiments performed on different equipment and on biological samples that were produced with disparate methods. For that reason the goal of a modern data management system is not only the simple replacement of the laboratory notebook by an electronic one but also the creation of a sophisticated, internally consistent, scalable data management system that will combine data obtained by a variety of experiments performed by various individuals on diverse equipment. All data should be stored in a core database that can be used by custom applications to prepare internal reports, statistics, and perform other functions that are specific to the research that is pursued in a particular laboratory.This chapter presents a general overview of the methods of data management and analysis used by structural genomics (SG) programs. In addition to a review of the existing literature on the subject, also presented is experience in the development of two SG data management systems, UniTrack and LabDB. The description is targeted to a general audience, as some technical details have been (or will be) published elsewhere. The focus is on "data management," meaning the process of gathering, organizing, and storing data, but also briefly discussed is "data mining," the process of analysis ideally leading to an understanding of the data. In other words, data mining is the conversion of data into information. Clearly, effective data management is a precondition for any useful data mining. If done properly, gathering details on millions of experiments on thousands of proteins and making them publicly available for analysis-even after the projects themselves have ended-may turn out to be one of the most important benefits of SG programs.

Assuntos

Pesquisa Biomédica/métodos , Ensaios de Triagem em Larga Escala/métodos , Biologia Molecular/métodos , Biologia Computacional , Humanos , Gestão do Conhecimento , Revisão da Pesquisa por Pares

Factors correlating with significant differences between X-ray structures of myoglobin.

Rashin, Alexander A; Domagalski, Marcin J; Zimmermann, Michael T; Minor, Wladek; Chruszcz, Maksymilian; Jernigan, Robert L.

Acta Crystallogr D Biol Crystallogr ; 70(Pt 2): 481-91, 2014 Feb.

Artigo em Inglês | MEDLINE | ID: mdl-24531482

RESUMO

Validation of general ideas about the origins of conformational differences in proteins is critical in order to arrive at meaningful functional insights. Here, principal component analysis (PCA) and distance difference matrices are used to validate some such ideas about the conformational differences between 291 myoglobin structures from sperm whale, horse and pig. Almost all of the horse and pig structures form compact PCA clusters with only minor coordinate differences and outliers that are easily explained. The 222 whale structures form a few dense clusters with multiple outliers. A few whale outliers with a prominent distortion of the GH loop are very similar to the cluster of horse structures, which all have a similar GH-loop distortion apparently owing to intermolecular crystal lattice hydrogen bonds to the GH loop from residues near the distal histidine His64. The variations of the GH-loop coordinates in the whale structures are likely to be owing to the observed alternative intermolecular crystal lattice bond, with the change to the GH loop distorting bonds correlated with the binding of specific `unusual' ligands. Such an alternative intermolecular bond is not observed in horse myoglobins, obliterating any correlation with the ligands. Intermolecular bonds do not usually cause significant coordinate differences and cannot be validated as their universal cause. Most of the native-like whale myoglobin structure outliers can be correlated with a few specific factors. However, these factors do not always lead to coordinate differences beyond the previously determined uncertainty thresholds. The binding of unusual ligands by myoglobin, leading to crystal-induced distortions, suggests that some of the conformational differences between the apo and holo structures might not be `functionally important' but rather artifacts caused by the binding of `unusual' substrate analogs. The causes of P6 symmetry in myoglobin crystals and the relationship between crystal and solution structures are also discussed.

Assuntos

Apoproteínas/química , Mioglobina/química , Análise de Componente Principal , Espermatozoides/química , Animais , Apoproteínas/genética , Cristalografia por Raios X , Cavalos , Ligação de Hidrogênio , Ligantes , Masculino , Mutação , Mioglobina/genética , Ligação Proteica , Conformação Proteica , Especificidade da Espécie , Suínos , Baleias

The quality and validation of structures from structural genomics.

Domagalski, Marcin J; Zheng, Heping; Zimmerman, Matthew D; Dauter, Zbigniew; Wlodawer, Alexander; Minor, Wladek.

Methods Mol Biol ; 1091: 297-314, 2014.

Artigo em Inglês | MEDLINE | ID: mdl-24203341

RESUMO

Quality control of three-dimensional structures of macromolecules is a critical step to ensure the integrity of structural biology data, especially those produced by structural genomics centers. Whereas the Protein Data Bank (PDB) has proven to be a remarkable success overall, the inconsistent quality of structures reveals a lack of universal standards for structure/deposit validation. Here, we review the state-of-the-art methods used in macromolecular structure validation, focusing on validation of structures determined by X-ray crystallography. We describe some general protocols used in the rebuilding and re-refinement of problematic structural models. We also briefly discuss some frontier areas of structure validation, including refinement of protein-ligand complexes, automation of structure redetermination, and the use of NMR structures and computational models to solve X-ray crystal structures by molecular replacement.

Assuntos

Conformação Proteica , Proteínas/química , Proteômica/métodos , Proteômica/normas , Mineração de Dados , Bases de Dados de Proteínas , Ligantes , Substâncias Macromoleculares/química , Modelos Moleculares , Ressonância Magnética Nuclear Biomolecular , Ligação Proteica , Proteínas/metabolismo , Controle de Qualidade , Reprodutibilidade dos Testes

Ultratight crystal packing of a 10 kDa protein.

Trillo-Muyo, Sergio; Jasilionis, Andrius; Domagalski, Marcin J; Chruszcz, Maksymilian; Minor, Wladek; Kuisiene, Nomeda; Arolas, Joan L; Solà, Maria; Gomis-Rüth, F Xavier.

Acta Crystallogr D Biol Crystallogr ; 69(Pt 3): 464-70, 2013 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-23519421

RESUMO

While small organic molecules generally crystallize forming tightly packed lattices with little solvent content, proteins form air-sensitive high-solvent-content crystals. Here, the crystallization and full structure analysis of a novel recombinant 10 kDa protein corresponding to the C-terminal domain of a putative U32 peptidase are reported. The orthorhombic crystal contained only 24.5% solvent and is therefore among the most tightly packed protein lattices ever reported.

Assuntos

Geobacillus/enzimologia , Peptídeo Hidrolases/química , Cristalização , Cristalografia por Raios X , Peso Molecular , Fragmentos de Peptídeos/química , Proteólise , Selenometionina/metabolismo , Solventes

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA