Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
1.
PLoS One ; 18(3): e0282821, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-36989217

RESUMEN

Advancements in deep plasma proteomics are enabling high-resolution measurement of plasma proteoforms, which may reveal a rich source of novel biomarkers previously concealed by aggregated protein methods. Here, we analyze 188 plasma proteomes from non-small cell lung cancer subjects (NSCLC) and controls to identify NSCLC-associated protein isoforms by examining differentially abundant peptides as a proxy for isoform-specific exon usage. We find four proteins comprised of peptides with opposite patterns of abundance between cancer and control subjects. One of these proteins, BMP1, has known isoforms that can explain this differential pattern, for which the abundance of the NSCLC-associated isoform increases with stage of NSCLC progression. The presence of cancer and control-associated isoforms suggests differential regulation of BMP1 isoforms. The identified BMP1 isoforms have known functional differences, which may reveal insights into mechanisms impacting NSCLC disease progression.


Asunto(s)
Carcinoma de Pulmón de Células no Pequeñas , Neoplasias Pulmonares , Humanos , Carcinoma de Pulmón de Células no Pequeñas/metabolismo , Neoplasias Pulmonares/metabolismo , Biomarcadores de Tumor/metabolismo , Isoformas de Proteínas/metabolismo , Péptidos , Proteína Morfogenética Ósea 1
2.
bioRxiv ; 2023 Aug 29.
Artículo en Inglés | MEDLINE | ID: mdl-37693476

RESUMEN

Background: The wide dynamic range of circulating proteins coupled with the diversity of proteoforms present in plasma has historically impeded comprehensive and quantitative characterization of the plasma proteome at scale. Automated nanoparticle (NP) protein corona-based proteomics workflows can efficiently compress the dynamic range of protein abundances into a mass spectrometry (MS)-accessible detection range. This enhances the depth and scalability of quantitative MS-based methods, which can elucidate the molecular mechanisms of biological processes, discover new protein biomarkers, and improve comprehensiveness of MS-based diagnostics. Methods: Investigating multi-species spike-in experiments and a cohort, we investigated fold-change accuracy, linearity, precision, and statistical power for the using the Proteograph™ Product Suite, a deep plasma proteomics workflow, in conjunction with multiple MS instruments. Results: We show that NP-based workflows enable accurate identification (false discovery rate of 1%) of more than 6,000 proteins from plasma (Orbitrap Astral) and, compared to a gold standard neat plasma workflow that is limited to the detection of hundreds of plasma proteins, facilitate quantification of more proteins with accurate fold-changes, high linearity, and precision. Furthermore, we demonstrate high statistical power for the discovery of biomarkers in small- and large-scale cohorts. Conclusions: The automated NP workflow enables high-throughput, deep, and quantitative plasma proteomics investigation with sufficient power to discover new biomarker signatures with a peptide level resolution.

3.
PLoS Comput Biol ; 6(8)2010 Aug 26.
Artículo en Inglés | MEDLINE | ID: mdl-20865152

RESUMEN

In order to fully understand protein kinase networks, new methods are needed to identify regulators and substrates of kinases, especially for weakly expressed proteins. Here we have developed a hybrid computational search algorithm that combines machine learning and expert knowledge to identify kinase docking sites, and used this algorithm to search the human genome for novel MAP kinase substrates and regulators focused on the JNK family of MAP kinases. Predictions were tested by peptide array followed by rigorous biochemical verification with in vitro binding and kinase assays on wild-type and mutant proteins. Using this procedure, we found new 'D-site' class docking sites in previously known JNK substrates (hnRNP-K, PPM1J/PP2Czeta), as well as new JNK-interacting proteins (MLL4, NEIL1). Finally, we identified new D-site-dependent MAPK substrates, including the hedgehog-regulated transcription factors Gli1 and Gli3, suggesting that a direct connection between MAP kinase and hedgehog signaling may occur at the level of these key regulators. These results demonstrate that a genome-wide search for MAP kinase docking sites can be used to find new docking sites and substrates.


Asunto(s)
Algoritmos , Inteligencia Artificial , Bases del Conocimiento , Proteínas Quinasas Activadas por Mitógenos/química , Sitios de Unión , Genoma Humano , Humanos , Factores de Transcripción de Tipo Kruppel/química , Proteínas del Tejido Nervioso/química , Unión Proteica , Especificidad por Sustrato , Factores de Transcripción/química , Proteína con Dedos de Zinc GLI1 , Proteína Gli3 con Dedos de Zinc
4.
J Chem Inf Model ; 51(4): 760-76, 2011 Apr 25.
Artículo en Inglés | MEDLINE | ID: mdl-21417267

RESUMEN

Accurate prediction of the 3-D structure of small molecules is essential in order to understand their physical, chemical, and biological properties, including how they interact with other molecules. Here, we survey the field of high-throughput methods for 3-D structure prediction and set up new target specifications for the next generation of methods. We then introduce COSMOS, a novel data-driven prediction method that utilizes libraries of fragment and torsion angle parameters. We illustrate COSMOS using parameters extracted from the Cambridge Structural Database (CSD) by analyzing their distribution and then evaluating the system's performance in terms of speed, coverage, and accuracy. Results show that COSMOS represents a significant improvement when compared to state-of-the-art prediction methods, particularly in terms of coverage of complex molecular structures, including metal-organics. COSMOS can predict structures for 96.4% of the molecules in the CSD (99.6% organic, 94.6% metal-organic), whereas the widely used commercial method CORINA predicts structures for 68.5% (98.5% organic, 51.6% metal-organic). On the common subset of molecules predicted by both methods, COSMOS makes predictions with an average speed per molecule of 0.15 s (0.10 s organic, 0.21 s metal-organic) and an average rmsd of 1.57 Å (1.26 Å organic, 1.90 Å metal-organic), and CORINA makes predictions with an average speed per molecule of 0.13s (0.18s organic, 0.08s metal-organic) and an average rmsd of 1.60 Å (1.13 Å organic, 2.11 Å metal-organic). COSMOS is available through the ChemDB chemoinformatics Web portal at http://cdb.ics.uci.edu/ .


Asunto(s)
Algoritmos , Química/métodos , Informática/métodos , Modelos Moleculares , Conformación Molecular , Bases de Datos Factuales , Modelos Estadísticos , Reconocimiento de Normas Patrones Automatizadas/métodos
5.
Bioinformatics ; 24(13): i357-65, 2008 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-18586735

RESUMEN

MOTIVATION: Small organic molecules, from nucleotides and amino acids to metabolites and drugs, play a fundamental role in chemistry, biology and medicine. As databases of small molecules continue to grow and become more open, it is important to develop the tools to search them efficiently. In order to develop a BLAST-like tool for small molecules, one must first understand the statistical behavior of molecular similarity scores. RESULTS: We develop a new detailed theory of molecular similarity scores that can be applied to a variety of molecular representations and similarity measures. For concreteness, we focus on the most widely used measure--the Tanimoto measure applied to chemical fingerprints. In both the case of empirical fingerprints and fingerprints generated by several stochastic models, we derive accurate approximations for both the distribution and extreme value distribution of similarity scores. These approximation are derived using a ratio of correlated Gaussians approach. The theory enables the calculation of significance scores, such as Z-scores and P-values, and the estimation of the top hits list size. Empirical results obtained using both the random models and real data from the ChemDB database are given to corroborate the theory and show how it can be applied to mine chemical space. AVAILABILITY: Data and related resources are available through http://cdb.ics.uci.edu.


Asunto(s)
Algoritmos , Técnicas de Química Analítica/métodos , Interpretación Estadística de Datos , Bases de Datos Factuales , Compuestos Orgánicos/química , Reconocimiento de Normas Patrones Automatizadas/métodos
6.
J Phys Chem B ; 110(47): 24157-64, 2006 Nov 30.
Artículo en Inglés | MEDLINE | ID: mdl-17125387

RESUMEN

In the absence of external stress, the surface tension of a lipid membrane vanishes at equilibrium, and the membrane exhibits long wavelength undulations that can be described as elastic (as opposed to tension-dominated) deformations. These long wavelength fluctuations are generally suppressed in molecular dynamics simulations of membranes, which have typically been carried out on membrane patches with areas <100 nm2 that are replicated by periodic boundary conditions. As a result, finite system-size effects in molecular dynamics simulations of lipid bilayers have been subject to much discussion in the membrane simulation community for several years, and it has been argued that it is necessary to simulate small membrane patches under tension to properly model the tension-free state of macroscopic membranes. Recent hardware and software advances have made it possible to simulate larger, all-atom systems allowing us to directly address the question of whether the relatively small size of current membrane simulations affects their physical characteristics compared to real macroscopic bilayer systems. In this work, system-size effects on the structure of a DOPC bilayer at 5.4 H2O/lipid are investigated by performing molecular dynamics simulations at constant temperature and isotropic pressure (i.e., vanishing surface tension) of small and large single bilayer patches (72 and 288 lipids, respectively), as well as an explicitly multilamellar system consisting of a stack of five 72-lipid bilayers, all replicated in three dimensions by using periodic boundary conditions. The simulation results are compared to X-ray and neutron diffraction data by using a model-free, reciprocal space approach developed recently in our laboratories. Our analysis demonstrates that finite-size effects are negligible in simulations of DOPC bilayers at low hydration, and suggests that refinements are needed in the simulation force fields.


Asunto(s)
Simulación por Computador , Membrana Dobles de Lípidos/química , Fluidez de la Membrana , Fosfatidilcolinas/química , Cristalografía por Rayos X , Modelos Biológicos , Conformación Molecular , Tensión Superficial , Agua/química
7.
J Appl Lab Med ; 1(2): 181-193, 2016 Sep 01.
Artículo en Inglés | MEDLINE | ID: mdl-33626780

RESUMEN

BACKGROUND: Well-collected and well-documented sample repositories are necessary for disease biomarker development. The availability of significant numbers of samples with the associated patient information enables biomarker validation to proceed with maximum efficacy and minimum bias. The creation and utilization of such a resource is an important step in the development of blood-based biomarker tests for colorectal cancer. METHODS: We have created a subject data and biological sample resource, Endoscopy II, which is based on 4698 individuals referred for diagnostic colonoscopy in Denmark between May 2010 and November 2012. Of the patients referred based on 1 or more clinical symptoms of colorectal neoplasia, 512 were confirmed by pathology to have colorectal cancer and 399 were confirmed to have advanced adenoma. Using subsets of these sample groups in case-control study designs (300 patients for colorectal cancer, 302 patients for advanced adenoma), 2 panels of plasma-based proteins for colorectal cancer and 1 panel for advanced adenoma were identified and validated based on ELISA data obtained for 28 proteins from the samples. RESULTS: One of the validated colorectal cancer panels was comprised of 8 proteins (CATD, CEA, CO3, CO9, SEPR, AACT, MIF, and PSGL) and had a validation ROC curve area under the curve (AUC) of 0.82 (CI 0.75-0.88). There was no significant difference in the performance between early- and late-stage cancer. The advanced adenoma panel was comprised of 4 proteins (CATD, CLUS, GDF15, SAA1) and had a validation ROC curve AUC of 0.65 (CI 0.56-0.74). CONCLUSIONS: These results suggest that the development of blood-based aids to colorectal cancer detection and diagnosis is feasible.

8.
Clin Colorectal Cancer ; 15(2): 186-194.e13, 2016 06.
Artículo en Inglés | MEDLINE | ID: mdl-27237338

RESUMEN

INTRODUCTION: Colorectal cancer (CRC) testing programs reduce mortality; however, approximately 40% of the recommended population who should undergo CRC testing does not. Early colon cancer detection in patient populations ineligible for testing, such as the elderly or those with significant comorbidities, could have clinical benefit. Despite many attempts to identify individual protein markers of this disease, little progress has been made. Targeted mass spectrometry, using multiple reaction monitoring (MRM) technology, enables the simultaneous assessment of groups of candidates for improved detection performance. MATERIALS AND METHODS: A multiplex assay was developed for 187 candidate marker proteins, using 337 peptides monitored through 674 simultaneously measured MRM transitions in a 30-minute liquid chromatography-mass spectrometry analysis of immunodepleted blood plasma. To evaluate the combined candidate marker performance, the present study used 274 individual patient blood plasma samples, 137 with biopsy-confirmed colorectal cancer and 137 age- and gender-matched controls. Using 2 well-matched platforms running 5 days each week, all 274 samples were analyzed in 52 days. RESULTS: Using one half of the data as a discovery set (69 disease cases and 69 control cases), the elastic net feature selection and random forest classifier assembly were used in cross-validation to identify a 15-transition classifier. The mean training receiver operating characteristic area under the curve was 0.82. After final classifier assembly using the entire discovery set, the 136-sample (68 disease cases and 68 control cases) validation set was evaluated. The validation area under the curve was 0.91. At the point of maximum accuracy (84%), the sensitivity was 87% and the specificity was 81%. CONCLUSION: These results have demonstrated the ability of simultaneous assessment of candidate marker proteins using high-multiplex, targeted-mass spectrometry to identify a subset group of CRC markers with significant and meaningful performance.


Asunto(s)
Biomarcadores de Tumor/sangre , Neoplasias Colorrectales/diagnóstico , Detección Precoz del Cáncer/métodos , Espectrometría de Masas/métodos , Adulto , Anciano , Área Bajo la Curva , Neoplasias Colorrectales/sangre , Femenino , Humanos , Masculino , Persona de Mediana Edad , Curva ROC , Sensibilidad y Especificidad
9.
J Chem Inf Model ; 48(6): 1138-51, 2008 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-18522387

RESUMEN

Power-law distributions have been observed in a wide variety of areas. To our knowledge however, there has been no systematic observation of power-law distributions in chemoinformatics. Here, we present several examples of power-law distributions arising from the features of small, organic molecules. The distributions of rigid segments and ring systems, the distributions of molecular paths and circular substructures, and the sizes of molecular similarity clusters all show linear trends on log-log rank/ frequency plots, suggesting underlying power-law distributions. The number of unique features also follow Heaps'-like laws. The characteristic exponents of the power-laws lie in the 1.5-3 range, consistently with the exponents observed in other power-law phenomena. The power-law nature of these distributions leads to several applications including the prediction of the growth of available data through Heaps' law and the optimal allocation of experimental or computational resources via the 80/20 rule. More importantly, we also show how the power-laws can be leveraged to efficiently compress chemical fingerprints in a lossless manner, useful for the improved storage and retrieval of molecules in large chemical databases.


Asunto(s)
Modelos Estadísticos , Compuestos Orgánicos/química , Bibliotecas de Moléculas Pequeñas/química , Análisis por Conglomerados , Cadenas de Markov
10.
J Chem Inf Model ; 47(6): 2098-109, 2007.
Artículo en Inglés | MEDLINE | ID: mdl-17967006

RESUMEN

Many modern chemoinformatics systems for small molecules rely on large fingerprint vector representations, where the components of the vector record the presence or number of occurrences in the molecular graphs of particular combinatorial features, such as labeled paths or labeled trees. These large fingerprint vectors are often compressed to much shorter fingerprint vectors using a lossy compression scheme based on a simple modulo procedure. Here, we combine statistical models of fingerprints with integer entropy codes, such as Golomb and Elias codes, to encode the indices or the run lengths of the fingerprints. After reordering the fingerprint components by decreasing frequency order, the indices are monotone-increasing and the run lengths are quasi-monotone-increasing, and both exhibit power-law distribution trends. We take advantage of these statistical properties to derive new efficient, lossless, compression algorithms for monotone integer sequences: monotone value (MOV) coding and monotone length (MOL) coding. In contrast to lossy systems that use 1024 or more bits of storage per molecule, we can achieve lossless compression of long chemical fingerprints based on circular substructures in slightly over 300 bits per molecule, close to the Shannon entropy limit, using a MOL Elias Gamma code for run lengths. The improvement in storage comes at a modest computational cost. Furthermore, because the compression is lossless, uncompressed similarity (e.g., Tanimoto) between molecules can be computed exactly from their compressed representations, leading to significant improvements in retrival performance, as shown on six benchmark data sets of druglike molecules.


Asunto(s)
Entropía , Modelos Químicos , Estructura Molecular , Factores de Tiempo
11.
Biophys J ; 91(10): 3617-29, 2006 Nov 15.
Artículo en Inglés | MEDLINE | ID: mdl-16950837

RESUMEN

We have recently shown that current molecular dynamics (MD) atomic force fields are not yet able to produce lipid bilayer structures that agree with experimentally-determined structures within experimental errors. Because of the many advantages offered by experimentally validated simulations, we have developed a novel restraint method for membrane MD simulations that uses experimental diffraction data. The restraints, introduced into the MD force field, act upon specified groups of atoms to restrain their mean positions and widths to values determined experimentally. The method was first tested using a simple liquid argon system, and then applied to a neat dioleoylphosphatidylcholine (DOPC) bilayer at 66% relative humidity and to the same bilayer containing the peptide melittin. Application of experiment-based restraints to the transbilayer double-bond and water distributions of neat DOPC bilayers led to distributions that agreed with the experimental values. Based upon the experimental structure, the restraints improved the simulated structure in some regions while introducing larger differences in others, as might be expected from imperfect force fields. For the DOPC-melittin system, the experimental transbilayer distribution of melittin was used as a restraint. The addition of the peptide caused perturbations of the simulated bilayer structure, but which were larger than observed experimentally. The melittin distribution of the simulation could be fit accurately to a Gaussian with parameters close to the observed ones, indicating that the restraints can be used to produce an ensemble of membrane-bound peptide conformations that are consistent with experiments. Such ensembles pave the way for understanding peptide-bilayer interactions at the atomic level.


Asunto(s)
Membrana Dobles de Lípidos/química , Meliteno/química , Proteínas de la Membrana/química , Modelos Químicos , Modelos Moleculares , Fosfatidilcolinas/química , Birrefringencia , Simulación por Computador , Péptidos/química , Estrés Mecánico
12.
Biophys J ; 88(2): 805-17, 2005 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-15533925

RESUMEN

A novel protocol has been developed for comparing the structural properties of lipid bilayers determined by simulation with those determined by diffraction experiments, which makes it possible to test critically the ability of molecular dynamics simulations to reproduce experimental data. This model-independent method consists of analyzing data from molecular dynamics bilayer simulations in the same way as experimental data by determining the structure factors of the system and, via Fourier reconstruction, the overall transbilayer scattering-density profiles. Multi-nanosecond molecular dynamics simulations of a dioleoylphosphatidylcholine bilayer at 66% RH (5.4 waters/lipid) were performed in the constant pressure and temperature ensemble using the united-atom GROMACS and the all-atom CHARMM22/27 force fields with the GROMACS and NAMD software packages, respectively. The quality of the simulated bilayer structures was evaluated by comparing simulation with experimental results for bilayer thickness, area/lipid, individual molecular-component distributions, continuous and discrete structure factors, and overall scattering-density profiles. Neither the GROMACS nor the CHARMM22/27 simulations reproduced experimental data within experimental error. The widths of the simulated terminal methyl distributions showed a particularly strong disagreement with the experimentally observed distributions. A comparison of the older CHARMM22 with the newer CHARMM27 force fields shows that significant progress is being made in the development of atomic force fields for describing lipid bilayer systems empirically.


Asunto(s)
Cristalografía/métodos , Membrana Dobles de Lípidos/química , Fluidez de la Membrana , Modelos Químicos , Modelos Moleculares , Fosfatidilcolinas/química , Simulación por Computador , Elasticidad , Conformación Molecular , Programas Informáticos , Estrés Mecánico
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA