Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 75
Filtrar
1.
J Proteome Res ; 2024 May 21.
Artigo em Inglês | MEDLINE | ID: mdl-38770571

RESUMO

Peptide identification is important in bottom-up proteomics. Post-translational modifications (PTMs) are crucial in regulating cellular activities. Many database search methods have been developed to identify peptides with PTMs and characterize the PTM patterns. However, the PTMs on peptides hinder the peptide identification rate and the PTM characterization precision, especially for peptides with multiple PTMs. To address this issue, we present a sensitive open search engine, PIPI2, with much better performance on peptides with multiple PTMs than other methods. With a greedy approach, we simplify the PTM characterization problem into a linear one, which enables characterizing multiple PTMs on one peptide. On the simulation data sets with up to four PTMs per peptide, PIPI2 identified over 90% of the spectra, at least 56% more than five other competitors. PIPI2 also characterized these PTM patterns with the highest precision of 77%, demonstrating a significant advantage in handling peptides with multiple PTMs. In the real applications, PIPI2 identified 30% to 88% more peptides with PTMs than its competitors.

2.
Comput Biol Med ; 175: 108533, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38714050

RESUMO

Bone proliferation is an important pathological feature of inflammatory rheumatic diseases. Although recent advance in high-resolution peripheral quantitative computed tomography (HR-pQCT) enables physicians to study microarchitectures, physicians' annotation of proliferation suffers from slice inconsistency and subjective variations. Also, there are only few effective automatic or semi-automatic tools for proliferation detection. In this study, by integrating pathological knowledge of proliferation formation with the advancement of statistical shape analysis theory, we present an unsupervised method, named Deformation-Controllable Elastic Shape model, for 3D bone Proliferation Analysis (DCES-PA). Unlike previous shape analysis methods that directly regularize the smoothness of the displacement field, DCES-PA regularizes the first and second-order derivative of the displacement field and decomposes these vector fields according to different deformations. For the first-order elastic metric, DCES-PA orthogonally decomposes the first-order derivative of the displacement field by shearing, scaling and bending deformation, and then penalize deformations triggering proliferation formation. For the second-order elastic metric, DCES-PA encodes both intrinsic and extrinsic surface curvatures into the second-order derivative of the displacement field to control the generation of high-curvature regions. By integrating the elastic shape metric with the varifold distances, DCES-PA achieves correspondence-free shape analysis. Extensive experiments on both simulated and real clinical datasets demonstrate that DCES-PA not only shows an improved accuracy than other state-of-the-art shape-based methods applied to proliferation analysis but also produces highly sensitive proliferation annotations to assist physicians in proliferation analysis.


Assuntos
Imageamento Tridimensional , Tomografia Computadorizada por Raios X , Humanos , Tomografia Computadorizada por Raios X/métodos , Imageamento Tridimensional/métodos , Osso e Ossos/diagnóstico por imagem , Mãos/diagnóstico por imagem , Feminino , Masculino , Proliferação de Células
3.
Diabetologia ; 67(5): 837-849, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38413437

RESUMO

AIMS/HYPOTHESIS: The aim of this study was to describe the metabolome in diabetic kidney disease (DKD) and its association with incident CVD in type 2 diabetes, and identify prognostic biomarkers. METHODS: From a prospective cohort of individuals with type 2 diabetes, baseline sera (N=1991) were quantified for 170 metabolites using NMR spectroscopy with median 5.2 years of follow-up. Associations of chronic kidney disease (CKD, eGFR<60 ml/min per 1.73 m2) or severely increased albuminuria with each metabolite were examined using linear regression, adjusted for confounders and multiplicity. Associations between DKD (CKD or severely increased albuminuria)-related metabolites and incident CVD were examined using Cox regressions. Metabolomic biomarkers were identified and assessed for CVD prediction and replicated in two independent cohorts. RESULTS: At false discovery rate (FDR)<0.05, 156 metabolites were associated with DKD (151 for CKD and 128 for severely increased albuminuria), including apolipoprotein B-containing lipoproteins, HDL, fatty acids, phenylalanine, tyrosine, albumin and glycoprotein acetyls. Over 5.2 years of follow-up, 75 metabolites were associated with incident CVD at FDR<0.05. A model comprising age, sex and three metabolites (albumin, triglycerides in large HDL and phospholipids in small LDL) performed comparably to conventional risk factors (C statistic 0.765 vs 0.762, p=0.893) and adding the three metabolites further improved CVD prediction (C statistic from 0.762 to 0.797, p=0.014) and improved discrimination and reclassification. The 3-metabolite score was validated in independent Chinese and Dutch cohorts. CONCLUSIONS/INTERPRETATION: Altered metabolomic signatures in DKD are associated with incident CVD and improve CVD risk stratification.


Assuntos
Doenças Cardiovasculares , Diabetes Mellitus Tipo 2 , Nefropatias Diabéticas , Insuficiência Renal Crônica , Humanos , Nefropatias Diabéticas/metabolismo , Doenças Cardiovasculares/complicações , Estudos Prospectivos , Hong Kong/epidemiologia , Albuminúria , Bancos de Espécimes Biológicos , Taxa de Filtração Glomerular , Biomarcadores , Albuminas
4.
Mol Cell Proteomics ; 23(3): 100738, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38364992

RESUMO

Wind is one of the most prevalent environmental forces entraining plants to develop various mechano-responses, collectively called thigmomorphogenesis. Largely unknown is how plants transduce these versatile wind force signals downstream to nuclear events and to the development of thigmomorphogenic phenotype or anemotropic response. To identify molecular components at the early steps of the wind force signaling, two mechanical signaling-related phosphoproteins, identified from our previous phosphoproteomic study of Arabidopsis touch response, mitogen-activated protein kinase kinase 1 (MKK1) and 2 (MKK2), were selected for performing in planta TurboID (ID)-based quantitative proximity-labeling (PL) proteomics. This quantitative biotinylproteomics was separately performed on MKK1-ID and MKK2-ID transgenic plants, respectively, using the genetically engineered TurboID biotin ligase expression transgenics as a universal control. This unique PTM proteomics successfully identified 11 and 71 MKK1 and MKK2 putative interactors, respectively. Biotin occupancy ratio (BOR) was found to be an alternative parameter to measure the extent of proximity and specificity between the proximal target proteins and the bait fusion protein. Bioinformatics analysis of these biotinylprotein data also found that TurboID biotin ligase favorably labels the loop region of target proteins. A WInd-Related Kinase 1 (WIRK1), previously known as rapidly accelerated fibrosarcoma (Raf)-like kinase 36 (RAF36), was found to be a putative common interactor for both MKK1 and MKK2 and preferentially interacts with MKK2. Further molecular biology studies of the Arabidopsis RAF36 kinase found that it plays a role in wind regulation of the touch-responsive TCH3 and CML38 gene expression and the phosphorylation of a touch-regulated PATL3 phosphoprotein. Measurement of leaf morphology and shoot gravitropic response of wirk1 (raf36) mutant revealed that the WIRK1 gene is involved in both wind-triggered rosette thigmomorphogenesis and gravitropism of Arabidopsis stems, suggesting that the WIRK1 (RAF36) protein probably functioning upstream of both MKK1 and MKK2 and that it may serve as the crosstalk point among multiple mechano-signal transduction pathways mediating both wind mechano-response and gravitropism.


Assuntos
Proteínas de Arabidopsis , Arabidopsis , Arabidopsis/genética , Arabidopsis/metabolismo , Gravitropismo , Biotina/metabolismo , Vento , Proteínas de Arabidopsis/genética , Proteínas de Arabidopsis/metabolismo , Fosfoproteínas/metabolismo , Ligases/metabolismo , Calmodulina/metabolismo
5.
BMC Bioinformatics ; 24(1): 351, 2023 Sep 20.
Artigo em Inglês | MEDLINE | ID: mdl-37730532

RESUMO

BACKGROUND: Cross-linking mass spectrometry (XL-MS) is a powerful technique for detecting protein-protein interactions (PPIs) and modeling protein structures in a high-throughput manner. In XL-MS experiments, proteins are cross-linked by a chemical reagent (namely cross-linker), fragmented, and then fed into a tandem mass spectrum (MS/MS). Cross-linkers are either cleavable or non-cleavable, and each type requires distinct data analysis tools. However, both types of cross-linkers suffer from imbalanced fragmentation efficiency, resulting in a large number of unidentifiable spectra that hinder the discovery of PPIs and protein conformations. To address this challenge, researchers have sought to improve the sensitivity of XL-MS through invention of novel cross-linking reagents, optimization of sample preparation protocols, and development of data analysis algorithms. One promising approach to developing new data analysis methods is to apply a protein feedback mechanism in the analysis. It has significantly improved the sensitivity of analysis methods in the cleavable cross-linking data. The application of the protein feedback mechanism to the analysis of non-cleavable cross-linking data is expected to have an even greater impact because the majority of XL-MS experiments currently employs non-cleavable cross-linkers. RESULTS: In this study, we applied the protein feedback mechanism to the analysis of both non-cleavable and cleavable cross-linking data and observed a substantial improvement in cross-link spectrum matches (CSMs) compared to conventional methods. Furthermore, we developed a new software program, ECL 3.0, that integrates two algorithms and includes a user-friendly graphical interface to facilitate wider applications of this new program. CONCLUSIONS: ECL 3.0 source code is available at https://github.com/yuweichuan/ECL-PF.git . A quick tutorial is available at https://youtu.be/PpZgbi8V2xI .


Assuntos
Peptídeos , Espectrometria de Massas em Tandem , Algoritmos , Reagentes de Ligações Cruzadas , Análise de Dados
6.
Diabetes Care ; 46(6): 1271-1281, 2023 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-37125963

RESUMO

OBJECTIVE: In this study we aim to unravel genetic determinants of coronary heart disease (CHD) in type 2 diabetes (T2D) and explore their applications. RESEARCH DESIGN AND METHODS: We performed a two-stage genome-wide association study for CHD in Chinese patients with T2D (3,596 case and 8,898 control subjects), followed by replications in European patients with T2D (764 case and 4,276 control subjects) and general populations (n = 51,442-547,261). Each identified variant was examined for its association with a wide range of phenotypes and its interactions with glycemic, blood pressure (BP), and lipid controls in incident cardiovascular diseases. RESULTS: We identified a novel variant (rs10171703) for CHD (odds ratio 1.21 [95% CI 1.13-1.30]; P = 2.4 × 10-8) and BP (ß ± SE 0.130 ± 0.017; P = 4.1 × 10-14) at PDE1A in Chinese T2D patients but found only a modest association with CHD in general populations. This variant modulated the effects of BP goal attainment (130/80 mmHg) on CHD (Pinteraction = 0.0155) and myocardial infarction (MI) (Pinteraction = 5.1 × 10-4). Patients with CC genotype of rs10171703 had >40% reduction in either cardiovascular events in response to BP control (2.9 × 10-8 < P < 3.6 × 10-5), those with CT genotype had no difference (0.0726 < P < 0.2614), and those with TT genotype had a threefold increase in MI risk (P = 6.7 × 10-3). CONCLUSIONS: We discovered a novel CHD- and BP-related variant at PDE1A that interacted with BP goal attainment with divergent effects on CHD risk in Chinese patients with T2D. Incorporating this information may facilitate individualized treatment strategies for precision care in diabetes, only when our findings are validated.


Assuntos
Doença das Coronárias , Nucleotídeo Cíclico Fosfodiesterase do Tipo 1 , Diabetes Mellitus Tipo 2 , Infarto do Miocárdio , Humanos , Doença das Coronárias/genética , Diabetes Mellitus Tipo 2/complicações , População do Leste Asiático , Estudo de Associação Genômica Ampla , Objetivos , Infarto do Miocárdio/complicações , Infarto do Miocárdio/genética , Polimorfismo de Nucleotídeo Único , Medição de Risco , Fatores de Risco , Nucleotídeo Cíclico Fosfodiesterase do Tipo 1/genética
7.
Comput Med Imaging Graph ; 106: 102200, 2023 06.
Artigo em Inglês | MEDLINE | ID: mdl-36857951

RESUMO

Rheumatoid arthritis (RA) is a chronic inflammatory disease. It leads to bone erosion in joints and other complications, which severely affect patients' quality of life. To accurately diagnose and monitor the progression of RA, quantitative imaging and analysis tools are desirable. High-resolution peripheral quantitative computed tomography (HR-pQCT) is such a promising tool for monitoring disease progression in RA. However, automatic erosion detection tools using HR-pQCT images are not yet available. Inspired by the consensus among radiologists on the erosions in HR-pQCT images, in this paper we define erosion as the significant concave regions on the cortical layer, and develop a model-based 3D automatic erosion detection method. It mainly consists of two steps: constructing closed cortical surface, and detecting erosion regions on the surface. In the first step, we propose an initialization-robust region competition methods for joint segmentation, and then fill the surface gaps by using joint bone separation and curvature-based surface alignment. In the second step, we analyze the curvature information of each voxel, and then aggregate the candidate voxels into concave surface regions and use the shape information of the regions to detect the erosions. We perform qualitative assessments of the new method using 59 well-annotated joint volumes. Our method has shown satisfactory and consistent performance compared with the annotations provided by medical experts.


Assuntos
Artrite Reumatoide , Qualidade de Vida , Humanos , Tomografia Computadorizada por Raios X/métodos , Artrite Reumatoide/diagnóstico por imagem , Mãos
8.
Mol Plant ; 16(5): 930-961, 2023 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-36960533

RESUMO

Nuclear proteins are major constituents and key regulators of nucleome topological organization and manipulators of nuclear events. To decipher the global connectivity of nuclear proteins and the hierarchically organized modules of their interactions, we conducted two rounds of cross-linking mass spectrometry (XL-MS) analysis, one of which followed a quantitative double chemical cross-linking mass spectrometry (in vivoqXL-MS) workflow, and identified 24,140 unique crosslinks in total from the nuclei of soybean seedlings. This in vivo quantitative interactomics enabled the identification of 5340 crosslinks that can be converted into 1297 nuclear protein-protein interactions (PPIs), 1220 (94%) of which were non-confirmative (or novel) nuclear PPIs compared with those in repositories. There were 250 and 26 novel interactors of histones and the nucleolar box C/D small nucleolar ribonucleoprotein complex, respectively. Modulomic analysis of orthologous Arabidopsis PPIs produced 27 and 24 master nuclear PPI modules (NPIMs) that contain the condensate-forming protein(s) and the intrinsically disordered region-containing proteins, respectively. These NPIMs successfully captured previously reported nuclear protein complexes and nuclear bodies in the nucleus. Surprisingly, these NPIMs were hierarchically assorted into four higher-order communities in a nucleomic graph, including genome and nucleolus communities. This combinatorial pipeline of 4C quantitative interactomics and PPI network modularization revealed 17 ethylene-specific module variants that participate in a broad range of nuclear events. The pipeline was able to capture both nuclear protein complexes and nuclear bodies, construct the topological architectures of PPI modules and module variants in the nucleome, and probably map the protein compositions of biomolecular condensates.


Assuntos
Arabidopsis , Núcleo Celular , Arabidopsis/genética , Arabidopsis/metabolismo , Espectrometria de Massas , Proteínas Nucleares/metabolismo
9.
J Proteome Res ; 22(1): 101-113, 2023 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-36480279

RESUMO

Improving the sensitivity of protein-protein interaction detection and protein structure probing is a principal challenge in cross-linking mass spectrometry (XL-MS) data analysis. In this paper, we propose an exhaustive cross-linking search method with protein feedback (ECL-PF) for cleavable XL-MS data analysis. ECL-PF adopts an optimized α/ß mass detection scheme and establishes protein-peptide association during the identification of cross-linked peptides. Existing major scoring functions can all benefit from the ECL-PF workflow to a great extent. In comparisons using synthetic data sets and hybrid simulated data sets, ECL-PF achieved 3-fold higher sensitivity over standard techniques. In experiments using real data sets, it also identified 65.6% more cross-link spectrum matches and 48.7% more unique cross-links.


Assuntos
Peptídeos , Proteínas , Retroalimentação , Proteínas/química , Peptídeos/análise , Espectrometria de Massas/métodos , Reagentes de Ligações Cruzadas/química
10.
Cardiovasc Diabetol ; 21(1): 293, 2022 12 31.
Artigo em Inglês | MEDLINE | ID: mdl-36587202

RESUMO

OBJECTIVE: High-density lipoproteins (HDL) comprise particles of different size, density and composition and their vasoprotective functions may differ. Diabetes modifies the composition and function of HDL. We assessed associations of HDL size-based subclasses with incident cardiovascular disease (CVD) and mortality and their prognostic utility. RESEARCH DESIGN AND METHODS: HDL subclasses by nuclear magnetic resonance spectroscopy were determined in sera from 1991 fasted adults with type 2 diabetes (T2D) consecutively recruited from March 2014 to February 2015 in Hong Kong. HDL was divided into small, medium, large and very large subclasses. Associations (per SD increment) with outcomes were evaluated using multivariate Cox proportional hazards models. C-statistic, integrated discrimination index (IDI), and categorial and continuous net reclassification improvement (NRI) were used to assess predictive value. RESULTS: Over median (IQR) 5.2 (5.0-5.4) years, 125 participants developed incident CVD and 90 participants died. Small HDL particles (HDL-P) were inversely associated with incident CVD [hazard ratio (HR) 0.65 (95% CI 0.52, 0.81)] and all-cause mortality [0.47 (0.38, 0.59)] (false discovery rate < 0.05). Very large HDL-P were positively associated with all-cause mortality [1.75 (1.19, 2.58)]. Small HDL-P improved prediction of mortality [C-statistic 0.034 (0.013, 0.055), IDI 0.052 (0.014, 0.103), categorical NRI 0.156 (0.006, 0.252), and continuous NRI 0.571 (0.246, 0.851)] and CVD [IDI 0.017 (0.003, 0.038) and continuous NRI 0.282 (0.088, 0.486)] over the RECODe model. CONCLUSION: Small HDL-P were inversely associated with incident CVD and all-cause mortality and improved risk stratification for adverse outcomes in people with T2D. HDL-P may be used as markers for residual risk in people with T2D.


Assuntos
Doenças Cardiovasculares , Diabetes Mellitus Tipo 2 , Adulto , Humanos , Diabetes Mellitus Tipo 2/diagnóstico , Bancos de Espécimes Biológicos , Hong Kong/epidemiologia , Fatores de Risco , Lipoproteínas HDL , HDL-Colesterol
11.
Front Cell Dev Biol ; 10: 854640, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35493102

RESUMO

Background: Structural variations (SVs) are common genetic alterations in the human genome that could cause different phenotypes and diseases, including cancer. However, the detection of structural variations using the second-generation sequencing was limited by its short read length, which restrained our understanding of structural variations. Methods: In this study, we developed a 28-gene panel for long-read sequencing and employed it to Oxford Nanopore Technologies and Pacific Biosciences platforms. We analyzed structural variations in the 28 breast cancer-related genes through long-read genomic and transcriptomic sequencing of tumor, para-tumor, and blood samples in 19 breast cancer patients. Results: Our results showed that some somatic SVs were recurring among the selected genes, though the majority of them occurred in the non-exonic region. We found evidence supporting the existence of hotspot regions for SVs, which extended our previous understanding that they exist only for single nucleotide variations. Conclusion: In conclusion, we employed long-read genomic and transcriptomic sequencing to identify SVs from breast cancer patients and proved that this approach holds great potential in clinical application.

12.
Neurosci Bull ; 38(9): 1057-1068, 2022 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-35639276

RESUMO

In animal experiments, ischemic stroke is usually induced through middle cerebral artery occlusion (MCAO), and quality assessment of this procedure is crucial. However, an accurate assessment method based on 18F-fluorodeoxyglucose (FDG) positron emission tomography (PET) is still lacking. The difficulty lies in the inconsistent preprocessing pipeline, biased intensity normalization, or unclear spatiotemporal uptake of FDG. Here, we propose an image feature-based protocol to assess the quality of the procedure using a 3D scale-invariant feature transform and support vector machine. This feature-based protocol provides a convenient, accurate, and reliable tool to assess the quality of the MCAO procedure in FDG PET studies. Compared with existing approaches, the proposed protocol is fully quantitative, objective, automatic, and bypasses the intensity normalization step. An online interface was constructed to check images and obtain assessment results.


Assuntos
Fluordesoxiglucose F18 , Infarto da Artéria Cerebral Média , Animais , Infarto da Artéria Cerebral Média/diagnóstico por imagem , Tomografia por Emissão de Pósitrons/métodos
13.
Biomolecules ; 11(8)2021 08 16.
Artigo em Inglês | MEDLINE | ID: mdl-34439883

RESUMO

Isotopically dimethyl labeling was applied in a quantitative post-translational modification (PTM) proteomic study of phosphoproteomic changes in the drought responses of two contrasting soybean cultivars. A total of 9457 phosphopeptides were identified subsequently, corresponding to 4571 phosphoprotein groups and 3889 leading phosphoproteins, which contained nine kinase families consisting of 279 kinases. These phosphoproteins contained a total of 8087 phosphosites, 6106 of which were newly identified and constituted 54% of the current soybean phosphosite repository. These phosphosites were converted into the highly conserved kinase docking sites by bioinformatics analysis, which predicted six kinase families that matched with those newly found nine kinase families. The overly post-translationally modified proteins (OPP) occupies 2.1% of these leading phosphoproteins. Most of these OPPs are photoreceptors, mRNA-, histone-, and phospholipid-binding proteins, as well as protein kinase/phosphatases. The subgroup population distribution of phosphoproteins over the number of phosphosites of phosphoproteins follows the exponential decay law, Y = 4.13e-0.098X - 0.04. Out of 218 significantly regulated unique phosphopeptide groups, 188 phosphoproteins were regulated by the drought-tolerant cultivar under the water loss condition. These significantly regulated phosphoproteins (SRP) are mainly enriched in the biological functions of water transport and deprivation, methionine metabolic processes, photosynthesis/light reaction, and response to cadmium ion, osmotic stress, and ABA response. Seventeen and 15 SRPs are protein kinases/phosphatases and transcription factors, respectively. Bioinformatics analysis again revealed that three members of the calcium dependent protein kinase family (CAMK family), GmSRK2I, GmCIPK25, and GmAKINß1 kinases, constitute a phosphor-relay-mediated signal transduction network, regulating ion channel activities and many nuclear events in this drought-tolerant cultivar, which presumably contributes to the development of the soybean drought tolerance under water deprivation process.


Assuntos
Glycine max/metabolismo , Fosfoproteínas/metabolismo , Proteoma/metabolismo , Proteínas de Soja/metabolismo , Secas , Pressão Osmótica , Fosforilação
14.
Genome Med ; 13(1): 29, 2021 02 19.
Artigo em Inglês | MEDLINE | ID: mdl-33608049

RESUMO

BACKGROUND: The clinical utility of personal genomic information in identifying individuals at increased risks for dyslipidemia and cardiovascular diseases remains unclear. METHODS: We used data from Biobank Japan (n = 70,657-128,305) and developed novel East Asian-specific genome-wide polygenic risk scores (PRSs) for four lipid traits. We validated (n = 4271) and subsequently tested associations of these scores with 3-year lipid changes in adolescents (n = 620), carotid intima-media thickness (cIMT) in adult women (n = 781), dyslipidemia (n = 7723), and coronary heart disease (CHD) (n = 2374 cases and 6246 controls) in type 2 diabetes (T2D) patients. RESULTS: Our PRSs aggregating 84-549 genetic variants (0.251 < correlation coefficients (r) < 0.272) had comparably stronger association with lipid variations than the typical PRSs derived based on the genome-wide significant variants (0.089 < r < 0.240). Our PRSs were robustly associated with their corresponding lipid levels (7.5 × 10- 103 < P < 1.3 × 10- 75) and 3-year lipid changes (1.4 × 10- 6 < P < 0.0130) which started to emerge in childhood and adolescence. With the adjustments for principal components (PCs), sex, age, and body mass index, there was an elevation of 5.3% in TC (ß ± SE = 0.052 ± 0.002), 11.7% in TG (ß ± SE = 0.111 ± 0.006), 5.8% in HDL-C (ß ± SE = 0.057 ± 0.003), and 8.4% in LDL-C (ß ± SE = 0.081 ± 0.004) per one standard deviation increase in the corresponding PRS. However, their predictive power was attenuated in T2D patients (0.183 < r < 0.231). When we included each PRS (for TC, TG, and LDL-C) in addition to the clinical factors and PCs, the AUC for dyslipidemia was significantly increased by 0.032-0.057 in the general population (7.5 × 10- 3 < P < 0.0400) and 0.029-0.069 in T2D patients (2.1 × 10- 10 < P < 0.0428). Moreover, the quintile of TC-related PRS was moderately associated with cIMT in adult women (ß ± SE = 0.011 ± 0.005, Ptrend = 0.0182). Independent of conventional risk factors, the quintile of PRSs for TC [OR (95% CI) = 1.07 (1.03-1.11)], TG [OR (95% CI) = 1.05 (1.01-1.09)], and LDL-C [OR (95% CI) = 1.05 (1.01-1.09)] were significantly associated with increased risk of CHD in T2D patients (4.8 × 10- 4 < P < 0.0197). Further adjustment for baseline lipid drug use notably attenuated the CHD association. CONCLUSIONS: The PRSs derived and validated here highlight the potential for early genomic screening and personalized risk assessment for cardiovascular disease.


Assuntos
Povo Asiático/genética , Aterosclerose/genética , Cardiomiopatias Diabéticas/genética , Dislipidemias/genética , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Lipídeos/sangue , Herança Multifatorial/genética , Adolescente , Adulto , Aterosclerose/sangue , Espessura Intima-Media Carotídea , Doença das Coronárias/genética , Diabetes Mellitus Tipo 2/genética , Cardiomiopatias Diabéticas/sangue , Dislipidemias/sangue , Feminino , Humanos , Fatores de Risco
15.
IEEE/ACM Trans Comput Biol Bioinform ; 18(6): 2884-2890, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-32356758

RESUMO

Peptide identification from tandem mass spectrometry data is a fundamental task in computational proteomics. Traditional algorithms perform well when facing unmodified peptides. However, when peptides have post-translational modifications (PTMs), these methods cannot provide satisfactory results. Recently, open search methods have been proposed to identify peptides with PTMs. While the performance of these new methods is promising, the identification results vary greatly with respect to the quality of tandem mass spectra and the number of PTMs in peptides. This motivates us to systematically study the relationship between the performance of open search methods and the quality parameters of tandem mass spectrometry data as well as the number of PTMs in peptides. In this paper, we have proposed an analytical model derived from simulated data to describe the relationship between the probability of obtaining correct results and the spectrum quality as well as the number of PTMs. The proposed model is verified using 1,464,146 real experimental spectra. The consistent trend observed in both simulated data and real data reveals the necessary conditions to effectively apply open search methods. Source code of our study is available at http://bioinformatics.ust.hk/PST.html.


Assuntos
Peptídeos/química , Processamento de Proteína Pós-Traducional , Proteômica/métodos , Algoritmos , Simulação por Computador , Bases de Dados de Proteínas , Espectrometria de Massas em Tandem
16.
Brief Bioinform ; 21(4): 1448-1454, 2020 07 15.
Artigo em Inglês | MEDLINE | ID: mdl-31267129

RESUMO

For genome-wide CRISPR off-target cleavage sites (OTS) prediction, an important issue is data imbalance-the number of true OTS recognized by whole-genome off-target detection techniques is much smaller than that of all possible nucleotide mismatch loci, making the training of machine learning model very challenging. Therefore, computational models proposed for OTS prediction and scoring should be carefully designed and properly evaluated in order to avoid bias. In our study, two tools are taken as examples to further emphasize the data imbalance issue in CRISPR off-target prediction to achieve better sensitivity and specificity for optimized CRISPR gene editing. We would like to indicate that (1) the benchmark of CRISPR off-target prediction should be properly evaluated and not overestimated by considering data imbalance issue; (2) incorporation of efficient computational techniques (including ensemble learning and data synthesis techniques) can help to address the data imbalance issue and improve the performance of CRISPR off-target prediction. Taking together, we call for more efforts to address the data imbalance issue in CRISPR off-target prediction to facilitate clinical utility of CRISPR-based gene editing techniques.


Assuntos
Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas , Edição de Genes/métodos , Aprendizado de Máquina
17.
Bioinformatics ; 35(2): 251-257, 2019 01 15.
Artigo em Inglês | MEDLINE | ID: mdl-30649350

RESUMO

Motivation: Cross-linking technique coupled with mass spectrometry (MS) is widely used in the analysis of protein structures and protein-protein interactions. In order to identify cross-linked peptides from MS data, we need to consider all pairwise combinations of peptides, which is computationally prohibitive when the sequence database is large. To alleviate this problem, some heuristic screening strategies are used to reduce the number of peptide pairs during the identification. However, heuristic screening strategies may miss some true cross-linked peptides. Results: We directly tackle the combination challenge without using any screening strategies. With the data structure of double-ended queue, the proposed algorithm reduces the quadratic time complexity of exhaustive searching down to the linear time complexity. We implement the algorithm in a tool named Xolik. The running time of Xolik is validated using databases with different numbers of proteins. Experiments using synthetic and empirical datasets show that Xolik outperforms existing tools in terms of running time and statistical power. Availability and implementation: Source code and binaries of Xolik are freely available at http://bioinformatics.ust.hk/Xolik.html. Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Bases de Dados de Proteínas , Peptídeos/química , Mapeamento de Interação de Proteínas/métodos , Proteínas/química , Software , Algoritmos , Biologia Computacional , Espectrometria de Massas
18.
BMC Med Genomics ; 12(Suppl 7): 133, 2019 12 30.
Artigo em Inglês | MEDLINE | ID: mdl-31888606

RESUMO

BACKGROUND: In genome-wide association study (GWAS), conventional interaction detection methods such as BOOST are mostly based on SNP-SNP interactions. Although single nucleotides are the building blocks of human genome, single nucleotide polymorphisms (SNPs) are not necessarily the smallest functional unit for complex phenotypes. Region-based strategies have been proved to be successful in studies aiming at marginal effects. METHODS: We propose a novel region-region interaction detection method named RRIntCC (region-region interaction detection for case-control studies). RRIntCC uses the correlations between individual SNP-SNP interactions based on linkage disequilibrium (LD) contrast test. RESULTS: Simulation experiments showed that our method can achieve a higher power than conventional SNP-based methods with similar type-I-error rates. When applied to two real datasets, RRIntCC was able to find several significant regions, while BOOST failed to identify any significant results. The source code and the sample data of RRIntCC are available at http://bioinformatics.ust.hk/RRIntCC.html. CONCLUSION: In this paper, a new region-based interaction detection method with better performance than SNP-based interaction detection methods has been proposed.


Assuntos
Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único/genética , Estudos de Casos e Controles , Bases de Dados Genéticas , Genoma Humano , Humanos , Modelos Genéticos
19.
J Proteome Res ; 17(9): 3195-3213, 2018 09 07.
Artigo em Inglês | MEDLINE | ID: mdl-30084631

RESUMO

An in planta chemical cross-linking-based quantitative interactomics (IPQCX-MS) workflow has been developed to investigate in vivo protein-protein interactions and alteration in protein structures in a model organism, Arabidopsis thaliana. A chemical cross-linker, azide-tag-modified disuccinimidyl pimelate (AMDSP), was directly applied onto Arabidopsis tissues. Peptides produced from protein fractions of CsCl density gradient centrifugation were dimethyl-labeled, from which the AMDSP cross-linked peptides were fractionated on chromatography, enriched, and analyzed by mass spectrometry. ECL2 and SQUA-D software were used to identify and quantitate these cross-linked peptides, respectively. These computer programs integrate peptide identification with quantitation and statistical evaluation. This workflow eventually identified 354 unique cross-linked peptides, including 61 and 293 inter- and intraprotein cross-linked peptides, respectively, demonstrating that it is able to in vivo identify hundreds of cross-linked peptides at an organismal level by overcoming the difficulties caused by multiple cellular structures and complex secondary metabolites of plants. Coimmunoprecipitation and super-resolution microscopy studies have confirmed the PHB3-PHB6 protein interaction found by IPQCX-MS. The quantitative interactomics also found hormone-induced structural changes of SBPase and other proteins. This mass-spectrometry-based interactomics will be useful in the study of in vivo protein-protein interaction networks in agricultural crops and plant-microbe interactions.


Assuntos
Arabidopsis/metabolismo , Regulação da Expressão Gênica de Plantas , Mapeamento de Interação de Proteínas/métodos , Proteoma/metabolismo , Proteínas Repressoras/metabolismo , Sequência de Aminoácidos , Arabidopsis/genética , Proteínas de Arabidopsis , Cromatografia Líquida , Reagentes de Ligações Cruzadas/química , Modelos Moleculares , Peptídeos/análise , Peptídeos/química , Proibitinas , Ligação Proteica , Isoformas de Proteínas/química , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , Estrutura Secundária de Proteína , Proteólise , Proteoma/química , Proteoma/genética , Proteínas Repressoras/química , Proteínas Repressoras/genética , Coloração e Rotulagem/métodos , Succinimidas/química , Espectrometria de Massas em Tandem
20.
Mol Cell Proteomics ; 17(5): 1010-1027, 2018 05.
Artigo em Inglês | MEDLINE | ID: mdl-29440448

RESUMO

Protein acetylation, one of many types of post-translational modifications (PTMs), is involved in a variety of biological and cellular processes. In the present study, we applied both CsCl density gradient (CDG) centrifugation-based protein fractionation and a dimethyl-labeling-based 4C quantitative PTM proteomics workflow in the study of dynamic acetylproteomic changes in Arabidopsis. This workflow integrates the dimethyl chemical labeling with chromatography-based acetylpeptide separation and enrichment followed by mass spectrometry (MS) analysis, the extracted ion chromatogram (XIC) quantitation-based computational analysis of mass spectrometry data to measure dynamic changes of acetylpeptide level using an in-house software program, named Stable isotope-based Quantitation-Dimethyl labeling (SQUA-D), and finally the confirmation of ethylene hormone-regulated acetylation using immunoblot analysis. Eventually, using this proteomic approach, 7456 unambiguous acetylation sites were found from 2638 different acetylproteins, and 5250 acetylation sites, including 5233 sites on lysine side chain and 17 sites on protein N termini, were identified repetitively. Out of these repetitively discovered acetylation sites, 4228 sites on lysine side chain (i.e. 80.5%) are novel. These acetylproteins are exemplified by the histone superfamily, ribosomal and heat shock proteins, and proteins related to stress/stimulus responses and energy metabolism. The novel acetylproteins enriched by the CDG centrifugation fractionation contain many cellular trafficking proteins, membrane-bound receptors, and receptor-like kinases, which are mostly involved in brassinosteroid, light, gravity, and development signaling. In addition, we identified 12 highly conserved acetylation site motifs within histones, P-glycoproteins, actin depolymerizing factors, ATPases, transcription factors, and receptor-like kinases. Using SQUA-D software, we have quantified 33 ethylene hormone-enhanced and 31 hormone-suppressed acetylpeptide groups or called unique PTM peptide arrays (UPAs) that share the identical unique PTM site pattern (UPSP). This CDG centrifugation protein fractionation in combination with dimethyl labeling-based quantitative PTM proteomics, and SQUA-D may be applied in the quantitation of any PTM proteins in any model eukaryotes and agricultural crops as well as tissue samples of animals and human beings.


Assuntos
Proteínas de Arabidopsis/metabolismo , Arabidopsis/metabolismo , Proteômica/métodos , Coloração e Rotulagem , Acetilação , Sequência de Aminoácidos , Cromatografia Líquida , Biologia Computacional , Etilenos/farmacologia , Histonas/metabolismo , Metilação , Reprodutibilidade dos Testes , Espectrometria de Massas em Tandem
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA