Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 52
Filtrar
1.
Soft Matter ; 20(8): 1869-1883, 2024 Feb 21.
Artigo em Inglês | MEDLINE | ID: mdl-38318759

RESUMO

Active nematics are dense systems of rodlike particles that consume energy to drive motion at the level of the individual particles. They exist in natural systems like biological tissues and artificial materials such as suspensions of self-propelled colloidal particles or synthetic microswimmers. Active nematics have attracted significant attention in recent years due to their spectacular nonequilibrium collective spatiotemporal dynamics, which may enable applications in fields such as robotics, drug delivery, and materials science. The director field, which measures the direction and degree of alignment of the local nematic orientation, is a crucial characteristic of active nematics and is essential for studying topological defects. However, determining the director field is a significant challenge in many experimental systems. Although director fields can be derived from images of active nematics using traditional imaging processing methods, the accuracy of such methods is highly sensitive to the settings of the algorithms. These settings must be tuned from image to image due to experimental noise, intrinsic noise of the imaging technology, and perturbations caused by changes in experimental conditions. This sensitivity currently limits automatic analysis of active nematics. To address this, we developed a machine learning model for extracting reliable director fields from raw experimental images, which enables accurate analysis of topological defects. Application of the algorithm to experimental data demonstrates that the approach is robust and highly generalizable to experimental settings that are different from those in the training data. It could be a promising tool for investigating active nematics and may be generalized to other active matter systems.

2.
J Biomed Inform ; 156: 104677, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-38876453

RESUMO

OBJECTIVE: Existing approaches to fairness evaluation often overlook systematic differences in the social determinants of health, like demographics and socioeconomics, among comparison groups, potentially leading to inaccurate or even contradictory conclusions. This study aims to evaluate racial disparities in predicting mortality among patients with chronic diseases using a fairness detection method that considers systematic differences. METHODS: We created five datasets from Mass General Brigham's electronic health records (EHR), each focusing on a different chronic condition: congestive heart failure (CHF), chronic kidney disease (CKD), chronic obstructive pulmonary disease (COPD), chronic liver disease (CLD), and dementia. For each dataset, we developed separate machine learning models to predict 1-year mortality and examined racial disparities by comparing prediction performances between Black and White individuals. We compared racial fairness evaluation between the overall Black and White individuals versus their counterparts who were Black and matched White individuals identified by propensity score matching, where the systematic differences were mitigated. RESULTS: We identified significant differences between Black and White individuals in age, gender, marital status, education level, smoking status, health insurance type, body mass index, and Charlson comorbidity index (p-value < 0.001). When examining matched Black and White subpopulations identified through propensity score matching, significant differences between particular covariates existed. We observed weaker significance levels in the CHF cohort for insurance type (p = 0.043), in the CKD cohort for insurance type (p = 0.005) and education level (p = 0.016), and in the dementia cohort for body mass index (p = 0.041); with no significant differences for other covariates. When examining mortality prediction models across the five study cohorts, we conducted a comparison of fairness evaluations before and after mitigating systematic differences. We revealed significant differences in the CHF cohort with p-values of 0.021 and 0.001 in terms of F1 measure and Sensitivity for the AdaBoost model, and p-values of 0.014 and 0.003 in terms of F1 measure and Sensitivity for the MLP model, respectively. DISCUSSION AND CONCLUSION: This study contributes to research on fairness assessment by focusing on the examination of systematic disparities and underscores the potential for revealing racial bias in machine learning models used in clinical settings.


Assuntos
Aprendizado de Máquina , Humanos , Masculino , Feminino , Doença Crônica , Idoso , Pessoa de Meia-Idade , Racismo , População Branca/estatística & dados numéricos , Registros Eletrônicos de Saúde , Doença Pulmonar Obstrutiva Crônica/mortalidade , Negro ou Afro-Americano/estatística & dados numéricos , Insuficiência Cardíaca/mortalidade
3.
BMC Genomics ; 23(1): 660, 2022 Sep 19.
Artigo em Inglês | MEDLINE | ID: mdl-36117155

RESUMO

BACKGROUND: Brown adipose tissue (BAT) is considered as a primary location of adaptive thermogenesis and the thermogenic activities of brown adipocytes are also connected to generating heat and counteracting obesity. Recent studies revealed that BAT could secrete certain batokines-like factors especially small extracellular vesicles (sEVs), which contributed to the systemic consequences of BAT activities. As a newly emerging class of mediators, some long non-coding RNAs (lncRNAs) have exhibited metabolic regulatory effects in adipocyte development. However, besides the well-studied lncRNAs, the lncRNAs carried by sEVs derived from brown adipose tissue (sEV-BAT) have not been identified yet.  RESULTS: In this study, we demonstrated that sEV-BAT could induce beige adipocyte differentiation both in ASCs and 3T3-L1 cells, while sEV-WAT had no corresponding effects. The lncRNA microarray assay on sEV-WAT and sEV-BAT revealed a total of 563 types of known lncRNAs were identified to be differentially expressed, among which 232 lncRNAs were upregulated and 331 lncRNAs were downregulated in sEV-BAT. Three novel candidates (AK029592, humanlincRNA1030 and ENSMUST00000152284) were selected for further validation. LncRNA-mRNA network analysis revealed candidate lncRNAs were largely embedded in cellular metabolic pathways. During adipogenic and thermogenic phenotype differentiation in ASCs and 3T3-L1 cells, only the expressions of AK029592 were upregulated. The three lncRNAs were all relatively enriched in brown adipose tissues and brown adipocytes. In different adipocytes, sEV and adipose tissue, the expression of AK029592 and ENSMUST00000152284 were remarkably decreased in obese mice compared to lean mice, while obesity state could not change the expression of humanlincRNA1030. CONCLUSION: Collectively, our profiling study provided a comprehensive catalog for the study of lncRNAs specifically carried by sEV-BAT and indicated the potential regulatory role of certain sEV-BAT lncRNAs in thermogenesis.


Assuntos
Vesículas Extracelulares , RNA Longo não Codificante , Tecido Adiposo Marrom/metabolismo , Animais , Vesículas Extracelulares/genética , Vesículas Extracelulares/metabolismo , Camundongos , Obesidade/genética , Obesidade/metabolismo , RNA Longo não Codificante/genética , RNA Longo não Codificante/metabolismo , RNA Mensageiro/metabolismo , Termogênese/genética
4.
J Am Chem Soc ; 144(15): 6709-6713, 2022 04 20.
Artigo em Inglês | MEDLINE | ID: mdl-35404599

RESUMO

The Golgi apparatus (GA) is the hub of intracellular trafficking, but selectively targeting GA remains a challenge. We show an unconventional types of peptide thioesters, consisting of an aminoethyl thioester and acting as substrates of thioesterases, for instantly targeting the GA of cells. The peptide thioesters, above or below their critical micelle concentrations, enter cells mainly via caveolin-mediated endocytosis or macropinocytosis, respectively. After being hydrolyzed by GA-associated thioesterases, the resulting thiopeptides form dimers and accumulate in the GA. After saturating the GA, the thiopeptides are enriched in the endoplasmic reticulum (ER). Their buildup in ER and GA disrupts protein trafficking, thus leading to cell death via multiple pathways. The peptide thioesters target the GA of a wide variety of cells, including human, murine, and Drosophila cells. Changing d-diphenylalanine to l-diphenylalanine in the peptide maintains the GA-targeting ability. In addition, targeting GA redirects protein (e.g., NRAS) distribution. This work illustrates a thioesterase-responsive and redox-active molecular platform for targeting the GA and controlling cell fates.


Assuntos
Retículo Endoplasmático , Complexo de Golgi , Animais , Drosophila , Retículo Endoplasmático/metabolismo , Complexo de Golgi/metabolismo , Camundongos , Peptídeos/metabolismo , Fenilalanina/metabolismo
5.
Bioconjug Chem ; 33(11): 1983-1988, 2022 11 16.
Artigo em Inglês | MEDLINE | ID: mdl-35312281

RESUMO

Despite the enormous progress in genomics and proteomics, it is still challenging to assess the states of organelles in living cells with high spatiotemporal resolution. Based on our recent finding of enzyme-instructed self-assembly of a thiophosphopeptide that targets the Golgi Apparatus (GA) instantly, we use the thiophosphopeptide, which is enzymatically responsive and redox active, as an integrative probe for revealing the state of the GA of live cells at the single cell level. By imaging the probe in the GA of live cells over time, our results show that the accumulation of the probe at the GA depends on cell types. By comparison to a conventional Golgi probe, this self-assembling probe accumulates at the GA much faster and are sensitive to the expression of alkaline phosphatases. In addition, subtle changes of the fluorophore results in slightly different GA responses. This work illustrates a novel class of active molecular probes that combine enzyme-instructed self-assembly and redox reaction for high-resolution imaging of the states of subcellular organelles over a large area and extended times.


Assuntos
Corantes Fluorescentes , Complexo de Golgi , Complexo de Golgi/metabolismo , Complexo de Golgi/ultraestrutura , Corantes Fluorescentes/química , Microscopia de Fluorescência , Organelas/metabolismo , Fosfatase Alcalina/metabolismo
6.
Soft Matter ; 17(3): 738-747, 2021 Jan 21.
Artigo em Inglês | MEDLINE | ID: mdl-33220675

RESUMO

Active nematics are a class of far-from-equilibrium materials characterized by local orientational order of force-generating, anisotropic constitutes. Traditional methods for predicting the dynamics of active nematics rely on hydrodynamic models, which accurately describe idealized flows and many of the steady-state properties, but do not capture certain detailed dynamics of experimental active nematics. We have developed a deep learning approach that uses a Convolutional Long-Short-Term-Memory (ConvLSTM) algorithm to automatically learn and forecast the dynamics of active nematics. We demonstrate our purely data-driven approach on experiments of 2D unconfined active nematics of extensile microtubule bundles, as well as on data from numerical simulations of active nematics.

7.
Anal Chem ; 92(1): 782-791, 2020 01 07.
Artigo em Inglês | MEDLINE | ID: mdl-31829560

RESUMO

Despite the recent advances in mass spectrometry (MS)-based methods for glycan structural analysis, characterization of glycomes remains a significant analytical challenge, in part due to the widespread presence of isomeric structures and the need to define the many structural variables for each glycan. Interpretation of the complex tandem mass spectra of glycans is often laborious and requires substantial expertise. Broad adoption of MS methods for glycomics, within and outside the glycoscience community, has been hindered by the shortage of bioinformatics tools for rapid and accurate glycan sequencing. Here, we developed an online porous graphitic carbon liquid chromatography (PGC-LC)-electronic excitation dissociation (EED) MS/MS method that takes advantage of the superior isomer resolving power of PGC and the structural details provided by EED MS/MS for characterization of glycan mixtures. We also made improvements to GlycoDeNovo, our de novo glycan sequencing algorithm, so that it can automatically and accurately identify glycan topologies from EED tandem mass spectra acquired online. The majority of linkages can also be determined de novo, although in some cases, biological insight may be needed to fully define the glycan structure. Application of this method to the analysis of N-glycans released from ribonuclease B not only revealed the presence of 18 high-mannose structures, including new isomers not previously reported, but also provided relative quantification for each isomeric structure. With fully automated data acquisition and topology analysis, the approach presented here holds great potential for automated and comprehensive glycan characterization.


Assuntos
Polissacarídeos/análise , Espectrometria de Massas em Tandem/métodos , Animais , Bovinos , Cromatografia Líquida/métodos , Glicômica/métodos , Grafite/química , Porosidade , Ribonucleases/química
8.
Entropy (Basel) ; 22(3)2020 Mar 02.
Artigo em Inglês | MEDLINE | ID: mdl-33286064

RESUMO

Traditional hypothesis-margin researches focus on obtaining large margins and feature selection. In this work, we show that the robustness of margins is also critical and can be measured using entropy. In addition, our approach provides clear mathematical formulations and explanations to uncover feature interactions, which is often lack in large hypothesis-margin based approaches. We design an algorithm, termed IMMIGRATE (Iterative max-min entropy margin-maximization with interaction terms), for training the weights associated with the interaction terms. IMMIGRATE simultaneously utilizes both local and global information and can be used as a base learner in Boosting. We evaluate IMMIGRATE in a wide range of tasks, in which it demonstrates exceptional robustness and achieves the state-of-the-art results with high interpretability.

9.
J Cell Physiol ; 234(11): 20925-20934, 2019 11.
Artigo em Inglês | MEDLINE | ID: mdl-31026067

RESUMO

The adipogenic differentiation of adipose tissue-derived mesenchymal stem cells (ADSCs) is a critical issue in many obesity-related disorders. Cytidine-cytidine-adenosine-adenosine-thymidine (CCAAT) enhancer binding protein α (CEBP-α) and peroxisome proliferator-activated receptor-γ are two important lipogenic and adipogenic transcription factors and markers in adipogenic differentiation. Noncoding RNAs participate in adipogenic differentiation. The long noncoding RNA (lncRNA) H19 is related to multiple cellular differentiation, including adipogenic differentiation; however, its function and precise molecular mechanism in human ADSCs (hADSCs) adipogenic differentiation are unclear. microRNAs that were differentially expressed in adipogenic differentiation and could be targeted by H19 were screened and selected; the regulation and interaction between H19 and miR-30a were verified. The interaction between miR-30a and predicted downstream target C8orf4 was validated. The dynamic effects of H19 and miR-30a on C8orf4 messenger RNA (mRNA) expression and protein and adipogenic differentiation were evaluated. miR-30a negatively regulated H19 with each other through direct binding. As predicted by TargetScan and verified using luciferase reporter gene assays, miR-30a directly bound to the 3'-untranslated region of C8orf4 to inhibit its expression; H19 knockdown suppressed while miR-30a inhibition promoted the mRNA expression and the protein levels of C8orf4 and adipogenic differentiation; the effect of H19 knockdown could be partially reversed by miR-30a inhibition. The lncRNA H19 serves as a competing endogenous RNA (ceRNA) for miR-30a to augment miR-30a downstream target C8orf4, therefore modulating adipogenic differentiation in hADSCs. From the perspective of lncRNA-miRNA-mRNA regulation, we provided a novel regulatory mechanism of hADSCs adipogenic differentiation.


Assuntos
Adipogenia/genética , Células-Tronco Mesenquimais/citologia , Células-Tronco Mesenquimais/metabolismo , MicroRNAs/metabolismo , Proteínas de Neoplasias/metabolismo , RNA Longo não Codificante/metabolismo , Regiões 3' não Traduzidas/genética , Sequência de Bases , Células Cultivadas , Regulação da Expressão Gênica , Humanos , MicroRNAs/genética , RNA Longo não Codificante/genética
10.
Anal Chem ; 90(6): 3793-3801, 2018 03 20.
Artigo em Inglês | MEDLINE | ID: mdl-29443510

RESUMO

Detailed glycan structural characterization is frequently achieved by collisionally activated dissociation (CAD) based sequential tandem mass spectrometry (MS n) analysis of permethylated glycans. However, it is challenging to implement MS n ( n > 2) during online glycan separation, and this has limited its application to analysis of complex glycan mixtures from biological samples. Further, permethylation can reduce liquid chromatographic (LC) resolution of isomeric glycans. Here, we studied the electronic excitation dissociation (EED) fragmentation behavior of native glycans with a reducing-end fixed charge tag and identified key spectral features that are useful for topology and linkage determination. We also developed a de novo glycan sequencing software that showed remarkable accuracy in glycan topology elucidation based on the EED spectra of fixed charge-derivatized glycans. The ability to obtain glycan structural details at the MS2 level, without permethylation, via a combination of fixed charge derivatization, EED, and de novo spectral interpretation, makes the present approach a promising tool for comprehensive and rapid characterization of glycan mixtures.


Assuntos
Oligossacarídeos/análise , Polissacarídeos/química , Espectrometria de Massas em Tandem/métodos , Cromatografia Líquida/métodos , Elétrons , Isomerismo , Análise de Sequência/métodos , Software
11.
BMC Bioinformatics ; 17(1): 541, 2016 Dec 19.
Artigo em Inglês | MEDLINE | ID: mdl-27993137

RESUMO

BACKGROUND: Performing statistical tests is an important step in analyzing genome-wide datasets for detecting genomic features differentially expressed between conditions. Each type of statistical test has its own advantages in characterizing certain aspects of differences between population means and often assumes a relatively simple data distribution (e.g., Gaussian, Poisson, negative binomial, etc.), which may not be well met by the datasets of interest. Making insufficient distributional assumptions can lead to inferior results when dealing with complex differential expression patterns. RESULTS: We propose to capture differential expression information more comprehensively by integrating multiple test statistics, each of which has relatively limited capacity to summarize the observed differential expression information. This work addresses a general application scenario, in which users want to detect as many as DEFs while requiring the false discovery rate (FDR) to be lower than a cut-off. We treat each test statistic as a basic attribute, and model the detection of differentially expressed genomic features as learning a discriminant boundary in a multi-dimensional space of basic attributes. We mathematically formulated our goal as a constrained optimization problem aiming to maximize discoveries satisfying a user-defined FDR. An effective algorithm, Discriminant-Cut, has been developed to solve an instantiation of this problem. Extensive comparisons of Discriminant-Cut with 13 existing methods were carried out to demonstrate its robustness and effectiveness. CONCLUSIONS: We have developed a novel machine learning methodology for robust differential expression analysis, which can be a new avenue to significantly advance research on large-scale differential expression analysis.


Assuntos
Algoritmos , Estudo de Associação Genômica Ampla , Aprendizado de Máquina , Análise Discriminante , Perfilação da Expressão Gênica/métodos , Humanos , Modelos Estatísticos , Distribuição Normal
12.
Mol Cell ; 32(4): 592-9, 2008 Nov 21.
Artigo em Inglês | MEDLINE | ID: mdl-19026789

RESUMO

The specificity of RNAi pathways is determined by several classes of small RNAs, which include siRNAs, piRNAs, endo-siRNAs, and microRNAs (miRNAs). These small RNAs are invariably incorporated into large Argonaute (Ago)-containing effector complexes known as RNA-induced silencing complexes (RISCs), which they guide to silencing targets. Both genetic and biochemical strategies have yielded conserved molecular components of small RNA biogenesis and effector machineries. However, given the complexity of these pathways, there are likely to be additional components and regulators that remain to be uncovered. We have undertaken a comparative and comprehensive RNAi screen to identify genes that impact three major Ago-dependent small RNA pathways that operate in Drosophila S2 cells. We identify subsets of candidates that act positively or negatively in siRNA, endo-siRNA, and miRNA pathways. Our studies indicate that many components are shared among all three Argonaute-dependent silencing pathways, though each is also impacted by discrete sets of genes.


Assuntos
Drosophila/metabolismo , MicroRNAs/metabolismo , RNA Interferente Pequeno/metabolismo , Complexo de Inativação Induzido por RNA/metabolismo , Animais , Proteínas Argonautas , Linhagem Celular , Drosophila/citologia , Drosophila/genética , Proteínas de Drosophila , Drosophila melanogaster/genética , Drosophila melanogaster/metabolismo , Fatores de Iniciação em Eucariotos , Inativação Gênica , Genes de Insetos , MicroRNAs/genética , Modelos Biológicos , Interferência de RNA , RNA Interferente Pequeno/genética , RNA não Traduzido/genética , RNA não Traduzido/metabolismo , Complexo de Inativação Induzido por RNA/genética
13.
Sci Rep ; 14(1): 12131, 2024 May 27.
Artigo em Inglês | MEDLINE | ID: mdl-38802415

RESUMO

Stereoselective reactions have played a vital role in the emergence of life, evolution, human biology, and medicine. However, for a long time, most industrial and academic efforts followed a trial-and-error approach for asymmetric synthesis in stereoselective reactions. In addition, most previous studies have been qualitatively focused on the influence of steric and electronic effects on stereoselective reactions. Therefore, quantitatively understanding the stereoselectivity of a given chemical reaction is extremely difficult. As proof of principle, this paper develops a novel composite machine learning method for quantitatively predicting the enantioselectivity representing the degree to which one enantiomer is preferentially produced from the reactions. Specifically, machine learning methods that are widely used in data analytics, including Random Forest, Support Vector Regression, and LASSO, are utilized. In addition, the Bayesian optimization and permutation importance tests are provided for an in-depth understanding of reactions and accurate prediction. Finally, the proposed composite method approximates the key features of the available reactions by using Gaussian mixture models, which provide suitable machine learning methods for new reactions. The case studies using the real stereoselective reactions show that the proposed method is effective and provides a solid foundation for further application to other chemical reactions.

14.
Proteome Sci ; 11(Suppl 1): S16, 2013 Nov 07.
Artigo em Inglês | MEDLINE | ID: mdl-24565074

RESUMO

BACKGROUND: Tandem affinity purification coupled with mass-spectrometry (TAP/MS) analysis is a popular method for the identification of novel endogenous protein-protein interactions (PPIs) in large-scale. Computational analysis of TAP/MS data is a critical step, particularly for high-throughput datasets, yet it remains challenging due to the noisy nature of TAP/MS data. RESULTS: We investigated several major TAP/MS data analysis methods for identifying PPIs, and developed an advanced method, which incorporates an improved statistical method to filter out false positives from the negative controls. Our method is named PPIRank that stands for PPI ranking in TAP/MS data. We compared PPIRank with several other existing methods in analyzing two pathway-specific TAP/MS PPI datasets from Drosophila. CONCLUSION: Experimental results show that PPIRank is more capable than other approaches in terms of identifying known interactions collected in the BioGRID PPI database. Specifically, PPIRank is able to capture more true interactions and simultaneously less false positives in both Insulin and Hippo pathways of Drosophila Melanogaster.

15.
J Biomed Inform ; 46 Suppl: S48-S53, 2013 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-24076508

RESUMO

The automatic detection of temporal relations between events in electronic medical records has the potential to greatly augment the value of such records for understanding disease progression and patients' responses to treatments. We present a three-step methodology for labeling temporal relations using machine learning and deterministic rules over an annotated corpus provided by the 2012 i2b2 Shared Challenge. We first create an expanded training network of relations by computing the transitive closure over the annotated data; we then apply hand-written rules and machine learning with a feature set that casts a wide net across potentially relevant lexical and syntactic information; finally, we employ a voting mechanism to resolve global contradictions between the local predictions made by the learned classifier. Results over the testing data illustrate the contributions of initial prediction and conflict resolution.


Assuntos
Registros Eletrônicos de Saúde , Narração , Processamento de Linguagem Natural , Humanos , Informática Médica , Fatores de Tempo
16.
Front Bioeng Biotechnol ; 11: 1185251, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37425361

RESUMO

Background: The regenerative capabilities of derivatives derived from the fat layer of lipoaspirate have been demonstrated. However, the large volume of lipoaspirate fluid has not attracted extensive attention in clinical applications. In this study, we aimed to isolate the factors and extracellular vesicles from human lipoaspirate fluid and evaluate their potential therapeutic efficacy. Methods: Lipoaspirate fluid derived factors and extracellular vesicles (LF-FVs) were prepared from human lipoaspirate and characterized by nanoparticle tracking analysis, size-exclusion chromatography and adipokine antibody arrays. The therapeutic potential of LF-FVs was evaluated on fibroblasts in vitro and rat burn model in vivo. Wound healing process was recorded on days 2, 4, 8, 10, 12 and 16 post-treatment. The scar formation was analyzed by histology, immunofluorescent staining and scar-related gene expression at day 35 post-treatment. Results: The results of nanoparticle tracking analysis and size-exclusion chromatography indicated that LF-FVs were enriched with proteins and extracellular vesicles. Specific adipokines (adiponectin and IGF-1) were detected in LF-FVs. In vitro, LF-FVs augmented the proliferation and migration of fibroblasts in a dose-dependent manner. In vivo, the results showed that LF-FVs significantly accelerated burn wound healing. Moreover, LF-FVs improved the quality of wound healing, including regenerating cutaneous appendages (hair follicles and sebaceous glands) and decreasing scar formation in the healed skin. Conclusion: LF-FVs were successfully prepared from lipoaspirate liquid, which were cell-free and enriched with extracellular vesicles. Additionally, they were found to improve wound healing in a rat burn model, suggesting that LF-FVs could be potentially used for wound regeneration in clinical settings.

17.
J Am Soc Mass Spectrom ; 34(10): 2127-2135, 2023 Oct 04.
Artigo em Inglês | MEDLINE | ID: mdl-37621000

RESUMO

Glycosidic linkages in oligosaccharides play essential roles in determining their chemical properties and biological activities. MSn has been widely used to infer glycosidic linkages but requires a substantial amount of starting material, which limits its application. In addition, there is a lack of rigorous research on what MSn protocols are proper for characterizing glycosidic linkages. In this work, to deliver high-quality experimental data and analysis results, we propose a machine learning-based framework to establish appropriate MSn protocols and build effective data analysis methods. We demonstrate the proof-of-principle by applying our approach to elucidate sialic acid linkages (α2'-3' and α2'-6') in a set of sialyllactose standards and NIST sialic acid-containing N-glycans as well as identify several protocol configurations for producing high-quality experimental data. Our companion data analysis method achieves nearly 100% accuracy in classifying α2'-3' vs α2'-6' using MS5, MS4, MS3, or even MS2 spectra alone. The ability to determine glycosidic linkages using MS2 or MS3 is significant as it requires substantially less sample, enabling linkage analysis for quantity-limited natural glycans and synthesized materials, as well as shortens the overall experimental time. MS2 is also more amenable than MS3/4/5 to automation when coupled to direct infusion or LC-MS. Additionally, our method can predict the ratio of α2'-3' and α2'-6' in a mixture with 8.6% RMSE (root-mean-square error) across data sets using MS5 spectra. We anticipate that our framework will be generally applicable to analysis of other glycosidic linkages.


Assuntos
Ácido N-Acetilneuramínico , Polissacarídeos , Ácido N-Acetilneuramínico/química , Polissacarídeos/análise , Espectrometria de Massas/métodos , Oligossacarídeos/química , Cromatografia Líquida
18.
Chem Sci ; 14(24): 6695-6704, 2023 Jun 21.
Artigo em Inglês | MEDLINE | ID: mdl-37350811

RESUMO

Comprehensive de novo glycan sequencing remains an elusive goal due to the structural diversity and complexity of glycans. Present strategies employing collision-induced dissociation (CID) and higher energy collisional dissociation (HCD)-based multi-stage tandem mass spectrometry (MSn) or MS/MS combined with sequential exoglycosidase digestions are inherently low-throughput and difficult to automate. Compared to CID and HCD, electron transfer dissociation (ETD) and electron capture dissociation (ECD) each generate more cross-ring cleavages informative about linkage positions, but electronic excitation dissociation (EED) exceeds the information content of all other methods and is also applicable to analysis of singly charged precursors. Although EED can provide extensive glycan structural information in a single stage of MS/MS, its performance has largely been limited to FTICR MS, and thus it has not been widely adopted by the glycoscience research community. Here, the effective performance of EED MS/MS was demonstrated on a hybrid Orbitrap-Omnitrap QE-HF instrument, with high sensitivity, fragmentation efficiency, and analysis speed. In addition, a novel EED MS2-guided MS3 approach was developed for detailed glycan structural analysis. Automated topology reconstruction from MS2 and MS3 spectra could be achieved with a modified GlycoDeNovo software. We showed that the topology and linkage configurations of the Man9GlcNAc2 glycan can be accurately determined from first principles based on one EED MS2 and two CID-EED MS3 analyses, without reliance on biological knowledge, a structure database or a spectral library. The presented approach holds great promise for autonomous, comprehensive and de novo glycan sequencing.

19.
Artigo em Inglês | MEDLINE | ID: mdl-36374897

RESUMO

Graph learning aims to predict the label for an entire graph. Recently, graph neural network (GNN)-based approaches become an essential strand to learning low-dimensional continuous embeddings of entire graphs for graph label prediction. While GNNs explicitly aggregate the neighborhood information and implicitly capture the topological structure for graph representation, they ignore the relationships among graphs. In this article, we propose a graph-graph (G2G) similarity network to tackle the graph learning problem by constructing a SuperGraph through learning the relationships among graphs. Each node in the SuperGraph represents an input graph, and the weights of edges denote the similarity between graphs. By this means, the graph learning task is then transformed into a classical node label propagation problem. Specifically, we use an adversarial autoencoder to align embeddings of all the graphs to a prior data distribution. After the alignment, we design the G2G similarity network to learn the similarity between graphs, which functions as the adjacency matrix of the SuperGraph. By running node label propagation algorithms on the SuperGraph, we can predict the labels of graphs. Experiments on five widely used classification benchmarks and four public regression benchmarks under a fair setting demonstrate the effectiveness of our method.

20.
J Am Soc Mass Spectrom ; 33(3): 436-445, 2022 Mar 02.
Artigo em Inglês | MEDLINE | ID: mdl-35157458

RESUMO

Glycan structure identification is essential to understanding the roles of glycans in various biological processes. Previously, we developed GlycoDeNovo, a de novo algorithm for reconstructing glycan topologies from tandem mass spectra (MS/MS). In this work, we introduce GlycoDeNovo2 that contains two major improvements to GlycoDeNovo. First, we use the precursor mass measured for a peak that likely corresponds to a glycan to determine its potential compositions, which are used to constrain the search space, enable parallel computation, and hence speed up topology reconstruction. Second, we developed a procedure to calculate the empirical p-value of a reconstructed topology candidate. Experimental results are provided to demonstrate the effectiveness of GlycoDeNovo2.


Assuntos
Algoritmos , Glicopeptídeos , Polissacarídeos , Análise de Sequência de Proteína/métodos , Espectrometria de Massas em Tandem/métodos , Glicopeptídeos/análise , Glicopeptídeos/química , Polissacarídeos/análise , Polissacarídeos/química
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA