Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 13 de 13
Filter
1.
Mol Psychiatry ; 28(2): 822-833, 2023 02.
Article in English | MEDLINE | ID: mdl-36266569

ABSTRACT

Autism Spectrum Disorder (ASD) diagnosis remains behavior-based and the median age of diagnosis is ~52 months, nearly 5 years after its first-trimester origin. Accurate and clinically-translatable early-age diagnostics do not exist due to ASD genetic and clinical heterogeneity. Here we collected clinical, diagnostic, and leukocyte RNA data from 240 ASD and typically developing (TD) toddlers (175 toddlers for training and 65 for test). To identify gene expression ASD diagnostic classifiers, we developed 42,840 models composed of 3570 gene expression feature selection sets and 12 classification methods. We found that 742 models had AUC-ROC ≥ 0.8 on both Training and Test sets. Weighted Bayesian model averaging of these 742 models yielded an ensemble classifier model with accurate performance in Training and Test gene expression datasets with ASD diagnostic classification AUC-ROC scores of 85-89% and AUC-PR scores of 84-92%. ASD toddlers with ensemble scores above and below the overall ASD ensemble mean of 0.723 (on a scale of 0 to 1) had similar diagnostic and psychometric scores, but those below this ASD ensemble mean had more prenatal risk events than TD toddlers. Ensemble model feature genes were involved in cell cycle, inflammation/immune response, transcriptional gene regulation, cytokine response, and PI3K-AKT, RAS and Wnt signaling pathways. We additionally collected targeted DNA sequencing smMIPs data on a subset of ASD risk genes from 217 of the 240 ASD and TD toddlers. This DNA sequencing found about the same percentage of SFARI Level 1 and 2 ASD risk gene mutations in TD (12 of 105) as in ASD (13 of 112) toddlers, and classification based only on the presence of mutation in these risk genes performed at a chance level of 49%. By contrast, the leukocyte ensemble gene expression classifier correctly diagnostically classified 88% of TD and ASD toddlers with ASD risk gene mutations. Our ensemble ASD gene expression classifier is diagnostically predictive and replicable across different toddler ages, races, and ethnicities; out-performs a risk gene mutation classifier; and has potential for clinical translation.


Subject(s)
Autism Spectrum Disorder , Humans , Child, Preschool , Infant , Autism Spectrum Disorder/diagnosis , Autism Spectrum Disorder/genetics , Bayes Theorem , Phosphatidylinositol 3-Kinases , Immunity , Gene Expression
2.
Proc Natl Acad Sci U S A ; 116(28): 14011-14018, 2019 07 09.
Article in English | MEDLINE | ID: mdl-31235599

ABSTRACT

Three-dimensional genome structure plays a pivotal role in gene regulation and cellular function. Single-cell analysis of genome architecture has been achieved using imaging and chromatin conformation capture methods such as Hi-C. To study variation in chromosome structure between different cell types, computational approaches are needed that can utilize sparse and heterogeneous single-cell Hi-C data. However, few methods exist that are able to accurately and efficiently cluster such data into constituent cell types. Here, we describe scHiCluster, a single-cell clustering algorithm for Hi-C contact matrices that is based on imputations using linear convolution and random walk. Using both simulated and real single-cell Hi-C data as benchmarks, scHiCluster significantly improves clustering accuracy when applied to low coverage datasets compared with existing methods. After imputation by scHiCluster, topologically associating domain (TAD)-like structures (TLSs) can be identified within single cells, and their consensus boundaries were enriched at the TAD boundaries observed in bulk cell Hi-C samples. In summary, scHiCluster facilitates visualization and comparison of single-cell 3D genomes.


Subject(s)
Chromatin/ultrastructure , Chromosome Structures/ultrastructure , Computational Biology , Single-Cell Analysis , Algorithms , Cluster Analysis , Genome/genetics , Humans , Molecular Conformation
3.
J Biomed Sci ; 28(1): 50, 2021 Jun 22.
Article in English | MEDLINE | ID: mdl-34158025

ABSTRACT

Cancer immunotherapy has revolutionized treatment and led to an unprecedented wave of immuno-oncology research during the past two decades. In 2018, two pioneer immunotherapy innovators, Tasuku Honjo and James P. Allison, were awarded the Nobel Prize for their landmark cancer immunotherapy work regarding "cancer therapy by inhibition of negative immune regulation" -CTLA4 and PD-1 immune checkpoints. However, the challenge in the coming decade is to develop cancer immunotherapies that can more consistently treat various patients and cancer types. Overcoming this challenge requires a systemic understanding of the underlying interactions between immune cells, tumor cells, and immunotherapeutics. The role of aberrant glycosylation in this process, and how it influences tumor immunity and immunotherapy is beginning to emerge. Herein, we review current knowledge of miRNA-mediated regulatory mechanisms of glycosylation machinery, and how these carbohydrate moieties impact immune cell and tumor cell interactions. We discuss these insights in the context of clinical findings and provide an outlook on modulating the regulation of glycosylation to offer new therapeutic opportunities. Finally, in the coming age of systems glycobiology, we highlight how emerging technologies in systems glycobiology are enabling deeper insights into cancer immuno-oncology, helping identify novel drug targets and key biomarkers of cancer, and facilitating the rational design of glyco-immunotherapies. These hold great promise clinically in the immuno-oncology field.


Subject(s)
Biomarkers , Drug Delivery Systems/methods , Glycomics/methods , Immunotherapy/methods , MicroRNAs/metabolism
4.
Beilstein J Org Chem ; 16: 2645-2662, 2020.
Article in English | MEDLINE | ID: mdl-33178355

ABSTRACT

Systems glycobiology aims to provide models and analysis tools that account for the biosynthesis, regulation, and interactions with glycoconjugates. To facilitate these methods, there is a need for a clear glycan representation accessible to both computers and humans. Linear Code, a linearized and readily parsable glycan structure representation, is such a language. For this reason, Linear Code was adapted to represent reaction rules, but the syntax has drifted from its original description to accommodate new and originally unforeseen challenges. Here, we delineate the consensuses and inconsistencies that have arisen through this adaptation. We recommend options for a consensus-based extension of Linear Code that can be used for reaction rule specification going forward. Through this extension and specification of Linear Code to reaction rules, we aim to minimize inconsistent symbology thereby making glycan database queries easier. With a clear guide for generating reaction rule descriptions, glycan synthesis models will be more interoperable and reproducible thereby moving glycoinformatics closer to compliance with FAIR standards. Here, we present Linear Code for Reaction Rules (LiCoRR), version 1.0, an unambiguous representation for describing glycosylation reactions in both literature and code.

5.
bioRxiv ; 2024 May 23.
Article in English | MEDLINE | ID: mdl-38798633

ABSTRACT

Glycosylation is described as a non-templated biosynthesis. Yet, the template-free premise is antithetical to the observation that different N-glycans are consistently placed at specific sites. It has been proposed that glycosite-proximal protein structures could constrain glycosylation and explain the observed microheterogeneity. Using site-specific glycosylation data, we trained a hybrid neural network to parse glycosites (recurrent neural network) and match them to feasible N-glycosylation events (graph neural network). From glycosite-flanking sequences, the algorithm predicts most human N-glycosylation events documented in the GlyConnect database and proposed structures corresponding to observed monosaccharide composition of the glycans at these sites. The algorithm also recapitulated glycosylation in Enhanced Aromatic Sequons, SARS-CoV-2 spike, and IgG3 variants, thus demonstrating the ability of the algorithm to predict both glycan structure and abundance. Thus, protein structure constrains glycosylation, and the neural network enables predictive in silico glycosylation of uncharacterized or novel protein sequences and genetic variants.

6.
STAR Protoc ; 4(2): 102162, 2023 Mar 13.
Article in English | MEDLINE | ID: mdl-36920914

ABSTRACT

GlyCompareCT is a portable command-line tool to facilitate downstream glycomic data analyses, by addressing data inherent sparsity and non-independence. Inputting glycan abundances, users can run GlyCompareCT with one line of code to obtain the abundances of a minimal substructure set, named glycomotif, thereby quantifying hidden biosynthetic relationships between measured glycans. Optional parameters tuning and annotation are supported for personal preference. For complete details on the use and execution of this protocol, please refer to Bao et al. (2021).1.

7.
Sci Rep ; 12(1): 4253, 2022 03 11.
Article in English | MEDLINE | ID: mdl-35277549

ABSTRACT

Few clinically validated biomarkers of ASD exist which can rapidly, accurately, and objectively identify autism during the first years of life and be used to support optimized treatment outcomes and advances in precision medicine. As such, the goal of the present study was to leverage both simple and computationally-advanced approaches to validate an eye-tracking measure of social attention preference, the GeoPref Test, among 1,863 ASD, delayed, or typical toddlers (12-48 months) referred from the community or general population via a primary care universal screening program. Toddlers participated in diagnostic and psychometric evaluations and the GeoPref Test: a 1-min movie containing side-by-side dynamic social and geometric images. Following testing, diagnosis was denoted as ASD, ASD features, LD, GDD, Other, typical sibling of ASD proband, or typical. Relative to other diagnostic groups, ASD toddlers exhibited the highest levels of visual attention towards geometric images and those with especially high fixation levels exhibited poor clinical profiles. Using the 69% fixation threshold, the GeoPref Test had 98% specificity, 17% sensitivity, 81% PPV, and 65% NPV. Sensitivity increased to 33% when saccades were included, with comparable validity across sex, ethnicity, or race. The GeoPref Test was also highly reliable up to 24 months following the initial test. Finally, fixation levels among twins concordant for ASD were significantly correlated, indicating that GeoPref Test performance may be genetically driven. As the GeoPref Test yields few false positives (~ 2%) and is equally valid across demographic categories, the current findings highlight the ability of the GeoPref Test to rapidly and accurately detect autism before the 2nd birthday in a subset of children and serve as a biomarker for a unique ASD subtype in clinical trials.


Subject(s)
Autism Spectrum Disorder , Autistic Disorder , Autism Spectrum Disorder/diagnosis , Biomarkers , Eye-Tracking Technology , Humans , Saccades
8.
Nat Commun ; 13(1): 2455, 2022 05 04.
Article in English | MEDLINE | ID: mdl-35508452

ABSTRACT

Human Milk Oligosaccharides (HMOs) are abundant carbohydrates fundamental to infant health and development. Although these oligosaccharides were discovered more than half a century ago, their biosynthesis in the mammary gland remains largely uncharacterized. Here, we use a systems biology framework that integrates glycan and RNA expression data to construct an HMO biosynthetic network and predict glycosyltransferases involved. To accomplish this, we construct models describing the most likely pathways for the synthesis of the oligosaccharides accounting for >95% of the HMO content in human milk. Through our models, we propose candidate genes for elongation, branching, fucosylation, and sialylation of HMOs. Our model aggregation approach recovers 2 of 2 previously known gene-enzyme relations and 2 of 3 empirically confirmed gene-enzyme relations. The top genes we propose for the remaining 5 linkage reactions are consistent with previously published literature. These results provide the molecular basis of HMO biosynthesis necessary to guide progress in HMO research and application with the goal of understanding and improving infant health and development.


Subject(s)
Milk, Human , Oligosaccharides , Glycosyltransferases/genetics , Glycosyltransferases/metabolism , Humans , Infant , Milk, Human/metabolism , Oligosaccharides/metabolism
9.
Nat Commun ; 12(1): 4988, 2021 08 17.
Article in English | MEDLINE | ID: mdl-34404781

ABSTRACT

Glycans are fundamental cellular building blocks, involved in many organismal functions. Advances in glycomics are elucidating the essential roles of glycans. Still, it remains challenging to properly analyze large glycomics datasets, since the abundance of each glycan is dependent on many other glycans that share many intermediate biosynthetic steps. Furthermore, the overlap of measured glycans can be low across samples. We address these challenges with GlyCompare, a glycomic data analysis approach that accounts for shared biosynthetic steps for all measured glycans to correct for sparsity and non-independence in glycomics, which enables direct comparison of different glycoprofiles and increases statistical power. Using GlyCompare, we study diverse N-glycan profiles from glycoengineered erythropoietin. We obtain biologically meaningful clustering of mutant cell glycoprofiles and identify knockout-specific effects of fucosyltransferase mutants on tetra-antennary structures. We further analyze human milk oligosaccharide profiles and find mother's fucosyltransferase-dependent secretor-status indirectly impact the sialylation. Finally, we apply our method on mucin-type O-glycans, gangliosides, and site-specific compositional glycosylation data to reveal tissues and disease-specific glycan presentations. Our substructure-oriented approach will enable researchers to take full advantage of the growing power and size of glycomics data.


Subject(s)
Biosynthetic Pathways , Glycomics , Polysaccharides/biosynthesis , Biological Transport , Biosynthetic Pathways/genetics , Cluster Analysis , Data Analysis , Erythropoietin/metabolism , Fucosyltransferases/genetics , Gangliosides , Gene Knockout Techniques , Glycosylation , Humans , Mucins
10.
Curr Res Biotechnol ; 2: 22-36, 2020 Nov.
Article in English | MEDLINE | ID: mdl-32285041

ABSTRACT

Glycosylated biopharmaceuticals are important in the global pharmaceutical market. Despite the importance of their glycan structures, our limited knowledge of the glycosylation machinery still hinders controllability of this critical quality attribute. To facilitate discovery of glycosyltransferase specificity and predict glycoengineering efforts, here we extend the approach to model N-linked protein glycosylation as a Markov process. Our model leverages putative glycosyltransferase (GT) specificity to define the biosynthetic pathways for all measured glycans, and the Markov chain modelling is used to learn glycosyltransferase isoform activities and predict glycosylation following glycosyltransferase knock-in/knockout. We apply our methodology to four different glycoengineered therapeutics (i.e., Rituximab, erythropoietin, Enbrel, and alpha-1 antitrypsin) produced in CHO cells. Our model accurately predicted N-linked glycosylation following glycoengineering and further quantified the impact of glycosyltransferase mutations on reactions catalyzed by other glycosyltransferases. By applying these learned GT-GT interaction rules identified from single glycosyltransferase mutants, our model further predicts the outcome of multi-gene glycosyltransferase mutations on the diverse biotherapeutics. Thus, this modeling approach enables rational glycoengineering and the elucidation of relationships between glycosyltransferases, thereby facilitating biopharmaceutical research and aiding the broader study of glycosylation to elucidate the genetic basis of complex changes in glycosylation.

11.
iScience ; 16: 155-161, 2019 Jun 28.
Article in English | MEDLINE | ID: mdl-31174177

ABSTRACT

We present an accessible, fast, and customizable network propagation system for pathway boosting and interpretation of genome-wide association studies. This system-NAGA (Network Assisted Genomic Association)-taps the NDEx biological network resource to gain access to thousands of protein networks and select those most relevant and performative for a specific association study. The method works efficiently, completing genome-wide analysis in under 5 minutes on a modern laptop computer. We show that NAGA recovers many known disease genes from analysis of schizophrenia genetic data, and it substantially boosts associations with previously unappreciated genes such as amyloid beta precursor. On this and seven other gene-disease association tasks, NAGA outperforms conventional approaches in recovery of known disease genes and replicability of results. Protein interactions associated with disease are visualized and annotated in Cytoscape, which, in addition to standard programmatic interfaces, allows for downstream analysis.

12.
Int J Biol Sci ; 11(1): 11-21, 2015.
Article in English | MEDLINE | ID: mdl-25552925

ABSTRACT

The homocysteine methyltransferase encoded by mmuM is widely distributed among microbial organisms. It is the key enzyme that catalyzes the last step in methionine biosynthesis and plays an important role in the metabolism process. It also enables the microbial organisms to tolerate high concentrations of selenium in the environment. In this research, 533 mmuM gene sequences covering 70 genera of the bacteria were selected from GenBank database. The distribution frequency of mmuM is different in the investigated genera of bacteria. The mapping results of 160 mmuM reference sequences showed that the mmuM genes were found in 7 species of pathogen genomes sequenced in this work. The polymerase chain reaction products of one mmuM genotype (NC_013951 as the reference) were sequenced and the sequencing results confirmed the mapping results. Furthermore, 144 representative sequences were chosen for phylogenetic analysis and some mmuM genes from totally different genera (such as the genes between Escherichia and Klebsiella and between Enterobacter and Kosakonia) shared closer phylogenetic relationship than those from the same genus. Comparative genomic analysis of the mmuM encoding regions on plasmids and bacterial chromosomes showed that pKF3-140 and pIP1206 plasmids shared a 21 kb homology region and a 4.9 kb fragment in this region was in fact originated from the Escherichia coli chromosome. These results further suggested that mmuM gene did go through the gene horizontal transfer among different species or genera of bacteria. High-throughput sequencing combined with comparative genomics analysis would explore distribution and dissemination of the mmuM gene among bacteria and its evolution at a molecular level.


Subject(s)
Bacteria/enzymology , Gene Transfer, Horizontal/genetics , Genetic Variation/genetics , Homocysteine S-Methyltransferase/genetics , Phylogeny , Base Sequence , Chromosome Mapping , Cluster Analysis , DNA Primers/genetics , High-Throughput Nucleotide Sequencing , Molecular Sequence Data , RNA, Ribosomal, 16S/genetics , Species Specificity
13.
PLoS One ; 7(10): e47197, 2012.
Article in English | MEDLINE | ID: mdl-23056610

ABSTRACT

Plasmids are important antibiotic resistance determinant carriers that can disseminate various drug resistance genes among species or genera. By using a high throughput sequencing approach, two groups of plasmids of Escherichia coli (named E1 and E2, each consisting of 160 clinical E. coli strains isolated from different periods of time) were sequenced and analyzed. A total of 20 million reads were obtained and mapped onto the known resistance gene sequences. As a result, a total of 9 classes, including 36 types of antibiotic resistant genes, were identified. Among these genes, 25 and 27 single nucleotide polymorphisms (SNPs) appeared, of which 9 and 12 SNPs are nonsynonymous substitutions in the E1 and E2 samples. It is interesting to find that a novel genotype of bla(KLUC), whose close relatives, bla(KLUC-1) and bla(KLUC-2), have been previously reported as carried on the Kluyvera cryocrescens chromosome and Enterobacter cloacae plasmid, was identified. It shares 99% and 98% amino acid identities with Kluc-1 and Kluc-2, respectively. Further PCR screening of 608 Enterobacteriaceae family isolates yielded a second variant (named bla(KLUC-4)). It was interesting to find that Kluc-3 showed resistance to several cephalosporins including cefotaxime, whereas bla(KLUC-4) did not show any resistance to the antibiotics tested. This may be due to a positively charged residue, Arg, replaced by a neutral residue, Leu, at position 167, which is located within an omega-loop. This work represents large-scale studies on resistance gene distribution, diversification and genetic variation in pooled multi-drug resistance plasmids, and provides insight into the use of high throughput sequencing technology for microbial resistance gene detection.


Subject(s)
Drug Resistance, Bacterial/genetics , Escherichia coli/genetics , Plasmids/genetics , Base Sequence , Enterobacteriaceae/drug effects , Escherichia coli/drug effects , Escherichia coli Proteins/genetics , Genotype , High-Throughput Nucleotide Sequencing , Kluyvera/drug effects , Polymerase Chain Reaction , Polymorphism, Single Nucleotide/genetics
SELECTION OF CITATIONS
SEARCH DETAIL