RESUMO
CONTEXT: Alterations in RNA splicing may influence protein isoform diversity that contributes to or reflects the pathophysiology of certain diseases. Whereas specific RNA splicing events in pancreatic islets have been investigated in models of inflammation in vitro, how RNA splicing in the circulation correlates with or is reflective of T1D disease pathophysiology in humans remains unexplored. OBJECTIVE: To use machine learning to investigate if alternative RNA splicing events differ between individuals with and without new-onset type 1 diabetes (T1D) and to determine if these splicing events provide insight into T1D pathophysiology. METHODS: RNA deep sequencing was performed on whole blood samples from two independent cohorts: a training cohort consisting of 12 individuals with new-onset T1D and 12 age- and sex-matched nondiabetic controls and a validation cohort of the same size and demographics. Machine learning analysis was used to identify specific isoforms that could distinguish individuals with T1D from controls. RESULTS: Distinct patterns of RNA splicing differentiated participants with T1D from unaffected controls. Notably, certain splicing events, particularly involving retained introns, showed significant association with T1D. Machine learning analysis using these splicing events as features from the training cohort demonstrated high accuracy in distinguishing between T1D subjects and controls in the validation cohort. Gene Ontology pathway enrichment analysis of the retained intron category showed evidence for a systemic viral response in T1D subjects. CONCLUSIONS: Alternative RNA splicing events in whole blood are significantly enriched in individuals with new-onset T1D and can effectively distinguish these individuals from unaffected controls. Our findings also suggest that RNA splicing profiles offer the potential to provide insights into disease pathogenesis.
RESUMO
Type 1 diabetes (T1D) is a chronic condition caused by autoimmune destruction of the insulin-producing pancreatic ß cells. While it is known that gene-environment interactions play a key role in triggering the autoimmune process leading to T1D, the pathogenic mechanism leading to the appearance of islet autoantibodies-biomarkers of autoimmunity-is poorly understood. Here we show that disruption of the complement system precedes the detection of islet autoantibodies and persists through disease onset. Our results suggest that children who exhibit islet autoimmunity and progress to clinical T1D have lower complement protein levels relative to those who do not progress within a similar time frame. Thus, the complement pathway, an understudied mechanistic and therapeutic target in T1D, merits increased attention for use as protein biomarkers of prediction and potentially prevention of T1D.
RESUMO
Metabolomics provides a unique snapshot into the world of small molecules and the complex biological processes that govern the human, animal, plant, and environmental ecosystems encapsulated by the One Health modeling framework. However, this "molecular snapshot" is only as informative as the number of metabolites confidently identified within it. The spectral similarity (SS) score is traditionally used to identify compound(s) in mass spectrometry approaches to metabolomics, where spectra are matched to reference libraries of candidate spectra. Unfortunately, there is little consensus on which of the dozens of available SS metrics should be used. This lack of standard SS score creates analytic uncertainty and potentially leads to issues in reproducibility, especially as these data are integrated across other domains. In this work, we use metabolomic spectral similarity as a case study to showcase the challenges in consistency within just one piece of the One Health framework that must be addressed to enable data science approaches for One Health problems. Here, using a large cohort of datasets comprising both standard and complex datasets with expert-verified truth annotations, we evaluated the effectiveness of 66 similarity metrics to delineate between correct matches (true positives) and incorrect matches (true negatives). We additionally characterize the families of these metrics to make informed recommendations for their use. Our results indicate that specific families of metrics (the Inner Product, Correlative, and Intersection families of scores) tend to perform better than others, with no single similarity metric performing optimally for all queried spectra. This work and its findings provide an empirically-based resource for researchers to use in their selection of similarity metrics for GC-MS identification, increasing scientific reproducibility through taking steps towards standardizing identification workflows.
RESUMO
As metabolomics grows into a high-throughput and high demand research field, current metrics for the identification of small molecules in gas chromatography-mass spectrometry (GC-MS) still require manual verification. Though steps have been taken to improve scoring metrics by combining spectral similarity (SS) and retention index (RI), the problem persists. A large body of literature has analyzed and refined SS scores, but few studies have explicitly studied improvements to RI scores. Here, we examined whether uninvestigated assumptions of the RI score are valid and propose ways to improve them. Query RIs were matched to library RI with a generous window of ±35 to avoid unintentional removal of valid compound identifications. Each match was manually verified as a true positive (TP), true negative, or unknown. Metabolites with at least 30 TP identifications were included in downstream analyses, resulting in a total of 87 metabolites from samples of varying complexity and type (e.g., amino acid mixtures, human urine, fungal species, and so on.). Our results showed that the RI score assumptions of normality, consistent variance across metabolites, and a mean error centered at 0 are often violated. We demonstrated through a cross-validation analysis that modifying these underlying assumptions according to empirical metabolite-specific distributions improved the TP and negative rankings. Further, we statistically determined the minimum number of samples required to estimate distributional parameters for scoring metrics. Overall, this work proposes a robust statistical pipeline to reduce the time bottleneck of metabolite identification by improving RI scores and thus minimize the effort to complete manual verification.
Assuntos
Metabolômica , Humanos , Cromatografia Gasosa-Espectrometria de Massas/métodos , Metabolômica/métodosRESUMO
The ability to reliably identify small molecules (e.g., metabolites) is key toward driving scientific advancement in metabolomics. Gas chromatography-mass spectrometry (GC-MS) is an analytic method that may be applied to facilitate this process. The typical GC-MS identification workflow involves quantifying the similarity of an observed sample spectrum and other features (e.g., retention index) to that of several references, noting the compound of the best-matching reference spectrum as the identified metabolite. While a deluge of similarity metrics exist, none quantify the error rate of generated identifications, thereby presenting an unknown risk of false identification or discovery. To quantify this unknown risk, we propose a model-based framework for estimating the false discovery rate (FDR) among a set of identifications. Extending a traditional mixture modeling framework, our method incorporates both similarity score and experimental information in estimating the FDR. We apply these models to identification lists derived from across 548 samples of varying complexity and sample type (e.g., fungal species, standard mixtures, etc.), comparing their performance to that of the traditional Gaussian mixture model (GMM). Through simulation, we additionally assess the impact of reference library size on the accuracy of FDR estimates. In comparing the best performing model extensions to the GMM, our results indicate relative decreases in median absolute estimation error (MAE) ranging from 12% to 70%, based on comparisons of the median MAEs across all hit-lists. Results indicate that these relative performance improvements generally hold despite library size; however FDR estimation error typically worsens as the set of reference compounds diminishes.
Assuntos
Metabolômica , Cromatografia Gasosa-Espectrometria de Massas/métodos , Metabolômica/métodosRESUMO
Biological systems function through complex interactions between various 'omics (biomolecules), and a more complete understanding of these systems is only possible through an integrated, multi-omic perspective. This has presented the need for the development of integration approaches that are able to capture the complex, often non-linear, interactions that define these biological systems and are adapted to the challenges of combining the heterogenous data across 'omic views. A principal challenge to multi-omic integration is missing data because all biomolecules are not measured in all samples. Due to either cost, instrument sensitivity, or other experimental factors, data for a biological sample may be missing for one or more 'omic techologies. Recent methodological developments in artificial intelligence and statistical learning have greatly facilitated the analyses of multi-omics data, however many of these techniques assume access to completely observed data. A subset of these methods incorporate mechanisms for handling partially observed samples, and these methods are the focus of this review. We describe recently developed approaches, noting their primary use cases and highlighting each method's approach to handling missing data. We additionally provide an overview of the more traditional missing data workflows and their limitations; and we discuss potential avenues for further developments as well as how the missing data issue and its current solutions may generalize beyond the multi-omics context.
RESUMO
Introduction. Salmonella enterica serovar Typhi (S. Typhi) is the etiological agent of typhoid fever. To establish an infection in the human host, this pathogen must survive the presence of bile salts in the gut and gallbladder.Hypothesis. S. Typhi uses multiple genetic elements to resist the presence of human bile.Aims. To determine the genetic elements that S. Typhi utilizes to tolerate the human bile salt sodium deoxycholate.Methodology. A collection of S. Typhi mutant strains was evaluated for their ability to growth in the presence of sodium deoxycholate and ox-bile. Additionally, transcriptomic and proteomic responses elicited by sodium deoxycholate on S. Typhi cultures were also analysed.Results. Multiple transcriptional factors and some of their dependent genes involved in central metabolism, as well as in cell envelope, are required for deoxycholate resistance.Conclusion. These findings suggest that metabolic adaptation to bile is focused on enhancing energy production to sustain synthesis of cell envelope components exposed to damage by bile salts.
Assuntos
Ácidos e Sais Biliares/química , Ácido Desoxicólico/química , Salmonella typhi , Bile , Humanos , Proteômica , Salmonella typhi/metabolismo , TranscriptomaRESUMO
OBJECTIVES: A comparative effectiveness trial tested 2 parent-based interventions in improving the psychosocial recovery of hospitalized injured children: (1) Link for Injured Kids (Link), a program of psychological first aid in which parents are taught motivational interviewing and stress-screening skills, and (2) Trauma Education, based on an informational booklet about trauma and its impacts and resources. METHODS: A randomized controlled trial was conducted in 4 children's hospitals in the Midwestern United States. Children aged 10 to 17 years admitted for an unintentional injury and a parent were recruited and randomly assigned to Link or Trauma Education. Parents and children completed questionnaires at baseline, 6 weeks, 3 months, and 6 months posthospitalization. Using an intent-to-treat analysis, changes in child-reported posttraumatic stress symptoms, depression, quality of life, and child behaviors were compared between intervention groups. RESULTS: Of 795 injured children, 314 children and their parents were enrolled into the study (40%). Link and Trauma Education was associated with improved symptoms of posttraumatic stress, depression, and pediatric quality of life at similar rates over time. However, unlike those in Trauma Education, children in the Link group had notable improvement of child emotional behaviors and mild improvement of conduct and peer behaviors. Compared with Trauma Education, Link was also associated with improved peer behaviors in rural children. CONCLUSION: Although children in both programs had reduced posttrauma symptoms over time, Link children, whose parents were trained in communication and referral skills, exhibited a greater reduction in problem behaviors.
Assuntos
Educação em Saúde/métodos , Entrevista Motivacional , Pais/educação , Primeiros Socorros Psicológicos , Transtornos de Estresse Pós-Traumáticos/prevenção & controle , Ferimentos e Lesões/psicologia , Adolescente , Criança , Transtornos do Comportamento Infantil/prevenção & controle , Transtornos do Comportamento Infantil/psicologia , Serviços de Saúde da Criança , Criança Hospitalizada/psicologia , Depressão/prevenção & controle , Feminino , Humanos , Masculino , Meio-Oeste dos Estados Unidos , Qualidade de Vida , Ferimentos e Lesões/complicaçõesRESUMO
The CRISPR-Cas cluster is found in many prokaryotic genomes including those of the Enterobacteriaceae family. Salmonella enterica serovar Typhi (S. Typhi) harbors a Type I-E CRISPR-Cas locus composed of cas3, cse1, cse2, cas7, cas5, cas6e, cas1, cas2, and a CRISPR1 array. In this work, it was determined that, in the absence of cas5 or cas2, the amount of the OmpC porin decreased substantially, whereas in individual cse2, cas6e, cas1, or cas3 null mutants, the OmpF porin was not observed in an electrophoretic profile of outer membrane proteins. Furthermore, the LysR-type transcriptional regulator LeuO was unable to positively regulate the expression of the quiescent OmpS2 porin, in individual S. Typhi cse2, cas5, cas6e, cas1, cas2, and cas3 mutants. Remarkably, the expression of the master porin regulator OmpR was dependent on the Cse2, Cas5, Cas6e, Cas1, Cas2, and Cas3 proteins. Therefore, the data suggest that the CRISPR-Cas system acts hierarchically on OmpR to control the synthesis of outer membrane proteins in S. Typhi.
RESUMO
BACKGROUND: Numerous public health studies, especially in the area of violence, examine the effects of contextual or group-level factors on health outcomes. Often, these contextual factors exhibit strong pairwise correlations, which pose a challenge when these factors are included as covariates in a statistical model. Such models may be characterised by inflated standard errors and unstable parameter estimates that may fluctuate drastically from sample to sample, where the excessive estimation variability is reflected by inflated standard errors. METHODS: We propose a three-stage approach for analysing correlated contextual factors that proceeds as follows: (1) a principal components analysis (PCA) is performed on the original set of correlated variables, (2) the primary generated principal components are included in a multilevel multivariable model and (3) the estimated parameters for these components are transformed into estimates for each of the original contextual factors. Using school violence data, we examined the associations between school crime and correlated contextual school factors (ie, English proficiency, academic performance, pupil to teacher ratio, average class size and children on free and reduced meals). RESULTS: From models ignoring correlations, school crime was not reliably associated with any of the contextual school factors. When models were fit with principal components, school crime was found to be positively associated with a school's student to teacher ratio, average classroom size and academic performance but negatively associated with the proportion of children who were on free and reduced meals. CONCLUSION: Our multistep approach is one way to address multicollinearity encountered in social epidemiological studies of violence.
Assuntos
Instituições Acadêmicas , Violência , Crime , Humanos , Modelos Estatísticos , EstudantesRESUMO
In nature, microorganisms are constantly exposed to multiple viral infections and thus have developed many strategies to survive phage attack and invasion by foreign DNA. One of such strategies is the clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated proteins (Cas) bacterial immunological system. This defense mechanism is widespread in prokaryotes including several families such as Enterobacteriaceae. Much knowledge about the CRISPR-Cas system has been generated, including its biological functions, transcriptional regulation, distribution, utility as a molecular marker and as a tool for specific genome editing. This review focuses on these aspects and describes the state of the art of the CRISPR-Cas system in the Enterobacteriaceae bacterial family.
Assuntos
Sistemas CRISPR-Cas , Enterobacteriaceae/enzimologia , Enterobacteriaceae/genética , Regulação da Expressão Gênica , Variação GenéticaRESUMO
INTRODUCTION: Recent research suggests that anti-bullying laws may be effective in reducing risk of bullying victimization among youth, but no research has determined whether these laws are also effective in reducing disparities in bullying. The aim of this paper was to evaluate the effectiveness of anti-bullying legislation in reducing disparities in sex- and weight-based bullying and cyberbullying victimization. METHODS: Data on anti-bullying legislation were obtained from the U.S. Department of Education, which commissioned a systematic review of 16 key components of state laws in 2011. States were also categorized based on whether their legislation enumerated protected groups and, if so, which groups were enumerated. These policy variables from 28 states were linked to individual-level data on bullying and cyberbullying victimization from students in 9th through 12th grade participating in the 2011 Youth Risk Behavior Surveillance System study (N=79,577). Analyses were conducted in 2016. RESULTS: There was an absence of any kind of moderating effect of anti-bullying legislation on weight-based disparities in bullying and cyberbullying victimization. Only state laws with high compliance to Department of Education enumeration guidelines were associated with lower sex-based disparities in bullying victimization. CONCLUSIONS: Anti-bullying policies were not associated with lower weight-based disparities in bullying and cyberbullying victimization among youth, and only one form of policies (high compliance to Department of Education enumeration guidelines) was associated with lower sex-based disparities in bullying victimization. Results therefore suggest that anti-bullying legislation requires further refinement to protect youth who are vulnerable to bullying victimization.
Assuntos
Bullying/prevenção & controle , Vítimas de Crime/estatística & dados numéricos , Instituições Acadêmicas/estatística & dados numéricos , Políticas de Controle Social/estatística & dados numéricos , Estudantes/psicologia , Adolescente , Peso Corporal , Bullying/estatística & dados numéricos , Vítimas de Crime/psicologia , Feminino , Guias como Assunto , Disparidades nos Níveis de Saúde , Inquéritos Epidemiológicos/estatística & dados numéricos , Humanos , Internet , Masculino , Fatores Sexuais , Políticas de Controle Social/normas , Estudantes/estatística & dados numéricos , Estados UnidosRESUMO
The CRISPR-Cas system is involved in bacterial immunity, virulence, gene regulation, biofilm formation and sporulation. In Salmonella enterica serovar Typhi, this system consists of five transcriptional units including antisense RNAs. It was determined that these genetic elements are expressed in minimal medium and are up-regulated by pH. In addition, a transcriptional characterization of cas3 and ascse2-1 is included herein.