RESUMO
MOTIVATION: Software is vital for the advancement of biology and medicine. Impact evaluations of scientific software have primarily emphasized traditional citation metrics of associated papers, despite these metrics inadequately capturing the dynamic picture of impact and despite challenges with improper citation. RESULTS: To understand how software developers evaluate their tools, we conducted a survey of participants in the Informatics Technology for Cancer Research (ITCR) program funded by the National Cancer Institute (NCI). We found that although developers realize the value of more extensive metric collection, they find a lack of funding and time hindering. We also investigated software among this community for how often infrastructure that supports more nontraditional metrics were implemented and how this impacted rates of papers describing usage of the software. We found that infrastructure such as social media presence, more in-depth documentation, the presence of software health metrics, and clear information on how to contact developers seemed to be associated with increased mention rates. Analysing more diverse metrics can enable developers to better understand user engagement, justify continued funding, identify novel use cases, pinpoint improvement areas, and ultimately amplify their software's impact. Challenges are associated, including distorted or misleading metrics, as well as ethical and security concerns. More attention to nuances involved in capturing impact across the spectrum of biomedical software is needed. For funders and developers, we outline guidance based on experience from our community. By considering how we evaluate software, we can empower developers to create tools that more effectively accelerate biological and medical research progress. AVAILABILITY AND IMPLEMENTATION: More information about the analysis, as well as access to data and code is available at https://github.com/fhdsl/ITCR_Metrics_manuscript_website.
Assuntos
Pesquisa Biomédica , Software , Pesquisa Biomédica/métodos , Humanos , Estados Unidos , Biologia Computacional/métodosRESUMO
Maintenance of the drug-addicted state is thought to involve changes in gene expression in different neuronal cell types and neural circuits. Midbrain dopamine (DA) neurons in particular mediate numerous responses to drugs of abuse. Long noncoding RNAs (lncRNAs) regulate CNS gene expression through a variety of mechanisms, but next to nothing is known about their role in drug abuse. The proportion of lncRNAs that are primate-specific provides a strong rationale for their study in human drug abusers. In this study, we determined a profile of dysregulated putative lncRNAs through the analysis of postmortem human midbrain specimens from chronic cocaine abusers and well-matched control subjects (n = 11 in each group) using a custom lncRNA microarray. A dataset comprising 32 well-annotated lncRNAs with independent evidence of brain expression and robust differential expression in cocaine abusers is presented. For a subset of these lncRNAs, differential expression was validated by quantitative real-time PCR and cellular localization determined by in situ hybridization histochemistry. Examples of lncRNAs exhibiting DA cell-specific expression, different subcellular distributions, and covariance of expression with known cocaine-regulated protein-coding genes were identified. These findings implicate lncRNAs in the cellular responses of human DA neurons to chronic cocaine abuse. Long noncoding RNAs (lncRNAs) regulate the expression of protein-coding genes, but little is known about their potential role in drug abuse. In this study, we identified lncRNAs differentially expressed in human cocaine abusers' midbrains. One up-regulated antisense lncRNA, tumor necrosis factor receptor-associated factor 3-interacting protein 2-antisense 1 (TRAF3IP2-AS1), was found predominantly in the nucleus of human dopamine (DA) neurons, whereas the related TRAF3IP2 protein-coding transcript was distributed throughout these cells. The abundances of these transcripts were significantly correlated (left) suggesting that TRAF3IP2-AS1 may regulate TRAF3IP2 gene expression, perhaps through local chromatin changes at this locus (right).
Assuntos
Transtornos Relacionados ao Uso de Cocaína/genética , Mesencéfalo/metabolismo , Neurônios/metabolismo , RNA Longo não Codificante/metabolismo , RNA/metabolismo , Peptídeos e Proteínas Associados a Receptores de Fatores de Necrose Tumoral/metabolismo , Proteínas Adaptadoras de Transdução de Sinal , Cocaína/farmacologia , Transtornos Relacionados ao Uso de Cocaína/metabolismo , Dopamina/genética , Dopamina/metabolismo , Humanos , Neurônios/efeitos dos fármacos , Transcrição GênicaRESUMO
Data science education provides tremendous opportunities but remains inaccessible to many communities. Increasing the accessibility of data science to these communities not only benefits the individuals entering data science, but also increases the field's innovation and potential impact as a whole. Education is the most scalable solution to meet these needs, but many data science educators lack formal training in education. Our group has led education efforts for a variety of audiences: from professional scientists to high school students to lay audiences. These experiences have helped form our teaching philosophy which we have summarized into three main ideals: 1) motivation, 2) inclusivity, and 3) realism. 20 we also aim to iteratively update our teaching approaches and curriculum as we find ways to better reach these ideals. In this manuscript we discuss these ideals as well practical ideas for how to implement these philosophies in the classroom.
Assuntos
Ciência de Dados , Motivação , Humanos , Ciência de Dados/educação , Currículo , EnsinoRESUMO
Data science and informatics tools are developing at a blistering rate, but their users often lack the educational background or resources to efficiently apply the methods to their research. Training resources and vignettes that accompany these tools often deprecate because their maintenance is not prioritized by funding, giving teams little time to devote to such endeavors. Our group has developed Open-source Tools for Training Resources (OTTR) to offer greater efficiency and flexibility for creating and maintaining these training resources. OTTR empowers creators to customize their work and allows for a simple workflow to publish using multiple platforms. OTTR allows content creators to publish training material to multiple massive online learner communities using familiar rendering mechanics. OTTR allows the incorporation of pedagogical practices like formative and summative assessments in the form of multiple choice questions and fill in the blank problems that are automatically graded. No local installation of any software is required to begin creating content with OTTR. Thus far, 15 training courses have been created with OTTR repository template. By using the OTTR system, the maintenance workload for updating these courses across platforms has been drastically reduced. For more information about OTTR and how to get started, go to ottrproject.org.
RESUMO
Software is vital for the advancement of biology and medicine. Through analysis of usage and impact metrics of software, developers can help determine user and community engagement. These metrics can be used to justify additional funding, encourage additional use, and identify unanticipated use cases. Such analyses can help define improvement areas and assist with managing project resources. However, there are challenges associated with assessing usage and impact, many of which vary widely depending on the type of software being evaluated. These challenges involve issues of distorted, exaggerated, understated, or misleading metrics, as well as ethical and security concerns. More attention to the nuances, challenges, and considerations involved in capturing impact across the diverse spectrum of biological software is needed. Furthermore, some tools may be especially beneficial to a small audience, yet may not have comparatively compelling metrics of high usage. Although some principles are generally applicable, there is not a single perfect metric or approach to effectively evaluate a software tool's impact, as this depends on aspects unique to each tool, how it is used, and how one wishes to evaluate engagement. We propose more broadly applicable guidelines (such as infrastructure that supports the usage of software and the collection of metrics about usage), as well as strategies for various types of software and resources. We also highlight outstanding issues in the field regarding how communities measure or evaluate software impact. To gain a deeper understanding of the issues hindering software evaluations, as well as to determine what appears to be helpful, we performed a survey of participants involved with scientific software projects for the Informatics Technology for Cancer Research (ITCR) program funded by the National Cancer Institute (NCI). We also investigated software among this scientific community and others to assess how often infrastructure that supports such evaluations is implemented and how this impacts rates of papers describing usage of the software. We find that although developers recognize the utility of analyzing data related to the impact or usage of their software, they struggle to find the time or funding to support such analyses. We also find that infrastructure such as social media presence, more in-depth documentation, the presence of software health metrics, and clear information on how to contact developers seem to be associated with increased usage rates. Our findings can help scientific software developers make the most out of the evaluations of their software so that they can more fully benefit from such assessments.
RESUMO
Pediatric brain and spinal cancers are collectively the leading disease-related cause of death in children; thus, we urgently need curative therapeutic strategies for these tumors. To accelerate such discoveries, the Children's Brain Tumor Network (CBTN) and Pacific Pediatric Neuro-Oncology Consortium (PNOC) created a systematic process for tumor biobanking, model generation, and sequencing with immediate access to harmonized data. We leverage these data to establish OpenPBTA, an open collaborative project with over 40 scalable analysis modules that genomically characterize 1,074 pediatric brain tumors. Transcriptomic classification reveals universal TP53 dysregulation in mismatch repair-deficient hypermutant high-grade gliomas and TP53 loss as a significant marker for poor overall survival in ependymomas and H3 K28-mutant diffuse midline gliomas. Already being actively applied to other pediatric cancers and PNOC molecular tumor board decision-making, OpenPBTA is an invaluable resource to the pediatric oncology community.
RESUMO
Tumor-associated macrophages (TAMs) play an important role in tumor immunity and comprise of subsets that have distinct phenotype, function, and ontology. Transcriptomic analyses of human medulloblastoma, the most common malignant pediatric brain cancer, showed that medulloblastomas (MBs) with activated sonic hedgehog signaling (SHH-MB) have significantly more TAMs than other MB subtypes. Therefore, we examined MB-associated TAMs by single-cell RNA sequencing of autochthonous murine SHH-MB at steady state and under two distinct treatment modalities: molecular-targeted inhibitor and radiation. Our analyses reveal significant TAM heterogeneity, identify markers of ontologically distinct TAM subsets, and show the impact of brain microenvironment on the differentiation of tumor-infiltrating monocytes. TAM composition undergoes dramatic changes with treatment and differs significantly between molecular-targeted and radiation therapy. We identify an immunosuppressive monocyte-derived TAM subset that emerges with radiation therapy and demonstrate its role in regulating T cell and neutrophil infiltration in MB.
Assuntos
Neoplasias Cerebelares/patologia , Neoplasias Cerebelares/terapia , Proteínas Hedgehog/metabolismo , Macrófagos/metabolismo , Macrófagos/patologia , Meduloblastoma/patologia , Meduloblastoma/terapia , Animais , Linfócitos T CD8-Positivos/imunologia , Neoplasias Cerebelares/genética , Neoplasias Cerebelares/imunologia , Marcadores Genéticos , Humanos , Meduloblastoma/genética , Meduloblastoma/imunologia , Camundongos , Microglia/patologia , Monócitos/patologia , Análise de Célula Única , Transcrição Gênica , Microambiente TumoralRESUMO
Epigenetic marks operate at multiple chromosomal levels to regulate gene expression, from direct covalent modification of DNA to three-dimensional chromosomal structure. Research has shown that 5-methylcytosine (5-mC) and its oxidized form, 5-hydroxymethylcytosine (5-hmC), are stable epigenetic marks with distinct genomic distributions and separate regulatory functions. In addition, recent data indicate that 5-hmC plays a critical regulatory role in the mammalian brain, emphasizing the importance of considering this alternative DNA modification in the context of neuroepigenetics. Traditional bisulfite (BS) treatment-based methods to measure the methylome are not able to distinguish between 5-mC and 5-hmC, meaning much of the existing literature does not differentiate these two DNA modifications. Recently developed methods, including Tet-assisted bisulfite treatment and oxidative bisulfite treatment, allow for differentiation of 5-hmC and/or 5-mC levels at base-pair resolution when combined with next-generation sequencing or methylation arrays. Despite these technological advances, there remains a lack of clarity regarding the appropriate statistical methods for integration of 5-mC and 5-hmC data. As a result, it can be difficult to determine the effects of an experimental treatment on 5-mC and 5-hmC dynamics. Here, we propose a statistical approach involving mixed effects to simultaneously model paired 5-mC and 5-hmC data as repeated measures. We tested this approach using publicly available BS/oxidative bisulfite-450K array data and showed that our new approach detected far more CpG probes with paired changes in 5-mC and 5-hmC by Alzheimer's disease status (n = 14,183 probes) compared with the overlapping differential probes generated from separate models for each epigenetic mark (n = 68). Of note, all 68 of the overlapping probe IDs from the separate models were also significant in our new modeling approach, supporting the sensitivity of our new analysis method. Using the proposed approach, it will be possible to determine the effects of an experimental treatment on both 5-mC and 5-hmC at the base-pair level.
RESUMO
Opioid abuse is now the most common cause of accidental death in the US. Although opioids and most other drugs of abuse acutely increase signaling mediated by midbrain dopamine (DA)-synthesizing neurons, little is known about long-lasting changes in DA cells that may contribute to continued opioid abuse, craving, and relapse. A better understanding of the molecular and cellular bases of opioid abuse could lead to advancements in therapeutics. This study comprises, to our knowledge, the first unbiased examination of genome-wide changes in midbrain gene expression associated with human opioid abuse. Our analyses identified differentially expressed genes and distinct gene networks associated with opioid abuse, specific genes with predictive capability for subject assignment to the opioid abuse cohort, and genes most similarly affected in chronic opioid and cocaine abusers. We also identified differentially expressed long noncoding RNAs capable of regulating known drug-responsive protein-coding genes. Opioid-regulated genes identified in this study warrant further investigation as potential biomarkers and/or therapeutic targets for human substance abuse.
Assuntos
Biomarcadores/metabolismo , Cocaína/farmacologia , Redes Reguladoras de Genes , Mesencéfalo/metabolismo , Transtornos Relacionados ao Uso de Opioides/patologia , RNA Longo não Codificante/metabolismo , Antígenos de Diferenciação/genética , Antígenos de Diferenciação/metabolismo , Área Sob a Curva , Estudos de Casos e Controles , Humanos , Concentração de Íons de Hidrogênio , Mesencéfalo/química , Mesencéfalo/efeitos dos fármacos , Pessoa de Meia-Idade , Inibidor de NF-kappaB alfa/genética , Inibidor de NF-kappaB alfa/metabolismo , Transtornos Relacionados ao Uso de Opioides/genética , Transtornos Relacionados ao Uso de Opioides/metabolismo , Curva ROCRESUMO
Opioid abuse is now the primary cause of accidental deaths in the United States. Studies over several decades established the cyclical nature of abused drugs of choice, with a current resurgence of heroin abuse and, more recently, fentanyl's emergence as a major precipitant of drug-related deaths. To better understand abuse trends and to explore the potential lethality of specific drug-drug interactions, we conducted statistical analyses of forensic toxicological data from the Wayne County Medical Examiner's Office from 2012-2016. We observed clear changes in opioid abuse over this period, including the rapid emergence of fentanyl and its analogs as highly significant causes of lethality starting in 2014. We then used Chi-square Automatic Interaction Detector (CHAID)-based decision tree analyses to obtain insights regarding specific drugs, drug combinations, and biomarkers in blood most predictive of cause of death or circumstances surrounding death. The presence of the non-opioid drug acetaminophen was highly predictive of drug-related deaths, likely reflecting the abuse of various combined acetaminophen-opioid formulations. The short-lived cocaine adulterant levamisole was highly predictive of a short post-cocaine survival time preceding sudden non-drug-related death. The combination of the opioid methadone and the antidepressant citalopram was uniformly linked to drug death, suggesting a potential drug-drug interaction at the level of a pathophysiological effect on the heart and/or drug metabolism. The presence of fentanyl plus the benzodiazepine midazolam was diagnostic for in-hospital deaths following serious medical illness and interventions that included these drugs. These data highlight the power of decision tree analyses not only in the determination of cause of death, but also as a key surveillance tool to inform drug abuse treatment and public health policies for combating the opioid crisis.
RESUMO
The development of new therapeutic strategies for the treatment of complex brain disorders such as drug addiction is likely to be advanced by a more complete understanding of the underlying molecular pathophysiology. Although the study of postmortem human brain represents a unique resource in this regard, it can be challenging to disentangle the relative contribution of chronic pathological processes versus perimortem events to the observed changes in gene expression. To begin to unravel this issue, we analyzed by quantitative PCR the midbrain expression of numerous candidate genes previously associated with cocaine abuse. Data obtained from chronic cocaine abusers (and matched control subjects) dying of gunshot wounds were compared with a prior study of subjects with deaths directly attributable to cocaine abuse. Most of the genes studied (i.e., tyrosine hydroxylase, dopamine transporter, forkhead box A2, histone variant H3 family 3B, nuclear factor kappa B inhibitor alpha, growth arrest and DNA damage-inducible beta) were found to be differentially expressed in chronic cocaine abusers irrespective of immediate cause of death or perimortem levels of cocaine, suggesting that these may represent core pathophysiological changes arising with chronic drug abuse. On the other hand, chemokine C-C motif ligand 2 and jun proto-oncogene expression were unaffected in cocaine-abusing subjects dying of gunshot wounds, in contrast to the differential expression previously reported in cocaine-related fatalities. The possible influence of cause of death and other factors on the cocaine-responsiveness of these genes is discussed.