RESUMO
Transcription factors are managers of the cellular factory, and key components to many diseases. Many non-coding single nucleotide polymorphisms affect transcription factors, either by directly altering the protein or its functional activity at individual binding sites. Here we first briefly summarize high-throughput approaches to studying transcription factor activity. We then demonstrate, using published chromatin accessibility data (specifically ATAC-seq), that the genome-wide profile of TF recognition motifs relative to regions of open chromatin can determine the key transcription factor altered by a perturbation. Our method of determining which TFs are altered by a perturbation is simple, is quick to implement, and can be used when biological samples are limited. In the future, we envision that this method could be applied to determine which TFs show altered activity in response to a wide variety of drugs and diseases.
Assuntos
Análise de Sequência de DNA , Fatores de Transcrição/metabolismo , Linhagem Celular Tumoral , Doença/genética , Humanos , Mutação/genética , Motivos de Nucleotídeos/genéticaRESUMO
Translational research requires data at multiple scales of biological organization. Advancements in sequencing and multi-omics technologies have increased the availability of these data, but researchers face significant integration challenges. Knowledge graphs (KGs) are used to model complex phenomena, and methods exist to construct them automatically. However, tackling complex biomedical integration problems requires flexibility in the way knowledge is modeled. Moreover, existing KG construction methods provide robust tooling at the cost of fixed or limited choices among knowledge representation models. PheKnowLator (Phenotype Knowledge Translator) is a semantic ecosystem for automating the FAIR (Findable, Accessible, Interoperable, and Reusable) construction of ontologically grounded KGs with fully customizable knowledge representation. The ecosystem includes KG construction resources (e.g., data preparation APIs), analysis tools (e.g., SPARQL endpoint resources and abstraction algorithms), and benchmarks (e.g., prebuilt KGs). We evaluated the ecosystem by systematically comparing it to existing open-source KG construction methods and by analyzing its computational performance when used to construct 12 different large-scale KGs. With flexible knowledge representation, PheKnowLator enables fully customizable KGs without compromising performance or usability.
Assuntos
Disciplinas das Ciências Biológicas , Bases de Conhecimento , Reconhecimento Automatizado de Padrão , Algoritmos , Pesquisa Translacional BiomédicaRESUMO
Animal methods bias in scientific publishing is a newly defined type of publishing bias describing a preference for animal-based methods where they may not be necessary or where nonanimal-based methods may already be suitable, which impacts the likelihood or timeliness of a manuscript being accepted for publication. This article covers the output from a workshop between stakeholders in publishing, academia, industry, government, and non-governmental organizations. The intent of the workshop was to exchange perspectives on the prevalence, causes, and impact of animal methods bias in scientific publishing, as well as to explore mitigation strategies. Output from the workshop includes summaries of presentations, breakout group discussions, participant polling results, and a synthesis of recommendations for mitigation. Overall, participants felt that animal methods bias has a meaningful impact on scientific publishing, though more evidence is needed to demonstrate its prevalence. Significant consequences of this bias that were identified include the unnecessary use of animals in scientific procedures, the continued reliance on animals in research even where suitable nonanimal methods exist, poor rates of clinical translation, delays in publication, and negative impacts on career trajectories in science. Workshop participants offered recommendations for journals, publishers, funders, governments, and other policy makers, as well as the scientific community at large, to reduce the prevalence and impacts of animal methods bias. The workshop resulted in the creation of working groups committed to addressing animal methods bias, and activities are ongoing.
Assuntos
Editoração , Projetos de Pesquisa , Humanos , AnimaisRESUMO
The assay for transposase-accessible chromatin followed by sequencing (ATAC-seq) is an inexpensive protocol for measuring open chromatin regions. ATAC-seq is also relatively simple and requires fewer cells than many other high-throughput sequencing protocols. Therefore, it is tractable in numerous settings where other high throughput assays are challenging to impossible. Hence it is important to understand the limits of what can be inferred from ATAC-seq data. In this work, we leverage ATAC-seq to predict the presence of nascent transcription. Nascent transcription assays are the current gold standard for identifying regions of active transcription, including markers for functional transcription factor (TF) binding. We combine mapped short reads from ATAC-seq with the underlying peak sequence, to determine regions of active transcription genome-wide. We show that a hybrid signal/sequence representation classified using recurrent neural networks (RNNs) can identify these regions across different cell types.
Assuntos
RNA Polimerases Dirigidas por DNA/metabolismo , Análise de Sequência de DNA/métodos , Sítio de Iniciação de Transcrição , Células A549 , Células HCT116 , Humanos , Células MCF-7 , Redes Neurais de Computação , Motivos de Nucleotídeos , Ligação Proteica , Fatores de Transcrição/metabolismoRESUMO
Knowledge-based biomedical data science involves the design and implementation of computer systems that act as if they knew about biomedicine. Such systems depend on formally represented knowledge in computer systems, often in the form of knowledge graphs. Here we survey recent progress in systems that use formally represented knowledge to address data science problems in both clinical and biological domains, as well as progress on approaches for creating knowledge graphs. Major themes include the relationships between knowledge graphs and machine learning, the use of natural language processing to construct knowledge graphs, and the expansion of novel knowledge-based approaches to clinical and biological domains.
RESUMO
When considering toxic chemicals in the environment, a mechanistic, causal explanation of toxicity may be preferred over a statistical or machine learning-based prediction by itself. Elucidating a mechanism of toxicity is, however, a costly and time-consuming process that requires the participation of specialists from a variety of fields, often relying on animal models. We present an innovative mechanistic inference framework (MechSpy), which can be used as a hypothesis generation aid to narrow the scope of mechanistic toxicology analysis. MechSpy generates hypotheses of the most likely mechanisms of toxicity, by combining a semantically-interconnected knowledge representation of human biology, toxicology and biochemistry with gene expression time series on human tissue. Using vector representations of biological entities, MechSpy seeks enrichment in a manually curated list of high-level mechanisms of toxicity, represented as biochemically- and causally-linked ontology concepts. Besides predicting the canonical mechanism of toxicity for many well-studied compounds, we experimentally validated some of our predictions for other chemicals without an established mechanism of toxicity. This mechanistic inference framework is an advantageous tool for predictive toxicology, and the first of its kind to produce a mechanistic explanation for each prediction. MechSpy can be modified to include additional mechanisms of toxicity, and is generalizable to other types of mechanisms of human biology.