Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Nucleic Acids Res ; 44(11): e103, 2016 06 20.
Artigo em Inglês | MEDLINE | ID: mdl-27016733

RESUMO

Identifying coding genes is an essential step in genome annotation. Here, we utilize existing whole genome alignments to detect conserved coding exons and then map gene annotations from one genome to many aligned genomes. We show that genome alignments contain thousands of spurious frameshifts and splice site mutations in exons that are truly conserved. To overcome these limitations, we have developed CESAR (Coding Exon-Structure Aware Realigner) that realigns coding exons, while considering reading frame and splice sites of each exon. CESAR effectively avoids spurious frameshifts in conserved genes and detects 91% of shifted splice sites. This results in the identification of thousands of additional conserved exons and 99% of the exons that lack inactivating mutations match real exons. Finally, to demonstrate the potential of using CESAR for comparative gene annotation, we applied it to 188 788 exons of 19 865 human genes to annotate human genes in 99 other vertebrates. These comparative gene annotations are available as a resource (http://bds.mpi-cbg.de/hillerlab/CESAR/). CESAR (https://github.com/hillerlab/CESAR/) can readily be applied to other alignments to accurately annotate coding genes in many other vertebrate and invertebrate genomes.


Assuntos
Éxons , Genoma , Genômica/métodos , Anotação de Sequência Molecular , Software , Animais , Bovinos , Códon , Biologia Computacional/métodos , Cães , Evolução Molecular , Humanos , Íntrons , Camundongos , Mutação , Fases de Leitura Aberta , Filogenia , Sítios de Splice de RNA , Ratos , Fases de Leitura , Reprodutibilidade dos Testes , Navegador
2.
Stud Health Technol Inform ; 290: 111-115, 2022 Jun 06.
Artigo em Inglês | MEDLINE | ID: mdl-35672981

RESUMO

The CDISC Controlled Terminology (CT) defines the terms that may be used to represent clinical trial data in the CDISC standards. Despite its unique importance, there has been limited systematic examination of the coverage of this terminology. In this work, we performed an assessment of the completeness of CDISC CT's coverage by comparing clinical outcomes for multiple sclerosis (MS) available in CDISC CT with two independent high-fidelity benchmarks: (1) 71 expert-selected outcomes catalogued by the National Institute of Neurological Disorders and Stroke (NINDS), and, (2) 66 common outcomes used in MS trials registered on ClinicalTrials.gov (CTG). We employed a semi-automated search and term-mapping process to identify possible CDISC equivalents to the benchmarks' measures. We found that 55% of the NINDS outcomes and 52% of the CTG outcomes are absent from the CDISC Terminology, indicating a need for expanding the terminology to take into account other established standards and real-world practice.


Assuntos
Benchmarking , Vocabulário Controlado , Ensaios Clínicos como Assunto , Humanos
3.
JMIR Med Inform ; 9(2): e18298, 2021 Feb 08.
Artigo em Inglês | MEDLINE | ID: mdl-33460388

RESUMO

BACKGROUND: Common disease-specific outcomes are vital for ensuring comparability of clinical trial data and enabling meta analyses and interstudy comparisons. Traditionally, the process of deciding which outcomes should be recommended as common for a particular disease relied on assembling and surveying panels of subject-matter experts. This is usually a time-consuming and laborious process. OBJECTIVE: The objectives of this work were to develop and evaluate a generalized pipeline that can automatically identify common outcomes specific to any given disease by finding, downloading, and analyzing data of previous clinical trials relevant to that disease. METHODS: An automated pipeline to interface with ClinicalTrials.gov's application programming interface and download the relevant trials for the input condition was designed. The primary and secondary outcomes of those trials were parsed and grouped based on text similarity and ranked based on frequency. The quality and usefulness of the pipeline's output were assessed by comparing the top outcomes identified by it for chronic obstructive pulmonary disease (COPD) to a list of 80 outcomes manually abstracted from the most frequently cited and comprehensive reviews delineating clinical outcomes for COPD. RESULTS: The common disease-specific outcome pipeline successfully downloaded and processed 3876 studies related to COPD. Manual verification indicated that the pipeline was downloading and processing the same number of trials as were obtained from the self-service ClinicalTrials.gov portal. Evaluating the automatically identified outcomes against the manually abstracted ones showed that the pipeline achieved a recall of 92% and precision of 79%. The precision number indicated that the pipeline was identifying many outcomes that were not covered in the literature reviews. Assessment of those outcomes indicated that they are relevant to COPD and could be considered in future research. CONCLUSIONS: An automated evidence-based pipeline can identify common clinical trial outcomes of comparable breadth and quality as the outcomes identified in comprehensive literature reviews. Moreover, such an approach can highlight relevant outcomes for further consideration.

4.
Stud Health Technol Inform ; 284: 505-509, 2021 Dec 15.
Artigo em Inglês | MEDLINE | ID: mdl-34920582

RESUMO

Common outcome sets are vital for ensuring usability of clinical trial results and enabling inter-study comparisons. The task of identifying clinical outcomes for a particular field is cumbersome and time-consuming. The aim of this work was to develop an automated pipeline for identifying common outcomes by analyzing outcomes from relevant trials reported at ClinicalTrials.gov and to assess the pipeline accuracy. We validated the output of our pipeline by comparing the outcomes it identified for acute coronary syndromes and coronary artery disease with the set of outcomes recommended for these conditions by a panel of experts in a widely cited report. We found that our pipeline identified the same or similar outcomes for 100% of the outcomes recommended in the experts' report. The coverage of the pipeline's results dropped only slightly (to 21 out of 23 outcome domains, 91%) when we restricted the pipeline to trials posted before the publication of the report, indicating a great potential for this pipeline to be used in aiding and informing the future development of core outcome measures in clinical trials.


Assuntos
Avaliação de Resultados em Cuidados de Saúde
5.
Stud Health Technol Inform ; 272: 379-382, 2020 Jun 26.
Artigo em Inglês | MEDLINE | ID: mdl-32604681

RESUMO

Common Data Elements (CDEs) are necessary for ensuring data sharing across studies, providing comparability, and enabling aggregation and meta-analyses. The process of developing a set of CDEs for a given clinical research area has typically been arduous and time-consuming. In this work we introduce an automated pipeline that can greatly aid the process by identifying, aggregating, and ranking relevant CDEs from the outcomes of studies registered on clinicaltrials.gov (CTG). The pipeline uses the Medical Subject Headings (MeSH) ontology to group and rank candidate CDEs by specific diseases. The initial CDE pipeline has been tested using an emerging research domain. The resulting CDEs output was aligned with the current recommendations in the corresponding subject area. Further development of automated means for CDE generation based on structured information from CTG and MeSH is warranted.


Assuntos
Pesquisa Biomédica , Elementos de Dados Comuns , National Institute of Neurological Disorders and Stroke (USA) , Estados Unidos
6.
Artigo em Inglês | MEDLINE | ID: mdl-31979291

RESUMO

Online social communities are becoming windows for learning more about the health of populations, through information about our health-related behaviors and outcomes from daily life. At the same time, just as public health data and theory has shown that aspects of the built environment can affect our health-related behaviors and outcomes, it is also possible that online social environments (e.g., posts and other attributes of our online social networks) can also shape facets of our life. Given the important role of the online environment in public health research and implications, factors which contribute to the generation of such data must be well understood. Here we study the role of the built and online social environments in the expression of dining on Instagram in Abu Dhabi; a ubiquitous social media platform, city with a vibrant dining culture, and a topic (food posts) which has been studied in relation to public health outcomes. Our study uses available data on user Instagram profiles and their Instagram networks, as well as the local food environment measured through the dining types (e.g., casual dining restaurants, food court restaurants, lounges etc.) by neighborhood. We find evidence that factors of the online social environment (profiles that post about dining versus profiles that do not post about dining) have different influences on the relationship between a user's built environment and the social dining expression, with effects also varying by dining types in the environment and time of day. We examine the mechanism of the relationships via moderation and mediation analyses. Overall, this study provides evidence that the interplay of online and built environments depend on attributes of said environments and can also vary by time of day. We discuss implications of this synergy for precisely-targeting public health interventions, as well as on using online data for public health research.


Assuntos
Ambiente Construído , Restaurantes , Meio Social , Mídias Sociais , Rede Social , Emirados Árabes Unidos
7.
Artigo em Inglês | MEDLINE | ID: mdl-29264592

RESUMO

Understanding tobacco- and alcohol-related behavioral patterns is critical for uncovering risk factors and potentially designing targeted social computing intervention systems. Given that we make choices multiple times per day, hourly and daily patterns are critical for better understanding behaviors. Here, we combine natural language processing, machine learning and time series analyses to assess Twitter activity specifically related to alcohol and tobacco consumption and their sub-daily, daily and weekly cycles. Twitter self-reports of alcohol and tobacco use are compared to other data streams available at similar temporal resolution. We assess if discussion of drinking by inferred underage versus legal age people or discussion of use of different types of tobacco products can be differentiated using these temporal patterns. We find that time and frequency domain representations of behaviors on social media can provide meaningful and unique insights, and we discuss the types of behaviors for which the approach may be most useful.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA