Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 25
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
País de afiliação
Intervalo de ano de publicação
1.
BMC Bioinformatics ; 15: 267, 2014 Aug 08.
Artigo em Inglês | MEDLINE | ID: mdl-25103881

RESUMO

BACKGROUND: The phenome represents a distinct set of information in the human population. It has been explored particularly in its relationship with the genome to identify correlations for diseases. The phenome has been also explored for drug repositioning with efforts focusing on the search space for the most similar candidate drugs. For a comprehensive analysis of the phenome, we assumed that all phenotypes (indications and side effects) were inter-connected with a probabilistic distribution and this characteristic may offer an opportunity to identify new therapeutic indications for a given drug. Correspondingly, we employed Latent Dirichlet Allocation (LDA), which introduces latent variables (topics) to govern the phenome distribution. RESULTS: We developed our model on the phenome information in Side Effect Resource (SIDER). We first developed a LDA model optimized based on its recovery potential through perturbing the drug-phenotype matrix for each of the drug-indication pairs where each drug-indication relationship was switched to "unknown" one at the time and then recovered based on the remaining drug-phenotype pairs. Of the probabilistically significant pairs, 70% was successfully recovered. Next, we applied the model on the whole phenome to narrow down repositioning candidates and suggest alternative indications. We were able to retrieve approved indications of 6 drugs whose indications were not listed in SIDER. For 908 drugs that were present with their indication information, our model suggested alternative treatment options for further investigations. Several of the suggested new uses can be supported with information from the scientific literature. CONCLUSIONS: The results demonstrated that the phenome can be further analyzed by a generative model, which can discover probabilistic associations between drugs and therapeutic uses. In this regard, LDA serves as an enrichment tool to explore new uses of existing drugs by narrowing down the search space.


Assuntos
Biologia Computacional/métodos , Reposicionamento de Medicamentos/métodos , Modelos Estatísticos , Fenótipo , Mineração de Dados , Bases de Dados de Produtos Farmacêuticos , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos
2.
Am J Pathol ; 182(4): 1180-7, 2013 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-23395088

RESUMO

Drug-induced liver injury (DILI) may present any morphologic characteristic of acute or chronic liver disease with no standardized terminology in place. Defining lexemes of DILI histopathology would allow the development of advanced knowledge discovery and data mining tools for across comparisons of publicly available information. For these purposes, a DILI ontology (DILIo) was developed by using the Unified Medical Language System tool and the standardized terminology of the Systematized Nomenclature of Medicine-Clinical Terms (SNOMED CT). The DILIo was entrained on findings of 114 US Food and Drug Administration-approved drugs by extracting all clinically DILI-related histopathologic descriptions for 1082 liver biopsy samples, which were then analyzed using the Unified Medical Language System MetaMap and subsequently mapped to the SNOMED CT. The DILIo provides a standard means to describe and organize liver injury induced by drugs, enabling comparative analysis of drugs within and across histopathologic terms. The analysis showed that flutamide, troglitazone, diclofenac, isoniazid, and tamoxifen were reported to have the most diverse histopathologic observations in liver biopsy. Necrosis, cholestasis, fatty degeneration, fibrosis, infiltrate, and hepatic necrosis were the most frequent terms used as descriptors of histopathologic features of DILI. In conclusion, DILIo entrains different algorithms for an efficient meta-analysis of published findings for an improved understanding of mechanisms and clinical characteristics of DILI.


Assuntos
Doença Hepática Induzida por Substâncias e Drogas/patologia , Terminologia como Assunto , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Humanos , Fígado/patologia , Publicações , Tioguanina/efeitos adversos
3.
Virus Evol ; 10(1): veae015, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38510920

RESUMO

We investigated transmission dynamics of a large human immunodeficiency virus (HIV) outbreak among persons who inject drugs (PWID) in KY and OH during 2017-20 by using detailed phylogenetic, network, recombination, and cluster dating analyses. Using polymerase (pol) sequences from 193 people associated with the investigation, we document high HIV-1 diversity, including Subtype B (44.6 per cent); numerous circulating recombinant forms (CRFs) including CRF02_AG (2.5 per cent) and CRF02_AG-like (21.8 per cent); and many unique recombinant forms composed of CRFs with major subtypes and sub-subtypes [CRF02_AG/B (24.3 per cent), B/CRF02_AG/B (0.5 per cent), and A6/D/B (6.4 per cent)]. Cluster analysis of sequences using a 1.5 per cent genetic distance identified thirteen clusters, including a seventy-five-member cluster composed of CRF02_AG-like and CRF02_AG/B, an eighteen-member CRF02_AG/B cluster, Subtype B clusters of sizes ranging from two to twenty-three, and a nine-member A6/D and A6/D/B cluster. Recombination and phylogenetic analyses identified CRF02_AG/B variants with ten unique breakpoints likely originating from Subtype B and CRF02_AG-like viruses in the largest clusters. The addition of contact tracing results from OH to the genetic networks identified linkage between persons with Subtype B, CRF02_AG, and CRF02_AG/B sequences in the clusters supporting de novo recombinant generation. Superinfection prevalence was 13.3 per cent (8/60) in persons with multiple specimens and included infection with B and CRF02_AG; B and CRF02_AG/B; or B and A6/D/B. In addition to the presence of multiple, distinct molecular clusters associated with this outbreak, cluster dating inferred transmission associated with the largest molecular cluster occurred as early as 2006, with high transmission rates during 2017-8 in certain other molecular clusters. This outbreak among PWID in KY and OH was likely driven by rapid transmission of multiple HIV-1 variants including de novo viral recombinants from circulating viruses within the community. Our findings documenting the high HIV-1 transmission rate and clustering through partner services and molecular clusters emphasize the importance of leveraging multiple different data sources and analyses, including those from disease intervention specialist investigations, to better understand outbreak dynamics and interrupt HIV spread.

4.
BMC Bioinformatics ; 14 Suppl 14: S11, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24267543

RESUMO

BACKGROUND: High Content Screening (HCS) has become an important tool for toxicity assessment, partly due to its advantage of handling multiple measurements simultaneously. This approach has provided insight and contributed to the understanding of systems biology at cellular level. To fully realize this potential, the simultaneously measured multiple endpoints from a live cell should be considered in a probabilistic relationship to assess the cell's condition to response stress from a treatment, which poses a great challenge to extract hidden knowledge and relationships from these measurements. METHOD: In this work, we applied a text mining method of Latent Dirichlet Allocation (LDA) to analyze cellular endpoints from in vitro HCS assays and related to the findings to in vivo histopathological observations. We measured multiple HCS assay endpoints for 122 drugs. Since LDA requires the data to be represented in document-term format, we first converted the continuous value of the measurements to the word frequency that can processed by the text mining tool. For each of the drugs, we generated a document for each of the 4 time points. Thus, we ended with 488 documents (drug-hour) each having different values for the 10 endpoints which are treated as words. We extracted three topics using LDA and examined these to identify diagnostic topics for 45 common drugs located in vivo experiments from the Japanese Toxicogenomics Project (TGP) observing their necrosis findings at 6 and 24 hours after treatment. RESULTS: We found that assay endpoints assigned to particular topics were in concordance with the histopathology observed. Drugs showing necrosis at 6 hour were linked to severe damage events such as Steatosis, DNA Fragmentation, Mitochondrial Potential, and Lysosome Mass. DNA Damage and Apoptosis were associated with drugs causing necrosis at 24 hours, suggesting an interplay of the two pathways in these drugs. Drugs with no sign of necrosis we related to the Cell Loss and Nuclear Size assays, which is suggestive of hepatocyte regeneration. CONCLUSIONS: The evidence from this study suggests that topic modeling with LDA can enable us to interpret relationships of endpoints of in vitro assays along with an in vivo histological finding, necrosis. Effectiveness of this approach may add substantially to our understanding of systems biology.


Assuntos
Mineração de Dados , Toxicogenética/métodos , Animais , Apoptose/efeitos dos fármacos , Células Cultivadas , Dano ao DNA , Bases de Dados Genéticas , Hepatócitos/efeitos dos fármacos , Hepatócitos/metabolismo , Ensaios de Triagem em Larga Escala , Lisossomos/metabolismo , Masculino , Mitocôndrias/efeitos dos fármacos , Mitocôndrias/genética , Mitocôndrias/metabolismo , Necrose/genética , Necrose/metabolismo , Ratos , Ratos Sprague-Dawley
5.
Hum Genomics ; 6: 5, 2012 Jul 05.
Artigo em Inglês | MEDLINE | ID: mdl-23245293

RESUMO

A genetic association study is a complicated process that involves collecting phenotypic data, generating genotypic data, analyzing associations between genotypic and phenotypic data, and interpreting genetic biomarkers identified. SNPTrack is an integrated bioinformatics system developed by the US Food and Drug Administration (FDA) to support the review and analysis of pharmacogenetics data resulting from FDA research or submitted by sponsors. The system integrates data management, analysis, and interpretation in a single platform for genetic association studies. Specifically, it stores genotyping data and single-nucleotide polymorphism (SNP) annotations along with study design data in an Oracle database. It also integrates popular genetic analysis tools, such as PLINK and Haploview. SNPTrack provides genetic analysis capabilities and captures analysis results in its database as SNP lists that can be cross-linked for biological interpretation to gene/protein annotations, Gene Ontology, and pathway analysis data. With SNPTrack, users can do the entire stream of bioinformatics jobs for genetic association studies. SNPTrack is freely available to the public at http://www.fda.gov/ScienceResearch/BioinformaticsTools/SNPTrack/default.htm.


Assuntos
Biologia Computacional/métodos , Bases de Dados Genéticas , Polimorfismo de Nucleotídeo Único , Ontologia Genética , Estudos de Associação Genética/métodos , Predisposição Genética para Doença/genética , Genótipo , Humanos , Internet , Fenótipo , Transdução de Sinais/genética , Software
6.
Viruses ; 15(11)2023 Nov 02.
Artigo em Inglês | MEDLINE | ID: mdl-38005885

RESUMO

Hantaviruses zoonotically infect humans worldwide with pathogenic consequences and are mainly spread by rodents that shed aerosolized virus particles in urine and feces. Bioinformatics methods for hantavirus diagnostics, genomic surveillance and epidemiology are currently lacking a comprehensive approach for data sharing, integration, visualization, analytics and reporting. With the possibility of hantavirus cases going undetected and spreading over international borders, a significant reporting delay can miss linked transmission events and impedes timely, targeted public health interventions. To overcome these challenges, we built HantaNet, a standalone visualization engine for hantavirus genomes that facilitates viral surveillance and classification for early outbreak detection and response. HantaNet is powered by MicrobeTrace, a browser-based multitool originally developed at the Centers for Disease Control and Prevention (CDC) to visualize HIV clusters and transmission networks. HantaNet integrates coding gene sequences and standardized metadata from hantavirus reference genomes into three separate gene modules for dashboard visualization of phylogenetic trees, viral strain clusters for classification, epidemiological networks and spatiotemporal analysis. We used 85 hantavirus reference datasets from GenBank to validate HantaNet as a classification and enhanced visualization tool, and as a public repository to download standardized sequence data and metadata for building analytic datasets. HantaNet is a model on how to deploy MicrobeTrace-specific tools to advance pathogen surveillance, epidemiology and public health globally.


Assuntos
Doenças Transmissíveis , Infecções por Hantavirus , Orthohantavírus , Animais , Humanos , Orthohantavírus/genética , Filogenia , Infecções por Hantavirus/epidemiologia , Doenças Transmissíveis/epidemiologia , Surtos de Doenças , Genômica , Roedores
7.
BMC Bioinformatics ; 13 Suppl 15: S6, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-23046522

RESUMO

BACKGROUND: Drug repositioning offers an opportunity to revitalize the slowing drug discovery pipeline by finding new uses for currently existing drugs. Our hypothesis is that drugs sharing similar side effect profiles are likely to be effective for the same disease, and thus repositioning opportunities can be identified by finding drug pairs with similar side effects documented in U.S. Food and Drug Administration (FDA) approved drug labels. The safety information in the drug labels is usually obtained in the clinical trial and augmented with the observations in the post-market use of the drug. Therefore, our drug repositioning approach can take the advantage of more comprehensive safety information comparing with conventional de novo approach. METHOD: A probabilistic topic model was constructed based on the terms in the Medical Dictionary for Regulatory Activities (MedDRA) that appeared in the Boxed Warning, Warnings and Precautions, and Adverse Reactions sections of the labels of 870 drugs. Fifty-two unique topics, each containing a set of terms, were identified by using topic modeling. The resulting probabilistic topic associations were used to measure the distance (similarity) between drugs. The success of the proposed model was evaluated by comparing a drug and its nearest neighbor (i.e., a drug pair) for common indications found in the Indications and Usage Section of the drug labels. RESULTS: Given a drug with more than three indications, the model yielded a 75% recall, meaning 75% of drug pairs shared one or more common indications. This is significantly higher than the 22% recall rate achieved by random selection. Additionally, the recall rate grows rapidly as the number of drug indications increases and reaches 84% for drugs with 11 indications. The analysis also demonstrated that 65 drugs with a Boxed Warning, which indicates significant risk of serious and possibly life-threatening adverse effects, might be replaced with safer alternatives that do not have a Boxed Warning. In addition, we identified two therapeutic groups of drugs (Musculo-skeletal system and Anti-infective for systemic use) where over 80% of the drugs have a potential replacement with high significance. CONCLUSION: Topic modeling can be a powerful tool for the identification of repositioning opportunities by examining the adverse event terms in FDA approved drug labels. The proposed framework not only suggests drugs that can be repurposed, but also provides insight into the safety of repositioned drugs.


Assuntos
Reposicionamento de Medicamentos , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Modelos Teóricos , Rotulagem de Medicamentos , Estados Unidos , United States Food and Drug Administration
8.
BMC Genomics ; 13: 325, 2012 Jul 20.
Artigo em Inglês | MEDLINE | ID: mdl-22817640

RESUMO

BACKGROUND: Large amounts of mammalian protein-protein interaction (PPI) data have been generated and are available for public use. From a systems biology perspective, Proteins/genes interactions encode the key mechanisms distinguishing disease and health, and such mechanisms can be uncovered through network analysis. An effective network analysis tool should integrate different content-specific PPI databases into a comprehensive network format with a user-friendly platform to identify key functional modules/pathways and the underlying mechanisms of disease and toxicity. RESULTS: atBioNet integrates seven publicly available PPI databases into a network-specific knowledge base. Knowledge expansion is achieved by expanding a user supplied proteins/genes list with interactions from its integrated PPI network. The statistically significant functional modules are determined by applying a fast network-clustering algorithm (SCAN: a Structural Clustering Algorithm for Networks). The functional modules can be visualized either separately or together in the context of the whole network. Integration of pathway information enables enrichment analysis and assessment of the biological function of modules. Three case studies are presented using publicly available disease gene signatures as a basis to discover new biomarkers for acute leukemia, systemic lupus erythematosus, and breast cancer. The results demonstrated that atBioNet can not only identify functional modules and pathways related to the studied diseases, but this information can also be used to hypothesize novel biomarkers for future analysis. CONCLUSION: atBioNet is a free web-based network analysis tool that provides a systematic insight into proteins/genes interactions through examining significant functional modules. The identified functional modules are useful for determining underlying mechanisms of disease and biomarker discovery. It can be accessed at: http://www.fda.gov/ScienceResearch/BioinformaticsTools/ucm285284.htm.


Assuntos
Biomarcadores/metabolismo , Genômica , Software , Algoritmos , Análise por Conglomerados , Bases de Dados de Proteínas , Humanos , Redes e Vias Metabólicas , Mapas de Interação de Proteínas , Interface Usuário-Computador
9.
PLoS Comput Biol ; 7(12): e1002310, 2011 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-22194678

RESUMO

Drug-induced liver injury (DILI) is a significant concern in drug development due to the poor concordance between preclinical and clinical findings of liver toxicity. We hypothesized that the DILI types (hepatotoxic side effects) seen in the clinic can be translated into the development of predictive in silico models for use in the drug discovery phase. We identified 13 hepatotoxic side effects with high accuracy for classifying marketed drugs for their DILI potential. We then developed in silico predictive models for each of these 13 side effects, which were further combined to construct a DILI prediction system (DILIps). The DILIps yielded 60-70% prediction accuracy for three independent validation sets. To enhance the confidence for identification of drugs that cause severe DILI in humans, the "Rule of Three" was developed in DILIps by using a consensus strategy based on 13 models. This gave high positive predictive value (91%) when applied to an external dataset containing 206 drugs from three independent literature datasets. Using the DILIps, we screened all the drugs in DrugBank and investigated their DILI potential in terms of protein targets and therapeutic categories through network modeling. We demonstrated that two therapeutic categories, anti-infectives for systemic use and musculoskeletal system drugs, were enriched for DILI, which is consistent with current knowledge. We also identified protein targets and pathways that are related to drugs that cause DILI by using pathway analysis and co-occurrence text mining. While marketed drugs were the focus of this study, the DILIps has a potential as an evaluation tool to screen and prioritize new drug candidates or chemicals, such as environmental chemicals, to avoid those that might cause liver toxicity. We expect that the methodology can be also applied to other drug safety endpoints, such as renal or cardiovascular toxicity.


Assuntos
Doença Hepática Induzida por Substâncias e Drogas/metabolismo , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Modelos Biológicos , Animais , Anti-Infecciosos/efeitos adversos , Anti-Inflamatórios/efeitos adversos , Bases de Dados Factuais , Humanos , Fígado/efeitos dos fármacos , Valor Preditivo dos Testes
10.
Microbiol Spectr ; 10(2): e0256421, 2022 04 27.
Artigo em Inglês | MEDLINE | ID: mdl-35234489

RESUMO

Next-generation sequencing (NGS) is a powerful tool for detecting and investigating viral pathogens; however, analysis and management of the enormous amounts of data generated from these technologies remains a challenge. Here, we present VPipe (the Viral NGS Analysis Pipeline and Data Management System), an automated bioinformatics pipeline optimized for whole-genome assembly of viral sequences and identification of diverse species. VPipe automates the data quality control, assembly, and contig identification steps typically performed when analyzing NGS data. Users access the pipeline through a secure web-based portal, which provides an easy-to-use interface with advanced search capabilities for reviewing results. In addition, VPipe provides a centralized system for storing and analyzing NGS data, eliminating common bottlenecks in bioinformatics analyses for public health laboratories with limited on-site computational infrastructure. The performance of VPipe was validated through the analysis of publicly available NGS data sets for viral pathogens, generating high-quality assemblies for 12 data sets. VPipe also generated assemblies with greater contiguity than similar pipelines for 41 human respiratory syncytial virus isolates and 23 SARS-CoV-2 specimens. IMPORTANCE Computational infrastructure and bioinformatics analysis are bottlenecks in the application of NGS to viral pathogens. As of September 2021, VPipe has been used by the U.S. Centers for Disease Control and Prevention (CDC) and 12 state public health laboratories to characterize >17,500 and 1,500 clinical specimens and isolates, respectively. VPipe automates genome assembly for a wide range of viruses, including high-consequence pathogens such as SARS-CoV-2. Such automated functionality expedites public health responses to viral outbreaks and pathogen surveillance.


Assuntos
COVID-19 , Vírus , Biologia Computacional/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , SARS-CoV-2/genética , Vírus/genética
11.
BMC Bioinformatics ; 12 Suppl 10: S3, 2011 Oct 18.
Artigo em Inglês | MEDLINE | ID: mdl-22166133

RESUMO

BACKGROUND: Genomic biomarkers play an increasing role in both preclinical and clinical application. Development of genomic biomarkers with microarrays is an area of intensive investigation. However, despite sustained and continuing effort, developing microarray-based predictive models (i.e., genomics biomarkers) capable of reliable prediction for an observed or measured outcome (i.e., endpoint) of unknown samples in preclinical and clinical practice remains a considerable challenge. No straightforward guidelines exist for selecting a single model that will perform best when presented with unknown samples. In the second phase of the MicroArray Quality Control (MAQC-II) project, 36 analysis teams produced a large number of models for 13 preclinical and clinical endpoints. Before external validation was performed, each team nominated one model per endpoint (referred to here as 'nominated models') from which MAQC-II experts selected 13 'candidate models' to represent the best model for each endpoint. Both the nominated and candidate models from MAQC-II provide benchmarks to assess other methodologies for developing microarray-based predictive models. METHODS: We developed a simple ensemble method by taking a number of the top performing models from cross-validation and developing an ensemble model for each of the MAQC-II endpoints. We compared the ensemble models with both nominated and candidate models from MAQC-II using blinded external validation. RESULTS: For 10 of the 13 MAQC-II endpoints originally analyzed by the MAQC-II data analysis team from the National Center for Toxicological Research (NCTR), the ensemble models achieved equal or better predictive performance than the NCTR nominated models. Additionally, the ensemble models had performance comparable to the MAQC-II candidate models. Most ensemble models also had better performance than the nominated models generated by five other MAQC-II data analysis teams that analyzed all 13 endpoints. CONCLUSIONS: Our findings suggest that an ensemble method can often attain a higher average predictive performance in an external validation set than a corresponding "optimized" model method. Using an ensemble method to determine a final model is a potentially important supplement to the good modeling practices recommended by the MAQC-II project for developing microarray-based genomic biomarkers.


Assuntos
Modelos Genéticos , Neoplasias/genética , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Toxicogenética/métodos , Perfilação da Expressão Gênica/métodos , Humanos , Metanálise como Assunto , Controle de Qualidade
12.
Hum Genomics ; 4(6): 428-34, 2010 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-20846933

RESUMO

ArrayTrack is a Food and Drug Administration (FDA) bioinformatics tool that has been widely adopted by the research community for genomics studies. It provides an integrated environment for microarray data management, analysis and interpretation. Most of its functionality for statistical, pathway and gene ontology analysis can also be applied independently to data generated by other molecular technologies. ArrayTrack has been undergoing active development and enhancement since its inception in 2001. This review summarises its key functionalities, with emphasis on the most recent extensions in support of the evolving needs of FDA's research programmes. ArrayTrack has added capability to manage, analyse and interpret proteomics and metabolomics data after quantification of peptides and metabolites abundance, respectively. Annotation information about single nucleotide polymorphisms and quantitative trait loci has been integrated to support genetics-related studies. Other extensions have been added to manage and analyse genomics data related to bacterial food-borne pathogens.


Assuntos
Pesquisa Biomédica/métodos , Biologia Computacional/métodos , Software , United States Food and Drug Administration , Humanos , Polimorfismo de Nucleotídeo Único/genética , Estados Unidos
13.
Chem Res Toxicol ; 24(7): 1062-70, 2011 Jul 18.
Artigo em Inglês | MEDLINE | ID: mdl-21627106

RESUMO

The primary testing strategy to identify nongenotoxic carcinogens largely relies on the 2-year rodent bioassay, which is time-consuming and labor-intensive. There is an increasing effort to develop alternative approaches to prioritize the chemicals for, supplement, or even replace the cancer bioassay. In silico approaches based on quantitative structure-activity relationships (QSAR) are rapid and inexpensive and thus have been investigated for such purposes. A slightly more expensive approach based on short-term animal studies with toxicogenomics (TGx) represents another attractive option for this application. Thus, the primary questions are how much better predictive performance using short-term TGx models can be achieved compared to that of QSAR models, and what length of exposure is sufficient for high quality prediction based on TGx. In this study, we developed predictive models for rodent liver carcinogenicity using gene expression data generated from short-term animal models at different time points and QSAR. The study was focused on the prediction of nongenotoxic carcinogenicity since the genotoxic chemicals can be inexpensively removed from further development using various in vitro assays individually or in combination. We identified 62 chemicals whose hepatocarcinogenic potential was available from the National Center for Toxicological Research liver cancer database (NCTRlcdb). The gene expression profiles of liver tissue obtained from rats treated with these chemicals at different time points (1 day, 3 days, and 5 days) are available from the Gene Expression Omnibus (GEO) database. Both TGx and QSAR models were developed on the basis of the same set of chemicals using the same modeling approach, a nearest-centroid method with a minimum redundancy and maximum relevancy-based feature selection with performance assessed using compound-based 5-fold cross-validation. We found that the TGx models outperformed QSAR in every aspect of modeling. For example, the TGx models' predictive accuracy (0.77, 0.77, and 0.82 for the 1-day, 3-day, and 5-day models, respectively) was much higher for an independent validation set than that of a QSAR model (0.55). Permutation tests confirmed the statistical significance of the model's prediction performance. The study concluded that a short-term 5-day TGx animal model holds the potential to predict nongenotoxic hepatocarcinogenicity.


Assuntos
Carcinógenos/toxicidade , Fígado/efeitos dos fármacos , Relação Quantitativa Estrutura-Atividade , Toxicogenética , Animais , Bases de Dados Factuais , Perfilação da Expressão Gênica , Camundongos , Modelos Animais , Ratos , Software , Fatores de Tempo , Testes de Toxicidade
14.
Front Genet ; 11: 601870, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-33324449

RESUMO

Effective laboratory-based surveillance and public health response to bacterial meningitis depends on timely characterization of bacterial meningitis pathogens. Traditionally, characterizing bacterial meningitis pathogens such as Neisseria meningitidis (Nm) and Haemophilus influenzae (Hi) required several biochemical and molecular tests. Whole genome sequencing (WGS) has enabled the development of pipelines capable of characterizing the given pathogen with equivalent results to many of the traditional tests. Here, we present the Bacterial Meningitis Genomic Analysis Platform (BMGAP): a secure, web-accessible informatics platform that facilitates automated analysis of WGS data in public health laboratories. BMGAP is a pipeline comprised of several components, including both widely used, open-source third-party software and customized analysis modules for the specific target pathogens. BMGAP performs de novo draft genome assembly and identifies the bacterial species by whole-genome comparisons against a curated reference collection of 17 focal species including Nm, Hi, and other closely related species. Genomes identified as Nm or Hi undergo multi-locus sequence typing (MLST) and capsule characterization. Further typing information is captured from Nm genomes, such as peptides for the vaccine antigens FHbp, NadA, and NhbA. Assembled genomes are retained in the BMGAP database, serving as a repository for genomic comparisons. BMGAP's species identification and capsule characterization modules were validated using PCR and slide agglutination from 446 bacterial invasive isolates (273 Nm from nine different serogroups, 150 Hi from seven different serotypes, and 23 from nine other species) collected from 2017 to 2019 through surveillance programs. Among the validation isolates, BMGAP correctly identified the species for all 440 isolates (100% sensitivity and specificity) and accurately characterized all Nm serogroups (99% sensitivity and 98% specificity) and Hi serotypes (100% sensitivity and specificity). BMGAP provides an automated, multi-species analysis pipeline that can be extended to include additional analysis modules as needed. This provides easy-to-interpret and validated Nm and Hi genome analysis capacity to public health laboratories and collaborators. As the BMGAP database accumulates more genomic data, it grows as a valuable resource for rapid comparative genomic analyses during outbreak investigations.

15.
Eur J Hum Genet ; 16(5): 603-13, 2008 May.
Artigo em Inglês | MEDLINE | ID: mdl-18212815

RESUMO

Metabolic response to the triglyceride (TG)-lowering drug, fenofibrate, is shaped by interactions between genetic and environmental factors, yet knowledge regarding the genetic determinants of this response is primarily limited to single-gene effects. Since very low-density lipoprotein (VLDL) is the central carrier of fasting TG, identifying factors that affect both total TG and VLDL-TG response to fenofibrate is critical for predicting individual fenofibrate response. As part of the Genetics of Lipid Lowering Drugs and Diet Network (GOLDN) study, 688 individuals from 161 families were genotyped for 91 single-nucleotide polymorphisms (SNPs) in 25 genes known to be involved in lipoprotein metabolism. Using generalized estimating equations to control for family structure, we performed linear modeling to investigate whether single SNPs, single covariates, SNP-SNP interactions, and/or SNP-covariate interactions had a significant association with the change in total fasting TG and fasting VLDL-TG after 3 weeks of fenofibrate treatment. A 10-iteration fourfold cross-validation procedure was used to validate significant associations and quantify their predictive abilities. More than one-third of the significant, cross-validated SNP-SNP interactions predicting each outcome involved just five SNPs, showing that these SNPs are of key importance to fenofibrate response. Multiple variable models constructed using the top-ranked SNP--covariate interactions explained 11.9% more variation in the change in TG and 7.8% more variation in the change in VLDL than baseline TG alone. These results yield insight into the complex biology of fenofibrate response, which can be used to target fenofibrate therapy to individuals who are most likely to benefit from the drug.


Assuntos
Jejum/sangue , Fenofibrato/farmacologia , Hipolipemiantes/farmacologia , Metabolismo dos Lipídeos/efeitos dos fármacos , Metabolismo dos Lipídeos/genética , Triglicerídeos/sangue , Adulto , Feminino , Fenofibrato/administração & dosagem , Humanos , Hipolipemiantes/administração & dosagem , Lipoproteínas VLDL/sangue , Masculino , Pessoa de Meia-Idade , Polimorfismo de Nucleotídeo Único , Reprodutibilidade dos Testes
16.
Bioinformatics ; 23(2): 249-51, 2007 Jan 15.
Artigo em Inglês | MEDLINE | ID: mdl-17032675

RESUMO

UNLABELLED: The KGraph is a data visualization system that has been developed to display the complex relationships between the univariate and bivariate associations among an outcome of interest, a set of covariates, and a set of genetic factors, such as single nucleotide polymorphisms (SNPs). It allows for easy viewing and interpretation of genetic associations, correlations among covariates and SNPs, and information about the replication and cross-validation of the associations. The KGraph allows the user to more easily investigate multicollinearity and confounding through visualization of the multidimensional correlation structure underlying genetic associations. It emphasizes gene-environment and gene-gene interaction, both important components of any genetic system that are often overlooked in association frameworks. AVAILABILITY: http://www.epidkardia.sph.umich.edu/software/kgrapher


Assuntos
Gráficos por Computador , Predisposição Genética para Doença/genética , Genética Populacional , Desequilíbrio de Ligação/genética , Modelos Genéticos , Software , Interface Usuário-Computador , Algoritmos , Simulação por Computador , Humanos , Armazenamento e Recuperação da Informação/métodos , Estatística como Assunto
17.
BMC Med Genet ; 9: 93, 2008 Oct 23.
Artigo em Inglês | MEDLINE | ID: mdl-18947427

RESUMO

BACKGROUND: Persistent stimulation of cardiac beta1-adrenergic receptors by endogenous norepinephrine promotes heart failure progression. Polymorphisms of this gene are known to alter receptor function or expression, as are polymorphisms of the alpha 2C-adrenergic receptor, which regulates norepinephrine release from cardiac presynaptic nerves. The purpose of this study was to investigate possible synergistic effects of polymorphisms of these two intronless genes (ADRB1 and ADRA2C, respectively) on the risk of death/transplant in heart failure patients. METHODS: Sixteen sequence variations in ADRA2C and 17 sequence variations in ADRB1 were genotyped in a longitudinal study of 655 white heart failure patients. Eleven sequence variations in each gene were polymorphic in the heart failure cohort. Cox proportional hazards modeling was used to identify polymorphisms and potential intra- or intergenic interactions that influenced risk of death or cardiac transplant. A leave-one-out cross-validation method was utilized for internal validation. RESULTS: Three polymorphisms in ADRA2C and five polymorphisms in ADRB1 were involved in eight cross-validated epistatic interactions identifying several two-locus genotype classes with significant relative risks ranging from 3.02 to 9.23. There was no evidence of intragenic epistasis. Combining high risk genotype classes across epistatic pairs to take into account linkage disequilibrium, the relative risk of death or transplant was 3.35 (1.82, 6.18) relative to all other genotype classes. CONCLUSION: Multiple polymorphisms act synergistically between the ADRA2C and ADRB1 genes to increase risk of death or cardiac transplant in heart failure patients.


Assuntos
Insuficiência Cardíaca/genética , Insuficiência Cardíaca/fisiopatologia , Receptores Adrenérgicos alfa 2/genética , Receptores Adrenérgicos beta 1/genética , Adolescente , Adulto , Idade de Início , Idoso , Idoso de 80 Anos ou mais , Estudos de Coortes , Epistasia Genética , Feminino , Insuficiência Cardíaca/mortalidade , Insuficiência Cardíaca/cirurgia , Transplante de Coração , Humanos , Estimativa de Kaplan-Meier , Desequilíbrio de Ligação , Masculino , Pessoa de Meia-Idade , Polimorfismo de Nucleotídeo Único , Prognóstico , Modelos de Riscos Proporcionais , Receptores Adrenérgicos alfa 2/fisiologia , Receptores Adrenérgicos beta 1/fisiologia , Fatores de Risco , Volume Sistólico , Função Ventricular Esquerda , Adulto Jovem
18.
BMC Syst Biol ; 8: 93, 2014 Aug 13.
Artigo em Inglês | MEDLINE | ID: mdl-25115450

RESUMO

BACKGROUND: Toxicogenomics studies often profile gene expression from assays involving multiple doses and time points. The dose- and time-dependent pattern is of great importance to assess toxicity but computational approaches are lacking to effectively utilize this characteristic in toxicity assessment. Topic modeling is a text mining approach, but may be used analogously in toxicogenomics due to the similar data structures between text and gene dysregulation. RESULTS: Topic modeling was applied to a very large toxicogenomics dataset containing microarray gene expression data from >15,000 samples associated with 131 drugs tested in three different assay platforms (i.e., in vitro assay, in vivo repeated dose study and in vivo single dose experiment) with a design including multiple doses and time points. A set of "topics" which each consist of a set of genes was determined, by which the varying sensitivity of three assay systems was observed. We found that the drug-dependent effect was more pronounced in the two in vivo systems than the in vitro system, while the time-dependent effect was most strongly reflected in the in vitro system followed by the single dose study and lastly the repeated dose experiment. The dose-dependent effect was similar across three assay systems. Although the results indicated a challenge to extrapolate the in vitro results to the in vivo situation, we did notice that, for some drugs but not for all the drugs, the similarity in gene expression patterns was observed across all three assay systems, indicating a possibility of using in vitro systems with careful designs (such as the choice of dose and time point), to replace the in vivo testing strategy. Nonetheless, a potential to replace the repeated dose study by the single-dose short-term methodology was strongly implied. CONCLUSIONS: The study demonstrated that text mining methodologies such as topic modeling provide an alternative method compared to traditional means for data reduction in toxicogenomics, enhancing researchers' capabilities to interpret biological information.


Assuntos
Biologia Computacional/métodos , Mineração de Dados/métodos , Toxicogenética/métodos , Relação Dose-Resposta a Droga , Humanos , Análise de Sequência com Séries de Oligonucleotídeos , Fatores de Tempo
19.
Toxicol Sci ; 136(1): 242-9, 2013 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-23997115

RESUMO

Drug-induced liver injury (DILI) is one of the leading causes of the termination of drug development programs. Consequently, identifying the risk of DILI in humans for drug candidates during the early stages of the development process would greatly reduce the drug attrition rate in the pharmaceutical industry but would require the implementation of new research and development strategies. In this regard, several in silico models have been proposed as alternative means in prioritizing drug candidates. Because the accuracy and utility of a predictive model rests largely on how to annotate the potential of a drug to cause DILI in a reliable and consistent way, the Food and Drug Administration-approved drug labeling was given prominence. Out of 387 drugs annotated, 197 drugs were used to develop a quantitative structure-activity relationship (QSAR) model and the model was subsequently challenged by the left of drugs serving as an external validation set with an overall prediction accuracy of 68.9%. The performance of the model was further assessed by the use of 2 additional independent validation sets, and the 3 validation data sets have a total of 483 unique drugs. We observed that the QSAR model's performance varied for drugs with different therapeutic uses; however, it achieved a better estimated accuracy (73.6%) as well as negative predictive value (77.0%) when focusing only on these therapeutic categories with high prediction confidence. Thus, the model's applicability domain was defined. Taken collectively, the developed QSAR model has the potential utility to prioritize compound's risk for DILI in humans, particularly for the high-confidence therapeutic subgroups like analgesics, antibacterial agents, and antihistamines.


Assuntos
Doença Hepática Induzida por Substâncias e Drogas/etiologia , Aprovação de Drogas , Rotulagem de Medicamentos , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos/etiologia , Preparações Farmacêuticas/química , United States Food and Drug Administration , Simulação por Computador , Humanos , Modelos Moleculares , Estrutura Molecular , Preparações Farmacêuticas/classificação , Relação Quantitativa Estrutura-Atividade , Reprodutibilidade dos Testes , Medição de Risco , Fatores de Risco , Estados Unidos
20.
Clin Transl Sci ; 4(1): 17-23, 2011 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-21348951

RESUMO

A three-stage approach was undertaken using genome-wide, case-control, and case-only association studies to identify genetic variants associated with heart failure mortality. In an Amish founder population (n = 851), cardiac hypertrophy, a trait integral to the adaptive response to failure, was found to be heritable (h² = 0.28, p = 0.0002) and GWAS revealed 21 candidate hypertrophy SNPs. In a case (n = 1,610)-control (n = 463) study in unrelated Caucasians, one of the SNPs associated with hypertrophy (rs2207418, p = 8 × 10⁻6), was associated with heart failure, RR = 1.85(1.25-2.73, p = 0.0019). In heart failure cases rs2207418 was associated with increased mortality, HR = 1.51(1.20-1.97, p = 0.0004). There was consistency between studies, with the GG allele being associated with increased ventricular mass (~13 g/m²) in the Amish, heart failure risk, and heart failure mortality. This SNP is in a gene desert of chromosome 20p12. Five genes are within 2.0 mbp of rs2207418 but with low LD between their SNPs and rs2207418. A region near this SNP is highly conserved in multiple vertebrates (lod score = 1,208). This conservation and the internal consistency across studies suggests that this region has biologic importance in heart failure, potentially acting as an enhancer or repressor element. rs2207418 may be useful for predicting a more progressive form of heart failure that may require aggressive therapy.


Assuntos
Cardiomegalia/complicações , Cardiomegalia/genética , Efeito Fundador , Predisposição Genética para Doença , Insuficiência Cardíaca/genética , Insuficiência Cardíaca/mortalidade , Polimorfismo de Nucleotídeo Único/efeitos dos fármacos , Adulto , Idoso , Idoso de 80 Anos ou mais , Sequência de Bases , Cardiomegalia/diagnóstico por imagem , Estudos de Coortes , Demografia , Etnicidade/genética , Feminino , Insuficiência Cardíaca/complicações , Insuficiência Cardíaca/diagnóstico por imagem , Ventrículos do Coração/patologia , Humanos , Masculino , Pessoa de Meia-Idade , Tamanho do Órgão , Polimorfismo de Nucleotídeo Único/genética , Fatores de Risco , Homologia de Sequência do Ácido Nucleico , Elementos Nucleotídeos Curtos e Dispersos/genética , Ultrassonografia , Adulto Jovem
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA