Pesquisa | BVS - MINISTÉRIO DA SAÚDE

Applications of Advanced Natural Language Processing for Clinical Pharmacology.

Hsu, Joy C; Wu, Michael; Kim, Chloe; Vora, Bianca; Lien, Yi Ting Kayla; Jindal, Ashutosh; Yoshida, Kenta; Kawakatsu, Sonoko; Gore, Jeremy; Jin, Jin Y; Lu, Christina; Chen, Bingyuan; Wu, Benjamin.

Clin Pharmacol Ther ; 115(4): 786-794, 2024 04.

Artigo em Inglês | MEDLINE | ID: mdl-38140747

RESUMO

Natural language processing (NLP) is a branch of artificial intelligence, which combines computational linguistics, machine learning, and deep learning models to process human language. Although there is a surge in NLP usage across various industries in recent years, NLP has not been widely evaluated and utilized to support drug development. To demonstrate how advanced NLP can expedite the extraction and analyses of information to help address clinical pharmacology questions, inform clinical trial designs, and support drug development, three use cases are described in this article: (1) dose optimization strategy in oncology, (2) common covariates on pharmacokinetic (PK) parameters in oncology, and (3) physiologically-based PK (PBPK) analyses for regulatory review and product label. The NLP workflow includes (1) preparation of source files, (2) NLP model building, and (3) automation of data extraction. The Clinical Pharmacology and Biopharmaceutics Summary Basis of Approval (SBA) documents, US package inserts (USPI), and approval letters from the US Food and Drug Administration (FDA) were used as our source data. As demonstrated in the three example use cases, advanced NLP can expedite the extraction and analyses of large amounts of information from regulatory review documents to help address important clinical pharmacology questions. Although this has not been adopted widely, integrating advanced NLP into the clinical pharmacology workflow can increase efficiency in extracting impactful information to advance drug development.

Assuntos

Processamento de Linguagem Natural , Farmacologia Clínica , Humanos , Inteligência Artificial , Registros Eletrônicos de Saúde , Aprendizado de Máquina

Diagnosis of genetic diseases in seriously ill children by rapid whole-genome sequencing and automated phenotyping and interpretation.

Clark, Michelle M; Hildreth, Amber; Batalov, Sergey; Ding, Yan; Chowdhury, Shimul; Watkins, Kelly; Ellsworth, Katarzyna; Camp, Brandon; Kint, Cyrielle I; Yacoubian, Calum; Farnaes, Lauge; Bainbridge, Matthew N; Beebe, Curtis; Braun, Joshua J A; Bray, Margaret; Carroll, Jeanne; Cakici, Julie A; Caylor, Sara A; Clarke, Christina; Creed, Mitchell P; Friedman, Jennifer; Frith, Alison; Gain, Richard; Gaughran, Mary; George, Shauna; Gilmer, Sheldon; Gleeson, Joseph; Gore, Jeremy; Grunenwald, Haiying; Hovey, Raymond L; Janes, Marie L; Lin, Kejia; McDonagh, Paul D; McBride, Kyle; Mulrooney, Patrick; Nahas, Shareef; Oh, Daeheon; Oriol, Albert; Puckett, Laura; Rady, Zia; Reese, Martin G; Ryu, Julie; Salz, Lisa; Sanford, Erica; Stewart, Lawrence; Sweeney, Nathaly; Tokita, Mari; Van Der Kraan, Luca; White, Sarah; Wigby, Kristen.

Sci Transl Med ; 11(489)2019 04 24.

Artigo em Inglês | MEDLINE | ID: mdl-31019026

RESUMO

By informing timely targeted treatments, rapid whole-genome sequencing can improve the outcomes of seriously ill children with genetic diseases, particularly infants in neonatal and pediatric intensive care units (ICUs). The need for highly qualified professionals to decipher results, however, precludes widespread implementation. We describe a platform for population-scale, provisional diagnosis of genetic diseases with automated phenotyping and interpretation. Genome sequencing was expedited by bead-based genome library preparation directly from blood samples and sequencing of paired 100-nt reads in 15.5 hours. Clinical natural language processing (CNLP) automatically extracted children's deep phenomes from electronic health records with 80% precision and 93% recall. In 101 children with 105 genetic diseases, a mean of 4.3 CNLP-extracted phenotypic features matched the expected phenotypic features of those diseases, compared with a match of 0.9 phenotypic features used in manual interpretation. We automated provisional diagnosis by combining the ranking of the similarity of a patient's CNLP phenome with respect to the expected phenotypic features of all genetic diseases, together with the ranking of the pathogenicity of all of the patient's genomic variants. Automated, retrospective diagnoses concurred well with expert manual interpretation (97% recall and 99% precision in 95 children with 97 genetic diseases). Prospectively, our platform correctly diagnosed three of seven seriously ill ICU infants (100% precision and recall) with a mean time saving of 22:19 hours. In each case, the diagnosis affected treatment. Genome sequencing with automated phenotyping and interpretation in a median of 20:10 hours may increase adoption in ICUs and, thereby, timely implementation of precise treatments.

Assuntos

Cetoacidose Diabética/genética , Genômica/métodos , Registros Eletrônicos de Saúde , Feminino , Humanos , Unidades de Terapia Intensiva/estatística & dados numéricos , Processamento de Linguagem Natural , Estudos Retrospectivos

Finding non-coding RNAs through genome-scale clustering.

Tseng, Huei-Hun; Weinberg, Zasha; Gore, Jeremy; Breaker, Ronald R; Ruzzo, Walter L.

J Bioinform Comput Biol ; 7(2): 373-88, 2009 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-19340921

RESUMO

Non-coding RNAs (ncRNAs) are transcripts that do not code for proteins. Recent findings have shown that RNA-mediated regulatory mechanisms influence a substantial portion of typical microbial genomes. We present an efficient method for finding potential ncRNAs in bacteria by clustering genomic sequences based on homology inferred from both primary sequence and secondary structure. We evaluate our approach using a set of predominantly Firmicutes sequences. Our results showed that, though primary sequence based-homology search was inaccurate for diverged ncRNA sequences, through our clustering method, we were able to infer motifs that recovered nearly all members of most known ncRNA families. Hence, our method shows promise for discovering new families of ncRNA.

Assuntos

Mapeamento Cromossômico/métodos , Análise por Conglomerados , Genoma/genética , RNA não Traduzido/genética , Análise de Sequência de RNA/métodos

Identification of 22 candidate structured RNAs in bacteria using the CMfinder comparative genomics pipeline.

Weinberg, Zasha; Barrick, Jeffrey E; Yao, Zizhen; Roth, Adam; Kim, Jane N; Gore, Jeremy; Wang, Joy Xin; Lee, Elaine R; Block, Kirsten F; Sudarsan, Narasimhan; Neph, Shane; Tompa, Martin; Ruzzo, Walter L; Breaker, Ronald R.

Nucleic Acids Res ; 35(14): 4809-19, 2007.

Artigo em Inglês | MEDLINE | ID: mdl-17621584

RESUMO

We applied a computational pipeline based on comparative genomics to bacteria, and identified 22 novel candidate RNA motifs. We predicted six to be riboswitches, which are mRNA elements that regulate gene expression on binding a specific metabolite. In separate studies, we confirmed that two of these are novel riboswitches. Three other riboswitch candidates are upstream of either a putative transporter gene in the order Lactobacillales, citric acid cycle genes in Burkholderiales or molybdenum cofactor biosynthesis genes in several phyla. The remaining riboswitch candidate, the widespread Genes for the Environment, for Membranes and for Motility (GEMM) motif, is associated with genes important for natural competence in Vibrio cholerae and the use of metal ions as electron acceptors in Geobacter sulfurreducens. Among the other motifs, one has a genetic distribution similar to a previously published candidate riboswitch, ykkC/yxkD, but has a different structure. We identified possible non-coding RNAs in five phyla, and several additional cis-regulatory RNAs, including one in epsilon-proteobacteria (upstream of purD, involved in purine biosynthesis), and one in Cyanobacteria (within an ATP synthase operon). These candidate RNAs add to the growing list of RNA motifs involved in multiple cellular processes, and suggest that many additional RNAs remain to be discovered.

Assuntos

Genômica/métodos , RNA Bacteriano/química , Sequências Reguladoras de Ácido Ribonucleico , Análise de Sequência de RNA/métodos , Sequência de Bases , Biologia Computacional , Sequência Consenso , Genoma Bacteriano , Dados de Sequência Molecular , Conformação de Ácido Nucleico , RNA Mensageiro/química , RNA não Traduzido/química

Deletion mutations caused by DNA strand slippage in Acinetobacter baylyi.

Gore, Jeremy M; Ran, F Ann; Ornston, L Nicholas.

Appl Environ Microbiol ; 72(8): 5239-45, 2006 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-16885271

RESUMO

Short nucleotide sequence repetitions in DNA can provide selective benefits and also can be a source of genetic instability arising from deletions guided by pairing between misaligned strands. These findings raise the question of how the frequency of deletion mutations is influenced by the length of sequence repetitions and by the distance between them. An experimental approach to this question was presented by the heat-sensitive phenotype conferred by pcaG1102, a 30-bp deletion in one of the structural genes for Acinetobacter baylyi protocatechuate 3,4-dioxygenase, which is required for growth with quinate. The original pcaG1102 deletion appears to have been guided by pairing between slipped DNA strands from nearby repeated sequences in wild-type pcaG. Placement of an in-phase termination codon between the repeated sequences in pcaG prevents growth with quinate and permits selection of sequence-guided deletions that excise the codon and permit quinate to be used as a growth substrate at room temperature. Natural transformation facilitated introduction of 68 different variants of the wild-type repeat structure within pcaG into the A. baylyi chromosome, and the frequency of deletion between the repetitions was determined with a novel method, precision plating. The deletion frequency increases with repeat length, decreases with the distance between repeats, and requires a minimum amount of similarity to occur at measurable rates. Deletions occurred in a recA-deficient background. Their frequency was unaffected by deficiencies in mutS and was increased by inactivation of recG.

Assuntos

Acinetobacter/genética , DNA Bacteriano/genética , DNA de Cadeia Simples/genética , Mutação , Deleção de Sequência , Acinetobacter/enzimologia , Acinetobacter/crescimento & desenvolvimento , Sequência de Bases , Meios de Cultura , DNA Bacteriano/metabolismo , DNA de Cadeia Simples/metabolismo , Escherichia coli/genética , Plasmídeos/genética , Protocatecoate-3,4-Dioxigenase/genética , Protocatecoate-3,4-Dioxigenase/metabolismo , Sequências Repetitivas de Ácido Nucleico/genética , Reprodutibilidade dos Testes

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA