Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
1.
NPJ Digit Med ; 7(1): 106, 2024 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-38693429

RESUMO

Existing natural language processing (NLP) methods to convert free-text clinical notes into structured data often require problem-specific annotations and model training. This study aims to evaluate ChatGPT's capacity to extract information from free-text medical notes efficiently and comprehensively. We developed a large language model (LLM)-based workflow, utilizing systems engineering methodology and spiral "prompt engineering" process, leveraging OpenAI's API for batch querying ChatGPT. We evaluated the effectiveness of this method using a dataset of more than 1000 lung cancer pathology reports and a dataset of 191 pediatric osteosarcoma pathology reports, comparing the ChatGPT-3.5 (gpt-3.5-turbo-16k) outputs with expert-curated structured data. ChatGPT-3.5 demonstrated the ability to extract pathological classifications with an overall accuracy of 89%, in lung cancer dataset, outperforming the performance of two traditional NLP methods. The performance is influenced by the design of the instructive prompt. Our case analysis shows that most misclassifications were due to the lack of highly specialized pathology terminology, and erroneous interpretation of TNM staging rules. Reproducibility shows the relatively stable performance of ChatGPT-3.5 over time. In pediatric osteosarcoma dataset, ChatGPT-3.5 accurately classified both grades and margin status with accuracy of 98.6% and 100% respectively. Our study shows the feasibility of using ChatGPT to process large volumes of clinical notes for structured information extraction without requiring extensive task-specific human annotation and model training. The results underscore the potential role of LLMs in transforming unstructured healthcare data into structured formats, thereby supporting research and aiding clinical decision-making.

2.
Oncotarget ; 10(57): 5958-5969, 2019 Oct 15.
Artigo em Inglês | MEDLINE | ID: mdl-31666927

RESUMO

Anal squamous cell carcinoma (ASCC) is a rare, potentially fatal malignancy primarily caused by high-risk human papillomaviruses (HPV). The prognostic implication of programmed death-ligand 1 (PD-L1) expression remains controversial, and glucose transporter 1 (GLUT1) expression has never been examined in ASCC. Covalently closed circular RNAs have recently been shown to be widespread in cancers and are proposed to be biomarkers. We discovered HPV16 expresses a circular E7 RNA (circE7) which has not been assessed as a potential biomarker. A retrospective, translational case series at UT Southwestern was conducted to analyze PD-L1, GLUT1, HPV-ISH, and HPV circE7 in relation to the clinical features and overall survival of patients with ASCC. Twenty-two (22) subjects were included in the study. Improved overall survival was predicted by basaloid histology ( p = 0.013), PD-L1 expression ( p = 0.08), and HPV-ISH positivity ( p & 0.001), but not GLUT1 expression. High levels of circE7 by quantitative RT-PCR predicted improved overall survival in ASCC ( p = 0.023) and analysis of The Cancer Genome Atlas sequencing from HPV-positive head and neck cancer and cervical cancer suggested high circE7 marked improved survival in 875 subjects ( p = 0.074). While our study suggests that circE7 levels correlate with improved survival in ASCC, larger, prospective studies are necessary to confirm the potential role of circE7 as a biomarker.

3.
Nat Genet ; 46(2): 200-4, 2014 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-24336170

RESUMO

The majority of reported complex disease associations for common genetic variants have been identified through meta-analysis, a powerful approach that enables the use of large sample sizes while protecting against common artifacts due to population structure and repeated small-sample analyses sharing individual-level data. As the focus of genetic association studies shifts to rare variants, genes and other functional units are becoming the focus of analysis. Here we propose and evaluate new approaches for performing meta-analysis of rare variant association tests, including burden tests, weighted burden tests, variable-threshold tests and tests that allow variants with opposite effects to be grouped together. We show that our approach retains useful features from single-variant meta-analysis approaches and demonstrate its use in a study of blood lipid levels in ∼18,500 individuals genotyped with exome arrays.


Assuntos
Estudos de Associação Genética/métodos , Variação Genética , Lipídeos/genética , Metanálise como Assunto , Projetos de Pesquisa , Interpretação Estatística de Dados , Exoma/genética , Genética Populacional , Genótipo , Humanos , Lipídeos/sangue , Modelos Genéticos , Método de Monte Carlo
4.
Biomed Res Int ; 2013: 865181, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24319692

RESUMO

BACKGROUND: Next generation sequencing (NGS) is being widely used to identify genetic variants associated with human disease. Although the approach is cost effective, the underlying data is susceptible to many types of error. Importantly, since NGS technologies and protocols are rapidly evolving, with constantly changing steps ranging from sample preparation to data processing software updates, it is important to enable researchers to routinely assess the quality of sequencing and alignment data prior to downstream analyses. RESULTS: Here we describe QPLOT, an automated tool that can facilitate the quality assessment of sequencing run performance. Taking standard sequence alignments as input, QPLOT generates a series of diagnostic metrics summarizing run quality and produces convenient graphical summaries for these metrics. QPLOT is computationally efficient, generates webpages for interactive exploration of detailed results, and can handle the joint output of many sequencing runs. CONCLUSION: QPLOT is an automated tool that facilitates assessment of sequence run quality. We routinely apply QPLOT to ensure quick detection of diagnostic of sequencing run problems. We hope that QPLOT will be useful to the community as well.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/normas , Software , Interpretação Estatística de Dados , Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , Humanos , Controle de Qualidade , Alinhamento de Sequência/normas , Alinhamento de Sequência/estatística & dados numéricos , Análise de Sequência de RNA/normas , Análise de Sequência de RNA/estatística & dados numéricos
5.
Am J Hum Genet ; 93(5): 891-9, 2013 Nov 07.
Artigo em Inglês | MEDLINE | ID: mdl-24210252

RESUMO

Estimates of the ancestry of specific chromosomal regions in admixed individuals are useful for studies of human evolutionary history and for genetic association studies. Previously, this ancestry inference relied on high-quality genotypes from genome-wide association study (GWAS) arrays. These high-quality genotypes are not always available when samples are exome sequenced, and exome sequencing is the strategy of choice for many ongoing genetic studies. Here we show that off-target reads generated during exome-sequencing experiments can be combined with on-target reads to accurately estimate the ancestry of each chromosomal segment in an admixed individual. To reconstruct local ancestry, our method SEQMIX models aligned bases directly instead of relying on hard genotype calls. We evaluate the accuracy of our method through simulations and analysis of samples sequenced by the 1000 Genomes Project and the NHLBI Grand Opportunity Exome Sequencing Project. In African Americans, we show that local-ancestry estimates derived by our method are very similar to those derived with Illumina's Omni 2.5M genotyping array and much improved in relation to estimates that use only exome genotypes and ignore off-target sequencing reads. Software implementing this method, SEQMIX, can be applied to analysis of human population history or used for genetic association studies in admixed individuals.


Assuntos
Exoma , Estudos de Associação Genética/métodos , Genética Populacional/métodos , Análise de Sequência de DNA/métodos , Negro ou Afro-Americano/genética , Algoritmos , Mapeamento Cromossômico , Simulação por Computador , Pesquisa Empírica , Genoma Humano , Genótipo , Humanos , Desequilíbrio de Ligação , Cadeias de Markov , Modelos Genéticos , Análise de Sequência com Séries de Oligonucleotídeos , Software
6.
J Phys Chem B ; 114(1): 36-41, 2010 Jan 14.
Artigo em Inglês | MEDLINE | ID: mdl-20000370

RESUMO

We developed a model system for blend polymers with electron-donating and -accepting compounds. It is found that the optimal energy conversion efficiency can be achieved when the feature size is around 10 nm. The first reaction method is used to describe the key processes (e.g., the generation, the diffusion, the dissociation at the interface for the excitons, the drift, the injection from the electrodes, and the collection by the electrodes for the charge carries) in the organic solar cell by the dynamic Monte Carlo simulation. Our simulations indicate that a 5% power conversion efficiency (PCE) is reachable with an optimum combination of charge mobility and morphology. The parameters used in this model study correspond to a blend of novel polymers (bis(thienylenevinylene)-substituted polythiophene and poly(perylene diimide-alt-dithienothiophene)), which features a broad absorption and a high mobility. The I-V curves are well-reproduced by our simulations, and the PCE for the polymer blend can reach up to 2.2%, which is higher than the experimental value (>1%), one of the best available experimental results up to now for the all-polymer solar cells. In addition, the dependency of PCE on the charge mobility and the material structure are also investigated.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA