Your browser doesn't support javascript.
loading
Language model enables end-to-end accurate detection of cancer from cell-free DNA.
Shen, Hongru; Liu, Jilei; Chen, Kexin; Li, Xiangchun.
Afiliación
  • Shen H; Tianjin Cancer Institute, Tianjin's Clinical Research Center for Cancer, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China.
  • Liu J; Tianjin Cancer Institute, Tianjin's Clinical Research Center for Cancer, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China.
  • Chen K; Department of Epidemiology and Biostatistics, Key Laboratory of Molecular Cancer Epidemiology of Tianjin, Tianjin's Clinical Research Center for Cancer, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China.
  • Li X; Tianjin Cancer Institute, Tianjin's Clinical Research Center for Cancer, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China.
Brief Bioinform ; 25(2)2024 Jan 22.
Article en En | MEDLINE | ID: mdl-38385880
ABSTRACT
We present a language model Affordable Cancer Interception and Diagnostics (ACID) that can achieve high classification performance in the diagnosis of cancer exclusively from using raw cfDNA sequencing reads. We formulate ACID as an autoregressive language model. ACID is pretrained with language sentences that are obtained from concatenation of raw sequencing reads and diagnostic labels. We benchmark ACID against three methods. On testing set subjected to whole-genome sequencing, ACID significantly outperforms the best benchmarked method in diagnosis of cancer [Area Under the Receiver Operating Curve (AUROC), 0.924 versus 0.853; P < 0.001] and detection of hepatocellular carcinoma (AUROC, 0.981 versus 0.917; P < 0.001). ACID can achieve high accuracy with just 10 000 reads per sample. Meanwhile, ACID achieves the best performance on testing sets that were subjected to bisulfite sequencing compared with benchmarked methods. In summary, we present an affordable, simple yet efficient end-to-end paradigm for cancer detection using raw cfDNA sequencing reads.
Asunto(s)
Palabras clave

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Carcinoma Hepatocelular / Ácidos Nucleicos Libres de Células / Neoplasias Hepáticas Límite: Humans Idioma: En Revista: Brief Bioinform Asunto de la revista: BIOLOGIA / INFORMATICA MEDICA Año: 2024 Tipo del documento: Article País de afiliación: China

Texto completo: 1 Banco de datos: MEDLINE Asunto principal: Carcinoma Hepatocelular / Ácidos Nucleicos Libres de Células / Neoplasias Hepáticas Límite: Humans Idioma: En Revista: Brief Bioinform Asunto de la revista: BIOLOGIA / INFORMATICA MEDICA Año: 2024 Tipo del documento: Article País de afiliación: China