Search | VHL Regional Portal

PhenoPad: Building AI enabled note-taking interfaces for patient encounters.

Wang, Jixuan; Yang, Jingbo; Zhang, Haochi; Lu, Helen; Skreta, Marta; Husic, Mia; Arbabi, Aryan; Sultanum, Nicole; Brudno, Michael.

NPJ Digit Med ; 5(1): 12, 2022 Jan 27.

Article in English | MEDLINE | ID: mdl-35087180

ABSTRACT

Current clinical note-taking approaches cannot capture the entirety of information available from patient encounters and detract from patient-clinician interactions. By surveying healthcare providers' current note-taking practices and attitudes toward new clinical technologies, we developed a patient-centered paradigm for clinical note-taking that makes use of hybrid tablet/keyboard devices and artificial intelligence (AI) technologies. PhenoPad is an intelligent clinical note-taking interface that captures free-form notes and standard phenotypic information via a variety of modalities, including speech and natural language processing techniques, handwriting recognition, and more. The output is unobtrusively presented on mobile devices to clinicians for real-time validation and can be automatically transformed into digital formats that would be compatible with integration into electronic health record systems. Semi-structured interviews and trials in clinical settings rendered positive feedback from both clinicians and patients, demonstrating that AI-enabled clinical note-taking under our design improves ease and breadth of information captured during clinical visits without compromising patient-clinician interactions. We open source a proof-of-concept implementation that can lay the foundation for broader clinical use cases.

Automatically disambiguating medical acronyms with ontology-aware deep learning.

Skreta, Marta; Arbabi, Aryan; Wang, Jixuan; Drysdale, Erik; Kelly, Jacob; Singh, Devin; Brudno, Michael.

Nat Commun ; 12(1): 5319, 2021 09 07.

Article in English | MEDLINE | ID: mdl-34493718

ABSTRACT

Modern machine learning (ML) technologies have great promise for automating diverse clinical and research workflows; however, training them requires extensive hand-labelled datasets. Disambiguating abbreviations is important for automated clinical note processing; however, broad deployment of ML for this task is restricted by the scarcity and imbalance of labeled training data. In this work we present a method that improves a model's ability to generalize through novel data augmentation techniques that utilizes information from biomedical ontologies in the form of related medical concepts, as well as global context information within the medical note. We train our model on a public dataset (MIMIC III) and test its performance on automatically generated and hand-labelled datasets from different sources (MIMIC III, CASI, i2b2). Together, these techniques boost the accuracy of abbreviation disambiguation by up to 17% on hand-labeled data, without sacrificing performance on a held-out test set from MIMIC III.

Subject(s)

Data Mining/methods , Deep Learning , Terminology as Topic , Biomedical Research , Datasets as Topic , Humans

Identifying Clinical Terms in Medical Text Using Ontology-Guided Machine Learning.

Arbabi, Aryan; Adams, David R; Fidler, Sanja; Brudno, Michael.

JMIR Med Inform ; 7(2): e12596, 2019 May 10.

Article in English | MEDLINE | ID: mdl-31094361

ABSTRACT

BACKGROUND: Automatic recognition of medical concepts in unstructured text is an important component of many clinical and research applications, and its accuracy has a large impact on electronic health record analysis. The mining of medical concepts is complicated by the broad use of synonyms and nonstandard terms in medical documents. OBJECTIVE: We present a machine learning model for concept recognition in large unstructured text, which optimizes the use of ontological structures and can identify previously unobserved synonyms for concepts in the ontology. METHODS: We present a neural dictionary model that can be used to predict if a phrase is synonymous to a concept in a reference ontology. Our model, called the Neural Concept Recognizer (NCR), uses a convolutional neural network to encode input phrases and then rank medical concepts based on the similarity in that space. It uses the hierarchical structure provided by the biomedical ontology as an implicit prior embedding to better learn embedding of various terms. We trained our model on two biomedical ontologies-the Human Phenotype Ontology (HPO) and Systematized Nomenclature of Medicine - Clinical Terms (SNOMED-CT). RESULTS: We tested our model trained on HPO by using two different data sets: 288 annotated PubMed abstracts and 39 clinical reports. We achieved 1.7%-3% higher F1-scores than those for our strongest manually engineered rule-based baselines (P=.003). We also tested our model trained on the SNOMED-CT by using 2000 Intensive Care Unit discharge summaries from MIMIC (Multiparameter Intelligent Monitoring in Intensive Care) and achieved 0.9%-1.3% higher F1-scores than those of our baseline. The results of our experiments show high accuracy of our model as well as the value of using the taxonomy structure of the ontology in concept recognition. CONCLUSION: Most popular medical concept recognizers rely on rule-based models, which cannot generalize well to unseen synonyms. In addition, most machine learning methods typically require large corpora of annotated text that cover all classes of concepts, which can be extremely difficult to obtain for biomedical ontologies. Without relying on large-scale labeled training data or requiring any custom training, our model can be efficiently generalized to new synonyms and performs as well or better than state-of-the-art methods custom built for specific ontologies.

Noninvasive Prenatal Detection of Trisomy 21 by Targeted Semiconductor Sequencing: A Technical Feasibility Study.

Xi, Yanwei; Arbabi, Aryan; McNaughton, Amy J M; Hamilton, Alison; Hull, Danna; Perras, Helene; Chiu, Tillie; Morrison, Shawna; Goldsmith, Claire; Creede, Emilie; Anger, Gregory J; Honeywell, Christina; Cloutier, Mireille; Macchio, Natasha; Kiss, Courtney; Liu, Xudong; Crocker, Susan; Davies, Gregory A; Brudno, Michael; Armour, Christine M.

Fetal Diagn Ther ; 42(4): 302-310, 2017.

Article in English | MEDLINE | ID: mdl-28511174

ABSTRACT

OBJECTIVE: To develop an alternate noninvasive prenatal testing method for the assessment of trisomy 21 (T21) using a targeted semiconductor sequencing approach. METHODS: A customized AmpliSeq panel was designed with 1,067 primer pairs targeting specific regions on chromosomes 21, 18, 13, and others. A total of 235 samples, including 30 affected with T21, were sequenced with an Ion Torrent Proton sequencer, and a method was developed for assessing the probability of fetal aneuploidy via derivation of a risk score. RESULTS: Application of the derived risk score yields a bimodal distribution, with the affected samples clustering near 1.0 and the unaffected near 0. For a risk score cutoff of 0.345, above which all would be considered at "high risk," all 30 T21-positive pregnancies were correctly predicted to be affected, and 199 of the 205 non-T21 samples were correctly predicted. The average hands-on time spent on library preparation and sequencing was 19 h in total, and the average number of reads of sequence obtained was 3.75 million per sample. CONCLUSION: With the described targeted sequencing approach on the semiconductor platform using a custom-designed library and a probabilistic statistical approach, we have demonstrated the feasibility of an alternate method of assessment for fetal T21.

Subject(s)

Down Syndrome/diagnosis , Maternal Serum Screening Tests , Sequence Analysis, DNA , Adult , Feasibility Studies , Female , Humans , Middle Aged , Pregnancy , Young Adult

Cell-free DNA fragment-size distribution analysis for non-invasive prenatal CNV prediction.

Arbabi, Aryan; Rampásek, Ladislav; Brudno, Michael.

Bioinformatics ; 32(11): 1662-9, 2016 06 01.

Article in English | MEDLINE | ID: mdl-27153615

ABSTRACT

BACKGROUND: Non-invasive detection of aneuploidies in a fetal genome through analysis of cell-free DNA circulating in the maternal plasma is becoming a routine clinical test. Such tests, which rely on analyzing the read coverage or the allelic ratios at single-nucleotide polymorphism (SNP) loci, are not sensitive enough for smaller sub-chromosomal abnormalities due to sequencing biases and paucity of SNPs in a genome. RESULTS: We have developed an alternative framework for identifying sub-chromosomal copy number variations in a fetal genome. This framework relies on the size distribution of fragments in a sample, as fetal-origin fragments tend to be smaller than those of maternal origin. By analyzing the local distribution of the cell-free DNA fragment sizes in each region, our method allows for the identification of sub-megabase CNVs, even in the absence of SNP positions. To evaluate the accuracy of our method, we used a plasma sample with the fetal fraction of 13%, down-sampled it to samples with coverage of 10X-40X and simulated samples with CNVs based on it. Our method had a perfect accuracy (both specificity and sensitivity) for detecting 5 Mb CNVs, and after reducing the fetal fraction (to 11%, 9% and 7%), it could correctly identify 98.82-100% of the 5 Mb CNVs and had a true-negative rate of 95.29-99.76%. AVAILABILITY AND IMPLEMENTATION: Our source code is available on GitHub at https://github.com/compbio-UofT/FSDA CONTACT: : brudno@cs.toronto.edu.

Subject(s)

DNA Copy Number Variations , Aneuploidy , DNA , Humans , Polymorphism, Single Nucleotide , Prenatal Diagnosis , Sequence Analysis, DNA

ARYANA: Aligning Reads by Yet Another Approach.

Gholami, Milad; Arbabi, Aryan; Sharifi-Zarchi, Ali; Chitsaz, Hamidreza; Sadeghi, Mehdi.

BMC Bioinformatics ; 15 Suppl 9: S12, 2014.

Article in English | MEDLINE | ID: mdl-25252881

ABSTRACT

MOTIVATION: Although there are many different algorithms and software tools for aligning sequencing reads, fast gapped sequence search is far from solved. Strong interest in fast alignment is best reflected in the $10(6) prize for the Innocentive competition on aligning a collection of reads to a given database of reference genomes. In addition, de novo assembly of next-generation sequencing long reads requires fast overlap-layout-concensus algorithms which depend on fast and accurate alignment. CONTRIBUTION: We introduce ARYANA, a fast gapped read aligner, developed on the base of BWA indexing infrastructure with a completely new alignment engine that makes it significantly faster than three other aligners: Bowtie2, BWA and SeqAlto, with comparable generality and accuracy. Instead of the time-consuming backtracking procedures for handling mismatches, ARYANA comes with the seed-and-extend algorithmic framework and a significantly improved efficiency by integrating novel algorithmic techniques including dynamic seed selection, bidirectional seed extension, reset-free hash tables, and gap-filling dynamic programming. As the read length increases ARYANA's superiority in terms of speed and alignment rate becomes more evident. This is in perfect harmony with the read length trend as the sequencing technologies evolve. The algorithmic platform of ARYANA makes it easy to develop mission-specific aligners for other applications using ARYANA engine. AVAILABILITY: ARYANA with complete source code can be obtained from http://github.com/aryana-aligner.

Subject(s)

Sequence Alignment/methods , Sequence Analysis, DNA/methods , Software , Algorithms , Genome, Human , High-Throughput Nucleotide Sequencing/economics , High-Throughput Nucleotide Sequencing/methods , Humans , Sequence Alignment/economics , Sequence Analysis, DNA/economics

Probabilistic method for detecting copy number variation in a fetal genome using maternal plasma sequencing.

Rampásek, Ladislav; Arbabi, Aryan; Brudno, Michael.

Bioinformatics ; 30(12): i212-8, 2014 Jun 15.

Article in English | MEDLINE | ID: mdl-24931986

ABSTRACT

MOTIVATION: The past several years have seen the development of methodologies to identify genomic variation within a fetus through the non-invasive sequencing of maternal blood plasma. These methods are based on the observation that maternal plasma contains a fraction of DNA (typically 5-15%) originating from the fetus, and such methodologies have already been used for the detection of whole-chromosome events (aneuploidies), and to a more limited extent for smaller (typically several megabases long) copy number variants (CNVs). RESULTS: Here we present a probabilistic method for non-invasive analysis of de novo CNVs in fetal genome based on maternal plasma sequencing. Our novel method combines three types of information within a unified Hidden Markov Model: the imbalance of allelic ratios at SNP positions, the use of parental genotypes to phase nearby SNPs and depth of coverage to better differentiate between various types of CNVs and improve precision. Our simulation results, based on in silico introduction of novel CNVs into plasma samples with 13% fetal DNA concentration, demonstrate a sensitivity of 90% for CNVs >400 kb (with 13 calls in an unaffected genome), and 40% for 50-400 kb CNVs (with 108 calls in an unaffected genome). AVAILABILITY AND IMPLEMENTATION: Implementation of our model and data simulation method is available at http://github.com/compbio-UofT/fCNV.

Subject(s)

DNA Copy Number Variations , Fetus , Genetic Testing/methods , Genome, Human , Prenatal Diagnosis/methods , Sequence Analysis, DNA/methods , Algorithms , Computer Simulation , DNA/blood , Female , Genotype , Humans , Male , Models, Statistical , Polymorphism, Single Nucleotide , Pregnancy

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL