Your browser doesn't support javascript.
loading
: 20 | 50 | 100
1 - 3 de 3
1.
Microbiol Spectr ; 10(2): e0256421, 2022 04 27.
Article En | MEDLINE | ID: mdl-35234489

Next-generation sequencing (NGS) is a powerful tool for detecting and investigating viral pathogens; however, analysis and management of the enormous amounts of data generated from these technologies remains a challenge. Here, we present VPipe (the Viral NGS Analysis Pipeline and Data Management System), an automated bioinformatics pipeline optimized for whole-genome assembly of viral sequences and identification of diverse species. VPipe automates the data quality control, assembly, and contig identification steps typically performed when analyzing NGS data. Users access the pipeline through a secure web-based portal, which provides an easy-to-use interface with advanced search capabilities for reviewing results. In addition, VPipe provides a centralized system for storing and analyzing NGS data, eliminating common bottlenecks in bioinformatics analyses for public health laboratories with limited on-site computational infrastructure. The performance of VPipe was validated through the analysis of publicly available NGS data sets for viral pathogens, generating high-quality assemblies for 12 data sets. VPipe also generated assemblies with greater contiguity than similar pipelines for 41 human respiratory syncytial virus isolates and 23 SARS-CoV-2 specimens. IMPORTANCE Computational infrastructure and bioinformatics analysis are bottlenecks in the application of NGS to viral pathogens. As of September 2021, VPipe has been used by the U.S. Centers for Disease Control and Prevention (CDC) and 12 state public health laboratories to characterize >17,500 and 1,500 clinical specimens and isolates, respectively. VPipe automates genome assembly for a wide range of viruses, including high-consequence pathogens such as SARS-CoV-2. Such automated functionality expedites public health responses to viral outbreaks and pathogen surveillance.


COVID-19 , Viruses , Computational Biology/methods , High-Throughput Nucleotide Sequencing/methods , Humans , SARS-CoV-2/genetics , Viruses/genetics
2.
Stud Health Technol Inform ; 264: 1041-1045, 2019 Aug 21.
Article En | MEDLINE | ID: mdl-31438083

Natural language processing (NLP) technologies have been successfully applied to cancer research by enabling automated phenotypic information extraction from narratives in electronic health records (EHRs) such as pathology reports; however, developing customized NLP solutions requires substantial effort. To facilitate the adoption of NLP in cancer research, we have developed a set of customizable modules for extracting comprehensive types of cancer-related information in pathology reports (e.g., tumor size, tumor stage, and biomarkers), by leveraging the existing CLAMP system, which provides user-friendly interfaces for building customized NLP solutions for individual needs. Evaluation using annotated data at Vanderbilt University Medical Center showed that CLAMP-Cancer could extract diverse types of cancer information with good F-measures (0.80-0.98). We then applied CLAMP-Cancer to an information extraction task at Mayo Clinic and showed that we can quickly build a customized NLP system with comparable performance with an existing system at Mayo Clinic. CLAMP-Cancer is freely available for academic use.


Information Storage and Retrieval , Neoplasms , Electronic Health Records , Humans , Natural Language Processing , Research Report
3.
Article En | MEDLINE | ID: mdl-30238070

Purpose: The systemic treatment of cancer is primarily through the administration of complex chemotherapy protocols. To date, this knowledge has not been systematized, because of the lack of a consistent nomenclature and the variation in which regimens are documented. For example, recording of treatment events in electronic health record notes is often through shorthand and acronyms, limiting secondary use. A standardized hierarchic ontology of cancer treatments, mapped to standard nomenclatures, would be valuable to a variety of end users. Methods: We leveraged the knowledge contained in a large wiki of hematology/oncology drugs and treatment regimens, HemOnc.org. Through algorithmic parsing, we created a hierarchic ontology of treatment concepts in the World Wide Web Consortium Web Ontology Language. We also mapped drug names to RxNorm codes and created optional filters to restrict the ontology by disease and/or drug class. Results: As of December 2017, the main ontology includes 30,526 axioms (eg, doxorubicin is an anthracycline), 1,196 classes (eg, regimens used in the neoadjuvant treatment of human epidermal growth factor receptor 2-positive breast cancer, nitrogen mustards), and 1,728 individual entities. More than 13,000 of the axioms are annotations including RxNorm codes, drug synonyms, literature references, and direct links to published articles. Conclusion: This approach represents, to our knowledge, the largest effort to date to systematically categorize and relate hematology/oncology drugs and regimens. The ontology can be used to reason individual components from regimens mentioned in electronic health records (eg, R-CHOP maps to rituximab, cyclophosphamide, doxorubicin, vincristine, and prednisone) and also to probabilistically reconstruct regimens from individual drug components. These capabilities may be particularly valuable in the implementation of rapid-learning health systems on the basis of real-world evidence. The derived Web Ontology Language ontology is freely available for noncommercial use through the Creative Commons 4.0 Attribution-NonCommercial-ShareAlike license.

...