RESUMO
Recent advances in nucleic acid sequencing now permit rapid and genome-scale analysis of genetic variation and transcription, enabling population-scale studies of human biology, disease, and diverse organisms. Likewise, advances in mass spectrometry proteomics now permit highly sensitive and accurate studies of protein expression at the whole proteome-scale. However, most proteomic studies rely on consensus databases to match spectra to peptide and protein sequences, and thus remain limited to the analysis of canonical protein sequences. Here, we develop ProteomeGenerator2 (PG2), based on the scalable and modular ProteomeGenerator framework. PG2 integrates genome and transcriptome sequencing to incorporate protein variants containing amino acid substitutions, insertions, and deletions, as well as noncanonical reading frames, exons, and other variants caused by genomic and transcriptomic variation. We benchmarked PG2 using synthetic data and genomic, transcriptomic, and proteomic analysis of human leukemia cells. PG2 can be integrated with current and emerging sequencing technologies, assemblers, variant callers, and mass spectral analysis algorithms, and is available open-source from https://github.com/kentsisresearchgroup/ProteomeGenerator2.
Assuntos
Proteogenômica , Humanos , Proteômica/métodos , Genômica/métodos , Espectrometria de Massas , PeptídeosRESUMO
Recent advances in nucleic acid sequencing now permit rapid and genome-scale analysis of genetic variation and transcription, enabling population-scale studies of human biology, disease, and diverse organisms. Likewise, advances in mass spectrometry proteomics now permit highly sensitive and accurate studies of protein expression at the whole proteome-scale. However, most proteomic studies rely on consensus databases to match spectra to peptide and proteins sequences, and thus remain limited to the analysis of canonical protein sequences. Here, we develop ProteomeGenerator2 (PG2), based on the scalable and modular ProteomeGenerator framework. PG2 integrates genome and transcriptome sequencing to incorporate protein variants containing amino acid substitutions, insertions, and deletions, as well as non-canonical reading frames, exons, and other variants caused by genomic and transcriptomic variation. We benchmarked PG2 using synthetic data and genomic, transcriptomic, and proteomic analysis of human leukemia cells. PG2 can be integrated with current and emerging sequencing technologies, assemblers, variant callers, and mass spectral analysis algorithms, and is available open-source from https://github.com/kentsisresearchgroup/ProteomeGenerator2 .
RESUMO
PURPOSE: Non-convulsive seizures are common in critically ill patients, and delays in diagnosis contribute to increased morbidity and mortality. Many intensive care units employ continuous EEG (cEEG) for seizure monitoring. Although cEEG is continuously recorded, it is often reviewed intermittently, which may delay seizure diagnosis and treatment. This may be mitigated with automated seizure detection. In this study, we develop and evaluate convolutional neural networks (CNN) to automate seizure detection on EEG spectrograms. METHODS: Adult EEGs (12 patients, 12 EEGs, 33 seizures) from New-York Presbyterian Hospital (NYP) and pediatric EEGs (22 patients, 130 EEGs, 177 seizures) from Children's Hospital Boston (CHB) were converted into spectrograms. To simulate a telemetry display, seizure and non-seizure events on spectrograms were sequentially sampled as images across a detection window (26,380 total images). Four CNN models of increasing complexity (number of layers) were trained, cross-validated, and tested on CHB and NYP spectrographic images. All CNNs were based on the VGG-net architecture, with adjustments to alleviate overfitting. RESULTS: For spectrographically visible seizures, two CNN models (containing 4 and 7 convolution layers) achieved >90% seizure detection sensitivity and specificity on the CHB test set and >90% sensitivity and 75-80% specificity on the NYP test set. The one CNN model (10 convolution layers) did not converge during training; while another CNN (2 convolution layers) performed poorly (60% sensitivity and 32% specificity) on the NYP test set. CONCLUSIONS: Seizure detection on EEG spectrograms with CNN models is feasible with sensitivity and specificity potentially suitable for clinical use.