RESUMO
Objectives: Anaphylaxis is a severe life-threatening allergic reaction, and its accurate identification in healthcare databases can harness the potential of "Big Data" for healthcare or public health purposes. Materials and methods: This study used claims data obtained between October 1, 2015 and February 28, 2019 from the CMS database to examine the utility of machine learning in identifying incident anaphylaxis cases. We created a feature selection pipeline to identify critical features between different datasets. Then a variety of unsupervised and supervised methods were used (eg, Sammon mapping and eXtreme Gradient Boosting) to train models on datasets of differing data quality, which reflects the varying availability and potential rarity of ground truth data in medical databases. Results: Resulting machine learning model accuracies ranged from 47.7% to 94.4% when tested on ground truth data. Finally, we found new features to help experts enhance existing case-finding algorithms. Discussion: Developing precise algorithms to detect medical outcomes in claims can be a laborious and expensive process, particularly for conditions presented and coded diversely. We found it beneficial to filter out highly potent codes used for data curation to identify underlying patterns and features. To improve rule-based algorithms where necessary, researchers could use model explainers to determine noteworthy features, which could then be shared with experts and included in the algorithm. Conclusion: Our work suggests machine learning models can perform at similar levels as a previously published expert case-finding algorithm, while also having the potential to improve performance or streamline algorithm construction processes by identifying new relevant features for algorithm construction.
RESUMO
In 2020, Novartis Pharmaceuticals Corporation and the U.S. Food and Drug Administration (FDA) started a 4-year scientific collaboration to approach complex new data modalities and advanced analytics. The scientific question was to find novel radio-genomics-based prognostic and predictive factors for HR+/HER- metastatic breast cancer under a Research Collaboration Agreement. This collaboration has been providing valuable insights to help successfully implement future scientific projects, particularly using artificial intelligence and machine learning. This tutorial aims to provide tangible guidelines for a multi-omics project that includes multidisciplinary expert teams, spanning across different institutions. We cover key ideas, such as "maintaining effective communication" and "following good data science practices," followed by the four steps of exploratory projects, namely (1) plan, (2) design, (3) develop, and (4) disseminate. We break each step into smaller concepts with strategies for implementation and provide illustrations from our collaboration to further give the readers actionable guidance.
Assuntos
Inteligência Artificial , Multiômica , Humanos , Aprendizado de Máquina , GenômicaRESUMO
Objective: Anaphylaxis is a severe life-threatening allergic reaction, and its accurate identification in healthcare databases can harness the potential of "Big Data" for healthcare or public health purposes. Methods: This study used claims data obtained between October 1, 2015 and February 28, 2019 from the CMS database to examine the utility of machine learning in identifying incident anaphylaxis cases. We created a feature selection pipeline to identify critical features between different datasets. Then a variety of unsupervised and supervised methods were used (eg, Sammon mapping and eXtreme Gradient Boosting) to train models on datasets of differing data quality, which reflects the varying availability and potential rarity of ground truth data in medical databases. Results: Resulting machine learning model accuracies ranged between 47.7% and 94.4% when tested on ground truth data. Finally, we found new features to help experts enhance existing case-finding algorithms. Discussion: Developing precise algorithms to detect medical outcomes in claims can be a laborious and expensive process, particularly for conditions presented and coded diversely. We found it beneficial to filter out highly potent codes used for data curation to identify underlying patterns and features. To improve rule-based algorithms where necessary, researchers could use model explainers to determine noteworthy features, which could then be shared with experts and included in the algorithm. Conclusion: Our work suggests machine learning models can perform at similar levels as a previously published expert case-finding algorithm, while also having the potential to improve performance or streamline algorithm construction processes by identifying new relevant features for algorithm construction.
RESUMO
Although phagocytic cells are documented targets of Leishmania parasites, it is unclear whether other cell types can be infected. Here, we use unbiased single-cell RNA sequencing (scRNA-seq) to simultaneously analyze host cell and Leishmania donovani transcriptomes to identify and annotate parasitized cells in spleen and bone marrow in chronically infected mice. Our dual-scRNA-seq methodology allows the detection of heterogeneous parasitized populations. In the spleen, monocytes and macrophages are the dominant parasitized cells, while megakaryocytes, basophils, and natural killer (NK) cells are found to be unexpectedly infected. In the bone marrow, the hematopoietic stem cells (HSCs) expressing phagocytic receptors FcγR and CD93 are the main parasitized cells. Additionally, we also detect parasitized cycling basal cells, eosinophils, and macrophages in chronically infected mice. Flow cytometric analysis confirms the presence of parasitized HSCs. Our unbiased dual-scRNA-seq method identifies rare, parasitized cells, potentially implicated in pathogenesis, persistence, and protective immunity, using a non-targeted approach.
RESUMO
Macrophages are a heterogeneous population of mononuclear phagocytes abundantly distributed throughout the intestinal compartments that adapt to microenvironmental specific cues. In adult mice, the majority of intestinal macrophages exhibit a mature phenotype and are derived from blood monocytes. In the steady-state, replenishment of these cells is reduced in the absence of the chemokine receptor CCR2. Within the intestine of mice with colitis, there is a marked increase in the accumulation of immature macrophages that demonstrate an inflammatory phenotype. Here, we asked whether CCR2 is necessary for the development of colitis in mice lacking the receptor for IL10. We compared the development of intestinal inflammation in mice lacking IL10RA or both IL10RA and CCR2. The absence of CCR2 interfered with the accumulation of immature macrophages in IL10R-deficient mice, including a novel population of rounded submucosal Iba1+ cells, and reduced the severity of colitis in these mice. In contrast, the absence of CCR2 did not reduce the augmented inflammatory gene expression observed in mature intestinal macrophages isolated from mice lacking IL10RA. These data suggest that both newly recruited CCR2-dependent immature macrophages and CCR2-independent residual mature macrophages contribute to the development of intestinal inflammation observed in IL10R-deficient mice.
Assuntos
Colite/imunologia , Subunidade alfa de Receptor de Interleucina-10/imunologia , Intestinos/imunologia , Monócitos/imunologia , Receptores CCR2/imunologia , Animais , Colite/genética , Feminino , Humanos , Subunidade alfa de Receptor de Interleucina-10/genética , Macrófagos/imunologia , Masculino , Camundongos , Camundongos Knockout , Receptores CCR2/genéticaRESUMO
Autophagy drives drug resistance and drug-induced cancer cell cytotoxicity. Targeting the autophagy process could greatly improve chemotherapy outcomes. The discovery of specific inhibitors or activators has been hindered by challenges with reliably measuring autophagy levels in a clinical setting. We investigated drug-induced autophagy in breast cancer cell lines with differing ER/PR/Her2 receptor status by exposing them to known but divergent autophagy inducers each with a unique molecular target, tamoxifen, trastuzumab, bortezomib or rapamycin. Differential gene expression analysis from total RNA extracted during the earliest sign of autophagy flux showed both cell- and drug-specific changes. We analyzed the list of differentially expressed genes to find a common, cell- and drug-agnostic autophagy signature. Twelve mRNAs were significantly modulated by all the drugs and 11 were orthogonally verified with Q-RT-PCR (Klhl24, Hbp1, Crebrf, Ypel2, Fbxo32, Gdf15, Cdc25a, Ddit4, Psat1, Cd22, Ypel3). The drug agnostic mRNA signature was similarly induced by a mitochondrially targeted agent, MitoQ. In-silico analysis on the KM-plotter cancer database showed that the levels of these mRNAs are detectable in human samples and associated with breast cancer prognosis outcomes of Relapse-Free Survival in all patients (RSF), Overall Survival in all patients (OS), and Relapse-Free Survival in ER+ Patients (RSF ER+). High levels of Klhl24, Hbp1, Crebrf, Ypel2, CD22 and Ypel3 were correlated with better outcomes, whereas lower levels of Gdf15, Cdc25a, Ddit4 and Psat1 were associated with better prognosis in breast cancer patients. This gene signature uncovers candidate autophagy biomarkers that could be tested during preclinical and clinical studies to monitor the autophagy process.
Assuntos
Antineoplásicos/farmacologia , Biomarcadores Tumorais/genética , Neoplasias da Mama/genética , Perfilação da Expressão Gênica/métodos , Redes Reguladoras de Genes , Antineoplásicos/uso terapêutico , Autofagia/efeitos dos fármacos , Bortezomib/farmacologia , Bortezomib/uso terapêutico , Neoplasias da Mama/tratamento farmacológico , Linhagem Celular Tumoral , Resistencia a Medicamentos Antineoplásicos , Feminino , Regulação Neoplásica da Expressão Gênica/efeitos dos fármacos , Redes Reguladoras de Genes/efeitos dos fármacos , Humanos , Células MCF-7 , Compostos Organofosforados/farmacologia , Compostos Organofosforados/uso terapêutico , Receptor ErbB-2/genética , Receptores de Estrogênio/genética , Receptores de Progesterona/genética , Análise de Sequência de RNA , Sirolimo/farmacologia , Sirolimo/uso terapêutico , Tamoxifeno/farmacologia , Tamoxifeno/uso terapêutico , Trastuzumab/farmacologia , Trastuzumab/uso terapêutico , Ubiquinona/análogos & derivados , Ubiquinona/farmacologia , Ubiquinona/uso terapêuticoRESUMO
BACKGROUND: Accurate detection of somatic mutations is challenging but critical in understanding cancer formation, progression, and treatment. We recently proposed NeuSomatic, the first deep convolutional neural network-based somatic mutation detection approach, and demonstrated performance advantages on in silico data. RESULTS: In this study, we use the first comprehensive and well-characterized somatic reference data sets from the SEQC2 consortium to investigate best practices for using a deep learning framework in cancer mutation detection. Using the high-confidence somatic mutations established for a cancer cell line by the consortium, we identify the best strategy for building robust models on multiple data sets derived from samples representing real scenarios, for example, a model trained on a combination of real and spike-in mutations had the highest average performance. CONCLUSIONS: The strategy identified in our study achieved high robustness across multiple sequencing technologies for fresh and FFPE DNA input, varying tumor/normal purities, and different coverages, with significant superiority over conventional detection approaches in general, as well as in challenging situations such as low coverage, low variant allele frequency, DNA damage, and difficult genomic regions.
Assuntos
Aprendizado Profundo , Neoplasias , Genômica , Humanos , Mutação , Neoplasias/genética , Redes Neurais de ComputaçãoRESUMO
Asthma diagnosis and management remains a challenging task for the medical community. The aim of the present study was to present the functional and inflammatory profiles of patients with difficult-to-treat asthma in a real-life clinical setting referred to the specialized asthma clinic at the University Hospital of Heraklion. The registry included a cohort of 267 patients who were referred to the severe asthma clinic. Patients were assessed with emphasis on the history of allergies, nasal polyposis or other comorbidities. Blood testing for eosinophils counts and total and specific IgE, and pulmonary function tests were performed at baseline. The median age of patients with asthma was 55 years old, 68.5% were women and 58.3% were never smokers. The vast majority presented with late onset asthma (75.7%), whereas eight (3%) patients were on oral corticosteroids. The median number of exacerbations during the last 12 months was 1 (0-3). Furthermore, 50.7% of patients had a positive serum allergy test, the median eosinophil count was 300 (188-508.5) cells/µl of blood and median total IgE level was 117.5 (29.4-360.5) IU/ml. Patients were retrospectively grouped in the following categories: Group 1, mild-moderate asthma; group 2, patients prescribed a step 4 or 5 asthma therapy according to Global Initiative for Asthma; and group 3, patients on biologic agents. Group 1 had significantly higher FEV1% than groups 2 and 3 (93.4 vs. 79.9 and 79.4%, respectively; P<0.001). Finally, the median Asthma Control Questionnaire 7 (ACQ7) score was 1.14, with patients from groups 2 and 3 presenting higher ACQ7 scores compared with group 1 patients as expected (1.1 and 2.1 vs. 0.7, respectively; P<0.001). To the best of our knowledge, this was the first real-life asthma study in Crete that demonstrated that severe asthmatics predominantly have late-onset asthma with airflow obstruction and uncontrolled symptoms.
RESUMO
Acute hypoxemic respiratory failure is the principal cause of hospitalization, invasive mechanical ventilation and death in severe COVID-19 infection. Nearly half of intubated patients with COVID-19 eventually die. High-Flow Nasal Oxygen (HFNO) and Noninvasive Ventilation (NIV) constitute valuable tools to avert endotracheal intubation in patients with severe COVID-19 pneumonia who do not respond to conventional oxygen treatment. Sparing Intensive Care Unit beds and reducing intubation-related complications may save lives in the pandemic era. The main drawback of HFNO and/or NIV is intubation delay. Cautious selection of patients with severe hypoxemia due to COVID-19 disease, close monitoring and appropriate employment and titration of HFNO and/or NIV can increase the rate of success and eliminate the risk of intubation delay. At the same time, all precautions to protect the healthcare personnel from viral transmission should be taken. In this review, we summarize the evidence supporting the application of HFNO and NIV in severe COVID-19 hypoxemic respiratory failure, analyse the risks associated with their use and provide a path for their proper implementation.
RESUMO
Acute fibrinous and organizing pneumonia (AFOP) is an entity that can be secondary to various conditions leading to lung injury, such as infections, malignancies, and various autoimmune conditions or idiopathic interstitial lung disease, when no obvious underlying cause is identified. Myelodysplastic syndromes (MDS), on the other hand, are a spectrum of clonal myeloid disorders, with a higher risk of acute leukemia, characterized by ineffective bone marrow (BM) hematopoiesis and, thus, peripheral blood (PB) cytopenias. Immune deregulation is thought to take part in the pathophysiology of the disease, including abnormal T and/or B cell responses, innate immunity, and cytokine expression. In the literature, there are a few case reports of patients with MDS that have presented pulmonary infiltrates and were diagnosed as having AFOP or organizing pneumonia (OP). It is rare, though, to have isolated pulmonary infiltrates without Sweet's syndrome or even the pulmonary infiltrates to precede the diagnosis and treatment of MDS, which was our case. We present a 72-year-old female developing new lung infiltrates refractory to antibiotic treatment that responded well to corticosteroids and was histologically described as having OP. The treatment was gradually successfully switched to mycophenolate mofetil (MMF). The patient was later diagnosed with MDS. This interesting case report suggests firstly that a diagnosis of AFOP or OP should alert the clinician to search for an underlying cause including MDS and vice versa, the use of systemic steroids should not be postponed, and, finally, that MMF can successfully be used in these patients.
RESUMO
INTRODUCTION: During the first COVID-19 wave, a considerable decline in hospital admissions was observed worldwide. AIM: This retrospective cohort study aimed to assess if there were any changes in the number of patients hospitalized for respiratory diseases in Greece during the first CO-VID-19 wave. METHODS: In the present study, we evaluated respiratory disease hospitalization rates across 9 tertiary hospitals in Greece during the study period (March-April 2020) and the corresponding period of the 2 previous years (2018-2019) that served as the control periods. Demographic data and discharge diagnosis were documented for every patient. RESULTS: Of the 1,307 patients who were hospitalized during the study period, 444 (35.5%) were males with a mean (±SD) age of 66.1 ± 16.6 years. There was a 47 and 46% reduction in all-cause respiratory morbidity compared to the corresponding periods of 2018 and 2019, respectively. The mean incidence rate for respiratory diseases during the study period was 21.4 admissions per day, and this rate was significantly lower than the rate during the same period in 2018 (40.8 admissions per day; incidence rate ratio [IRR], 0.525; 95% confidence interval [CI], 0.491-0.562; p < 0.001) or the rate during 2019 (39.9 admissions per day; IRR, 0.537; 95% CI, 0.502-0.574; p < 0.001). The greatest reductions (%) in the number of daily admissions in 2020 were observed for sleep apnoea (87% vs. 2018 and 84% vs. 2019) followed by admissions for asthma (76% vs. 2018 and 79% vs. 2019) and chronic obstructive pulmonary disease (60% vs. 2018 and 51% vs. 2019), while the lowest reductions were detected in hospitalizations for pulmonary embolism (6% vs. 2018 and 23% vs. 2019) followed by tuberculosis (25% vs. both 2018 and 2019). DISCUSSION/CONCLUSION: The significant reduction in respiratory admissions in 2020 raises the reasonable question of whether some patients may have avoided seeking medical attention during the COVID-19 pandemic and suggests an urgent need for transformation of healthcare systems during the pandemic to offer appropriate management of respiratory diseases other than COVID-19.
Assuntos
COVID-19/epidemiologia , Hospitalização/tendências , Doenças Respiratórias/epidemiologia , Idoso , Idoso de 80 Anos ou mais , Asma/epidemiologia , Estudos de Coortes , Feminino , Grécia/epidemiologia , Humanos , Incidência , Masculino , Pessoa de Meia-Idade , Doença Pulmonar Obstrutiva Crônica/epidemiologia , Embolia Pulmonar/epidemiologia , Estudos Retrospectivos , SARS-CoV-2 , Síndromes da Apneia do Sono/epidemiologia , Tuberculose Pulmonar/epidemiologiaRESUMO
High-throughput sequencing (HTS) has been widely used to characterize HIV-1 genome sequences. There are no algorithms currently that can directly determine genotype and quasispecies population using short HTS reads generated from long genome sequences without additional software. To establish a robust subpopulation, subtype, and recombination analysis workflow, we amplified the HIV-1 3'-half genome from plasma samples of 65 HIV-1-infected individuals and sequenced the entire amplicon (â¼4,500 bp) by HTS. With direct analysis of raw reads using HIVE-hexahedron, we showed that 48% of samples harbored 2 to 13 subpopulations. We identified various subtypes (17 A1s, 4 Bs, 27 Cs, 6 CRF02_AGs, and 11 unique recombinant forms) and defined recombinant breakpoints of 10 recombinants. These results were validated with viral genome sequences generated by single genome sequencing (SGS) or the analysis of consensus sequence of the HTS reads. The HIVE-hexahedron workflow is more sensitive and accurate than just evaluating the consensus sequence and also more cost-effective than SGS.IMPORTANCE The highly recombinogenic nature of human immunodeficiency virus type 1 (HIV-1) leads to recombination and emergence of quasispecies. It is important to reliably identify subpopulations to understand the complexity of a viral population for drug resistance surveillance and vaccine development. High-throughput sequencing (HTS) provides improved resolution over Sanger sequencing for the analysis of heterogeneous viral subpopulations. However, current methods of analysis of HTS reads are unable to fully address accurate population reconstruction. Hence, there is a dire need for a more sensitive, accurate, user-friendly, and cost-effective method to analyze viral quasispecies. For this purpose, we have improved the HIVE-hexahedron algorithm that we previously developed with in silico short sequences to analyze raw HTS short reads. The significance of this study is that our standalone algorithm enables a streamlined analysis of quasispecies, subtype, and recombination patterns from long HIV-1 genome regions without the need of additional sequence analysis tools. Distinct viral populations and recombination patterns identified by HIVE-hexahedron are further validated by comparison with sequences obtained by single genome sequencing (SGS).
Assuntos
Algoritmos , Genoma Viral , HIV-1/classificação , HIV-1/genética , Recombinação Genética , Estudos de Coortes , Simulação por Computador , Variação Genética , Genótipo , Infecções por HIV/sangue , Infecções por HIV/virologia , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Filogenia , Quase-Espécies/genéticaRESUMO
Most viruses are known to spontaneously generate defective viral genomes (DVG) due to errors during replication. These DVGs are subgenomic and contain deletions that render them unable to complete a full replication cycle in the absence of a co-infecting, non-defective helper virus. DVGs, especially of the copyback type, frequently observed with paramyxoviruses, have been recognized to be important triggers of the antiviral innate immune response. DVGs have therefore gained interest for their potential to alter the attenuation and immunogenicity of vaccines. To investigate this potential, accurate identification and quantification of DVGs is essential. Conventional methods, such as RT-PCR, are labor intensive and will only detect primer sequence-specific species. High throughput sequencing (HTS) is much better suited for this undertaking. Here, we present an HTS-based algorithm called DVG-profiler to identify and quantify all DVG sequences in an HTS data set generated from a virus preparation. DVG-profiler identifies DVG breakpoints relative to a reference genome and reports the directionality of each segment from within the same read. The specificity and sensitivity of the algorithm was assessed using both in silico data sets as well as HTS data obtained from parainfluenza virus 5, Sendai virus and mumps virus preparations. HTS data from the latter were also compared with conventional RT-PCR data and with data obtained using an alternative algorithm. The data presented here demonstrate the high specificity, sensitivity, and robustness of DVG-profiler. This algorithm was implemented within an open source cloud-based computing environment for analyzing HTS data. DVG-profiler might prove valuable not only in basic virus research but also in monitoring live attenuated vaccines for DVG content and to assure vaccine lot to lot consistency.
Assuntos
Algoritmos , Mapeamento Cromossômico/estatística & dados numéricos , Vírus Defeituosos/genética , Genoma Viral , Vírus da Caxumba/genética , Vírus da Parainfluenza 5/genética , Vírus Sendai/genética , Animais , Mapeamento Cromossômico/métodos , Primers do DNA/síntese química , Primers do DNA/metabolismo , Conjuntos de Dados como Assunto , Vírus Defeituosos/classificação , Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , Humanos , Tipagem Molecular , Vírus da Caxumba/classificação , Vírus da Parainfluenza 5/classificação , Reação em Cadeia da Polimerase em Tempo Real , Vírus Sendai/classificação , Sensibilidade e EspecificidadeRESUMO
Can you identify the cause of the acute respiratory failure in this patient with a history of polymyositis? http://ow.ly/jeDO30jLX5R.
RESUMO
Mesenchymal stem (stromal) cells (MSCs) are multipotent stromal cells that have the ability to modulate immune response to tissue injury and promote repair in vivo. The therapeutic potential of ex vivo expanded MSCs are currently under investigation for a variety of chronic and acute lung diseases. This review summarizes the encouraging results regarding the safety of MSCs administration from recent and current clinical trials for idiopathic pulmonary fibrosis, acute respiratory distress syndrome, and chronic obstructive pulmonary disease. It also reviews the early preliminary data extracted by the same trials regarding the efficacy of MSCs in the aforementioned lung diseases.
RESUMO
Deep sequencing was used to determine complete nucleotide sequences of echovirus 11 (EV11) strains isolated from a chronically infected patient with CVID as well as from cases of acute enterovirus infection. Phylogenetic analysis showed that EV11 strains that circulated in Israel in 1980-90s could be divided into four clades. EV11 strains isolated from a chronically infected individual belonged to one of the four clades and over a period of 4 years accumulated mutations at a relatively constant rate. Extrapolation of mutations accumulation curve into the past suggested that the individual was infected with circulating EV11 in the first half of 1990s. Genomic regions coding for individual viral proteins did not appear to be under strong selective pressure except for protease 3C that was remarkably conserved. This may suggest its important role in maintaining persistent infection.
Assuntos
Evolução Biológica , Enterovirus Humano B/genética , Enterovirus Humano B/isolamento & purificação , Infecções por Enterovirus/virologia , Genoma Viral , Hospedeiro Imunocomprometido , Proteínas Virais/metabolismo , Regiões 3' não Traduzidas , Enterovirus Humano B/classificação , Genômica/métodos , Humanos , Filogenia , Proteínas Virais/genéticaRESUMO
Sequence heterogeneity is a common characteristic of RNA viruses that is often referred to as sub-populations or quasispecies. Traditional techniques used for assembly of short sequence reads produced by deep sequencing, such as de-novo assemblers, ignore the underlying diversity. Here, we introduce a novel algorithm that simultaneously assembles discrete sequences of multiple genomes present in populations. Using in silico data we were able to detect populations at as low as 0.1% frequency with complete global genome reconstruction and in a single sample detected 16 resolved sequences with no mismatches. We also applied the algorithm to high throughput sequencing data obtained for viruses present in sewage samples and successfully detected multiple sub-populations and recombination events in these diverse mixtures. High sensitivity of the algorithm also enables genomic analysis of heterogeneous pathogen genomes from patient samples and accurate detection of intra-host diversity, enabling not just basic research in personalized medicine but also accurate diagnostics and monitoring drug therapies, which are critical in clinical and regulatory decision-making process.
Assuntos
Algoritmos , Biologia Computacional/métodos , Genoma Humano/genética , Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Genoma Viral/genética , Humanos , Filogenia , Poliovirus/classificação , Poliovirus/genética , Reprodutibilidade dos TestesRESUMO
Advances in high-throughput sequencing (HTS) technologies have greatly increased the availability of genomic data and potential discovery of clinically significant genomic variants. However, numerous issues still exist with the analysis of these data, including data complexity, the absence of formally agreed upon best practices, and inconsistent reproducibility. Toward a more robust and reproducible variant-calling paradigm, we propose a series of selective noise filtrations and post-alignment quality control (QC) techniques that may reduce the rate of false variant calls. We have implemented both novel and refined post-alignment QC mechanisms to augment existing pre-alignment QC measures. These techniques can be used independently or in combination to identify and correct issues caused during data generation or early analysis stages. The adoption of these procedures by the broader scientific community is expected to improve the identification of clinically significant variants both in terms of computational efficiency and in the confidence of the results. AVAILABILITY: https://hive.biochemistry.gwu.edu/.
Assuntos
Algoritmos , Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Polimorfismo Genético , Controle de Qualidade , Genômica/métodos , Humanos , Reprodutibilidade dos Testes , Análise de Sequência de DNA/métodosRESUMO
Mumps virus (MuV) is postulated to adhere to the "rule of six" for efficient replication. To examine the requirement for MuV, minigenomes of nonpolyhexameric length (6n-1 and 6n+1) were analyzed. Expression of the reporter gene CAT was significantly reduced with minigenomes of nonpolyhexameric length compared to the wild type 6n genome, and reduction was more pronounced for the 6n-1 than for the 6n+1 minigenome. That 6n-1 genomes are impacted by nonconformance with the rule of six to a greater degree as compared to 6n+1 genomes was also suggested with MuV derived from cDNA coding for 6n+1 or 6n-1 genomes. While viruses recovered from 6n+1 cDNAs maintained a nonpolyhexameric genome length over multiple replication cycles, viruses rescued from the 6n-1 cDNAs acquired length correcting mutations rapidly following rescue. Our data indicate that polyhexameric genomes are the preferred template for the MuV RNA polymerase, but that this requirement is not absolute.