Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 59
Filtrar
1.
Sci Rep ; 14(1): 10000, 2024 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-38693215

RESUMO

Convolutional Neural Networks (CNNs) have been central to the Deep Learning revolution and played a key role in initiating the new age of Artificial Intelligence. However, in recent years newer architectures such as Transformers have dominated both research and practical applications. While CNNs still play critical roles in many of the newer developments such as Generative AI, they are far from being thoroughly understood and utilised to their full potential. Here we show that CNNs can recognise patterns in images with scattered pixels and can be used to analyse complex datasets by transforming them into pseudo images with minimal processing for any high dimensional dataset, representing a more general approach to the application of CNNs to datasets such as in molecular biology, text, and speech. We introduce a pipeline called DeepMapper, which allows analysis of very high dimensional datasets without intermediate filtering and dimension reduction, thus preserving the full texture of the data, enabling detection of small variations normally deemed 'noise'. We demonstrate that DeepMapper can identify very small perturbations in large datasets with mostly random variables, and that it is superior in speed and on par in accuracy to prior work in processing large datasets with large numbers of features.

2.
Blood Adv ; 8(1): 112-129, 2024 01 09.
Artigo em Inglês | MEDLINE | ID: mdl-37729615

RESUMO

ABSTRACT: Acute megakaryoblastic leukemia (AMKL) is a rare, developmentally restricted, and highly lethal cancer of early childhood. The paucity and hypocellularity (due to myelofibrosis) of primary patient samples hamper the discovery of cell- and genotype-specific treatments. AMKL is driven by mutually exclusive chimeric fusion oncogenes in two-thirds of the cases, with CBFA2T3::GLIS2 (CG2) and NUP98 fusions (NUP98r) representing the highest-fatality subgroups. We established CD34+ cord blood-derived CG2 models (n = 6) that sustain serial transplantation and recapitulate human leukemia regarding immunophenotype, leukemia-initiating cell frequencies, comutational landscape, and gene expression signature, with distinct upregulation of the prosurvival factor B-cell lymphoma 2 (BCL2). Cell membrane proteomic analyses highlighted CG2 surface markers preferentially expressed on leukemic cells compared with CD34+ cells (eg, NCAM1 and CD151). AMKL differentiation block in the mega-erythroid progenitor space was confirmed by single-cell profiling. Although CG2 cells were rather resistant to BCL2 genetic knockdown or selective pharmacological inhibition with venetoclax, they were vulnerable to strategies that target the megakaryocytic prosurvival factor BCL-XL (BCL2L1), including in vitro and in vivo treatment with BCL2/BCL-XL/BCL-W inhibitor navitoclax and DT2216, a selective BCL-XL proteolysis-targeting chimera degrader developed to limit thrombocytopenia in patients. NUP98r AMKL were also sensitive to BCL-XL inhibition but not the NUP98r monocytic leukemia, pointing to a lineage-specific dependency. Navitoclax or DT2216 treatment in combination with low-dose cytarabine further reduced leukemic burden in mice. This work extends the cellular and molecular diversity set of human AMKL models and uncovers BCL-XL as a therapeutic vulnerability in CG2 and NUP98r AMKL.


Assuntos
Antineoplásicos , Leucemia Megacarioblástica Aguda , Humanos , Criança , Pré-Escolar , Animais , Camundongos , Leucemia Megacarioblástica Aguda/tratamento farmacológico , Leucemia Megacarioblástica Aguda/genética , Leucemia Megacarioblástica Aguda/patologia , Proteômica , Fatores de Transcrição , Proteínas Proto-Oncogênicas c-bcl-2 , Proteínas Repressoras
3.
J Virol ; 97(11): e0070523, 2023 Nov 30.
Artigo em Inglês | MEDLINE | ID: mdl-37843370

RESUMO

IMPORTANCE: The lack of a reliable method to accurately detect when replication-competent HIV has been cleared is a major challenge in developing a cure. This study introduces a new approach called the HIVepsilon-seq (HIVε-seq) assay, which uses long-read sequencing technology and bioinformatics to scrutinize the HIV genome at the nucleotide level, distinguishing between defective and intact HIV. This study included 30 participants on antiretroviral therapy, including 17 women, and was able to discriminate between defective and genetically intact viruses at the single DNA strand level. The HIVε-seq assay is an improvement over previous methods, as it requires minimal sample, less specialized lab equipment, and offers a shorter turnaround time. The HIVε-seq assay offers a promising new tool for researchers to measure the intact HIV reservoir, advancing efforts towards finding a cure for this devastating disease.


Assuntos
Infecções por HIV , HIV , Provírus , Feminino , Humanos , Linfócitos T CD4-Positivos , DNA Viral/genética , Infecções por HIV/tratamento farmacológico , Infecções por HIV/epidemiologia , Infecções por HIV/virologia , Nucleotídeos , Provírus/genética , Carga Viral , Análise de Sequência de DNA , Masculino , Fatores Sexuais , HIV/genética
4.
iScience ; 26(1): 105783, 2023 Jan 20.
Artigo em Inglês | MEDLINE | ID: mdl-36514310

RESUMO

Neutralizing antibodies (NAbs) hold great promise for clinical interventions against SARS-CoV-2 variants of concern (VOCs). Understanding NAb epitope-dependent antiviral mechanisms is crucial for developing vaccines and therapeutics against VOCs. Here we characterized two potent NAbs, EH3 and EH8, isolated from an unvaccinated pediatric patient with exceptional plasma neutralization activity. EH3 and EH8 cross-neutralize the early VOCs and mediate strong Fc-dependent effector activity in vitro. Structural analyses of EH3 and EH8 in complex with the receptor-binding domain (RBD) revealed the molecular determinants of the epitope-driven protection and VOC evasion. While EH3 represents the prevalent IGHV3-53 NAb whose epitope substantially overlaps with the ACE2 binding site, EH8 recognizes a narrow epitope exposed in both RBD-up and RBD-down conformations. When tested in vivo, a single-dose prophylactic administration of EH3 fully protected stringent K18-hACE2 mice from lethal challenge with Delta VOC. Our study demonstrates that protective NAbs responses converge in pediatric and adult SARS-CoV-2 patients.

5.
Dis Model Mech ; 15(11)2022 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-36317486

RESUMO

A series of well-regulated cellular and molecular events result in the compartmentalization of the anterior foregut into the esophagus and trachea. Disruption of the compartmentalization process leads to esophageal atresia/tracheoesophageal fistula (EA/TEF). The cause of EA/TEF remains largely unknown. Therefore, to mimic the early development of the esophagus and trachea, we differentiated induced pluripotent stem cells (iPSCs) from EA/TEF patients, and iPSCs and embryonic stem cells from healthy individuals into mature three-dimensional esophageal organoids. CXCR4, SOX17 and GATA4 expression was similar in both patient-derived and healthy endodermal cells. The expression of the key transcription factor SOX2 was significantly lower in the patient-derived anterior foregut. We also observed an abnormal expression of NKX2.1 (or NKX2-1) in the patient-derived mature esophageal organoids. At the anterior foregut stage, RNA sequencing revealed the critical genes GSTM1 and RAB37 to be significantly lower in the patient-derived anterior foregut. We therefore hypothesize that a transient dysregulation of SOX2 and the abnormal expression of NKX2.1 in patient-derived cells could be responsible for the abnormal foregut compartmentalization.


Assuntos
Atresia Esofágica , Células-Tronco Pluripotentes Induzidas , Fístula Traqueoesofágica , Humanos , Atresia Esofágica/genética , Atresia Esofágica/complicações , Células-Tronco Pluripotentes Induzidas/metabolismo , Fístula Traqueoesofágica/etiologia , Fístula Traqueoesofágica/metabolismo , Fatores de Transcrição SOXB1/genética
6.
J Assoc Med Microbiol Infect Dis Can ; 7(3): 283-291, 2022 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-36337604

RESUMO

BACKGROUND: COVID-19 is usually a time-limited disease. However, prolonged infections and reinfections can occur among immunocompromised patients. It can be difficult to distinguish a prolonged infection from a new one, especially when reinfection occurs early. METHODS: We report the case of a 57-year-old man infected with SARS-CoV-2 while undergoing chemotherapy for follicular lymphoma. He experienced prolonged symptomatic infection for 3 months despite a 5-day course of remdesivir and eventually deteriorated and died. RESULTS: Viral genome sequencing showed that his final deterioration was most likely due to reinfection. Serologic studies confirmed that the patient did not seroconvert. CONCLUSIONS: This case report highlights that reinfection can occur rapidly (62-67 d) among immunocompromised patients after a prolonged disease. We provide substantial proof of prolonged infection through repeated nucleic acid amplification tests and positive viral culture at day 56 of the disease course, and we put forward evidence of reinfection with viral genome sequencing.


HISTORIQUE: La COVID-19 est généralement une maladie limitée dans le temps. Toutefois, des infections et réinfections prolongées peuvent survenir chez des patients immunodéprimés. Il peut être difficile de distinguer une infection prolongée d'une nouvelle infection, particulièrement lorsque la réinfection se produit rapidement. MÉTHODOLOGIE: Les auteurs rendent compte du cas d'un homme de 57 ans infecté par le SRAS-CoV-2 alors qu'il était sous chimiothérapie pour soigner un lymphome folliculaire. Il a souffert d'une infection symptomatique prolongée de trois mois, malgré un traitement de cinq jours au remdésivir. Son état s'est finalement détérioré et il est décédé. RÉSULTATS: Le séquençage du génome viral a démontré que la détérioration finale de son état a probablement été causée par une réinfection. Les études sérologiques ont confirmé qu'il n'avait pas présenté de séroconversion. CONCLUSIONS: Le présent rapport de cas établit la possibilité d'une réinfection rapide (au bout de 62 à 67 jours) chez les patients immunodéprimés après une longue maladie. Les auteurs fournissent des preuves substantielles d'une infection prolongée par des tests répétés d'amplification des acides nucléiques et par des cultures virales positives au 56e jour de l'évolution de la maladie, et ils présentent des preuves de réinfection grâce au séquençage du génome viral.

7.
Antimicrob Agents Chemother ; 66(7): e0019822, 2022 07 19.
Artigo em Inglês | MEDLINE | ID: mdl-35708323

RESUMO

In vitro selection of remdesivir-resistant severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) revealed the emergence of a V166L substitution, located outside of the polymerase active site of the Nsp12 protein, after 9 passages of a single lineage. V166L remained the only Nsp12 substitution after 17 passages (10 µM remdesivir), conferring a 2.3-fold increase in 50% effective concentration (EC50). When V166L was introduced into a recombinant SARS-CoV-2 virus, a 1.5-fold increase in EC50 was observed, indicating a high in vitro barrier to remdesivir resistance.


Assuntos
Tratamento Farmacológico da COVID-19 , SARS-CoV-2 , Monofosfato de Adenosina/análogos & derivados , Monofosfato de Adenosina/química , Alanina/análogos & derivados , Alanina/metabolismo , Antivirais/química , Humanos
8.
Nat Biotechnol ; 40(7): 1026-1029, 2022 07.
Artigo em Inglês | MEDLINE | ID: mdl-34980914

RESUMO

Nanopore sequencing depends on the FAST5 file format, which does not allow efficient parallel analysis. Here we introduce SLOW5, an alternative format engineered for efficient parallelization and acceleration of nanopore data analysis. Using the example of DNA methylation profiling of a human genome, analysis runtime is reduced from more than two weeks to approximately 10.5 h on a typical high-performance computer. SLOW5 is approximately 25% smaller than FAST5 and delivers consistent improvements on different computer architectures.


Assuntos
Sequenciamento por Nanoporos , Nanoporos , Análise de Dados , Genoma Humano/genética , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Análise de Sequência de DNA
9.
Cell Syst ; 13(2): 143-157.e3, 2022 02 16.
Artigo em Inglês | MEDLINE | ID: mdl-34637888

RESUMO

The rapid, global dispersion of SARS-CoV-2 has led to the emergence of a diverse range of variants. Here, we describe how the mutational landscape of SARS-CoV-2 has shaped HLA-restricted T cell immunity at the population level during the first year of the pandemic. We analyzed a total of 330,246 high-quality SARS-CoV-2 genome assemblies, sampled across 143 countries and all major continents from December 2019 to December 2020 before mass vaccination or the rise of the Delta variant. We observed that proline residues are preferentially removed from the proteome of prevalent mutants, leading to a predicted global loss of SARS-CoV-2 T cell epitopes in individuals expressing HLA-B alleles of the B7 supertype family; this is largely driven by a dominant C-to-U mutation type at the RNA level. These results indicate that B7-supertype-associated epitopes, including the most immunodominant ones, were more likely to escape CD8+ T cell immunosurveillance during the first year of the pandemic.


Assuntos
COVID-19 , Epitopos de Linfócito T , SARS-CoV-2 , COVID-19/virologia , Epitopos de Linfócito T/genética , Epitopos de Linfócito T/imunologia , Humanos , Mutação , SARS-CoV-2/genética
10.
PLoS One ; 16(12): e0260714, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34855869

RESUMO

The first confirmed case of COVID-19 in Quebec, Canada, occurred at Verdun Hospital on February 25, 2020. A month later, a localized outbreak was observed at this hospital. We performed tiled amplicon whole genome nanopore sequencing on nasopharyngeal swabs from all SARS-CoV-2 positive samples from 31 March to 17 April 2020 in 2 local hospitals to assess viral diversity (unknown at the time in Quebec) and potential associations with clinical outcomes. We report 264 viral genomes from 242 individuals-both staff and patients-with associated clinical features and outcomes, as well as longitudinal samples and technical replicates. Viral lineage assessment identified multiple subclades in both hospitals, with a predominant subclade in the Verdun outbreak, indicative of hospital-acquired transmission. Dimensionality reduction identified two subclades with mutations of clinical interest, namely in the Spike protein, that evaded supervised lineage assignment methods-including Pangolin and NextClade supervised lineage assignment tools. We also report that certain symptoms (headache, myalgia and sore throat) are significantly associated with favorable patient outcomes. Our findings demonstrate the strength of unsupervised, data-driven analyses whilst suggesting that caution should be used when employing supervised genomic workflows, particularly during the early stages of a pandemic.


Assuntos
COVID-19/virologia , Infecção Hospitalar/virologia , Surtos de Doenças , Genoma Viral/genética , SARS-CoV-2/genética , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , COVID-19/epidemiologia , COVID-19/mortalidade , Criança , Pré-Escolar , Infecção Hospitalar/epidemiologia , Surtos de Doenças/estatística & dados numéricos , Feminino , Haplótipos/genética , Humanos , Masculino , Pessoa de Meia-Idade , Filogenia , Quebeque/epidemiologia , SARS-CoV-2/patogenicidade , Análise de Sequência de RNA , Resultado do Tratamento , Adulto Jovem
11.
Cell Rep ; 36(12): 109722, 2021 09 21.
Artigo em Inglês | MEDLINE | ID: mdl-34551299

RESUMO

DNA replication timing and three-dimensional (3D) genome organization are associated with distinct epigenome patterns across large domains. However, whether alterations in the epigenome, in particular cancer-related DNA hypomethylation, affects higher-order levels of genome architecture is still unclear. Here, using Repli-Seq, single-cell Repli-Seq, and Hi-C, we show that genome-wide methylation loss is associated with both concordant loss of replication timing precision and deregulation of 3D genome organization. Notably, we find distinct disruption in 3D genome compartmentalization, striking gains in cell-to-cell replication timing heterogeneity and loss of allelic replication timing in cancer hypomethylation models, potentially through the gene deregulation of DNA replication and genome organization pathways. Finally, we identify ectopic H3K4me3-H3K9me3 domains from across large hypomethylated domains, where late replication is maintained, which we purport serves to protect against catastrophic genome reorganization and aberrant gene transcription. Our results highlight a potential role for the methylome in the maintenance of 3D genome regulation.


Assuntos
Metilação de DNA , Período de Replicação do DNA/fisiologia , Genoma Humano , Linhagem Celular Tumoral , Cromatina/metabolismo , DNA (Citosina-5-)-Metiltransferase 1/genética , DNA (Citosina-5-)-Metiltransferase 1/metabolismo , Bases de Dados Genéticas , Expressão Gênica , Histonas/metabolismo , Humanos , Análise de Sequência de DNA/métodos
12.
J Vis Exp ; (173)2021 07 12.
Artigo em Inglês | MEDLINE | ID: mdl-34309607

RESUMO

Transgenic mouse models have proved to be powerful tools in studying various aspects of human neurological disorders, including epilepsy. The SCN1A-associated genetic epilepsies comprise a wide spectrum of seizure disorders with incomplete penetrance and clinical variability. SCN1A mutations can result in a large variety of seizure phenotype ranging from simple, self-limited fever-associated febrile seizures (FS), moderate-level genetic epilepsy with febrile seizures plus (GEFS+) to more severe Dravet Syndrome (DS). Although FS are commonly seen in children below 6-7 years of age who do not have genetic epilepsy, FS in GEFS+ patients continue to occur into adulthood. Traditionally, experimental FS have been induced in mice by exposing the animal to a stream of dry air or heating lamps, and the rate of change in body temperature is often not well controlled. Here, we describe a custom-built heating chamber, with a plexiglass front, that is fitted with a digital temperature controller and a heater-equipped electric fan, which can send heated forced air into the test arena in a temperature-controlled manner. The body temperature of a mouse placed in the chamber, monitored through a rectal probe, can be increased to 40-42 °C in a reproducible manner by increasing the temperature inside the chamber. Continual visual monitoring of the animals during the heating period demonstrates induction of heat-induced seizures in mice carrying an FS mutation at a body temperature that does not elicit behavioral seizures in wild-type litter mates. Animals can be easily removed from the chamber and placed on a cooling pad to rapidly return body temperature to normal. This method provides for a simple, rapid, and reproducible screening protocol for the occurrence of heat-induced seizures in epilepsy mouse models.


Assuntos
Epilepsia , Canal de Sódio Disparado por Voltagem NAV1.1 , Adulto , Animais , Epilepsia/genética , Temperatura Alta , Humanos , Camundongos , Camundongos Transgênicos , Mutação , Canal de Sódio Disparado por Voltagem NAV1.1/genética , Fenótipo , Convulsões/etiologia
13.
medRxiv ; 2021 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-34100030

RESUMO

The first confirmed case of COVID-19 in Quebec, Canada, occurred at Verdun Hospital on February 25, 2020. A month later, a localized outbreak was observed at this hospital. We performed tiled amplicon whole genome nanopore sequencing on nasopharyngeal swabs from all SARS-CoV-2 positive samples from 31 March to 17 April 2020 in 2 local hospitals to assess the viral diversity of the outbreak. We report 264 viral genomes from 242 individuals (both staff and patients) with associated clinical features and outcomes, as well as longitudinal samples, technical replicates and the first publicly disseminated SARS-CoV-2 genomes in Quebec. Viral lineage assessment identified multiple subclades in both hospitals, with a predominant subclade in the Verdun outbreak, indicative of hospital-acquired transmission. Dimensionality reduction identified two subclades that evaded supervised lineage assignment methods, including Pangolin, and identified certain symptoms (headache, myalgia and sore throat) that are significantly associated with favorable patient outcomes. We also address certain limitations of standard SARS-CoV-2 bioinformatics procedures, notably when presented with multiple viral haplotypes.

14.
BMC Genomics ; 22(1): 148, 2021 Mar 02.
Artigo em Inglês | MEDLINE | ID: mdl-33653280

RESUMO

BACKGROUND: Hepatitis C (HCV) and many other RNA viruses exist as rapidly mutating quasi-species populations in a single infected host. High throughput characterization of full genome, within-host variants is still not possible despite advances in next generation sequencing. This limitation constrains viral genomic studies that depend on accurate identification of hemi-genome or whole genome, within-host variants, especially those occurring at low frequencies. With the advent of third generation long read sequencing technologies, including Oxford Nanopore Technology (ONT) and PacBio platforms, this problem is potentially surmountable. ONT is particularly attractive in this regard due to the portable nature of the MinION sequencer, which makes real-time sequencing in remote and resource-limited locations possible. However, this technology (termed here 'nanopore sequencing') has a comparatively high technical error rate. The present study aimed to assess the utility, accuracy and cost-effectiveness of nanopore sequencing for HCV genomes. We also introduce a new bioinformatics tool (Nano-Q) to differentiate within-host variants from nanopore sequencing. RESULTS: The Nanopore platform, when the coverage exceeded 300 reads, generated comparable consensus sequences to Illumina sequencing. Using HCV Envelope plasmids (~ 1800 nt) mixed in known proportions, the capacity of nanopore sequencing to reliably identify variants with an abundance as low as 0.1% was demonstrated, provided the autologous reference sequence was available to identify the matching reads. Successful pooling and nanopore sequencing of 52 samples from patients with HCV infection demonstrated its cost effectiveness (AUD$ 43 per sample with nanopore sequencing versus $100 with paired-end short read technology). The Nano-Q tool successfully separated between-host sequences, including those from the same subtype, by bulk sorting and phylogenetic clustering without an autologous reference sequence (using only a subtype-specific generic reference). The pipeline also identified within-host viral variants and their abundance when the parameters were appropriately adjusted. CONCLUSION: Cost effective HCV whole genome sequencing and within-host variant identification without haplotype reconstruction are potential advantages of nanopore sequencing.


Assuntos
Hepatite C , Nanoporos , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Filogenia , Análise de Sequência de DNA , Tecnologia , Sequenciamento Completo do Genoma
15.
Sci Rep ; 10(1): 18196, 2020 10 23.
Artigo em Inglês | MEDLINE | ID: mdl-33097792

RESUMO

Current methods for dengue virus (DENV) genome amplification, amplify parts of the genome in at least 5 overlapping segments and then combine the output to characterize a full genome. This process is laborious, costly and requires at least 10 primers per serotype, thus increasing the likelihood of PCR bias. We introduce an assay to amplify near full-length dengue virus genomes as intact molecules, sequence these amplicons with third generation "nanopore" technology without fragmenting and use the sequence data to differentiate within-host viral variants with a bioinformatics tool (Nano-Q). The new assay successfully generated near full-length amplicons from DENV serotypes 1, 2 and 3 samples which were sequenced with nanopore technology. Consensus DENV sequences generated by nanopore sequencing had over 99.5% pairwise sequence similarity to Illumina generated counterparts provided the coverage was > 100 with both platforms. Maximum likelihood phylogenetic trees generated from nanopore consensus sequences were able to reproduce the exact trees made from Illumina sequencing with a conservative 99% bootstrapping threshold (after 1000 replicates and 10% burn-in). Pairwise genetic distances of within host variants identified from the Nano-Q tool were less than that of between host variants, thus enabling the phylogenetic segregation of variants from the same host.


Assuntos
Vírus da Dengue/genética , Genoma Viral , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Funções Verossimilhança , Filogenia
16.
Genome Res ; 30(9): 1345-1353, 2020 09.
Artigo em Inglês | MEDLINE | ID: mdl-32907883

RESUMO

Nanopore sequencing enables direct measurement of RNA molecules without conversion to cDNA, thus opening the gates to a new era for RNA biology. However, the lack of molecular barcoding of direct RNA nanopore sequencing data sets severely affects the applicability of this technology to biological samples, where RNA availability is often limited. Here, we provide the first experimental protocol and associated algorithm to barcode and demultiplex direct RNA nanopore sequencing data sets. Specifically, we present a novel and robust approach to accurately classify raw nanopore signal data by transforming current intensities into images or arrays of pixels, followed by classification using a deep learning algorithm. We demonstrate the power of this strategy by developing the first experimental protocol for barcoding and demultiplexing direct RNA sequencing libraries. Our method, DeePlexiCon, can classify 93% of reads with 95.1% accuracy or 60% of reads with 99.9% accuracy. The availability of an efficient and simple multiplexing strategy for native RNA sequencing will improve the cost-effectiveness of this technology, as well as facilitate the analysis of lower-input biological samples. Overall, our work exemplifies the power, simplicity, and robustness of signal-to-image conversion for nanopore data analysis using deep learning.


Assuntos
Aprendizado Profundo , Sequenciamento por Nanoporos/métodos , Análise de Sequência de RNA/métodos , Algoritmos
17.
BMC Bioinformatics ; 21(1): 343, 2020 Aug 05.
Artigo em Inglês | MEDLINE | ID: mdl-32758139

RESUMO

BACKGROUND: Nanopore sequencing enables portable, real-time sequencing applications, including point-of-care diagnostics and in-the-field genotyping. Achieving these outcomes requires efficient bioinformatic algorithms for the analysis of raw nanopore signal data. However, comparing raw nanopore signals to a biological reference sequence is a computationally complex task. The dynamic programming algorithm called Adaptive Banded Event Alignment (ABEA) is a crucial step in polishing sequencing data and identifying non-standard nucleotides, such as measuring DNA methylation. Here, we parallelise and optimise an implementation of the ABEA algorithm (termed f5c) to efficiently run on heterogeneous CPU-GPU architectures. RESULTS: By optimising memory, computations and load balancing between CPU and GPU, we demonstrate how f5c can perform ∼3-5 × faster than an optimised version of the original CPU-only implementation of ABEA in the Nanopolish software package. We also show that f5c enables DNA methylation detection on-the-fly using an embedded System on Chip (SoC) equipped with GPUs. CONCLUSIONS: Our work not only demonstrates that complex genomics analyses can be performed on lightweight computing systems, but also benefits High-Performance Computing (HPC). The associated source code for f5c along with GPU optimised ABEA is available at https://github.com/hasindu2008/f5c .


Assuntos
Gráficos por Computador , Nanoporos , Processamento de Sinais Assistido por Computador , Algoritmos , Biologia Computacional , Bases de Dados como Assunto , Genoma Humano , Humanos , Análise de Sequência
18.
RNA ; 26(9): 1104-1117, 2020 09.
Artigo em Inglês | MEDLINE | ID: mdl-32393525

RESUMO

Noncoding RNA has a proven ability to direct and regulate chromatin modifications by acting as scaffolds between DNA and histone-modifying complexes. However, it is unknown if ncRNA plays any role in DNA replication and epigenome maintenance, including histone eviction and reinstallment of histone modifications after genome duplication. Isolation of nascent chromatin has identified a large number of RNA-binding proteins in addition to unknown components of the replication and epigenetic maintenance machinery. Here, we isolated and characterized long and short RNAs associated with nascent chromatin at active replication forks and track RNA composition during chromatin maturation across the cell cycle. Shortly after fork passage, GA-rich-, alpha- and TElomeric Repeat-containing RNAs (TERRA) are associated with replicated DNA. These repeat containing RNAs arise from loci undergoing replication, suggesting an interaction in cis. Post-replication during chromatin maturation, and even after mitosis in G1, the repeats remain enriched on DNA. This suggests that specific types of repeat RNAs are transcribed shortly after DNA replication and stably associate with their loci of origin throughout the cell cycle. The presented method and data enable studies of RNA interactions with replication forks and post-replicative chromatin and provide insights into how repeat RNAs and their engagement with chromatin are regulated with respect to DNA replication and across the cell cycle.


Assuntos
Replicação do DNA/genética , DNA/genética , Processamento de Proteína Pós-Traducional/genética , RNA/genética , Ciclo Celular/genética , Linhagem Celular Tumoral , Cromatina/genética , Células HeLa , Histonas/genética , Humanos
20.
Gigascience ; 9(4)2020 04 01.
Artigo em Inglês | MEDLINE | ID: mdl-32236524

RESUMO

BACKGROUND: The German Shepherd Dog (GSD) is one of the most common breeds on earth and has been bred for its utility and intelligence. It is often first choice for police and military work, as well as protection, disability assistance, and search-and-rescue. Yet, GSDs are well known to be susceptible to a range of genetic diseases that can interfere with their training. Such diseases are of particular concern when they occur later in life, and fully trained animals are not able to continue their duties. FINDINGS: Here, we provide the draft genome sequence of a healthy German Shepherd female as a reference for future disease and evolutionary studies. We generated this improved canid reference genome (CanFam_GSD) utilizing a combination of Pacific Bioscience, Oxford Nanopore, 10X Genomics, Bionano, and Hi-C technologies. The GSD assembly is ∼80 times as contiguous as the current canid reference genome (20.9 vs 0.267 Mb contig N50), containing far fewer gaps (306 vs 23,876) and fewer scaffolds (429 vs 3,310) than the current canid reference genome CanFamv3.1. Two chromosomes (4 and 35) are assembled into single scaffolds with no gaps. BUSCO analyses of the genome assembly results show that 93.0% of the conserved single-copy genes are complete in the GSD assembly compared with 92.2% for CanFam v3.1. Homology-based gene annotation increases this value to ∼99%. Detailed examination of the evolutionarily important pancreatic amylase region reveals that there are most likely 7 copies of the gene, indicative of a duplication of 4 ancestral copies and the disruption of 1 copy. CONCLUSIONS: GSD genome assembly and annotation were produced with major improvement in completeness, continuity, and quality over the existing canid reference. This resource will enable further research related to canine diseases, the evolutionary relationships of canids, and other aspects of canid biology.


Assuntos
Cromossomos/genética , Genoma/genética , Análise de Sequência de DNA/métodos , Sequenciamento Completo do Genoma/métodos , Animais , Cães , Genômica , Anotação de Sequência Molecular
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA