Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 50
Filtrar
1.
Eur J Neurosci ; 2024 Apr 04.
Artigo em Inglês | MEDLINE | ID: mdl-38576196

RESUMO

Detection and measurement of amyloid-beta (Aß) in the brain is a key factor for early identification and diagnosis of Alzheimer's disease (AD). We aimed to develop a deep learning model to predict Aß cerebrospinal fluid (CSF) concentration directly from amyloid PET images, independent of tracers, brain reference regions or preselected regions of interest. We used 1870 Aß PET images and CSF measurements to train and validate a convolutional neural network ("ArcheD"). We evaluated the ArcheD performance in relation to episodic memory and the standardized uptake value ratio (SUVR) of cortical Aß. We also compared the brain region's relevance for the model's CSF prediction within clinical-based and biological-based classifications. ArcheD-predicted Aß CSF values correlated with measured Aß CSF values (r = 0.92; q < 0.01), SUVR (rAV45 = -0.64, rFBB = -0.69; q < 0.01) and episodic memory measures (0.33 < r < 0.44; q < 0.01). For both classifications, cerebral white matter significantly contributed to CSF prediction (q < 0.01), specifically in non-symptomatic and early stages of AD. However, in late-stage disease, the brain stem, subcortical areas, cortical lobes, limbic lobe and basal forebrain made more significant contributions (q < 0.01). Considering cortical grey matter separately, the parietal lobe was the strongest predictor of CSF amyloid levels in those with prodromal or early AD, while the temporal lobe played a more crucial role for those with AD. In summary, ArcheD reliably predicted Aß CSF concentration from Aß PET scans, offering potential clinical utility for Aß level determination.

2.
bioRxiv ; 2023 Oct 27.
Artigo em Inglês | MEDLINE | ID: mdl-37425778

RESUMO

Detection and measurement of amyloid-beta (Aß) aggregation in the brain is a key factor for early identification and diagnosis of Alzheimer's disease (AD). We aimed to develop a deep learning model to predict Aß cerebrospinal fluid (CSF) concentration directly from amyloid PET images, independent of tracers, brain reference regions or preselected regions of interest. We used 1870 Aß PET images and CSF measurements to train and validate a convolutional neural network ("ArcheD"). We evaluated the ArcheD performance in relation to episodic memory and the standardized uptake value ratio (SUVR) of cortical Aß. We also compared the brain region's relevance for the model's CSF prediction within clinical-based and biological-based classifications. ArcheD-predicted Aß CSF values correlated strongly with measured Aß CSF values ( r =0.81; p <0.001) and showed correlations with SUVR and episodic memory measures in all participants except in those with AD. For both clinical and biological classifications, cerebral white matter significantly contributed to CSF prediction ( q <0.01), specifically in non-symptomatic and early stages of AD. However, in late-stage disease, brain stem, subcortical areas, cortical lobes, limbic lobe, and basal forebrain made more significant contributions (q<0.01). Considering cortical gray matter separately, the parietal lobe was the strongest predictor of CSF amyloid levels in those with prodromal or early AD, while the temporal lobe played a more crucial role for those with AD. In summary, ArcheD reliably predicted Aß CSF concentration from Aß PET scans, offering potential clinical utility for Aß level determination and early AD detection.

3.
Semin Hematol ; 60(3): 132-141, 2023 07.
Artigo em Inglês | MEDLINE | ID: mdl-37455222

RESUMO

Liquid biopsies utilizing plasma circulating tumor DNA (ctDNA) are anticipated to revolutionize decision-making in cancer care. In the field of lymphomas, ctDNA-based blood tests represent the forefront of clinically applicable tools to harness decades of genomic research for disease profiling, quantification, and detection. More recently, the discovery of nonrandom fragmentation patterns in cell-free DNA (cfDNA) has opened another avenue of liquid biopsy research beyond mutational interrogation of ctDNA. Through examination of structural features, nucleotide content, and genomic distribution of massive numbers of plasma cfDNA molecules, the study of fragmentomics aims at identifying new tools that augment existing ctDNA-based analyses and discover new ways to profile cancer from blood tests. Indeed, the characterization of aberrant lymphoma ctDNA fragment patterns and harnessing them with powerful machine-learning techniques are expected to unleash the potential of nonmutant molecules for liquid biopsy purposes. In this article, we review cfDNA fragmentomics as an emerging approach in the ctDNA research of B-cell lymphomas. We summarize the biology behind the formation of cfDNA fragment patterns and discuss the preanalytical and technical limitations faced with current methodologies. Then we go through the advances in the field of lymphomas and envision what other noninvasive tools based on fragment characteristics could be explored. Last, we place fragmentomics as one of the facets of ctDNA analyses in emerging multiview and multiomics liquid biopsies. We pay attention to the unknowns in the field of cfDNA fragmentation biology that warrant further mechanistic investigation to provide rational background for the development of these precision oncology tools and understanding of their limitations.


Assuntos
Ácidos Nucleicos Livres , Linfoma de Células B , Linfoma , Neoplasias , Humanos , Medicina de Precisão , Biópsia Líquida/métodos , Linfoma de Células B/diagnóstico , Linfoma de Células B/genética , Mutação , Biomarcadores Tumorais/genética
4.
Genome Med ; 15(1): 47, 2023 Jul 07.
Artigo em Inglês | MEDLINE | ID: mdl-37420249

RESUMO

BACKGROUND: Cancer genome sequencing enables accurate classification of tumours and tumour subtypes. However, prediction performance is still limited using exome-only sequencing and for tumour types with low somatic mutation burden such as many paediatric tumours. Moreover, the ability to leverage deep representation learning in discovery of tumour entities remains unknown. METHODS: We introduce here Mutation-Attention (MuAt), a deep neural network to learn representations of simple and complex somatic alterations for prediction of tumour types and subtypes. In contrast to many previous methods, MuAt utilizes the attention mechanism on individual mutations instead of aggregated mutation counts. RESULTS: We trained MuAt models on 2587 whole cancer genomes (24 tumour types) from the Pan-Cancer Analysis of Whole Genomes (PCAWG) and 7352 cancer exomes (20 types) from the Cancer Genome Atlas (TCGA). MuAt achieved prediction accuracy of 89% for whole genomes and 64% for whole exomes, and a top-5 accuracy of 97% and 90%, respectively. MuAt models were found to be well-calibrated and perform well in three independent whole cancer genome cohorts with 10,361 tumours in total. We show MuAt to be able to learn clinically and biologically relevant tumour entities including acral melanoma, SHH-activated medulloblastoma, SPOP-associated prostate cancer, microsatellite instability, POLE proofreading deficiency, and MUTYH-associated pancreatic endocrine tumours without these tumour subtypes and subgroups being provided as training labels. Finally, scrunity of MuAt attention matrices revealed both ubiquitous and tumour-type specific patterns of simple and complex somatic mutations. CONCLUSIONS: Integrated representations of somatic alterations learnt by MuAt were able to accurately identify histological tumour types and identify tumour entities, with potential to impact precision cancer medicine.


Assuntos
Mutação , Neoplasias , Neoplasias/genética , Neoplasias/patologia , Humanos , Aprendizado Profundo , Benchmarking
5.
Cytometry A ; 103(10): 807-817, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-37276178

RESUMO

Imaging flow cytometry (IFC) combines flow cytometry with microscopy, allowing rapid characterization of cellular and molecular properties via high-throughput single-cell fluorescent imaging. However, fluorescent labeling is costly and time-consuming. We present a computational method called DeepIFC based on the Inception U-Net neural network architecture, able to generate fluorescent marker images and learn morphological features from IFC brightfield and darkfield images. Furthermore, the DeepIFC workflow identifies cell types from the generated fluorescent images and visualizes the single-cell features generated in a 2D space. We demonstrate that rarer cell types are predicted well when a balanced data set is used to train the model, and the model is able to recognize red blood cells not seen during model training as a distinct entity. In summary, DeepIFC allows accurate cell reconstruction, typing and recognition of unseen cell types from brightfield and darkfield images via virtual fluorescent labeling.

6.
BMC Bioinformatics ; 23(1): 522, 2022 Dec 06.
Artigo em Inglês | MEDLINE | ID: mdl-36474143

RESUMO

BACKGROUND: A deep understanding of carcinogenesis at the DNA level underpins many advances in cancer prevention and treatment. Mutational signatures provide a breakthrough conceptualisation, as well as an analysis framework, that can be used to build such understanding. They capture somatic mutation patterns and at best identify their causes. Most studies in this context have focused on an inherently additive analysis, e.g. by non-negative matrix factorization, where the mutations within a cancer sample are explained by a linear combination of independent mutational signatures. However, other recent studies show that the mutational signatures exhibit non-additive interactions. RESULTS: We carefully analysed such additive model fits from the PCAWG study cataloguing mutational signatures as well as their activities across thousands of cancers. Our analysis identified systematic and non-random structure of residuals that is left unexplained by the additive model. We used hierarchical clustering to identify cancer subsets with similar residual profiles to show that both systematic mutation count overestimation and underestimation take place. We propose an extension to the additive mutational signature model-multiplicatively acting modulatory processes-and develop a maximum-likelihood framework to identify such modulatory mutational signatures. The augmented model is expressive enough to almost fully remove the observed systematic residual patterns. CONCLUSION: We suggest the modulatory processes biologically relate to sample specific DNA repair propensities with cancer or tissue type specific profiles. Overall, our results identify an interesting direction where to expand signature analysis.


Assuntos
Neoplasias , Humanos , Mutação , Neoplasias/genética
7.
Front Mol Biosci ; 9: 974799, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36310597

RESUMO

Personalised medicine (PM) presents a great opportunity to improve the future of individualised healthcare. Recent advances in -omics technologies have led to unprecedented efforts characterising the biology and molecular mechanisms that underlie the development and progression of a wide array of complex human diseases, supporting further development of PM. This article reflects the outcome of the 2021 EATRIS-Plus Multi-omics Stakeholder Group workshop organised to 1) outline a global overview of common promises and challenges that key European stakeholders are facing in the field of multi-omics research, 2) assess the potential of new technologies, such as artificial intelligence (AI), and 3) establish an initial dialogue between key initiatives in this space. Our focus is on the alignment of agendas of European initiatives in multi-omics research and the centrality of patients in designing solutions that have the potential to advance PM in long-term healthcare strategies.

8.
Front Genet ; 13: 913163, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35873465

RESUMO

Microsatellite sequences are particularly prone to slippage during DNA replication, forming insertion-deletion loops that, if left unrepaired, result in de novo mutations (expansions or contractions of the repeat array). Mismatch repair (MMR) is a critical DNA repair mechanism that corrects these insertion-deletion loops, thereby maintaining microsatellite stability. MMR deficiency gives rise to the molecular phenotype known as microsatellite instability (MSI). By sequencing MMR-proficient and -deficient (Mlh1 +/+ and Mlh1 -/- ) single-cell exomes from mouse T cells, we reveal here several previously unrecognized features of in vivo MSI. Specifically, mutational dynamics of insertions and deletions were different on multiple levels. Factors that associated with propensity of mononucleotide microsatellites to insertions versus deletions were: microsatellite length, nucleotide composition of the mononucleotide tract, gene length and transcriptional status, as well replication timing. Here, we show on a single-cell level that deletions - the predominant MSI type in MMR-deficient cells - are preferentially associated with longer A/T tracts, long or transcribed genes and later-replicating genes.

9.
Sci Rep ; 12(1): 10670, 2022 06 23.
Artigo em Inglês | MEDLINE | ID: mdl-35739278

RESUMO

Despite recent progress in acute lymphoblastic leukemia (ALL) therapies, a significant subset of adult and pediatric ALL patients has a dismal prognosis. Better understanding of leukemogenesis and recognition of germline genetic changes may provide new tools for treating patients. Given that hematopoietic stem cell transplantation, often from a family member, is a major form of treatment in ALL, acknowledging the possibility of hereditary predisposition is of special importance. Reports of comprehensive germline analyses performed in adult ALL patients are scarce. Aiming at fulfilling this gap of knowledge, we investigated variants in 93 genes predisposing to hematologic malignancies and 70 other cancer-predisposing genes from exome data obtained from 61 adult and 87 pediatric ALL patients. Our results show that pathogenic (P) or likely pathogenic (LP) germline variants in genes associated with predisposition to ALL or other cancers are prevalent in ALL patients: 8% of adults and 11% of children. Comparison of P/LP germline variants in patients to population-matched controls (gnomAD Finns) revealed a 2.6-fold enrichment in ALL cases (CI 95% 1.5-4.2, p = 0.00071). Acknowledging inherited factors is crucial, especially when considering hematopoietic stem cell transplantation and planning post-therapy follow-up. Harmful germline variants may also predispose patients to excessive toxicity potentially compromising the outcome. We propose integrating germline genetics into precise ALL patient care and providing families genetic counseling.


Assuntos
Mutação em Linhagem Germinativa , Leucemia-Linfoma Linfoblástico de Células Precursoras , Adulto , Criança , Exoma , Predisposição Genética para Doença , Células Germinativas , Humanos , Leucemia-Linfoma Linfoblástico de Células Precursoras/genética , Leucemia-Linfoma Linfoblástico de Células Precursoras/terapia
10.
iScience ; 25(2): 103767, 2022 Feb 18.
Artigo em Inglês | MEDLINE | ID: mdl-35146385

RESUMO

Many neural networks for medical imaging generalize poorly to data unseen during training. Such behavior can be caused by overfitting easy-to-learn features while disregarding other potentially informative features. A recent implicit bias mitigation technique called spectral decoupling provably encourages neural networks to learn more features by regularizing the networks' unnormalized prediction scores with an L2 penalty. We show that spectral decoupling increases the networks' robustness for data distribution shifts and prevents overfitting on easy-to-learn features in medical images. To validate our findings, we train networks with and without spectral decoupling to detect prostate cancer on tissue slides and COVID-19 in chest radiographs. Networks trained with spectral decoupling achieve up to 9.5 percent point higher performance on external datasets. Spectral decoupling alleviates generalization issues associated with neural networks and can be used to complement or replace computationally expensive explicit bias mitigation methods, such as stain normalization in histological images.

11.
Genome Biol ; 23(1): 32, 2022 01 24.
Artigo em Inglês | MEDLINE | ID: mdl-35073941

RESUMO

Meta-analysis has been established as an effective approach to combining summary statistics of several genome-wide association studies (GWAS). However, the accuracy of meta-analysis can be attenuated in the presence of cross-study heterogeneity. We present sPLINK, a hybrid federated and user-friendly tool, which performs privacy-aware GWAS on distributed datasets while preserving the accuracy of the results. sPLINK is robust against heterogeneous distributions of data across cohorts while meta-analysis considerably loses accuracy in such scenarios. sPLINK achieves practical runtime and acceptable network usage for chi-square and linear/logistic regression tests. sPLINK is available at https://exbio.wzw.tum.de/splink .


Assuntos
Estudo de Associação Genômica Ampla , Privacidade , Estudo de Associação Genômica Ampla/métodos , Modelos Lineares , Modelos Logísticos , Metanálise como Assunto
12.
Blood ; 139(12): 1863-1877, 2022 03 24.
Artigo em Inglês | MEDLINE | ID: mdl-34932792

RESUMO

Inadequate molecular and clinical stratification of the patients with high-risk diffuse large B-cell lymphoma (DLBCL) is a clinical challenge hampering the establishment of personalized therapeutic options. We studied the translational significance of liquid biopsy in a uniformly treated trial cohort. Pretreatment circulating tumor DNA (ctDNA) revealed hidden clinical and biological heterogeneity, and high ctDNA burden determined increased risk of relapse and death independently of conventional risk factors. Genomic dissection of pretreatment ctDNA revealed translationally relevant phenotypic, molecular, and prognostic information that extended beyond diagnostic tissue biopsies. During therapy, chemorefractory lymphomas exhibited diverging ctDNA kinetics, whereas end-of-therapy negativity for minimal residual disease (MRD) characterized cured patients and resolved clinical enigmas, including false residual PET positivity. Furthermore, we discovered fragmentation disparities in the cell-free DNA that characterize lymphoma-derived ctDNA and, as a proof-of-concept for their clinical application, used machine learning to show that end-of-therapy fragmentation patterns predict outcome. Altogether, we have discovered novel molecular determinants in the liquid biopsy that can noninvasively guide treatment decisions.


Assuntos
DNA Tumoral Circulante , Linfoma Difuso de Grandes Células B , Biomarcadores Tumorais/genética , DNA Tumoral Circulante/genética , Humanos , Linfoma Difuso de Grandes Células B/diagnóstico , Linfoma Difuso de Grandes Células B/genética , Linfoma Difuso de Grandes Células B/terapia
13.
Nat Commun ; 10(1): 4022, 2019 09 06.
Artigo em Inglês | MEDLINE | ID: mdl-31492840

RESUMO

Genomic instability pathways in colorectal cancer (CRC) have been extensively studied, but the role of retrotransposition in colorectal carcinogenesis remains poorly understood. Although retrotransposons are usually repressed, they become active in several human cancers, in particular those of the gastrointestinal tract. Here we characterize retrotransposon insertions in 202 colorectal tumor whole genomes and investigate their associations with molecular and clinical characteristics. We find highly variable retrotransposon activity among tumors and identify recurrent insertions in 15 known cancer genes. In approximately 1% of the cases we identify insertions in APC, likely to be tumor-initiating events. Insertions are positively associated with the CpG island methylator phenotype and the genomic fraction of allelic imbalance. Clinically, high number of insertions is independently associated with poor disease-specific survival.


Assuntos
Neoplasias Colorretais/genética , Perfilação da Expressão Gênica , Regulação Neoplásica da Expressão Gênica , Elementos Nucleotídeos Longos e Dispersos/genética , Mutagênese Insercional , Idoso , Células CACO-2 , Carcinogênese/genética , Linhagem Celular Tumoral , Neoplasias Colorretais/patologia , Ilhas de CpG/genética , Metilação de DNA , Feminino , Instabilidade Genômica , Humanos , Estimativa de Kaplan-Meier , Masculino , Pessoa de Meia-Idade
15.
Br J Cancer ; 120(9): 922-930, 2019 04.
Artigo em Inglês | MEDLINE | ID: mdl-30894686

RESUMO

BACKGROUND: Approximately 4% of colorectal cancer (CRC) patients have at least two simultaneous cancers in the colon. Due to the shared environment, these synchronous CRCs (SCRCs) provide a unique setting to study colorectal carcinogenesis. Understanding whether these tumours are genetically similar or distinct is essential when designing therapeutic approaches. METHODS: We performed exome sequencing of 47 primary cancers and corresponding normal samples from 23 patients. Additionally, we carried out a comprehensive mutational signature analysis to assess whether tumours had undergone similar mutational processes and the first immune cell score analysis (IS) of SCRC to analyse the interplay between immune cell invasion and mutation profile in both lesions of an individual. RESULTS: The tumour pairs shared only few mutations, favouring different mutations in known CRC genes and signalling pathways and displayed variation in their signature content. Two tumour pairs had discordant mismatch repair statuses. In majority of the pairs, IS varied between primaries. Differences were not explained by any clinicopathological variable or mutation burden. CONCLUSIONS: The study shows major diversity within SCRCs. Rather than rely on data from one tumour, our study highlights the need to evaluate both tumours of a synchronous pair for optimised targeted therapy.


Assuntos
Neoplasias Colorretais/genética , Neoplasias Colorretais/imunologia , Linfócitos/imunologia , Neoplasias Primárias Múltiplas/genética , Neoplasias Primárias Múltiplas/imunologia , Idoso , Idoso de 80 Anos ou mais , Complexo CD3/imunologia , Antígenos CD8/imunologia , Linfócitos T CD8-Positivos/imunologia , Linfócitos T CD8-Positivos/patologia , Estudos de Casos e Controles , Neoplasias Colorretais/patologia , Análise Mutacional de DNA , Exoma/genética , Exoma/imunologia , Feminino , Humanos , Linfócitos/patologia , Masculino , Instabilidade de Microssatélites , Pessoa de Meia-Idade , Mutação , Neoplasias Primárias Múltiplas/patologia
16.
Nat Protoc ; 13(11): 2580-2600, 2018 11.
Artigo em Inglês | MEDLINE | ID: mdl-30323186

RESUMO

Next-generation sequencing (NGS) is routinely applied in life sciences and clinical practice, but interpretation of the massive quantities of genomic data produced has become a critical challenge. The genome-wide mutation analyses enabled by NGS have had a revolutionary impact in revealing the predisposing and driving DNA alterations behind a multitude of disorders. The workflow to identify causative mutations from NGS data, for example in cancer and rare diseases, commonly involves phases such as quality filtering, case-control comparison, genome annotation, and visual validation, which require multiple processing steps and usage of various tools and scripts. To this end, we have introduced an interactive and user-friendly multi-platform-compatible software, BasePlayer, which allows scientists, regardless of bioinformatics training, to carry out variant analysis in disease genetics settings. A genome-wide scan of regulatory regions for mutation clusters can be carried out with a desktop computer in ~10 min with a dataset of 3 million somatic variants in 200 whole-genome-sequenced (WGS) cancers.


Assuntos
Análise Mutacional de DNA/métodos , DNA de Neoplasias/genética , Genoma Humano , Mutação , Neoplasias/genética , Software , Sequência de Bases , Biologia Computacional , DNA Intergênico , Exoma , Genética Médica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , Humanos , Neoplasias/diagnóstico , Neoplasias/patologia , Sequenciamento Completo do Genoma
17.
Nat Commun ; 9(1): 3664, 2018 09 10.
Artigo em Inglês | MEDLINE | ID: mdl-30202008

RESUMO

Point mutations in cancer have been extensively studied but chromosomal gains and losses have been more challenging to interpret due to their unspecific nature. Here we examine high-resolution allelic imbalance (AI) landscape in 1699 colorectal cancers, 256 of which have been whole-genome sequenced (WGSed). The imbalances pinpoint 38 genes as plausible AI targets based on previous knowledge. Unbiased CRISPR-Cas9 knockout and activation screens identified in total 79 genes within AI peaks regulating cell growth. Genetic and functional data implicate loss of TP53 as a sufficient driver of AI. The WGS highlights an influence of copy number aberrations on the rate of detected somatic point mutations. Importantly, the data reveal several associations between AI target genes, suggesting a role for a network of lineage-determining transcription factors in colorectal tumorigenesis. Overall, the results unravel the contribution of AI in colorectal cancer and provide a plausible explanation why so few genes are commonly affected by point mutations in cancers.


Assuntos
Desequilíbrio Alélico , Neoplasias Colorretais/genética , Predisposição Genética para Doença , Sistemas CRISPR-Cas , Aberrações Cromossômicas , Cromossomos Humanos Par 8 , Neoplasias Colorretais/patologia , Variações do Número de Cópias de DNA , Dinamarca , Perfilação da Expressão Gênica , Genômica , Genótipo , Humanos , Perda de Heterozigosidade , Repetições de Microssatélites , Fenótipo , Mutação Puntual , Proteínas Proto-Oncogênicas p21(ras)/genética , RNA Interferente Pequeno/genética , Fatores de Transcrição/genética , Proteína Supressora de Tumor p53/genética , Sequenciamento Completo do Genoma
18.
EMBO Mol Med ; 10(9)2018 09.
Artigo em Inglês | MEDLINE | ID: mdl-30108113

RESUMO

Microsatellite instability (MSI) leads to accumulation of an excessive number of mutations in the genome, mostly small insertions and deletions. MSI colorectal cancers (CRCs), however, also contain more point mutations than microsatellite-stable (MSS) tumors, yet they have not been as comprehensively studied. To identify candidate driver genes affected by point mutations in MSI CRC, we ranked genes based on mutation significance while correcting for replication timing and gene expression utilizing an algorithm, MutSigCV Somatic point mutation data from the exome kit-targeted area from 24 exome-sequenced sporadic MSI CRCs and respective normals, and 12 whole-genome-sequenced sporadic MSI CRCs and respective normals were utilized. The top 73 genes were validated in 93 additional MSI CRCs. The MutSigCV ranking identified several well-established MSI CRC driver genes and provided additional evidence for previously proposed CRC candidate genes as well as shortlisted genes that have to our knowledge not been linked to CRC before. Two genes, SMARCB1 and STK38L, were also functionally scrutinized, providing evidence of a tumorigenic role, for SMARCB1 mutations in particular.


Assuntos
Neoplasias Colorretais/genética , Neoplasias Colorretais/patologia , Instabilidade de Microssatélites , Mutação Puntual , Redes Reguladoras de Genes , Humanos , Anotação de Sequência Molecular , Análise de Sequência de DNA
19.
BMC Genomics ; 19(Suppl 2): 87, 2018 May 09.
Artigo em Inglês | MEDLINE | ID: mdl-29764365

RESUMO

BACKGROUND: Typical human genome differs from the reference genome at 4-5 million sites. This diversity is increasingly catalogued in repositories such as ExAC/gnomAD, consisting of >15,000 whole-genomes and >126,000 exome sequences from different individuals. Despite this enormous diversity, resequencing data workflows are still based on a single human reference genome. Identification and genotyping of genetic variants is typically carried out on short-read data aligned to a single reference, disregarding the underlying variation. RESULTS: We propose a new unified framework for variant calling with short-read data utilizing a representation of human genetic variation - a pan-genomic reference. We provide a modular pipeline that can be seamlessly incorporated into existing sequencing data analysis workflows. Our tool is open source and available online: https://gitlab.com/dvalenzu/PanVC . CONCLUSIONS: Our experiments show that by replacing a standard human reference with a pan-genomic one we achieve an improvement in single-nucleotide variant calling accuracy and in short indel calling accuracy over the widely adopted Genome Analysis Toolkit (GATK) in difficult genomic regions.


Assuntos
Variação Genética , Análise de Sequência de DNA/métodos , Acesso à Informação , Genoma Humano , Humanos , Internet , Alinhamento de Sequência , Software , Fluxo de Trabalho
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...