Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
1.
Nature ; 631(8019): 134-141, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38867047

RESUMO

Mosaic loss of the X chromosome (mLOX) is the most common clonal somatic alteration in leukocytes of female individuals1,2, but little is known about its genetic determinants or phenotypic consequences. Here, to address this, we used data from 883,574 female participants across 8 biobanks; 12% of participants exhibited detectable mLOX in approximately 2% of leukocytes. Female participants with mLOX had an increased risk of myeloid and lymphoid leukaemias. Genetic analyses identified 56 common variants associated with mLOX, implicating genes with roles in chromosomal missegregation, cancer predisposition and autoimmune diseases. Exome-sequence analyses identified rare missense variants in FBXO10 that confer a twofold increased risk of mLOX. Only a small fraction of associations was shared with mosaic Y chromosome loss, suggesting that distinct biological processes drive formation and clonal expansion of sex chromosome missegregation. Allelic shift analyses identified X chromosome alleles that are preferentially retained in mLOX, demonstrating variation at many loci under cellular selection. A polygenic score including 44 allelic shift loci correctly inferred the retained X chromosomes in 80.7% of mLOX cases in the top decile. Our results support a model in which germline variants predispose female individuals to acquiring mLOX, with the allelic content of the X chromosome possibly shaping the magnitude of clonal expansion.


Assuntos
Aneuploidia , Cromossomos Humanos X , Células Clonais , Leucócitos , Mosaicismo , Adulto , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Alelos , Doenças Autoimunes/genética , Bancos de Espécimes Biológicos , Segregação de Cromossomos/genética , Cromossomos Humanos X/genética , Cromossomos Humanos Y/genética , Células Clonais/metabolismo , Células Clonais/patologia , Exoma/genética , Proteínas F-Box/genética , Predisposição Genética para Doença/genética , Mutação em Linhagem Germinativa , Leucemia/genética , Leucócitos/metabolismo , Modelos Genéticos , Herança Multifatorial/genética , Mutação de Sentido Incorreto/genética
2.
Hum Mutat ; 42(6): 777-786, 2021 06.
Artigo em Inglês | MEDLINE | ID: mdl-33715282

RESUMO

KATK is a fast and accurate software tool for calling variants directly from raw next-generation sequencing reads. It uses predefined k-mers to retrieve only the reads of interest from the FASTQ file and calls genotypes by aligning retrieved reads locally. KATK does not use data about known polymorphisms and has NC (no call) as the default genotype. The reference or variant allele is called only if there is sufficient evidence for their presence in data. Thus it is not biased against rare variants or de-novo mutations. With simulated datasets, we achieved a false-negative rate of 0.23% (sensitivity 99.77%) and a false discovery rate of 0.19%. Calling all human exonic regions with KATK requires 1-2 h, depending on sequencing coverage.


Assuntos
Análise Mutacional de DNA/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Software , Algoritmos , Alelos , Mapeamento Cromossômico/métodos , Conjuntos de Dados como Assunto , Feminino , Genoma Humano , Genótipo , Humanos , Masculino , Polimorfismo de Nucleotídeo Único , Reprodutibilidade dos Testes , Análise de Sequência de DNA/métodos
3.
Nat Commun ; 15(1): 6277, 2024 Jul 25.
Artigo em Inglês | MEDLINE | ID: mdl-39054313

RESUMO

A diagnosis of epilepsy has significant consequences for an individual but is often challenging in clinical practice. Novel biomarkers are thus greatly needed. Here, we investigated how common genetic factors (epilepsy polygenic risk scores, [PRSs]) influence epilepsy risk in detailed longitudinal electronic health records (EHRs) of > 700k Finns and Estonians. We found that a high genetic generalized epilepsy PRS (PRSGGE) increased risk for genetic generalized epilepsy (GGE) (hazard ratio [HR] 1.73 per PRSGGE standard deviation [SD]) across lifetime and within 10 years after an unspecified seizure event. The effect of PRSGGE was significantly larger on idiopathic generalized epilepsies, in females and for earlier epilepsy onset. Analogously, we found significant but more modest focal epilepsy PRS burden associated with non-acquired focal epilepsy (NAFE). Here, we outline the potential of epilepsy specific PRSs to serve as biomarkers after a first seizure event.


Assuntos
Epilepsia Generalizada , Predisposição Genética para Doença , Herança Multifatorial , Convulsões , Humanos , Feminino , Masculino , Adulto , Herança Multifatorial/genética , Convulsões/genética , Pessoa de Meia-Idade , Fatores de Risco , Epilepsia Generalizada/genética , Adulto Jovem , Adolescente , Epilepsia/genética , Epilepsia/epidemiologia , Biomarcadores , Epilepsias Parciais/genética , Criança , Idoso , Estudos Longitudinais , Registros Eletrônicos de Saúde , Estratificação de Risco Genético
4.
Sci Rep ; 13(1): 17765, 2023 10 18.
Artigo em Inglês | MEDLINE | ID: mdl-37853040

RESUMO

Genomes exhibit large regions with segmental copy number variation, many of which include entire genes and are multiallelic. We have developed a computational method GeneToCN that counts the frequencies of gene-specific k-mers in FASTQ files and uses this information to infer copy number of the gene. We validated the copy number predictions for amylase genes (AMY1, AMY2A, AMY2B) using experimental data from digital droplet PCR (ddPCR) on 39 individuals and observed a strong correlation (R = 0.99) between GeneToCN predictions and experimentally determined copy numbers. An additional validation on FCGR3 genes showed a higher concordance for FCGR3A compared to two other methods, but reduced accuracy for FCGR3B. We further tested the method on three different genomic regions (SMN, NPY4R, and LPA Kringle IV-2 domain). Predicted copy number distributions of these genes in a set of 500 individuals from the Estonian Biobank were in good agreement with the previously published studies. In addition, we investigated the possibility to use GeneToCN on sequencing data generated by different technologies by comparing copy number predictions from Illumina, PacBio, and Oxford Nanopore data of the same sample. Despite the differences in variability of k-mer frequencies, all three sequencing technologies give similar predictions with GeneToCN.


Assuntos
Variações do Número de Cópias de DNA , Genoma , Humanos , Variações do Número de Cópias de DNA/genética , Dosagem de Genes , Reação em Cadeia da Polimerase/métodos , Sequenciamento de Nucleotídeos em Larga Escala
5.
medRxiv ; 2023 Nov 27.
Artigo em Inglês | MEDLINE | ID: mdl-38076931

RESUMO

A diagnosis of epilepsy has significant consequences for an individual but is often challenging in clinical practice. Novel biomarkers are thus greatly needed. Here, we investigated how common genetic factors (epilepsy polygenic risk scores, [PRSs]) influence epilepsy risk in detailed longitudinal electronic health records (EHRs) of > 360k Finns spanning up to 50 years of individuals' lifetimes. Individuals with a high genetic generalized epilepsy PRS (PRSGGE) in FinnGen had an increased risk for genetic generalized epilepsy (GGE) (hazard ratio [HR] 1.55 per PRSGGE standard deviation [SD]) across their lifetime and after unspecified seizure events. Effect sizes of epilepsy PRSs were comparable to effect sizes in clinically curated data supporting our EHR-derived epilepsy diagnoses. Within 10 years after an unspecified seizure, the GGE rate was 37% when PRSGGE > 2 SD compared to 5.6% when PRSGGE < -2 SD. The effect of PRSGGE was even larger on GGE subtypes of idiopathic generalized epilepsy (IGE) (HR 2.1 per SD PRSGGE). We further report significantly larger effects of PRSGGE on epilepsy in females and in younger age groups. Analogously, we found significant but more modest focal epilepsy PRS burden associated with non-acquired focal epilepsy (NAFE). We found PRSGGE specifically associated with GGE in comparison with >2000 independent diseases while PRSNAFE was also associated with other diseases than NAFE such as back pain. Here, we show that epilepsy specific PRSs have good discriminative ability after a first seizure event i.e. in circumstances where the prior probability of epilepsy is high outlining a potential to serve as biomarkers for an epilepsy diagnosis.

6.
medRxiv ; 2023 Jan 31.
Artigo em Inglês | MEDLINE | ID: mdl-36778285

RESUMO

Mosaic loss of the X chromosome (mLOX) is the most commonly occurring clonal somatic alteration detected in the leukocytes of women, yet little is known about its genetic determinants or phenotypic consequences. To address this, we estimated mLOX in >900,000 women across eight biobanks, identifying 10% of women with detectable X loss in approximately 2% of their leukocytes. Out of 1,253 diseases examined, women with mLOX had an elevated risk of myeloid and lymphoid leukemias and pneumonia. Genetic analyses identified 49 common variants influencing mLOX, implicating genes with established roles in chromosomal missegregation, cancer predisposition, and autoimmune diseases. Complementary exome-sequence analyses identified rare missense variants in FBXO10 which confer a two-fold increased risk of mLOX. A small fraction of these associations were shared with mosaic Y chromosome loss in men, suggesting different biological processes drive the formation and clonal expansion of sex chromosome missegregation events. Allelic shift analyses identified alleles on the X chromosome which are preferentially retained, demonstrating that variation at many loci across the X chromosome is under cellular selection. A novel polygenic score including 44 independent X chromosome allelic shift loci correctly inferred the retained X chromosomes in 80.7% of mLOX cases in the top decile. Collectively our results support a model where germline variants predispose women to acquiring mLOX, with the allelic content of the X chromosome possibly shaping the magnitude of subsequent clonal expansion.

7.
Mob DNA ; 10: 31, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31360240

RESUMO

BACKGROUND: Recently, alignment-free sequence analysis methods have gained popularity in the field of personal genomics. These methods are based on counting frequencies of short k-mer sequences, thus allowing faster and more robust analysis compared to traditional alignment-based methods. RESULTS: We have created a fast alignment-free method, AluMine, to analyze polymorphic insertions of Alu elements in the human genome. We tested the method on 2,241 individuals from the Estonian Genome Project and identified 28,962 potential polymorphic Alu element insertions. Each tested individual had on average 1,574 Alu element insertions that were different from those in the reference genome. In addition, we propose an alignment-free genotyping method that uses the frequency of insertion/deletion-specific 32-mer pairs to call the genotype directly from raw sequencing reads. Using this method, the concordance between the predicted and experimentally observed genotypes was 98.7%. The running time of the discovery pipeline is approximately 2 h per individual. The genotyping of potential polymorphic insertions takes between 0.4 and 4 h per individual, depending on the hardware configuration. CONCLUSIONS: AluMine provides tools that allow discovery of novel Alu element insertions and/or genotyping of known Alu element insertions from personal genomes within few hours.

8.
Sci Rep ; 7(1): 2537, 2017 05 31.
Artigo em Inglês | MEDLINE | ID: mdl-28566690

RESUMO

We have developed a computational method that counts the frequencies of unique k-mers in FASTQ-formatted genome data and uses this information to infer the genotypes of known variants. FastGT can detect the variants in a 30x genome in less than 1 hour using ordinary low-cost server hardware. The overall concordance with the genotypes of two Illumina "Platinum" genomes is 99.96%, and the concordance with the genotypes of the Illumina HumanOmniExpress is 99.82%. Our method provides k-mer database that can be used for the simultaneous genotyping of approximately 30 million single nucleotide variants (SNVs), including >23,000 SNVs from Y chromosome. The source code of FastGT software is available at GitHub (https://github.com/bioinfo-ut/GenomeTester4/).


Assuntos
Algoritmos , Genoma Humano , Polimorfismo de Nucleotídeo Único , Análise de Sequência de DNA/métodos , Software , Teorema de Bayes , Benchmarking , Genótipo , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Reprodutibilidade dos Testes , Análise de Sequência de DNA/estatística & dados numéricos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA