Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 3.736
Filtrar
1.
Funct Integr Genomics ; 24(3): 104, 2024 May 20.
Artigo em Inglês | MEDLINE | ID: mdl-38764005

RESUMO

Accurate estimation of population allele frequency (AF) is crucial for gene discovery and genetic diagnostics. However, determining AF for frameshift-inducing small insertions and deletions (indels) faces challenges due to discrepancies in mapping and variant calling methods. Here, we propose an innovative approach to assess indel AF. We developed CRAFTS-indels (Calculating Regional Allele Frequency Targeting Small indels), an algorithm that combines AF of distinct indels within a given region and provides "regional AF" (rAF). We tested and validated CRAFTS-indels using three independent datasets: gnomAD v2 (n=125,748 samples), an internal dataset (IGM; n=39,367), and the UK BioBank (UKBB; n=469,835). By comparing rAF against standard AF, we identified rare indels with rAF exceeding standard AF (sAF≤10-4 and rAF>10-4) as "rAF-hi" indels. Notably, a high percentage of rare indels were "rAF-hi", with a higher proportion in gnomAD v2 (11-20%) and IGM (11-22%) compared to the UKBB (5-9% depending on the CRAFTS-indels' parameters). Analysis of the overlap of regions based on their rAF with low complexity regions and with ClinVar classification supported the pertinence of rAF. Using the internal dataset, we illustrated the utility of CRAFTS-indel in the analysis of de novo variants and the potential negative impact of rAF-hi indels in gene discovery. In summary, annotation of indels with cohort specific rAF can be used to handle some of the limitations of current annotation pipelines and facilitate detection of novel gene disease associations. CRAFTS-indels offers a user-friendly approach to providing rAF annotation. It can be integrated into public databases such as gnomAD, UKBB and used by ClinVar to revise indel classifications.


Assuntos
Frequência do Gene , Mutação INDEL , Humanos , Algoritmos
2.
Theor Appl Genet ; 137(6): 136, 2024 May 20.
Artigo em Inglês | MEDLINE | ID: mdl-38764078

RESUMO

KEY MESSAGE: Different kinship and resistance to cotton leaf curl disease (CLCuD) and heat were found between upland cotton cultivars from China and Pakistan. 175 SNPs and 82 InDels loci related to yield, fiber quality, CLCuD, and heat resistance were identified. Elite alleles found in Pakistani accessions aided local adaptation to climatic condition of two countries. Adaptation of upland cotton (Gossypium hirsutum) beyond its center of origin is expected to be driven by tailoring of the genome and genes to enhance yield and quality in new ecological niches. Here, resequencing of 456 upland cotton accessions revealed two distinct kinships according to the associated country. Fiber quality and lint percentage were consistent across kinships, but resistance to cotton leaf curl disease (CLCuD) and heat was distinctly exhibited by accessions from Pakistan, illustrating highly local adaption. A total of 175 SNP and 82 InDel loci related to yield, fiber quality, CLCuD and heat resistance were identified; among them, only two overlapped between Pakistani and Chinese accessions underscoring the divergent domestication and improvement targets in each country. Loci associated with resistance alleles to leaf curl disease and high temperature were largely found in Pakistani accessions to counter these stresses prevalent in Pakistan. These results revealed that breeding activities led to the accumulation of unique alleles and helped upland cotton become adapted to the respective climatic conditions, which will contribute to elucidating the genetic mechanisms that underlie resilience traits and help develop climate-resilient cotton cultivars for use worldwide.


Assuntos
Gossypium , Polimorfismo de Nucleotídeo Único , Gossypium/genética , Paquistão , China , Resistência à Doença/genética , Doenças das Plantas/genética , Mutação INDEL , Adaptação Fisiológica/genética , Genoma de Planta , Alelos , Melhoramento Vegetal , Fibra de Algodão , Fenótipo
3.
Orphanet J Rare Dis ; 19(1): 209, 2024 May 21.
Artigo em Inglês | MEDLINE | ID: mdl-38773661

RESUMO

BACKGROUND: Marfan syndrome (MFS) is an autosomal dominant connective tissue disease with wide clinical heterogeneity, and mainly caused by pathogenic variants in fibrillin-1 (FBN1). METHODS: A Chinese 4-generation MFS pedigree with 16 family members was recruited and exome sequencing (ES) was performed in the proband. Transcript analysis (patient RNA and minigene assays) and in silico structural analysis were used to determine the pathogenicity of the variant. In addition, germline mosaicism in family member (Ι:1) was assessed using quantitative fluorescent polymerase chain reaction (QF-PCR) and short tandem repeat PCR (STR) analyses. RESULTS: Two cis-compound benign intronic variants of FBN1 (c.3464-4 A > G and c.3464-5G > A) were identified in the proband by ES. As a compound variant, c.3464-5_3464-4delGAinsAG was found to be pathogenic and co-segregated with MFS. RNA studies indicated that aberrant transcripts were found only in patients and mutant-type clones. The variant c.3464-5_3464-4delGAinsAG caused erroneous integration of a 3 bp sequence into intron 28 and resulted in the insertion of one amino acid in the protein sequence (p.Ile1154_Asp1155insAla). Structural analyses suggested that p.Ile1154_Asp1155insAla affected the protein's secondary structure by interfering with one disulfide bond between Cys1140 and Cys1153 and causing the extension of an anti-parallel ß sheet in the calcium-binding epidermal growth factor-like (cbEGF)13 domain. In addition, the asymptomatic family member Ι:1 was deduced to be a gonadal mosaic as assessed by inconsistent results of sequencing and STR analysis. CONCLUSIONS: To our knowledge, FBN1 c.3464-5_3464-4delGAinsAG is the first identified pathogenic intronic indel variant affecting non-canonical splice sites in this gene. Our study reinforces the importance of assessing the pathogenic role of intronic variants at the mRNA level, with structural analysis, and the occurrence of mosaicism.


Assuntos
Fibrilina-1 , Íntrons , Síndrome de Marfan , Mosaicismo , Linhagem , Humanos , Fibrilina-1/genética , Síndrome de Marfan/genética , Síndrome de Marfan/patologia , Feminino , Masculino , Adulto , Íntrons/genética , Mutação INDEL/genética , Pessoa de Meia-Idade , Adipocinas
4.
BMC Genomics ; 25(1): 475, 2024 May 14.
Artigo em Inglês | MEDLINE | ID: mdl-38745120

RESUMO

BACKGROUND: Single nucleotide polymorphism (SNP) markers play significant roles in accelerating breeding and basic crop research. Several soybean SNP panels have been developed. However, there is still a lack of SNP panels for differentiating between wild and cultivated populations, as well as for detecting polymorphisms within both wild and cultivated populations. RESULTS: This study utilized publicly available resequencing data from over 3,000 soybean accessions to identify differentiating and highly conserved SNP and insertion/deletion (InDel) markers between wild and cultivated soybean populations. Additionally, a naturally occurring mutant gene library was constructed by analyzing large-effect SNPs and InDels in the population. CONCLUSION: The markers obtained in this study are associated with numerous genes governing agronomic traits, thus facilitating the evaluation of soybean germplasms and the efficient differentiation between wild and cultivated soybeans. The natural mutant gene library permits the quick identification of individuals with natural mutations in functional genes, providing convenience for accelerating soybean breeding using reverse genetics.


Assuntos
Glycine max , Mutação INDEL , Polimorfismo de Nucleotídeo Único , Glycine max/genética , Genoma de Planta , Biblioteca Gênica , Melhoramento Vegetal
5.
Genome Biol Evol ; 16(5)2024 May 02.
Artigo em Inglês | MEDLINE | ID: mdl-38735759

RESUMO

A fundamental goal in evolutionary biology and population genetics is to understand how selection shapes the fate of new mutations. Here, we test the null hypothesis that insertion-deletion (indel) events in protein-coding regions occur randomly with respect to secondary structures. We identified indels across 11,444 sequence alignments in mouse, rat, human, chimp, and dog genomes and then quantified their overlap with four different types of secondary structure-alpha helices, beta strands, protein bends, and protein turns-predicted by deep-learning methods of AlphaFold2. Indels overlapped secondary structures 54% as much as expected and were especially underrepresented over beta strands, which tend to form internal, stable regions of proteins. In contrast, indels were enriched by 155% over regions without any predicted secondary structures. These skews were stronger in the rodent lineages compared to the primate lineages, consistent with population genetic theory predicting that natural selection will be more efficient in species with larger effective population sizes. Nonsynonymous substitutions were also less common in regions of protein secondary structure, although not as strongly reduced as in indels. In a complementary analysis of thousands of human genomes, we showed that indels overlapping secondary structure segregated at significantly lower frequency than indels outside of secondary structure. Taken together, our study shows that indels are selected against if they overlap secondary structure, presumably because they disrupt the tertiary structure and function of a protein.


Assuntos
Mutação INDEL , Estrutura Secundária de Proteína , Humanos , Animais , Camundongos , Ratos , Evolução Molecular , Proteínas/genética , Proteínas/química , Cães , Seleção Genética , Genoma
6.
Physiol Genomics ; 56(6): 436-444, 2024 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-38586874

RESUMO

This study aimed to investigate the relationship between pre- and postexercise cardiac biomarker release according to athletic status (trained vs. untrained) and to establish whether the I/D polymorphism in the angiotensin-converting enzyme (ACE) gene had an influence on cardiac biomarkers release with specific regard on the influence of the training state. We determined cardiac troponin I (cTnI) and N-terminal pro-brain natriuretic peptide (NT-proBNP) in 29 trained and 27 untrained male soccer players before and after moderate-intensity continuous exercise (MICE) and high-intensity interval exercise (HIIE) running tests. Trained soccer players had higher pre (trained: 0.014 ± 0.007 ng/mL; untrained: 0.010 ± 0.005 ng/mL) and post HIIE (trained: 0.031 ± 0.008 ng/mL; untrained: 0.0179 ± 0.007) and MICE (trained: 0.030 ± 0.007 ng/mL; untrained: 0.018 ± 0.007) cTnI values than untrained subjects, but the change with exercise (ΔcTnI) was similar between groups. There was no significant difference in baseline and postexercise NT-proBNP between groups. NT-proBNP levels were elevated after both HIIE and MICE. Considering three ACE genotypes, the mean pre exercise cTnI values of the trained group (DD: 0.015 ± 0.008 ng/mL, ID: 0.015 ± 0.007 ng/mL, and II: 0.014 ± 0.008 ng/mL) and their untrained counterparts (DD: 0.010 ± 0.004 ng/mL, ID: 0.011 ± 0.004 ng/mL, and II: 0.010 ± 0.006 ng/mL) did not show any significant difference. To sum up, noticeable difference in baseline cTnI was observed, which was related to athletic status but not ACE genotypes. Neither athletic status nor ACE genotypes seemed to affect the changes in cardiac biomarkers in response to HIIE and MICE, indicating that the ACE gene does not play a significant role in the release of exercise-induced cardiac biomarkers indicative of cardiac damage in Iranian soccer players.NEW & NOTEWORTHY Our study investigated the impact of athletic status and angiotensin-converting enzyme (ACE) gene I/D polymorphism on cardiac biomarkers in soccer players. Trained players showed higher baseline cardiac troponin I (cTnI) levels, whereas postexercise ΔcTnI remained consistent across groups. N-terminal pro-brain natriuretic peptide increased after exercise in both groups, staying within normal limits. ACE genotypes did not significantly affect pre-exercise cTnI. Overall, athletic status influences baseline cTnI, but neither it nor ACE genotypes significantly impact exercise-induced cardiac biomarker responses in this population.


Assuntos
Biomarcadores , Exercício Físico , Peptídeo Natriurético Encefálico , Fragmentos de Peptídeos , Peptidil Dipeptidase A , Polimorfismo Genético , Troponina I , Masculino , Humanos , Peptidil Dipeptidase A/genética , Biomarcadores/sangue , Peptídeo Natriurético Encefálico/sangue , Peptídeo Natriurético Encefálico/genética , Troponina I/sangue , Troponina I/genética , Fragmentos de Peptídeos/sangue , Exercício Físico/fisiologia , Adulto Jovem , Adulto , Treinamento Intervalado de Alta Intensidade/métodos , Futebol/fisiologia , Mutação INDEL/genética , Coração/fisiologia
7.
Electrophoresis ; 45(9-10): 867-876, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38651903

RESUMO

Short tandem repeat analysis is challenging when dealing with unbalanced mixtures in forensic cases due to the presence of stutter peaks and large amplicons. In this research, we propose a novel genetic marker called DIP-TriSNP, which combines deletion/insertion polymorphism (DIP) with tri-allelic single nucleotide polymorphism in less than 230 bp length of human genome. Based on multiplex PCR and SNaPShot, a panel, including 14 autosomal DIP-TriSNPs and one Y chromosomal DIP-SNP, had been developed and applied to genotyping 102 unrelated Han Chinese individuals in Sichuan of China and simulated a mixture study. The panel sensitivity can reach as low as 0.1 ng DNA template, and the minor contributor of DNA can be detected with the highest ratio of 19:1, as indicated by the obtained results. In the Sichuan Han population, the cumulative probability of informative genotypes reached 0.997092, with a combined power of discrimination of 0.999999998801. The panel was estimated to detect more than two alleles in at least one locus in 99.69% of mixtures of the Sichuan Han population. In conclusion, DIP-TriSNPs have shown promising as an innovative DNA marker for identifying the minor contributor in unbalanced DNA mixtures, offering advantages such as short amplifications, increased polymorphism, and heightened sensitivity.


Assuntos
DNA , Genética Forense , Reação em Cadeia da Polimerase Multiplex , Polimorfismo de Nucleotídeo Único , Humanos , Reação em Cadeia da Polimerase Multiplex/métodos , Genética Forense/métodos , Marcadores Genéticos/genética , DNA/genética , DNA/análise , China , Povo Asiático/genética , Genótipo , Reprodutibilidade dos Testes , Mutação INDEL , Repetições de Microssatélites/genética , Masculino , Técnicas de Genotipagem/métodos
8.
BMC Biol ; 22(1): 90, 2024 Apr 22.
Artigo em Inglês | MEDLINE | ID: mdl-38644496

RESUMO

BACKGROUND: Accurate identification of genetic variants, such as point mutations and insertions/deletions (indels), is crucial for various genetic studies into epidemic tracking, population genetics, and disease diagnosis. Genetic studies into microbiomes often require processing numerous sequencing datasets, necessitating variant identifiers with high speed, accuracy, and robustness. RESULTS: We present QuickVariants, a bioinformatics tool that effectively summarizes variant information from read alignments and identifies variants. When tested on diverse bacterial sequencing data, QuickVariants demonstrates a ninefold higher median speed than bcftools, a widely used variant identifier, with higher accuracy in identifying both point mutations and indels. This accuracy extends to variant identification in virus samples, including SARS-CoV-2, particularly with significantly fewer false negative indels than bcftools. The high accuracy of QuickVariants is further demonstrated by its detection of a greater number of Omicron-specific indels (5 versus 0) and point mutations (61 versus 48-54) than bcftools in sewage metagenomes predominated by Omicron variants. Much of the reduced accuracy of bcftools was attributable to its misinterpretation of indels, often producing false negative indels and false positive point mutations at the same locations. CONCLUSIONS: We introduce QuickVariants, a fast, accurate, and robust bioinformatics tool designed for identifying genetic variants for microbial studies. QuickVariants is available at https://github.com/caozhichongchong/QuickVariants .


Assuntos
Mutação INDEL , SARS-CoV-2 , SARS-CoV-2/genética , Biologia Computacional/métodos , Humanos , Software , COVID-19/virologia , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Mutação Puntual , Variação Genética , Análise de Sequência de DNA/métodos
9.
Anim Biotechnol ; 35(1): 2337751, 2024 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-38597900

RESUMO

The economic efficiency of sheep breeding, aiming to enhance productivity, is a focal point for improvement of sheep breeding. Recent studies highlight the involvement of the Early Region 2 Binding Factor transcription factor 8 (E2F8) gene in female reproduction. Our group's recent genome-wide association study (GWAS) emphasizes the potential impact of the E2F8 gene on prolificacy traits in Australian White sheep (AUW). Herein, the purpose of this study was to assess the correlation of the E2F8 gene with litter size in AUW sheep breed. This work encompassed 659 AUW sheep, subject to genotyping through PCR-based genotyping technology. Furthermore, the results of PCR-based genotyping showed significant associations between the P1-del-32bp bp InDel and the fourth and fifth parities litter size in AUW sheep; the litter size of those with genotype ID were superior compared to those with DD and II genotypes. Thus, these results indicate that the P1-del-32bp InDel within the E2F8 gene can be useful in marker-assisted selection (MAS) in sheep.


Assuntos
Estudo de Associação Genômica Ampla , Mutação INDEL , Feminino , Animais , Ovinos/genética , Gravidez , Austrália , Tamanho da Ninhada de Vivíparos/genética , Genótipo , Mutação INDEL/genética
10.
Sci Rep ; 14(1): 8165, 2024 04 08.
Artigo em Inglês | MEDLINE | ID: mdl-38589653

RESUMO

Accurately calling indels with next-generation sequencing (NGS) data is critical for clinical application. The precisionFDA team collaborated with the U.S. Food and Drug Administration's (FDA's) National Center for Toxicological Research (NCTR) and successfully completed the NCTR Indel Calling from Oncopanel Sequencing Data Challenge, to evaluate the performance of indel calling pipelines. Top performers were selected based on precision, recall, and F1-score. The performance of many other pipelines was close to the top performers, which produced a top cluster of performers. The performance was significantly higher in high confidence regions and coding regions, and significantly lower in low complexity regions. Oncopanel capture and other issues may have occurred that affected the recall rate. Indels with higher variant allele frequency (VAF) may generally be called with higher confidence. Many of the indel calling pipelines had good performance. Some of them performed generally well across all three oncopanels, while others were better for a specific oncopanel. The performance of indel calling can further be improved by restricting the calls within high confidence intervals (HCIs) and coding regions, and by excluding low complexity regions (LCR) regions. Certain VAF cut-offs could be applied according to the applications.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Mutação INDEL , Polimorfismo de Nucleotídeo Único
11.
J Agric Food Chem ; 72(17): 10138-10148, 2024 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-38637271

RESUMO

Passion fruit (Passiflora spp.) is an important fruit tree in the family Passifloraceae. The color of the fruit skin, a significant agricultural trait, is determined by the content of anthocyanin in passion fruit. However, the regulatory mechanisms behind the accumulation of anthocyanin in different passion fruit skin colors remain unclear. In the study, we identified and characterized a R2R3-MYB transcription factor, PeMYB114, which functions as a transcriptional activator in anthocyanin biosynthesis. Yeast one-hybrid system and dual-luciferase analysis showed that PeMYB114 could directly activate the expression of anthocyanin structural genes (PeCHS and PeDFR). Furthermore, a natural variation in the promoter region of PeMYB114 alters its expression. PeMYB114purple accessions with the 224-bp insertion have a higher anthocyanin level than PeMYB114yellow accessions with the 224-bp deletion. The findings enhance our understanding of anthocyanin accumulation in fruits and provide genetic resources for genome design for improving passion fruit quality.


Assuntos
Antocianinas , Frutas , Regulação da Expressão Gênica de Plantas , Passiflora , Proteínas de Plantas , Regiões Promotoras Genéticas , Fatores de Transcrição , Antocianinas/metabolismo , Antocianinas/genética , Passiflora/genética , Passiflora/metabolismo , Passiflora/química , Frutas/metabolismo , Frutas/genética , Frutas/química , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismo , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo , Mutação INDEL
12.
BMC Genomics ; 25(1): 428, 2024 Apr 30.
Artigo em Inglês | MEDLINE | ID: mdl-38689225

RESUMO

BACKGROUND: Although many studies have been done to reveal artificial selection signatures in commercial and indigenous chickens, a limited number of genes have been linked to specific traits. To identify more trait-related artificial selection signatures and genes, we re-sequenced a total of 85 individuals of five indigenous chicken breeds with distinct traits from Yunnan Province, China. RESULTS: We found 30 million non-redundant single nucleotide variants and small indels (< 50 bp) in the indigenous chickens, of which 10 million were not seen in 60 broilers, 56 layers and 35 red jungle fowls (RJFs) that we compared with. The variants in each breed are enriched in non-coding regions, while those in coding regions are largely tolerant, suggesting that most variants might affect cis-regulatory sequences. Based on 27 million bi-allelic single nucleotide polymorphisms identified in the chickens, we found numerous selective sweeps and affected genes in each indigenous chicken breed and substantially larger numbers of selective sweeps and affected genes in the broilers and layers than previously reported using a rigorous statistical model. Consistent with the locations of the variants, the vast majority (~ 98.3%) of the identified selective sweeps overlap known quantitative trait loci (QTLs). Meanwhile, 74.2% known QTLs overlap our identified selective sweeps. We confirmed most of previously identified trait-related genes and identified many novel ones, some of which might be related to body size and high egg production traits. Using RT-qPCR, we validated differential expression of eight genes (GHR, GHRHR, IGF2BP1, OVALX, ELF2, MGARP, NOCT, SLC25A15) that might be related to body size and high egg production traits in relevant tissues of relevant breeds. CONCLUSION: We identify 30 million single nucleotide variants and small indels in the five indigenous chicken breeds, 10 million of which are novel. We predict substantially more selective sweeps and affected genes than previously reported in both indigenous and commercial breeds. These variants and affected genes are good candidates for further experimental investigations of genotype-phenotype relationships and practical applications in chicken breeding programs.


Assuntos
Galinhas , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , Seleção Genética , Animais , Galinhas/genética , Genoma , Mutação INDEL , Cruzamento , Fenótipo , Genômica/métodos
13.
Genome Biol ; 25(1): 101, 2024 Apr 19.
Artigo em Inglês | MEDLINE | ID: mdl-38641647

RESUMO

Many bioinformatics methods seek to reduce reference bias, but no methods exist to comprehensively measure it. Biastools analyzes and categorizes instances of reference bias. It works in various scenarios: when the donor's variants are known and reads are simulated; when donor variants are known and reads are real; and when variants are unknown and reads are real. Using biastools, we observe that more inclusive graph genomes result in fewer biased sites. We find that end-to-end alignment reduces bias at indels relative to local aligners. Finally, we use biastools to characterize how T2T references improve large-scale bias.


Assuntos
Genoma , Genômica , Genômica/métodos , Biologia Computacional , Mutação INDEL , Viés , Análise de Sequência de DNA/métodos , Software , Sequenciamento de Nucleotídeos em Larga Escala/métodos
14.
BMC Genomics ; 25(1): 329, 2024 Apr 02.
Artigo em Inglês | MEDLINE | ID: mdl-38566035

RESUMO

BACKGROUND: Previously, a novel multiplex system of 64 loci was constructed based on capillary electrophoresis platform, including 59 autosomal insertion/deletions (A-InDels), two Y-chromosome InDels, two mini short tandem repeats (miniSTRs), and an Amelogenin gene. The aim of this study is to evaluate the efficiencies of this multiplex system for individual identification, paternity testing and biogeographic ancestry inference in Chinese Hezhou Han (CHH) and Hubei Tujia (CTH) groups, providing valuable insights for forensic anthropology and population genetics research. RESULTS: The cumulative values of power of discrimination (CDP) and probability of exclusion (CPE) for the 59 A-InDels and two miniSTRs were 0.99999999999999999999999999754, 0.99999905; and 0.99999999999999999999999999998, 0.99999898 in CTH and CHH groups, respectively. When the likelihood ratio thresholds were set to 1 or 10, more than 95% of the full sibling pairs could be identified from unrelated individual pairs, and the false positive rates were less than 1.2% in both CTH and CHH groups. Biogeographic ancestry inference models based on 35 populations were constructed with three algorithms: random forest, adaptive boosting and extreme gradient boosting, and then 10-fold cross-validation analyses were applied to test these three models with the average accuracies of 86.59%, 84.22% and 87.80%, respectively. In addition, we also investigated the genetic relationships between the two studied groups with 33 reference populations using population statistical methods of FST, DA, phylogenetic tree, PCA, STRUCTURE and TreeMix analyses. The present results showed that compared to other continental populations, the CTH and CHH groups had closer genetic affinities to East Asian populations. CONCLUSIONS: This novel multiplex system has high CDP and CPE in CTH and CHH groups, which can be used as a powerful tool for individual identification and paternity testing. According to various genetic analysis methods, the genetic structures of CTH and CHH groups are relatively similar to the reference East Asian populations.


Assuntos
Genética Populacional , Irmãos , Humanos , Filogenia , China , Mutação INDEL , Repetições de Microssatélites , Genética Forense/métodos , Frequência do Gene
15.
BMC Genomics ; 25(1): 391, 2024 Apr 22.
Artigo em Inglês | MEDLINE | ID: mdl-38649797

RESUMO

Developmental delay (DD), or intellectual disability (ID) is a very large group of early onset disorders that affects 1-2% of children worldwide, which have diverse genetic causes that should be identified. Genetic studies can elucidate the pathogenesis underlying DD/ID. In this study, whole-exome sequencing (WES) was performed on 225 Chinese DD/ID children (208 cases were sequenced as proband-parent trio) who were classified into seven phenotype subgroups. The phenotype and genomic data of patients with DD/ID were further retrospectively analyzed. There were 96/225 (42.67%; 95% confidence interval [CI] 36.15-49.18%) patients were found to have causative single nucleotide variants (SNVs) and small insertions/deletions (Indels) associated with DD/ID based on WES data. The diagnostic yields among the seven subgroups ranged from 31.25 to 71.43%. Three specific clinical features, hearing loss, visual loss, and facial dysmorphism, can significantly increase the diagnostic yield of WES in patients with DD/ID (P = 0.005, P = 0.005, and P = 0.039, respectively). Of note, hearing loss (odds ratio [OR] = 1.86%; 95% CI = 1.00-3.46, P = 0.046) or abnormal brainstem auditory evoked potential (BAEP) (OR = 1.91, 95% CI = 1.02-3.50, P = 0.042) was independently associated with causative genetic variants in DD/ID children. Our findings enrich the variation spectrums of SNVs/Indels associated with DD/ID, highlight the value genetic testing for DD/ID children, stress the importance of BAEP screen in DD/ID children, and help to facilitate early diagnose, clinical management and reproductive decisions, improve therapeutic response to medical treatment.


Assuntos
Deficiências do Desenvolvimento , Sequenciamento do Exoma , Deficiência Intelectual , Criança , Pré-Escolar , Feminino , Humanos , Lactente , Masculino , Deficiências do Desenvolvimento/genética , Deficiências do Desenvolvimento/diagnóstico , População do Leste Asiático/genética , Mutação INDEL , Deficiência Intelectual/genética , Fenótipo , Polimorfismo de Nucleotídeo Único
16.
Cell ; 187(8): 1955-1970.e23, 2024 Apr 11.
Artigo em Inglês | MEDLINE | ID: mdl-38503282

RESUMO

Characterizing somatic mutations in the brain is important for disentangling the complex mechanisms of aging, yet little is known about mutational patterns in different brain cell types. Here, we performed whole-genome sequencing (WGS) of 86 single oligodendrocytes, 20 mixed glia, and 56 single neurons from neurotypical individuals spanning 0.4-104 years of age and identified >92,000 somatic single-nucleotide variants (sSNVs) and small insertions/deletions (indels). Although both cell types accumulate somatic mutations linearly with age, oligodendrocytes accumulated sSNVs 81% faster than neurons and indels 28% slower than neurons. Correlation of mutations with single-nucleus RNA profiles and chromatin accessibility from the same brains revealed that oligodendrocyte mutations are enriched in inactive genomic regions and are distributed across the genome similarly to mutations in brain cancers. In contrast, neuronal mutations are enriched in open, transcriptionally active chromatin. These stark differences suggest an assortment of active mutagenic processes in oligodendrocytes and neurons.


Assuntos
Envelhecimento , Encéfalo , Neurônios , Oligodendroglia , Humanos , Envelhecimento/genética , Envelhecimento/patologia , Cromatina/genética , Cromatina/metabolismo , Mutação , Neurônios/metabolismo , Neurônios/patologia , Oligodendroglia/metabolismo , Oligodendroglia/patologia , Análise da Expressão Gênica de Célula Única , Sequenciamento Completo do Genoma , Encéfalo/metabolismo , Encéfalo/patologia , Polimorfismo de Nucleotídeo Único , Mutação INDEL , Bancos de Espécimes Biológicos , Células Precursoras de Oligodendrócitos/metabolismo , Células Precursoras de Oligodendrócitos/patologia
17.
Bioinformatics ; 40(3)2024 Mar 04.
Artigo em Inglês | MEDLINE | ID: mdl-38426352

RESUMO

MOTIVATION: Intra-host variants refer to genetic variations or mutations that occur within an individual host organism. These variants are typically studied in the context of viruses, bacteria, or other pathogens to understand the evolution of pathogens. Moreover, intra-host variants are also explored in the field of tumor biology and mitochondrial biology to characterize somatic mutations and inherited heteroplasmic mutations. Intra-host variants can involve long insertions, deletions, and combinations of different mutation types, which poses challenges in their identification. The performance of current methods in detecting of complex intra-host variants is unknown. RESULTS: First, we simulated a dataset comprising 10 samples with 1869 intra-host variants involving various mutation patterns and benchmarked current variant detection software. The results indicated that though current software can detect most variants with F1-scores between 0.76 and 0.97, their performance in detecting long indels and low frequency variants was limited. Thus, we developed a new software, PySNV, for the detection of complex intra-host variations. On the simulated dataset, PySNV successfully detected 1863 variant cases (F1-score: 0.99) and exhibited the highest Pearson correlation coefficient (PCC: 0.99) to the ground truth in predicting variant frequencies. The results demonstrated that PySNV delivered promising performance even for long indels and low frequency variants, while maintaining computational speed comparable to other methods. Finally, we tested its performance on SARS-CoV-2 replicate sequencing data and found that it reported 21% more variants compared to LoFreq, the best-performing benchmarked software, while showing higher consistency (62% over 54%) within replicates. The discrepancies mostly exist in low-depth regions and low frequency variants. AVAILABILITY AND IMPLEMENTATION: https://github.com/bnuLyndon/PySNV/.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Software , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Mutação , Mutação INDEL , Variação Genética
18.
Sci Rep ; 14(1): 7028, 2024 03 25.
Artigo em Inglês | MEDLINE | ID: mdl-38528062

RESUMO

Accurate indel calling plays an important role in precision medicine. A benchmarking indel set is essential for thoroughly evaluating the indel calling performance of bioinformatics pipelines. A reference sample with a set of known-positive variants was developed in the FDA-led Sequencing Quality Control Phase 2 (SEQC2) project, but the known indels in the known-positive set were limited. This project sought to provide an enriched set of known indels that would be more translationally relevant by focusing on additional cancer related regions. A thorough manual review process completed by 42 reviewers, two advisors, and a judging panel of three researchers significantly enriched the known indel set by an additional 516 indels. The extended benchmarking indel set has a large range of variant allele frequencies (VAFs), with 87% of them having a VAF below 20% in reference Sample A. The reference Sample A and the indel set can be used for comprehensive benchmarking of indel calling across a wider range of VAF values in the lower range. Indel length was also variable, but the majority were under 10 base pairs (bps). Most of the indels were within coding regions, with the remainder in the gene regulatory regions. Although high confidence can be derived from the robust study design and meticulous human review, this extensive indel set has not undergone orthogonal validation. The extended benchmarking indel set, along with the indels in the previously published known-positive set, was the truth set used to benchmark indel calling pipelines in a community challenge hosted on the precisionFDA platform. This benchmarking indel set and reference samples can be utilized for a comprehensive evaluation of indel calling pipelines. Additionally, the insights and solutions obtained during the manual review process can aid in improving the performance of these pipelines.


Assuntos
Benchmarking , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Biologia Computacional , Controle de Qualidade , Mutação INDEL , Polimorfismo de Nucleotídeo Único
19.
CRISPR J ; 7(1): 29-40, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38353621

RESUMO

The clustered regularly interspaced short palindromic repeats (CRISPR)-Cas9 system has been widely used to create animal models for biomedical and agricultural use owing to its low cost and easy handling. However, the occurrence of erroneous cleavage (off-targeting) may raise certain concerns for the practical application of the CRISPR-Cas9 system. In this study, we created a melanocortin 1 receptor (MC1R)-edited pig model through somatic cell nuclear transfer (SCNT) by using porcine kidney cells modified by the CRISPR-Cas9 system. We then carried out whole-genome sequencing of two MC1R-edited pigs and two cloned wild-type siblings, together with the donor cells, to assess the genome-wide presence of single-nucleotide variants and small insertions and deletions (indels) and found only one candidate off-target indel in both MC1R-edited pigs. In summary, our study indicates that the minimal off-targeting effect induced by CRISPR-Cas9 may not be a major concern in gene-edited pigs created by SCNT.


Assuntos
Sistemas CRISPR-Cas , Receptor Tipo 1 de Melanocortina , Animais , Suínos/genética , Receptor Tipo 1 de Melanocortina/genética , Sistemas CRISPR-Cas/genética , Edição de Genes , Mutação , Mutação INDEL/genética
20.
Cells ; 13(3)2024 Jan 30.
Artigo em Inglês | MEDLINE | ID: mdl-38334653

RESUMO

Successful genome editing depends on the cleavage efficiency of programmable nucleases (PNs) such as the CRISPR-Cas system. Various methods have been developed to assess the efficiency of PNs, most of which estimate the occurrence of indels caused by PN-induced double-strand breaks. In these methods, PN genomic target sites are amplified through PCR, and the resulting PCR products are subsequently analyzed using Sanger sequencing, high-throughput sequencing, or mismatch detection assays. Among these methods, Sanger sequencing of PCR products followed by indel analysis using online web tools has gained popularity due to its user-friendly nature. This approach estimates indel frequencies by computationally analyzing sequencing trace data. However, the accuracy of these computational tools remains uncertain. In this study, we compared the performance of four web tools, TIDE, ICE, DECODR, and SeqScreener, using artificial sequencing templates with predetermined indels. Our results demonstrated that these tools were able to estimate indel frequency with acceptable accuracy when the indels were simple and contained only a few base changes. However, the estimated values became more variable among the tools when the sequencing templates contained more complex indels or knock-in sequences. Moreover, although these tools effectively estimated the net indel sizes, their capability to deconvolute indel sequences exhibited variability with certain limitations. These findings underscore the importance of judiciously selecting and using an appropriate tool with caution, depending on the type of genome editing being performed.


Assuntos
Sistemas CRISPR-Cas , Edição de Genes , Edição de Genes/métodos , Sistemas CRISPR-Cas/genética , Mutação INDEL/genética , Genoma/genética , Genômica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA