Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 3.739
Filtrar
1.
Genome Biol Evol ; 16(5)2024 May 02.
Artículo en Inglés | MEDLINE | ID: mdl-38735759

RESUMEN

A fundamental goal in evolutionary biology and population genetics is to understand how selection shapes the fate of new mutations. Here, we test the null hypothesis that insertion-deletion (indel) events in protein-coding regions occur randomly with respect to secondary structures. We identified indels across 11,444 sequence alignments in mouse, rat, human, chimp, and dog genomes and then quantified their overlap with four different types of secondary structure-alpha helices, beta strands, protein bends, and protein turns-predicted by deep-learning methods of AlphaFold2. Indels overlapped secondary structures 54% as much as expected and were especially underrepresented over beta strands, which tend to form internal, stable regions of proteins. In contrast, indels were enriched by 155% over regions without any predicted secondary structures. These skews were stronger in the rodent lineages compared to the primate lineages, consistent with population genetic theory predicting that natural selection will be more efficient in species with larger effective population sizes. Nonsynonymous substitutions were also less common in regions of protein secondary structure, although not as strongly reduced as in indels. In a complementary analysis of thousands of human genomes, we showed that indels overlapping secondary structure segregated at significantly lower frequency than indels outside of secondary structure. Taken together, our study shows that indels are selected against if they overlap secondary structure, presumably because they disrupt the tertiary structure and function of a protein.


Asunto(s)
Mutación INDEL , Estructura Secundaria de Proteína , Humanos , Animales , Ratones , Ratas , Evolución Molecular , Proteínas/genética , Proteínas/química , Perros , Selección Genética , Genoma
2.
BMC Genomics ; 25(1): 475, 2024 May 14.
Artículo en Inglés | MEDLINE | ID: mdl-38745120

RESUMEN

BACKGROUND: Single nucleotide polymorphism (SNP) markers play significant roles in accelerating breeding and basic crop research. Several soybean SNP panels have been developed. However, there is still a lack of SNP panels for differentiating between wild and cultivated populations, as well as for detecting polymorphisms within both wild and cultivated populations. RESULTS: This study utilized publicly available resequencing data from over 3,000 soybean accessions to identify differentiating and highly conserved SNP and insertion/deletion (InDel) markers between wild and cultivated soybean populations. Additionally, a naturally occurring mutant gene library was constructed by analyzing large-effect SNPs and InDels in the population. CONCLUSION: The markers obtained in this study are associated with numerous genes governing agronomic traits, thus facilitating the evaluation of soybean germplasms and the efficient differentiation between wild and cultivated soybeans. The natural mutant gene library permits the quick identification of individuals with natural mutations in functional genes, providing convenience for accelerating soybean breeding using reverse genetics.


Asunto(s)
Glycine max , Mutación INDEL , Polimorfismo de Nucleótido Simple , Glycine max/genética , Genoma de Planta , Biblioteca de Genes , Fitomejoramiento
3.
PLoS One ; 19(5): e0302870, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38776345

RESUMEN

The systematic identification of insertion/deletion (InDel) length polymorphisms from the entire lentil genome can be used to map the quantitative trait loci (QTL) and also for the marker-assisted selection (MAS) for various linked traits. The InDels were identified by comparing the whole-genome resequencing (WGRS) data of two extreme bulks (early- and late-flowering bulk) and a parental genotype (Globe Mutant) of lentil. The bulks were made by pooling 20 extreme recombinant inbred lines (RILs) each, derived by crossing Globe Mutant (late flowering parent) with L4775 (early flowering parent). Finally, 734,716 novel InDels were identified, which is nearly one InDel per 5,096 bp of lentil genome. Furthermore, 74.94% of InDels were within the intergenic region and 99.45% displayed modifier effects. Of these, 15,732 had insertions or deletions of 20 bp or more, making them amenable to the development of PCR-based markers. An InDel marker I-SP-356.6 (chr. 3; position 356,687,623; positioned 174.5 Kb from the LcFRI gene) was identified as having a phenotypic variance explained (PVE) value of 47.7% for earliness when validated in a RIL population. Thus, I-SP-356.6 marker can be deployed in MAS to facilitate the transfer of the earliness trait to other elite late-maturing cultivars. Two InDel markers viz., I-SP-356.6 and I-SP-383.9 (chr. 3; linked to LcELF3a gene) when tested in 9 lentil genotypes differing for maturity duration, clearly distinguished three early (L4775, ILL7663, Precoz) and four late genotypes (Globe Mutant, MFX, L4602, L830). However, these InDels could not be validated in two genotypes (L4717, L4727), suggesting either absence of polymorphism and/or presence of other loci causing earliness. The identified InDel markers can act as valuable tools for MAS for the development of early maturing lentil varieties.


Asunto(s)
Genoma de Planta , Genotipo , Mutación INDEL , Lens (Planta) , Sitios de Carácter Cuantitativo , Lens (Planta)/genética , Lens (Planta)/crecimiento & desarrollo , Marcadores Genéticos , Reacción en Cadena de la Polimerasa/métodos , Mapeo Cromosómico/métodos
4.
F1000Res ; 13: 146, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38779312

RESUMEN

Background: Previous studies have linked genetics to knee osteoarthritis. Angiotensin-converting enzyme (ACE) gene I/D polymorphism may cause OA. However, evidence remains inconsistent. This study examines knee OA risk and ACE gene I/D polymorphism. Methods: We explored Europe PMC, Medline, Scopus, and Cochrane Library using keywords. Three assessment bias factors were assessed using the Newcastle-Ottawa Scale (NOS). Criteria for inclusion: (1) Split the study population into knee OA patients and healthy controls; (2) Analysed the ACE gene I/D polymorphism; (3) Case-control or cross-sectional surveys. Studies with non-knee OA, incomplete data, and no full-text were excluded. The odds ratio (OR) and 95% confidence intervals (95% CI) were calculated using random-effect models. Results: A total of 6 case-control studies consist of 1,226 patients with knee OA and 1,145 healthy subjects as controls were included. Our pooled analysis revealed that a significant association between ACE gene I/D polymorphism and risk of knee OA was only seen in the dominant (DD + ID vs. II) [OR 1.69 (95% CI 1.14 - 2.50), p = 0.009, I2 = 72%], and ID vs. II [OR 1.37 (95% CI 1.01- 1.86), p = 0.04, I2 = 43%] genotype models. Other genotype models, including recessive (DD vs. ID + II), alleles (D vs. I), DD vs. ID, and DD vs. II models did not show a significant association with knee OA risk. Further regression analysis revealed that ethnicity and sex may influence those relationships in several genotype models. Conclusions: Dominant and ID vs. II ACE gene I/D polymorphism models increased knee OA risk significantly. More research with larger samples and different ethnic groups is needed to confirm our findings. After ethnicity subgroup analysis, some genetic models in our study showed significant heterogeneities, and most studies are from Asian countries with Asian populations, with little evidence on Arabs.


Asunto(s)
Predisposición Genética a la Enfermedad , Osteoartritis de la Rodilla , Peptidil-Dipeptidasa A , Polimorfismo Genético , Humanos , Estudios de Casos y Controles , Estudios de Asociación Genética , Mutación INDEL , Osteoartritis de la Rodilla/genética , Peptidil-Dipeptidasa A/genética , Factores de Riesgo
5.
Funct Integr Genomics ; 24(3): 104, 2024 May 20.
Artículo en Inglés | MEDLINE | ID: mdl-38764005

RESUMEN

Accurate estimation of population allele frequency (AF) is crucial for gene discovery and genetic diagnostics. However, determining AF for frameshift-inducing small insertions and deletions (indels) faces challenges due to discrepancies in mapping and variant calling methods. Here, we propose an innovative approach to assess indel AF. We developed CRAFTS-indels (Calculating Regional Allele Frequency Targeting Small indels), an algorithm that combines AF of distinct indels within a given region and provides "regional AF" (rAF). We tested and validated CRAFTS-indels using three independent datasets: gnomAD v2 (n=125,748 samples), an internal dataset (IGM; n=39,367), and the UK BioBank (UKBB; n=469,835). By comparing rAF against standard AF, we identified rare indels with rAF exceeding standard AF (sAF≤10-4 and rAF>10-4) as "rAF-hi" indels. Notably, a high percentage of rare indels were "rAF-hi", with a higher proportion in gnomAD v2 (11-20%) and IGM (11-22%) compared to the UKBB (5-9% depending on the CRAFTS-indels' parameters). Analysis of the overlap of regions based on their rAF with low complexity regions and with ClinVar classification supported the pertinence of rAF. Using the internal dataset, we illustrated the utility of CRAFTS-indel in the analysis of de novo variants and the potential negative impact of rAF-hi indels in gene discovery. In summary, annotation of indels with cohort specific rAF can be used to handle some of the limitations of current annotation pipelines and facilitate detection of novel gene disease associations. CRAFTS-indels offers a user-friendly approach to providing rAF annotation. It can be integrated into public databases such as gnomAD, UKBB and used by ClinVar to revise indel classifications.


Asunto(s)
Frecuencia de los Genes , Mutación INDEL , Humanos , Algoritmos
6.
Theor Appl Genet ; 137(6): 136, 2024 May 20.
Artículo en Inglés | MEDLINE | ID: mdl-38764078

RESUMEN

KEY MESSAGE: Different kinship and resistance to cotton leaf curl disease (CLCuD) and heat were found between upland cotton cultivars from China and Pakistan. 175 SNPs and 82 InDels loci related to yield, fiber quality, CLCuD, and heat resistance were identified. Elite alleles found in Pakistani accessions aided local adaptation to climatic condition of two countries. Adaptation of upland cotton (Gossypium hirsutum) beyond its center of origin is expected to be driven by tailoring of the genome and genes to enhance yield and quality in new ecological niches. Here, resequencing of 456 upland cotton accessions revealed two distinct kinships according to the associated country. Fiber quality and lint percentage were consistent across kinships, but resistance to cotton leaf curl disease (CLCuD) and heat was distinctly exhibited by accessions from Pakistan, illustrating highly local adaption. A total of 175 SNP and 82 InDel loci related to yield, fiber quality, CLCuD and heat resistance were identified; among them, only two overlapped between Pakistani and Chinese accessions underscoring the divergent domestication and improvement targets in each country. Loci associated with resistance alleles to leaf curl disease and high temperature were largely found in Pakistani accessions to counter these stresses prevalent in Pakistan. These results revealed that breeding activities led to the accumulation of unique alleles and helped upland cotton become adapted to the respective climatic conditions, which will contribute to elucidating the genetic mechanisms that underlie resilience traits and help develop climate-resilient cotton cultivars for use worldwide.


Asunto(s)
Gossypium , Polimorfismo de Nucleótido Simple , Gossypium/genética , Pakistán , China , Resistencia a la Enfermedad/genética , Enfermedades de las Plantas/genética , Mutación INDEL , Adaptación Fisiológica/genética , Genoma de Planta , Alelos , Fitomejoramiento , Fibra de Algodón , Fenotipo
7.
Orphanet J Rare Dis ; 19(1): 209, 2024 May 21.
Artículo en Inglés | MEDLINE | ID: mdl-38773661

RESUMEN

BACKGROUND: Marfan syndrome (MFS) is an autosomal dominant connective tissue disease with wide clinical heterogeneity, and mainly caused by pathogenic variants in fibrillin-1 (FBN1). METHODS: A Chinese 4-generation MFS pedigree with 16 family members was recruited and exome sequencing (ES) was performed in the proband. Transcript analysis (patient RNA and minigene assays) and in silico structural analysis were used to determine the pathogenicity of the variant. In addition, germline mosaicism in family member (Ι:1) was assessed using quantitative fluorescent polymerase chain reaction (QF-PCR) and short tandem repeat PCR (STR) analyses. RESULTS: Two cis-compound benign intronic variants of FBN1 (c.3464-4 A > G and c.3464-5G > A) were identified in the proband by ES. As a compound variant, c.3464-5_3464-4delGAinsAG was found to be pathogenic and co-segregated with MFS. RNA studies indicated that aberrant transcripts were found only in patients and mutant-type clones. The variant c.3464-5_3464-4delGAinsAG caused erroneous integration of a 3 bp sequence into intron 28 and resulted in the insertion of one amino acid in the protein sequence (p.Ile1154_Asp1155insAla). Structural analyses suggested that p.Ile1154_Asp1155insAla affected the protein's secondary structure by interfering with one disulfide bond between Cys1140 and Cys1153 and causing the extension of an anti-parallel ß sheet in the calcium-binding epidermal growth factor-like (cbEGF)13 domain. In addition, the asymptomatic family member Ι:1 was deduced to be a gonadal mosaic as assessed by inconsistent results of sequencing and STR analysis. CONCLUSIONS: To our knowledge, FBN1 c.3464-5_3464-4delGAinsAG is the first identified pathogenic intronic indel variant affecting non-canonical splice sites in this gene. Our study reinforces the importance of assessing the pathogenic role of intronic variants at the mRNA level, with structural analysis, and the occurrence of mosaicism.


Asunto(s)
Fibrilina-1 , Intrones , Síndrome de Marfan , Mosaicismo , Linaje , Humanos , Fibrilina-1/genética , Síndrome de Marfan/genética , Síndrome de Marfan/patología , Femenino , Masculino , Adulto , Intrones/genética , Mutación INDEL/genética , Persona de Mediana Edad , Adipoquinas
8.
Physiol Genomics ; 56(6): 436-444, 2024 Jun 01.
Artículo en Inglés | MEDLINE | ID: mdl-38586874

RESUMEN

This study aimed to investigate the relationship between pre- and postexercise cardiac biomarker release according to athletic status (trained vs. untrained) and to establish whether the I/D polymorphism in the angiotensin-converting enzyme (ACE) gene had an influence on cardiac biomarkers release with specific regard on the influence of the training state. We determined cardiac troponin I (cTnI) and N-terminal pro-brain natriuretic peptide (NT-proBNP) in 29 trained and 27 untrained male soccer players before and after moderate-intensity continuous exercise (MICE) and high-intensity interval exercise (HIIE) running tests. Trained soccer players had higher pre (trained: 0.014 ± 0.007 ng/mL; untrained: 0.010 ± 0.005 ng/mL) and post HIIE (trained: 0.031 ± 0.008 ng/mL; untrained: 0.0179 ± 0.007) and MICE (trained: 0.030 ± 0.007 ng/mL; untrained: 0.018 ± 0.007) cTnI values than untrained subjects, but the change with exercise (ΔcTnI) was similar between groups. There was no significant difference in baseline and postexercise NT-proBNP between groups. NT-proBNP levels were elevated after both HIIE and MICE. Considering three ACE genotypes, the mean pre exercise cTnI values of the trained group (DD: 0.015 ± 0.008 ng/mL, ID: 0.015 ± 0.007 ng/mL, and II: 0.014 ± 0.008 ng/mL) and their untrained counterparts (DD: 0.010 ± 0.004 ng/mL, ID: 0.011 ± 0.004 ng/mL, and II: 0.010 ± 0.006 ng/mL) did not show any significant difference. To sum up, noticeable difference in baseline cTnI was observed, which was related to athletic status but not ACE genotypes. Neither athletic status nor ACE genotypes seemed to affect the changes in cardiac biomarkers in response to HIIE and MICE, indicating that the ACE gene does not play a significant role in the release of exercise-induced cardiac biomarkers indicative of cardiac damage in Iranian soccer players.NEW & NOTEWORTHY Our study investigated the impact of athletic status and angiotensin-converting enzyme (ACE) gene I/D polymorphism on cardiac biomarkers in soccer players. Trained players showed higher baseline cardiac troponin I (cTnI) levels, whereas postexercise ΔcTnI remained consistent across groups. N-terminal pro-brain natriuretic peptide increased after exercise in both groups, staying within normal limits. ACE genotypes did not significantly affect pre-exercise cTnI. Overall, athletic status influences baseline cTnI, but neither it nor ACE genotypes significantly impact exercise-induced cardiac biomarker responses in this population.


Asunto(s)
Biomarcadores , Ejercicio Físico , Péptido Natriurético Encefálico , Fragmentos de Péptidos , Peptidil-Dipeptidasa A , Polimorfismo Genético , Troponina I , Masculino , Humanos , Peptidil-Dipeptidasa A/genética , Biomarcadores/sangre , Péptido Natriurético Encefálico/sangre , Péptido Natriurético Encefálico/genética , Troponina I/sangre , Troponina I/genética , Fragmentos de Péptidos/sangre , Ejercicio Físico/fisiología , Adulto Joven , Adulto , Entrenamiento de Intervalos de Alta Intensidad/métodos , Fútbol/fisiología , Mutación INDEL/genética , Corazón/fisiología
9.
BMC Genomics ; 25(1): 329, 2024 Apr 02.
Artículo en Inglés | MEDLINE | ID: mdl-38566035

RESUMEN

BACKGROUND: Previously, a novel multiplex system of 64 loci was constructed based on capillary electrophoresis platform, including 59 autosomal insertion/deletions (A-InDels), two Y-chromosome InDels, two mini short tandem repeats (miniSTRs), and an Amelogenin gene. The aim of this study is to evaluate the efficiencies of this multiplex system for individual identification, paternity testing and biogeographic ancestry inference in Chinese Hezhou Han (CHH) and Hubei Tujia (CTH) groups, providing valuable insights for forensic anthropology and population genetics research. RESULTS: The cumulative values of power of discrimination (CDP) and probability of exclusion (CPE) for the 59 A-InDels and two miniSTRs were 0.99999999999999999999999999754, 0.99999905; and 0.99999999999999999999999999998, 0.99999898 in CTH and CHH groups, respectively. When the likelihood ratio thresholds were set to 1 or 10, more than 95% of the full sibling pairs could be identified from unrelated individual pairs, and the false positive rates were less than 1.2% in both CTH and CHH groups. Biogeographic ancestry inference models based on 35 populations were constructed with three algorithms: random forest, adaptive boosting and extreme gradient boosting, and then 10-fold cross-validation analyses were applied to test these three models with the average accuracies of 86.59%, 84.22% and 87.80%, respectively. In addition, we also investigated the genetic relationships between the two studied groups with 33 reference populations using population statistical methods of FST, DA, phylogenetic tree, PCA, STRUCTURE and TreeMix analyses. The present results showed that compared to other continental populations, the CTH and CHH groups had closer genetic affinities to East Asian populations. CONCLUSIONS: This novel multiplex system has high CDP and CPE in CTH and CHH groups, which can be used as a powerful tool for individual identification and paternity testing. According to various genetic analysis methods, the genetic structures of CTH and CHH groups are relatively similar to the reference East Asian populations.


Asunto(s)
Genética de Población , Hermanos , Humanos , Filogenia , China , Mutación INDEL , Repeticiones de Microsatélite , Genética Forense/métodos , Frecuencia de los Genes
10.
Anim Biotechnol ; 35(1): 2337751, 2024 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-38597900

RESUMEN

The economic efficiency of sheep breeding, aiming to enhance productivity, is a focal point for improvement of sheep breeding. Recent studies highlight the involvement of the Early Region 2 Binding Factor transcription factor 8 (E2F8) gene in female reproduction. Our group's recent genome-wide association study (GWAS) emphasizes the potential impact of the E2F8 gene on prolificacy traits in Australian White sheep (AUW). Herein, the purpose of this study was to assess the correlation of the E2F8 gene with litter size in AUW sheep breed. This work encompassed 659 AUW sheep, subject to genotyping through PCR-based genotyping technology. Furthermore, the results of PCR-based genotyping showed significant associations between the P1-del-32bp bp InDel and the fourth and fifth parities litter size in AUW sheep; the litter size of those with genotype ID were superior compared to those with DD and II genotypes. Thus, these results indicate that the P1-del-32bp InDel within the E2F8 gene can be useful in marker-assisted selection (MAS) in sheep.


Asunto(s)
Estudio de Asociación del Genoma Completo , Mutación INDEL , Femenino , Animales , Ovinos/genética , Embarazo , Australia , Tamaño de la Camada/genética , Genotipo , Mutación INDEL/genética
11.
Sci Rep ; 14(1): 8165, 2024 04 08.
Artículo en Inglés | MEDLINE | ID: mdl-38589653

RESUMEN

Accurately calling indels with next-generation sequencing (NGS) data is critical for clinical application. The precisionFDA team collaborated with the U.S. Food and Drug Administration's (FDA's) National Center for Toxicological Research (NCTR) and successfully completed the NCTR Indel Calling from Oncopanel Sequencing Data Challenge, to evaluate the performance of indel calling pipelines. Top performers were selected based on precision, recall, and F1-score. The performance of many other pipelines was close to the top performers, which produced a top cluster of performers. The performance was significantly higher in high confidence regions and coding regions, and significantly lower in low complexity regions. Oncopanel capture and other issues may have occurred that affected the recall rate. Indels with higher variant allele frequency (VAF) may generally be called with higher confidence. Many of the indel calling pipelines had good performance. Some of them performed generally well across all three oncopanels, while others were better for a specific oncopanel. The performance of indel calling can further be improved by restricting the calls within high confidence intervals (HCIs) and coding regions, and by excluding low complexity regions (LCR) regions. Certain VAF cut-offs could be applied according to the applications.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento , Mutación INDEL , Polimorfismo de Nucleótido Simple
12.
Genome Biol ; 25(1): 101, 2024 Apr 19.
Artículo en Inglés | MEDLINE | ID: mdl-38641647

RESUMEN

Many bioinformatics methods seek to reduce reference bias, but no methods exist to comprehensively measure it. Biastools analyzes and categorizes instances of reference bias. It works in various scenarios: when the donor's variants are known and reads are simulated; when donor variants are known and reads are real; and when variants are unknown and reads are real. Using biastools, we observe that more inclusive graph genomes result in fewer biased sites. We find that end-to-end alignment reduces bias at indels relative to local aligners. Finally, we use biastools to characterize how T2T references improve large-scale bias.


Asunto(s)
Genoma , Genómica , Genómica/métodos , Biología Computacional , Mutación INDEL , Sesgo , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos
13.
BMC Genomics ; 25(1): 391, 2024 Apr 22.
Artículo en Inglés | MEDLINE | ID: mdl-38649797

RESUMEN

Developmental delay (DD), or intellectual disability (ID) is a very large group of early onset disorders that affects 1-2% of children worldwide, which have diverse genetic causes that should be identified. Genetic studies can elucidate the pathogenesis underlying DD/ID. In this study, whole-exome sequencing (WES) was performed on 225 Chinese DD/ID children (208 cases were sequenced as proband-parent trio) who were classified into seven phenotype subgroups. The phenotype and genomic data of patients with DD/ID were further retrospectively analyzed. There were 96/225 (42.67%; 95% confidence interval [CI] 36.15-49.18%) patients were found to have causative single nucleotide variants (SNVs) and small insertions/deletions (Indels) associated with DD/ID based on WES data. The diagnostic yields among the seven subgroups ranged from 31.25 to 71.43%. Three specific clinical features, hearing loss, visual loss, and facial dysmorphism, can significantly increase the diagnostic yield of WES in patients with DD/ID (P = 0.005, P = 0.005, and P = 0.039, respectively). Of note, hearing loss (odds ratio [OR] = 1.86%; 95% CI = 1.00-3.46, P = 0.046) or abnormal brainstem auditory evoked potential (BAEP) (OR = 1.91, 95% CI = 1.02-3.50, P = 0.042) was independently associated with causative genetic variants in DD/ID children. Our findings enrich the variation spectrums of SNVs/Indels associated with DD/ID, highlight the value genetic testing for DD/ID children, stress the importance of BAEP screen in DD/ID children, and help to facilitate early diagnose, clinical management and reproductive decisions, improve therapeutic response to medical treatment.


Asunto(s)
Discapacidades del Desarrollo , Secuenciación del Exoma , Discapacidad Intelectual , Niño , Preescolar , Femenino , Humanos , Lactante , Masculino , Discapacidades del Desarrollo/genética , Discapacidades del Desarrollo/diagnóstico , Pueblos del Este de Asia/genética , Mutación INDEL , Discapacidad Intelectual/genética , Fenotipo , Polimorfismo de Nucleótido Simple
14.
Mol Ecol ; 33(11): e17364, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38651830

RESUMEN

Despite receiving significant recent attention, the relevance of structural variation (SV) in driving phenotypic diversity remains understudied, although recent advances in long-read sequencing, bioinformatics and pangenomic approaches have enhanced SV detection. We review the role of SVs in shaping phenotypes in avian model systems, and identify some general patterns in SV type, length and their associated traits. We found that most of the avian SVs so far identified are short indels in chickens, which are frequently associated with changes in body weight and plumage colouration. Overall, we found that relatively short SVs are more frequently detected, likely due to a combination of their prevalence compared to large SVs, and a detection bias, stemming primarily from the widespread use of short-read sequencing and associated analytical methods. SVs most commonly involve non-coding regions, especially introns, and when patterns of inheritance were reported, SVs associated primarily with dominant discrete traits. We summarise several examples of phenotypic convergence across different species, mediated by different SVs in the same or different genes and different types of changes in the same gene that can lead to various phenotypes. Complex rearrangements and supergenes, which can simultaneously affect and link several genes, tend to have pleiotropic phenotypic effects. Additionally, SVs commonly co-occur with single-nucleotide polymorphisms, highlighting the need to consider all types of genetic changes to understand the basis of phenotypic traits. We end by summarising expectations for when long-read technologies become commonly implemented in non-model birds, likely leading to an increase in SV discovery and characterisation. The growing interest in this subject suggests an increase in our understanding of the phenotypic effects of SVs in upcoming years.


Asunto(s)
Pollos , Fenotipo , Animales , Pollos/genética , Aves/genética , Variación Estructural del Genoma , Mutación INDEL
15.
Electrophoresis ; 45(9-10): 867-876, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38651903

RESUMEN

Short tandem repeat analysis is challenging when dealing with unbalanced mixtures in forensic cases due to the presence of stutter peaks and large amplicons. In this research, we propose a novel genetic marker called DIP-TriSNP, which combines deletion/insertion polymorphism (DIP) with tri-allelic single nucleotide polymorphism in less than 230 bp length of human genome. Based on multiplex PCR and SNaPShot, a panel, including 14 autosomal DIP-TriSNPs and one Y chromosomal DIP-SNP, had been developed and applied to genotyping 102 unrelated Han Chinese individuals in Sichuan of China and simulated a mixture study. The panel sensitivity can reach as low as 0.1 ng DNA template, and the minor contributor of DNA can be detected with the highest ratio of 19:1, as indicated by the obtained results. In the Sichuan Han population, the cumulative probability of informative genotypes reached 0.997092, with a combined power of discrimination of 0.999999998801. The panel was estimated to detect more than two alleles in at least one locus in 99.69% of mixtures of the Sichuan Han population. In conclusion, DIP-TriSNPs have shown promising as an innovative DNA marker for identifying the minor contributor in unbalanced DNA mixtures, offering advantages such as short amplifications, increased polymorphism, and heightened sensitivity.


Asunto(s)
ADN , Genética Forense , Reacción en Cadena de la Polimerasa Multiplex , Polimorfismo de Nucleótido Simple , Humanos , Reacción en Cadena de la Polimerasa Multiplex/métodos , Genética Forense/métodos , Marcadores Genéticos/genética , ADN/genética , ADN/análisis , China , Pueblo Asiatico/genética , Genotipo , Reproducibilidad de los Resultados , Mutación INDEL , Repeticiones de Microsatélite/genética , Masculino , Técnicas de Genotipaje/métodos
16.
J Agric Food Chem ; 72(17): 10138-10148, 2024 May 01.
Artículo en Inglés | MEDLINE | ID: mdl-38637271

RESUMEN

Passion fruit (Passiflora spp.) is an important fruit tree in the family Passifloraceae. The color of the fruit skin, a significant agricultural trait, is determined by the content of anthocyanin in passion fruit. However, the regulatory mechanisms behind the accumulation of anthocyanin in different passion fruit skin colors remain unclear. In the study, we identified and characterized a R2R3-MYB transcription factor, PeMYB114, which functions as a transcriptional activator in anthocyanin biosynthesis. Yeast one-hybrid system and dual-luciferase analysis showed that PeMYB114 could directly activate the expression of anthocyanin structural genes (PeCHS and PeDFR). Furthermore, a natural variation in the promoter region of PeMYB114 alters its expression. PeMYB114purple accessions with the 224-bp insertion have a higher anthocyanin level than PeMYB114yellow accessions with the 224-bp deletion. The findings enhance our understanding of anthocyanin accumulation in fruits and provide genetic resources for genome design for improving passion fruit quality.


Asunto(s)
Antocianinas , Frutas , Regulación de la Expresión Génica de las Plantas , Passiflora , Proteínas de Plantas , Regiones Promotoras Genéticas , Factores de Transcripción , Antocianinas/metabolismo , Antocianinas/genética , Passiflora/genética , Passiflora/metabolismo , Passiflora/química , Frutas/metabolismo , Frutas/genética , Frutas/química , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismo , Factores de Transcripción/genética , Factores de Transcripción/metabolismo , Mutación INDEL
17.
BMC Genomics ; 25(1): 428, 2024 Apr 30.
Artículo en Inglés | MEDLINE | ID: mdl-38689225

RESUMEN

BACKGROUND: Although many studies have been done to reveal artificial selection signatures in commercial and indigenous chickens, a limited number of genes have been linked to specific traits. To identify more trait-related artificial selection signatures and genes, we re-sequenced a total of 85 individuals of five indigenous chicken breeds with distinct traits from Yunnan Province, China. RESULTS: We found 30 million non-redundant single nucleotide variants and small indels (< 50 bp) in the indigenous chickens, of which 10 million were not seen in 60 broilers, 56 layers and 35 red jungle fowls (RJFs) that we compared with. The variants in each breed are enriched in non-coding regions, while those in coding regions are largely tolerant, suggesting that most variants might affect cis-regulatory sequences. Based on 27 million bi-allelic single nucleotide polymorphisms identified in the chickens, we found numerous selective sweeps and affected genes in each indigenous chicken breed and substantially larger numbers of selective sweeps and affected genes in the broilers and layers than previously reported using a rigorous statistical model. Consistent with the locations of the variants, the vast majority (~ 98.3%) of the identified selective sweeps overlap known quantitative trait loci (QTLs). Meanwhile, 74.2% known QTLs overlap our identified selective sweeps. We confirmed most of previously identified trait-related genes and identified many novel ones, some of which might be related to body size and high egg production traits. Using RT-qPCR, we validated differential expression of eight genes (GHR, GHRHR, IGF2BP1, OVALX, ELF2, MGARP, NOCT, SLC25A15) that might be related to body size and high egg production traits in relevant tissues of relevant breeds. CONCLUSION: We identify 30 million single nucleotide variants and small indels in the five indigenous chicken breeds, 10 million of which are novel. We predict substantially more selective sweeps and affected genes than previously reported in both indigenous and commercial breeds. These variants and affected genes are good candidates for further experimental investigations of genotype-phenotype relationships and practical applications in chicken breeding programs.


Asunto(s)
Pollos , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo , Selección Genética , Animales , Pollos/genética , Genoma , Mutación INDEL , Cruzamiento , Fenotipo , Genómica/métodos
18.
BMC Biol ; 22(1): 90, 2024 Apr 22.
Artículo en Inglés | MEDLINE | ID: mdl-38644496

RESUMEN

BACKGROUND: Accurate identification of genetic variants, such as point mutations and insertions/deletions (indels), is crucial for various genetic studies into epidemic tracking, population genetics, and disease diagnosis. Genetic studies into microbiomes often require processing numerous sequencing datasets, necessitating variant identifiers with high speed, accuracy, and robustness. RESULTS: We present QuickVariants, a bioinformatics tool that effectively summarizes variant information from read alignments and identifies variants. When tested on diverse bacterial sequencing data, QuickVariants demonstrates a ninefold higher median speed than bcftools, a widely used variant identifier, with higher accuracy in identifying both point mutations and indels. This accuracy extends to variant identification in virus samples, including SARS-CoV-2, particularly with significantly fewer false negative indels than bcftools. The high accuracy of QuickVariants is further demonstrated by its detection of a greater number of Omicron-specific indels (5 versus 0) and point mutations (61 versus 48-54) than bcftools in sewage metagenomes predominated by Omicron variants. Much of the reduced accuracy of bcftools was attributable to its misinterpretation of indels, often producing false negative indels and false positive point mutations at the same locations. CONCLUSIONS: We introduce QuickVariants, a fast, accurate, and robust bioinformatics tool designed for identifying genetic variants for microbial studies. QuickVariants is available at https://github.com/caozhichongchong/QuickVariants .


Asunto(s)
Mutación INDEL , SARS-CoV-2 , SARS-CoV-2/genética , Biología Computacional/métodos , Humanos , Programas Informáticos , COVID-19/virología , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Mutación Puntual , Variación Genética , Análisis de Secuencia de ADN/métodos
19.
Sci Rep ; 14(1): 7028, 2024 03 25.
Artículo en Inglés | MEDLINE | ID: mdl-38528062

RESUMEN

Accurate indel calling plays an important role in precision medicine. A benchmarking indel set is essential for thoroughly evaluating the indel calling performance of bioinformatics pipelines. A reference sample with a set of known-positive variants was developed in the FDA-led Sequencing Quality Control Phase 2 (SEQC2) project, but the known indels in the known-positive set were limited. This project sought to provide an enriched set of known indels that would be more translationally relevant by focusing on additional cancer related regions. A thorough manual review process completed by 42 reviewers, two advisors, and a judging panel of three researchers significantly enriched the known indel set by an additional 516 indels. The extended benchmarking indel set has a large range of variant allele frequencies (VAFs), with 87% of them having a VAF below 20% in reference Sample A. The reference Sample A and the indel set can be used for comprehensive benchmarking of indel calling across a wider range of VAF values in the lower range. Indel length was also variable, but the majority were under 10 base pairs (bps). Most of the indels were within coding regions, with the remainder in the gene regulatory regions. Although high confidence can be derived from the robust study design and meticulous human review, this extensive indel set has not undergone orthogonal validation. The extended benchmarking indel set, along with the indels in the previously published known-positive set, was the truth set used to benchmark indel calling pipelines in a community challenge hosted on the precisionFDA platform. This benchmarking indel set and reference samples can be utilized for a comprehensive evaluation of indel calling pipelines. Additionally, the insights and solutions obtained during the manual review process can aid in improving the performance of these pipelines.


Asunto(s)
Benchmarking , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Biología Computacional , Control de Calidad , Mutación INDEL , Polimorfismo de Nucleótido Simple
20.
Cell ; 187(8): 1955-1970.e23, 2024 Apr 11.
Artículo en Inglés | MEDLINE | ID: mdl-38503282

RESUMEN

Characterizing somatic mutations in the brain is important for disentangling the complex mechanisms of aging, yet little is known about mutational patterns in different brain cell types. Here, we performed whole-genome sequencing (WGS) of 86 single oligodendrocytes, 20 mixed glia, and 56 single neurons from neurotypical individuals spanning 0.4-104 years of age and identified >92,000 somatic single-nucleotide variants (sSNVs) and small insertions/deletions (indels). Although both cell types accumulate somatic mutations linearly with age, oligodendrocytes accumulated sSNVs 81% faster than neurons and indels 28% slower than neurons. Correlation of mutations with single-nucleus RNA profiles and chromatin accessibility from the same brains revealed that oligodendrocyte mutations are enriched in inactive genomic regions and are distributed across the genome similarly to mutations in brain cancers. In contrast, neuronal mutations are enriched in open, transcriptionally active chromatin. These stark differences suggest an assortment of active mutagenic processes in oligodendrocytes and neurons.


Asunto(s)
Envejecimiento , Encéfalo , Neuronas , Oligodendroglía , Humanos , Envejecimiento/genética , Envejecimiento/patología , Cromatina/genética , Cromatina/metabolismo , Mutación , Neuronas/metabolismo , Neuronas/patología , Oligodendroglía/metabolismo , Oligodendroglía/patología , Análisis de Expresión Génica de una Sola Célula , Secuenciación Completa del Genoma , Encéfalo/metabolismo , Encéfalo/patología , Polimorfismo de Nucleótido Simple , Mutación INDEL , Bancos de Muestras Biológicas , Células Precursoras de Oligodendrocitos/metabolismo , Células Precursoras de Oligodendrocitos/patología
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA