Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 50
Filtrar
1.
Brief Bioinform ; 25(2)2024 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-38517697

RESUMO

Non-coding variants associated with complex traits can alter the motifs of transcription factor (TF)-deoxyribonucleic acid binding. Although many computational models have been developed to predict the effects of non-coding variants on TF binding, their predictive power lacks systematic evaluation. Here we have evaluated 14 different models built on position weight matrices (PWMs), support vector machines, ordinary least squares and deep neural networks (DNNs), using large-scale in vitro (i.e. SNP-SELEX) and in vivo (i.e. allele-specific binding, ASB) TF binding data. Our results show that the accuracy of each model in predicting SNP effects in vitro significantly exceeds that achieved in vivo. For in vitro variant impact prediction, kmer/gkm-based machine learning methods (deltaSVM_HT-SELEX, QBiC-Pred) trained on in vitro datasets exhibit the best performance. For in vivo ASB variant prediction, DNN-based multitask models (DeepSEA, Sei, Enformer) trained on the ChIP-seq dataset exhibit relatively superior performance. Among the PWM-based methods, tRap demonstrates better performance in both in vitro and in vivo evaluations. In addition, we find that TF classes such as basic leucine zipper factors could be predicted more accurately, whereas those such as C2H2 zinc finger factors are predicted less accurately, aligning with the evolutionary conservation of these TF classes. We also underscore the significance of non-sequence factors such as cis-regulatory element type, TF expression, interactions and post-translational modifications in influencing the in vivo predictive performance of TFs. Our research provides valuable insights into selecting prioritization methods for non-coding variants and further optimizing such models.


Assuntos
Polimorfismo de Nucleotídeo Único , Fatores de Transcrição , Sítios de Ligação/genética , Ligação Proteica/genética , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo , DNA/genética
2.
Brief Bioinform ; 25(5)2024 Jul 25.
Artigo em Inglês | MEDLINE | ID: mdl-39276327

RESUMO

Recent advancements in high-throughput sequencing technologies have significantly enhanced our ability to unravel the intricacies of gene regulatory processes. A critical challenge in this endeavor is the identification of variant effects, a key factor in comprehending the mechanisms underlying gene regulation. Non-coding variants, constituting over 90% of all variants, have garnered increasing attention in recent years. The exploration of gene variant impacts and regulatory mechanisms has spurred the development of various deep learning approaches, providing new insights into the global regulatory landscape through the analysis of extensive genetic data. Here, we provide a comprehensive overview of the development of the non-coding variants models based on bulk and single-cell sequencing data and their model-based interpretation and downstream tasks. This review delineates the popular sequencing technologies for epigenetic profiling and deep learning approaches for discerning the effects of non-coding variants. Additionally, we summarize the limitations of current approaches in variant effect prediction research and outline opportunities for improvement. We anticipate that our study will offer a practical and useful guide for the bioinformatic community to further advance the unraveling of genetic variant effects.


Assuntos
Aprendizado Profundo , Variação Genética , Humanos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Biologia Computacional/métodos , Epigênese Genética
3.
Genet Epidemiol ; 2024 Sep 24.
Artigo em Inglês | MEDLINE | ID: mdl-39318036

RESUMO

The introduction of Next-Generation Sequencing technologies in the clinics has improved rare disease diagnosis. Nonetheless, for very heterogeneous or very rare diseases, more than half of cases still lack molecular diagnosis. Novel strategies are needed to prioritize variants within a single individual. The Population Sampling Probability (PSAP) method was developed to meet this aim but only for coding variants in exome data. Here, we propose an extension of the PSAP method to the non-coding genome called PSAP-genomic-regions. In this extension, instead of considering genes as testing units (PSAP-genes strategy), we use genomic regions defined over the whole genome that pinpoint potential functional constraints. We conceived an evaluation protocol for our method using artificially generated disease exomes and genomes, by inserting coding and non-coding pathogenic ClinVar variants in large data sets of exomes and genomes from the general population. PSAP-genomic-regions significantly improves the ranking of these variants compared to using a pathogenicity score alone. Using PSAP-genomic-regions, more than 50% of non-coding ClinVar variants were among the top 10 variants of the genome. On real sequencing data from six patients with Cerebral Small Vessel Disease and nine patients with male infertility, all causal variants were ranked in the top 100 variants with PSAP-genomic-regions. By revisiting the testing units used in the PSAP method to include non-coding variants, we have developed PSAP-genomic-regions, an efficient whole-genome prioritization tool which offers promising results for the diagnosis of unresolved rare diseases.

4.
Genomics ; 116(2): 110782, 2024 03.
Artigo em Inglês | MEDLINE | ID: mdl-38176574

RESUMO

There is an increasing understanding that a reference genome representing an individual cannot capture all the gene repertoire of a species. Here, we conduct a population-scale missing sequences detection of Chinese domestic pigs using whole-genome sequencing data from 534 individuals. We identify 132.41 Mb of sequences absent in the reference assembly, including eight novel genes. In particular, the breeds spread in Chinese high-altitude regions perform significantly different frequencies of new sequences in promoters than other breeds. Furthermore, we dissect the role of non-coding variants and identify a novel sequence inserted in the 3'UTR of the FMO3 gene, which may be associated with the intramuscular fat phenotype. This novel sequence could be a candidate marker for meat quality. Our study provides a comprehensive overview of the missing sequences in Chinese domestic pigs and indicates that this dataset is a valuable resource for understanding the diversity and biology of pigs.


Assuntos
Genoma , Sus scrofa , Animais , Cruzamento , China , Fenótipo , Sus scrofa/genética , Suínos/genética
5.
J Cell Mol Med ; 27(12): 1621-1636, 2023 06.
Artigo em Inglês | MEDLINE | ID: mdl-37183561

RESUMO

Cardiovascular diseases (CVDs) constitute one of the significant causes of death worldwide. Different pathological states are linked to CVDs, which despite interventions and treatments, still have poor prognoses. The genetic component, as a beneficial tool in the risk stratification of CVD development, plays a role in the pathogenesis of this group of diseases. The emergence of genome-wide association studies (GWAS) have led to the identification of non-coding parts associated with cardiovascular traits and disorders. Variants located in functional non-coding regions, including promoters/enhancers, introns, miRNAs and 5'/3' UTRs, account for 90% of all identified single-nucleotide polymorphisms associated with CVDs. Here, for the first time, we conducted a comprehensive review on the reported non-coding variants for different CVDs, including hypercholesterolemia, cardiomyopathies, congenital heart diseases, thoracic aortic aneurysms/dissections and coronary artery diseases. Additionally, we present the most commonly reported genes involved in each CVD. In total, 1469 non-coding variants constitute most reports on familial hypercholesterolemia, hypertrophic cardiomyopathy and dilated cardiomyopathy. The application and identification of non-coding variants are beneficial for the genetic diagnosis and better therapeutic management of CVDs.


Assuntos
Doenças Cardiovasculares , MicroRNAs , Humanos , Doenças Cardiovasculares/genética , Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único/genética , Fenótipo , MicroRNAs/genética
6.
Int J Mol Sci ; 24(15)2023 Jul 27.
Artigo em Inglês | MEDLINE | ID: mdl-37569400

RESUMO

Utilizing large-scale epigenomics data, deep learning tools can predict the regulatory activity of genomic sequences, annotate non-coding genetic variants, and uncover mechanisms behind complex traits. However, these tools primarily rely on human or mouse data for training, limiting their performance when applied to other species. Furthermore, the limited exploration of many species, particularly in the case of livestock, has led to a scarcity of comprehensive and high-quality epigenetic data, posing challenges in developing reliable deep learning models for decoding their non-coding genomes. The cross-species prediction of the regulatory genome can be achieved by leveraging publicly available data from extensively studied organisms and making use of the conserved DNA binding preferences of transcription factors within the same tissue. In this study, we introduced DeepSATA, a novel deep learning-based sequence analyzer that incorporates the transcription factor binding affinity for the cross-species prediction of chromatin accessibility. By applying DeepSATA to analyze the genomes of pigs, chickens, cattle, humans, and mice, we demonstrated its ability to improve the prediction accuracy of chromatin accessibility and achieve reliable cross-species predictions in animals. Additionally, we showcased its effectiveness in analyzing pig genetic variants associated with economic traits and in increasing the accuracy of genomic predictions. Overall, our study presents a valuable tool to explore the epigenomic landscape of various species and pinpoint regulatory deoxyribonucleic acid (DNA) variants associated with complex traits.


Assuntos
Aprendizado Profundo , Animais , Humanos , Bovinos , Suínos , Camundongos , Galinhas/genética , Cromatina/genética , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo , DNA
7.
J Headache Pain ; 24(1): 78, 2023 Jun 29.
Artigo em Inglês | MEDLINE | ID: mdl-37380951

RESUMO

Migraine is a common and complex neurological disease potentially caused by a polygenic interaction of multiple gene variants. Many genes associated with migraine are involved in pathways controlling the synaptic function and neurotransmitters release. However, the molecular mechanisms underpinning migraine need to be further explored.Recent studies raised the possibility that migraine may arise from the effect of regulatory non-coding variants. In this study, we explored the effect of candidate non-coding variants potentially associated with migraine and predicted to lie within regulatory elements: VAMP2_rs1150, SNAP25_rs2327264, and STX1A_rs6951030. The involvement of these genes, which are constituents of the SNARE complex involved in membrane fusion and neurotransmitter release, underscores their significance in migraine pathogenesis. Our reporter gene assays confirmed the impact of at least two of these non-coding variants. VAMP2 and SNAP25 risk alleles were associated with a decrease and increase in gene expression, respectively, while STX1A risk allele showed a tendency to reduce luciferase activity in neuronal-like cells. Therefore, the VAMP2_rs1150 and SNAP25_rs2327264 non-coding variants affect gene expression, which may have implications in migraine susceptibility. Based on previous in silico analysis, it is plausible that these variants influence the binding of regulators, such as transcription factors and micro-RNAs. Still, further studies exploring these mechanisms would be important to shed light on the association between SNAREs dysregulation and migraine susceptibility.


Assuntos
Transtornos de Enxaqueca , Proteína 2 Associada à Membrana da Vesícula , Humanos , Proteína 2 Associada à Membrana da Vesícula/genética , Fusão de Membrana , Alelos , Transtornos de Enxaqueca/genética , Expressão Gênica , Proteína 25 Associada a Sinaptossoma/genética
8.
Vascular ; 30(5): 842-847, 2022 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-34281442

RESUMO

BACKGROUND: Visceral artery aneurysms (VAAs) can be fatal if ruptured. Although a relatively rare incident, it holds a contemporary mortality rate of approximately 12%. VAAs have multiple possible causes, one of which is genetic predisposition. Here, we present a striking family with seven individuals affected by VAAs, and one individual affected by a visceral artery pseudoaneurysm. METHODS: We exome sequenced the affected family members and the parents of the proband to find a possible underlying genetic defect. As exome sequencing did not reveal any feasible protein-coding variants, we combined whole-genome sequencing of two individuals with linkage analysis to find a plausible non-coding culprit variant. Variants were ranked by the deep learning framework DeepSEA. RESULTS: Two of seven top-ranking variants, NC_000013.11:g.108154659C>T and NC_000013.11:g.110409638C>T, were found in all VAA-affected individuals, but not in the individual affected by the pseudoaneurysm. The second variant is in a candidate cis-regulatory element in the fourth intron of COL4A2, proximal to COL4A1. CONCLUSIONS: As type IV collagens are essential for the stability and integrity of the vascular basement membrane and involved in vascular disease, we conclude that COL4A1 and COL4A2 are strong candidates for VAA susceptibility genes.


Assuntos
Falso Aneurisma , Aneurisma , Colágeno Tipo IV , Aneurisma/etiologia , Artérias , Colágeno Tipo IV/genética , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Linhagem
9.
Int J Mol Sci ; 23(21)2022 Oct 26.
Artigo em Inglês | MEDLINE | ID: mdl-36361767

RESUMO

The advent of Whole Genome Sequencing (WGS) broadened the genetic variation detection range, revealing the presence of variants even in non-coding regions of the genome, which would have been missed using targeted approaches. One of the most challenging issues in WGS analysis regards the interpretation of annotated variants. This review focuses on tools suitable for the functional annotation of variants falling into non-coding regions. It couples the description of non-coding genomic areas with the results and performance of existing tools for a functional interpretation of the effect of variants in these regions. Tools were tested in a controlled genomic scenario, representing the ground-truth and allowing us to determine software performance.


Assuntos
Genômica , Software , Humanos , Genômica/métodos , Sequenciamento Completo do Genoma/métodos , Genoma , Genoma Humano
10.
BMC Bioinformatics ; 22(Suppl 6): 128, 2021 Jun 02.
Artigo em Inglês | MEDLINE | ID: mdl-34078253

RESUMO

BACKGROUND: Understanding the functional effects of non-coding variants is important as they are often associated with gene-expression alteration and disease development. Over the past few years, many computational tools have been developed to predict their functional impact. However, the intrinsic difficulty in dealing with the scarcity of data leads to the necessity to further improve the algorithms. In this work, we propose a novel method, employing a semi-supervised deep-learning model with pseudo labels, which takes advantage of learning from both experimentally annotated and unannotated data. RESULTS: We prepared known functional non-coding variants with histone marks, DNA accessibility, and sequence context in GM12878, HepG2, and K562 cell lines. Applying our method to the dataset demonstrated its outstanding performance, compared with that of existing tools. Our results also indicated that the semi-supervised model with pseudo labels achieves higher predictive performance than the supervised model without pseudo labels. Interestingly, a model trained with the data in a certain cell line is unlikely to succeed in other cell lines, which implies the cell-type-specific nature of the non-coding variants. Remarkably, we found that DNA accessibility significantly contributes to the functional consequence of variants, which suggests the importance of open chromatin conformation prior to establishing the interaction of non-coding variants with gene regulation. CONCLUSIONS: The semi-supervised deep learning model coupled with pseudo labeling has advantages in studying with limited datasets, which is not unusual in biology. Our study provides an effective approach in finding non-coding mutations potentially associated with various biological phenomena, including human diseases.


Assuntos
Aprendizado Profundo , Algoritmos , Genômica , Código das Histonas , Humanos , Aprendizado de Máquina Supervisionado
11.
Int J Mol Sci ; 22(14)2021 Jul 19.
Artigo em Inglês | MEDLINE | ID: mdl-34299313

RESUMO

With the progress of sequencing technologies, an ever-increasing number of variants of unknown functional and clinical significance (VUS) have been identified in both coding and non-coding regions of the main Breast Cancer (BC) predisposition genes. The aim of this study is to identify a mutational profile of coding and intron-exon junction regions of 12 moderate penetrance genes (ATM, BRIP1, CDH1, CHEK2, NBN, PALB2, PTEN, RAD50, RAD51C, RAD51D, STK11, TP53) in a cohort of 450 Italian patients with Hereditary Breast/Ovarian Cancer Syndrome, wild type for germline mutation in BRCA1/2 genes. The analysis was extended to 5'UTR and 3'UTR of all the genes listed above and to the BRCA1 and BRCA2 known regulatory regions in a subset of 120 patients. The screening was performed through NGS target resequencing on the Illumina platform MiSeq. 8.7% of the patients analyzed is carriers of class 5/4 coding variants in the ATM (3.6%), BRIP1 (1.6%), CHEK2 (1.8%), PALB2 (0.7%), RAD51C (0.4%), RAD51D (0.4%), and TP53 (0.2%) genes, while variants of uncertain pathological significance (VUSs)/class 3 were identified in 9.1% of the samples. In intron-exon junctions and in regulatory regions, variants were detected respectively in 5.1% and in 32.5% of the cases analyzed. The average age of disease onset of 44.4 in non-coding variant carriers is absolutely similar to the average age of disease onset in coding variant carriers for each proband's group with the same cancer type. Furthermore, there is not a statistically significant difference in the proportion of cases with a tumor onset under age of 40 between the two groups, but the presence of multiple non-coding variants in the same patient may affect the aggressiveness of the tumor and it is worth underlining that 25% of patients with an aggressive tumor are carriers of a PTEN 3'UTR-variant. This data provides initial information on how important it might be to extend mutational screening to the regulatory regions in clinical practice.


Assuntos
Síndrome Hereditária de Câncer de Mama e Ovário/genética , Adulto , Idade de Início , Estudos de Coortes , Feminino , Genes BRCA1 , Genes BRCA2 , Predisposição Genética para Doença , Variação Genética , Mutação em Linhagem Germinativa , Humanos , Itália , Pessoa de Meia-Idade , PTEN Fosfo-Hidrolase/genética , Penetrância , Sequências Reguladoras de Ácido Nucleico
12.
J Mol Cell Cardiol ; 144: 54-62, 2020 07.
Artigo em Inglês | MEDLINE | ID: mdl-32437778

RESUMO

Recent genome-wide association studies identified several polymorphisms in the APOA5/A4/C3/A1 gene cluster influencing lipids level and risk of coronary heart disease (CHD). However, few studies explored the molecular mechanism. The purposes of this study were to fine-map noncoding region between APOA1 and APOC3 and then explore the clinical relevance in CHD and potential underlying mechanisms. In this study, a 2.7-kb length of the non-coding region between APOA1 and APOC3 was screened and five polymorphisms were investigated in the case-control study. The molecular mechanism was explored. Our data confirmed the association between rs7123454, rs12721030, rs10750098, and rs12721028 with CHD in 828 patients and 828 controls and replicated it in an independent population of 405 patients and 405 controls. In addition, the rs10750098 and rs12721030 are significantly associated with decreased serum APOA1 levels (P = 4.2 × 10-4 and P = 3.2 × 10-5, combined analysis), while a significant association was observed between serum APOA1 level and CHD (OR: 0.43, 95% CI: 0.28-0.64, P < .01) with adjustment for clinical covariates and different population sets. In vitro evaluation of potential function of non-coding variants between APOA1 and APOC3 demonstrated that rs10750098 as being the most sufficient to confer the haplotype-specific effect on the regulation of APOs gene transcription. Our results strongly implicate the involvement of common noncoding DNA variants in APOA5/A4/C3/A1 gene cluster in the pathogenesis of dyslipidemia and the risk of CHD.


Assuntos
Apolipoproteína A-I/genética , Apolipoproteína A-V/genética , Apolipoproteína C-III/genética , Apolipoproteínas A/genética , Doença das Coronárias/etiologia , Família Multigênica , RNA não Traduzido , Idoso , Alelos , Apolipoproteína A-I/metabolismo , Apolipoproteína A-V/metabolismo , Apolipoproteína C-III/metabolismo , Apolipoproteínas A/metabolismo , Biomarcadores , Estudos de Casos e Controles , Doença das Coronárias/diagnóstico , Doença das Coronárias/metabolismo , Feminino , Estudos de Associação Genética , Predisposição Genética para Doença , Testes Genéticos , Variação Genética , Genótipo , Humanos , Masculino , Pessoa de Meia-Idade , Polimorfismo de Nucleotídeo Único
13.
Int J Mol Sci ; 21(22)2020 Nov 13.
Artigo em Inglês | MEDLINE | ID: mdl-33202810

RESUMO

Brugada syndrome (BrS) is an inherited electrical heart disease associated with a high risk of sudden cardiac death (SCD). The genetic characterization of BrS has always been challenging. Although several cardiac ion channel genes have been associated with BrS, SCN5A is the only gene that presents definitive evidence for causality to be used for clinical diagnosis of BrS. However, more than 65% of diagnosed cases cannot be explained by variants in SCN5A or other genes. Therefore, in an important number of BrS cases, the underlying mechanisms are still elusive. Common variants, mostly located in non-coding regions, have emerged as potential modulators of the disease by affecting different regulatory mechanisms, including transcription factors (TFs), three-dimensional organization of the genome, or non-coding RNAs (ncRNAs). These common variants have been hypothesized to modulate the interindividual susceptibility of the disease, which could explain incomplete penetrance of BrS observed within families. Altogether, the study of both common and rare variants in parallel is becoming increasingly important to better understand the genetic basis underlying BrS. In this review, we aim to describe the challenges of studying non-coding variants associated with disease, re-examine the studies that have linked non-coding variants with BrS, and provide further evidence for the relevance of regulatory elements in understanding this cardiac disorder.


Assuntos
Síndrome de Brugada , Genoma Humano , RNA não Traduzido , Elementos Reguladores de Transcrição , Síndrome de Brugada/genética , Síndrome de Brugada/metabolismo , Morte Súbita Cardíaca , Feminino , Humanos , Masculino , Canal de Sódio Disparado por Voltagem NAV1.5/genética , Canal de Sódio Disparado por Voltagem NAV1.5/metabolismo , RNA não Traduzido/genética , RNA não Traduzido/metabolismo
14.
Breast Cancer Res Treat ; 168(2): 311-325, 2018 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-29236234

RESUMO

PURPOSE: The molecular mechanism of breast and/or ovarian cancer susceptibility remains unclear in the majority of patients. While germline mutations in the regulatory non-coding regions of BRCA1 and BRCA2 genes have been described, screening has generally been limited to coding regions. The aim of this study was to evaluate the contribution of BRCA1/2 non-coding variants. METHODS: Four BRCA1/2 non-coding regions were screened using high-resolution melting analysis/Sanger sequencing or next-generation sequencing on DNA extracted from index cases with breast and ovarian cancer predisposition (3926 for BRCA1 and 3910 for BRCA2). The impact of a set of variants on BRCA1/2 gene regulation was evaluated by site-directed mutagenesis, transfection, followed by Luciferase gene reporter assay. RESULTS: We identified a total of 117 variants and tested twelve BRCA1 and 8 BRCA2 variants mapping to promoter and intronic regions. We highlighted two neighboring BRCA1 promoter variants (c.-130del; c.-125C > T) and one BRCA2 promoter variants (c.-296C > T) inhibiting significantly the promoter activity. In the functional assays, a regulating region within the intron 12 was found with the same enhancing impact as within the intron 2. Furthermore, the variants c.81-3980A > G and c.4186-2022C > T suppress the positive effect of the introns 2 and 12, respectively, on the BRCA1 promoter activity. We also found some variants inducing the promoter activities. CONCLUSION: In this study, we highlighted some variants among many, modulating negatively the promoter activity of BRCA1 or 2 and thus having a potential impact on the risk of developing cancer. This selection makes it possible to conduct future validation studies on a limited number of variants.


Assuntos
Proteína BRCA1/genética , Proteína BRCA2/genética , Genes BRCA1 , Genes BRCA2 , Síndrome Hereditária de Câncer de Mama e Ovário/genética , Adulto , Idoso , Estudos de Coortes , Biologia Computacional , Feminino , Predisposição Genética para Doença , Mutação em Linhagem Germinativa , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Íntrons/genética , Pessoa de Meia-Idade , Linhagem , Polimorfismo de Nucleotídeo Único , Regiões Promotoras Genéticas/genética , Regiões não Traduzidas/genética
15.
Biochem Biophys Res Commun ; 462(4): 301-13, 2015 Jul 10.
Artigo em Inglês | MEDLINE | ID: mdl-25976673

RESUMO

The recombination-activating genes (RAGs) encode for V(D)J recombinases responsible for rearrangements of antigen-receptor genes during T and B cell development, and RAG expression is known to correlate strictly with the process of rearrangement. There have been several studies of RAG1 illustrating biochemical, physiological and immunological properties. Hitherto, there are limited studies on RAG1 focusing molecular phylogenetic analyses, evolutionary traits, and genetic variants in human populations. Hence, there is a need of a comprehensive study on this topic. In the current report, we have shed light into insights of evolutionary traits and genetic variants of human RAG1 gene using 1092 genomes from human populations. Syntenic analyses revealed that two RAG genes are physically linked and conserved on the same locus in head-to-head orientation from sea urchin to human for about 550 MY. Spliceosomal introns have been in invaded in fishes and sea urchin, whereas gene structures of RAG1 gene from tetrapods remained single exon architecture. We compiled 751 genetic variants in human RAG1 gene using 1092 human genomes; where major stockholders of variant classes are 79% single nucleotide polymorphisms (SNPs), 12.2% somatic single nucleotide variants (somatic SNVs) and 6.8% deletion. Out of 267 missense variants, 140 are deleterious mutations. We identified 284 non-coding variants with 94% regulatory in nature.


Assuntos
Variação Genética , Proteínas de Homeodomínio/genética , Mutação de Sentido Incorreto , Filogenia , Recombinação V(D)J , Genoma Humano , Humanos , Íntrons , Spliceossomos
16.
J Genet Genomics ; 51(2): 230-242, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38142743

RESUMO

The application of whole genome sequencing is expanding in clinical diagnostics across various genetic disorders, and the significance of non-coding variants in penetrant diseases is increasingly being demonstrated. Therefore, it is urgent to improve the diagnostic yield by exploring the pathogenic mechanisms of variants in non-coding regions. However, the interpretation of non-coding variants remains a significant challenge, due to the complex functional regulatory mechanisms of non-coding regions and the current limitations of available databases and tools. Hence, we develop the non-coding variant annotation database (NCAD, http://www.ncawdb.net/), encompassing comprehensive insights into 665,679,194 variants, regulatory elements, and element interaction details. Integrating data from 96 sources, spanning both GRCh37 and GRCh38 versions, NCAD v1.0 provides vital information to support the genetic diagnosis of non-coding variants, including allele frequencies of 12 diverse populations, with a particular focus on the population frequency information for 230,235,698 variants in 20,964 Chinese individuals. Moreover, it offers prediction scores for variant functionality, five categories of regulatory elements, and four types of non-coding RNAs. With its rich data and comprehensive coverage, NCAD serves as a valuable platform, empowering researchers and clinicians with profound insights into non-coding regulatory mechanisms while facilitating the interpretation of non-coding variants.


Assuntos
Bases de Dados Genéticas , Sequências Reguladoras de Ácido Nucleico , Humanos , Anotação de Sequência Molecular , Frequência do Gene , Sequências Reguladoras de Ácido Nucleico/genética , Variação Genética/genética
17.
Best Pract Res Clin Rheumatol ; 38(2): 101937, 2024 05.
Artigo em Inglês | MEDLINE | ID: mdl-38429183

RESUMO

Systemic Lupus Erythematosus (SLE) is a multifactorial autoimmune disease that arises from a dynamic interplay between genetics and environmental triggers. The advent of sophisticated genomics technology has catalyzed a shift in our understanding of disease etiology, spotlighting the pivotal role of non-coding DNA variants in SLE pathogenesis. In this review, we present a comprehensive examination of the non-coding variants associated with SLE, shedding light on their role in influencing disease risk and progression. We discuss the latest methodological advancements that have been instrumental in the identification and functional characterization of these genomic elements, with a special focus on the transformative power of CRISPR-based gene-editing technologies. Additionally, the review probes into the therapeutic opportunities that arise from modulating non-coding regions associated with SLE. Through an exploration of the complex network of non-coding DNA, this review aspires to decode the genetic puzzle of SLE and set the stage for groundbreaking gene-based therapeutic interventions and the advancement of precision medicine strategies tailored to SLE management.


Assuntos
Predisposição Genética para Doença , Lúpus Eritematoso Sistêmico , Humanos , Lúpus Eritematoso Sistêmico/genética , Variação Genética , Edição de Genes
18.
Front Biosci (Schol Ed) ; 16(1): 4, 2024 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-38538340

RESUMO

Genome-wide association studies (GWAS) have mapped over 90% of disease- and quantitative-trait-associated variants within the non-coding genome. Non-coding regulatory DNA (e.g., promoters and enhancers) and RNA (e.g., 5' and 3' UTRs and splice sites) are essential in regulating temporal and tissue-specific gene expressions. Non-coding variants can potentially impact the phenotype of an organism by altering the molecular recognition of the cis-regulatory elements, leading to gene dysregulation. However, determining causality between non-coding variants, gene regulation, and human disease has remained challenging. Experimental and computational methods have been developed to understand the molecular mechanism involved in non-coding variant interference at the transcriptional and post-transcriptional levels. This review discusses recent approaches to evaluating disease-associated single-nucleotide variants (SNVs) and determines their impact on transcription factor (TF) binding, gene expression, chromatin conformation, post-transcriptional regulation, and translation.


Assuntos
Regulação da Expressão Gênica , Estudo de Associação Genômica Ampla , Humanos , Regulação da Expressão Gênica/genética , Sequências Reguladoras de Ácido Nucleico , Regiões Promotoras Genéticas , Ligação Proteica , Polimorfismo de Nucleotídeo Único/genética
19.
Expert Rev Mol Diagn ; 24(9): 753-765, 2024 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-39194060

RESUMO

INTRODUCTION: Sensorineural hearing impairment (SNHI), a common childhood disorder with heterogeneous genetic causes, can lead to delayed language development and psychosocial problems. Next-generation sequencing (NGS) offers high-throughput screening and high-sensitivity detection of genetic etiologies of SNHI, enabling clinicians to make informed medical decisions, provide tailored treatments, and improve prognostic outcomes. AREAS COVERED: This review covers the diverse etiologies of HHI and the utility of different NGS modalities (targeted sequencing and whole exome/genome sequencing), and includes HHI-related studies on newborn screening, genetic counseling, prognostic prediction, and personalized treatment. Challenges such as the trade-off between cost and diagnostic yield, detection of structural variants, and exploration of the non-coding genome are also highlighted. EXPERT OPINION: In the current landscape of NGS-based diagnostics for HHI, there are both challenges (e.g. detection of structural variants and non-coding genome variants) and opportunities (e.g. the emergence of medical artificial intelligence tools). The authors advocate the use of technological advances such as long-read sequencing for structural variant detection, multi-omics analysis for non-coding variant exploration, and medical artificial intelligence for pathogenicity assessment and outcome prediction. By integrating these innovations into clinical practice, precision medicine in the diagnosis and management of HHI can be further improved.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Testes Genéticos/métodos , Triagem Neonatal/métodos , Perda Auditiva Neurossensorial/genética , Perda Auditiva Neurossensorial/diagnóstico , Perda Auditiva Neurossensorial/terapia , Predisposição Genética para Doença , Recém-Nascido , Gerenciamento Clínico , Aconselhamento Genético
20.
Front Immunol ; 15: 1387253, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38947339

RESUMO

Type I diabetes is an autoimmune disease mediated by T-cell destruction of ß cells in pancreatic islets. Currently, there is no known cure, and treatment consists of daily insulin injections. Genome-wide association studies and twin studies have indicated a strong genetic heritability for type I diabetes and implicated several genes. As most strongly associated variants are noncoding, there is still a lack of identification of functional and, therefore, likely causal variants. Given that many of these genetic variants reside in enhancer elements, we have tested 121 CD4+ T-cell enhancer variants associated with T1D. We found four to be functional through massively parallel reporter assays. Three of the enhancer variants weaken activity, while the fourth strengthens activity. We link these to their cognate genes using 3D genome architecture or eQTL data and validate them using CRISPR editing. Validated target genes include CLEC16A and SOCS1. While these genes have been previously implicated in type 1 diabetes and other autoimmune diseases, we show that enhancers controlling their expression harbor functional variants. These variants, therefore, may act as causal type 1 diabetic variants.


Assuntos
Linfócitos T CD4-Positivos , Diabetes Mellitus Tipo 1 , Elementos Facilitadores Genéticos , Predisposição Genética para Doença , Diabetes Mellitus Tipo 1/genética , Diabetes Mellitus Tipo 1/imunologia , Humanos , Linfócitos T CD4-Positivos/imunologia , Linfócitos T CD4-Positivos/metabolismo , Elementos Facilitadores Genéticos/genética , Proteína 1 Supressora da Sinalização de Citocina/genética , Estudo de Associação Genômica Ampla , Lectinas Tipo C/genética , Variação Genética , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa