Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 31.822
Filtrar
1.
BMC Bioinformatics ; 24(1): 341, 2023 Sep 13.
Artigo em Inglês | MEDLINE | ID: mdl-37704952

RESUMO

BACKGROUND: Mitochondria are the cell organelles that produce most of the chemical energy required to power the cell's biochemical reactions. Despite being a part of a eukaryotic host cell, the mitochondria contain a separate genome whose origin is linked with the endosymbiosis of a prokaryotic cell by the host cell and encode independent genomic information throughout their genomes. Mitochondrial genomes accommodate essential genes and are regularly utilized in biotechnology and phylogenetics. Various assemblers capable of generating complete mitochondrial genomes are being continuously developed. These tools often use whole-genome sequencing data as an input containing reads from the mitochondrial genome. Till now, no published work has explored the systematic comparison of all the available tools for assembling human mitochondrial genomes using short-read sequencing data. This evaluation is required to identify the best tool that can be well-optimized for small-scale projects or even national-level research. RESULTS: In this study, we have tested the mitochondrial genome assemblers for both simulated datasets and whole genome sequencing (WGS) datasets of humans. For the highest computational setting of 16 computational threads with the simulated dataset having 1000X read depth, MitoFlex took the least execution time of 69 s, and IOGA took the longest execution time of 1278 s. NOVOPlasty utilized the least computational memory of approximately 0.098 GB for the same setting, whereas IOGA utilized the highest computational memory of 11.858 GB. In the case of WGS datasets for humans, GetOrganelle and MitoFlex performed the best in capturing the SNPs information with a mean F1-score of 0.919 at the sequencing depth of 10X. MToolBox and NOVOPlasty performed consistently across all sequencing depths with a mean F1 score of 0.897 and 0.890, respectively. CONCLUSIONS: Based on the overall performance metrics and consistency in assembly quality for all sequencing data, MToolBox performed the best. However, NOVOPlasty was the second fastest tool in execution time despite being single-threaded, and it utilized the least computational resources among all the assemblers when tested on simulated datasets. Therefore, NOVOPlasty may be more practical when there is a significant sample size and a lack of computational resources. Besides, as long-read sequencing gains popularity, mitochondrial genome assemblers must be developed to use long-read sequencing data.


Assuntos
Genoma Mitocondrial , Humanos , Genoma Humano , Mitocôndrias/genética , Benchmarking , Biotecnologia
2.
Nat Commun ; 14(1): 5530, 2023 09 14.
Artigo em Inglês | MEDLINE | ID: mdl-37709751

RESUMO

Markedly expanded tandem repeats (TRs) have been correlated with ~60 diseases. TR diversity has been considered a clue toward understanding missing heritability. However, haplotype-resolved long TRs remain mostly hidden or blacked out because their complex structures (TRs composed of various units and minisatellites containing >10-bp units) make them difficult to determine accurately with existing methods. Here, using a high-precision algorithm to determine complex TR structures from long, accurate reads of PacBio HiFi, an investigation of 270 Japanese control samples yields several genome-wide findings. Approximately 322,000 TRs are difficult to impute from the surrounding single-nucleotide variants. Greater genetic divergence of TR loci is significantly correlated with more events of younger replication slippage. Complex TRs are more abundant than single-unit TRs, and a tendency for complex TRs to consist of <10-bp units and single-unit TRs to be minisatellites is statistically significant at loci with ≥500-bp TRs. Of note, 8909 loci with extended TRs (>100b longer than the mode) contain several known disease-associated TRs and are considered candidates for association with disorders. Overall, complex TRs and minisatellites are found to be abundant and diverse, even in genetically small Japanese populations, yielding insights into the landscape of long TRs.


Assuntos
Genoma Humano , Sequências de Repetição em Tandem , Humanos , Genoma Humano/genética , Repetições Minissatélites , Algoritmos , Deriva Genética
3.
Science ; 381(6664): 1289-1290, 2023 Sep 22.
Artigo em Inglês | MEDLINE | ID: mdl-37733865
4.
Science ; 381(6664): eadg7492, 2023 Sep 22.
Artigo em Inglês | MEDLINE | ID: mdl-37733863

RESUMO

The vast majority of missense variants observed in the human genome are of unknown clinical significance. We present AlphaMissense, an adaptation of AlphaFold fine-tuned on human and primate variant population frequency databases to predict missense variant pathogenicity. By combining structural context and evolutionary conservation, our model achieves state-of-the-art results across a wide range of genetic and experimental benchmarks, all without explicitly training on such data. The average pathogenicity score of genes is also predictive for their cell essentiality, capable of identifying short essential genes that existing statistical approaches are underpowered to detect. As a resource to the community, we provide a database of predictions for all possible human single amino acid substitutions and classify 89% of missense variants as either likely benign or likely pathogenic.


Assuntos
Substituição de Aminoácidos , Doença , Mutação de Sentido Incorreto , Proteoma , Alinhamento de Sequência , Humanos , Substituição de Aminoácidos/genética , Benchmarking , Sequência Conservada , Bases de Dados Genéticas , Doença/genética , Genoma Humano , Conformação Proteica , Proteoma/genética , Alinhamento de Sequência/métodos , Aprendizado de Máquina
5.
Commun Biol ; 6(1): 954, 2023 09 19.
Artigo em Inglês | MEDLINE | ID: mdl-37726397

RESUMO

Repetitive DNA sequences playing critical roles in driving evolution, inducing variation, and regulating gene expression. In this review, we summarized the definition, arrangement, and structural characteristics of repeats. Besides, we introduced diverse biological functions of repeats and reviewed existing methods for automatic repeat detection, classification, and masking. Finally, we analyzed the type, structure, and regulation of repeats in the human genome and their role in the induction of complex diseases. We believe that this review will facilitate a comprehensive understanding of repeats and provide guidance for repeat annotation and in-depth exploration of its association with human diseases.


Assuntos
Genoma Humano , Humanos , Sequência de Bases
6.
Science ; 381(6662): eadk5693, 2023 09 08.
Artigo em Inglês | MEDLINE | ID: mdl-37676963

RESUMO

There are no cures for the most common neurodegenerative diseases. None of the currently approved treatments cure or halt these conditions; rather, they address symptoms or slow disease progression. A focus on protein deposits in the brain-a hallmark of Alzheimer's disease (AD) and Parkinson's disease (PD)-has led to the development of immunotherapy drugs. Other promising avenues of investigation include the roles of neuroinflammation in neurodegeneration. However, the clinical impact of these approaches is still uncertain. What about exploiting our knowledge of the human genome and the ability to modify it with surgically precise tools? Can functional genomics approaches in neurodegenerative disease research provide the breakthroughs we need?


Assuntos
Doença de Alzheimer , Doença de Parkinson , Humanos , Doença de Alzheimer/genética , Doença de Alzheimer/terapia , Encéfalo , Genoma Humano , Genômica , Doença de Parkinson/genética , Doença de Parkinson/terapia
7.
Nat Commun ; 14(1): 5528, 2023 09 08.
Artigo em Inglês | MEDLINE | ID: mdl-37684230

RESUMO

Breakage-fusion-bridge (BFB) is a complex rearrangement that leads to tumor malignancy. Existing models for detecting BFBs rely on the ideal BFB hypothesis, ruling out the possibility of BFBs entangled with other structural variations, that is, complex BFBs. We propose an algorithm Ambigram to identify complex BFB and reconstruct the rearranged structure of the local genome during the cancer subclone evolution process. Ambigram handles data from short, linked, long, and single-cell sequences, and optical mapping technologies. Ambigram successfully deciphers the gold- or silver-standard complex BFBs against the state-of-the-art in multiple cancers. Ambigram dissects the intratumor heterogeneity of complex BFB events with single-cell reads from melanoma and gastric cancer. Furthermore, applying Ambigram to liver and cervical cancer data suggests that the BFB mechanism may mediate oncovirus integrations. BFB also exists in noncancer genomics. Investigating the complete human genome reference with Ambigram suggests that the BFB mechanism may be involved in two genome reorganizations of Homo Sapiens during evolution. Moreover, Ambigram discovers the signals of recurrent foldback inversions and complex BFBs in whole genome data from the 1000 genome project, and congenital heart diseases, respectively.


Assuntos
Melanoma , Neoplasias do Colo do Útero , Humanos , Feminino , Genômica , Fígado , Genoma Humano/genética
8.
BMC Med Ethics ; 24(1): 72, 2023 Sep 21.
Artigo em Inglês | MEDLINE | ID: mdl-37735670

RESUMO

BACKGROUND: Forward-looking, democratically oriented governance is needed to ensure that human genome editing serves rather than undercuts public values. Scientific, policy, and ethics communities have recognized this necessity but have demonstrated limited understanding of how to fulfill it. The field of bioethics has long attempted to grapple with the unintended consequences of emerging technologies, but too often such foresight has lacked adequate scientific grounding, overemphasized regulation to the exclusion of examining underlying values, and failed to adequately engage the public. METHODS: This research investigates the application of scenario planning, a tool developed in the high-stakes, uncertainty-ridden world of corporate strategy, for the equally high-stakes and uncertain world of the governance of emerging technologies. The scenario planning methodology is non-predictive, looking instead at a spread of plausible futures which diverge in their implications for different communities' needs, cares, and desires. RESULTS: In this article we share how the scenario development process can further understandings of the complex and dynamic systems which generate and shape new biomedical technologies and provide opportunities to re-examine and re-think questions of governance, ethics and values. We detail the results of a year-long scenario planning study that engaged experts from the biological sciences, bioethics, social sciences, law, policy, private industry, and civic organizations to articulate alternative futures of human genome editing. CONCLUSIONS: Through sharing and critiquing our methodological approach and results of this study, we advance understandings of anticipatory methods deployed in bioethics, demonstrating how this approach provides unique insights and helps to derive better research questions and policy strategies.


Assuntos
Bioética , Edição de Genes , Humanos , Ciências Sociais , Genoma Humano , Políticas
9.
Int J Mol Sci ; 24(16)2023 Aug 08.
Artigo em Inglês | MEDLINE | ID: mdl-37628742

RESUMO

We have developed a new method for promoter sequence classification based on a genetic algorithm and the MAHDS sequence alignment method. We have created four classes of human promoters, combining 17,310 sequences out of the 29,598 present in the EPD database. We searched the human genome for potential promoter sequences (PPSs) using dynamic programming and position weight matrices representing each of the promoter sequence classes. A total of 3,065,317 potential promoter sequences were found. Only 1,241,206 of them were located in unannotated parts of the human genome. Every other PPS found intersected with either true promoters, transposable elements, or interspersed repeats. We found a strong intersection between PPSs and Alu elements as well as transcript start sites. The number of false positive PPSs is estimated to be 3 × 10-8 per nucleotide, which is several orders of magnitude lower than for any other promoter prediction method. The developed method can be used to search for PPSs in various eukaryotic genomes.


Assuntos
Braquiterapia , Genoma Humano , Humanos , Elementos Alu/genética , Elementos de DNA Transponíveis/genética , Bases de Dados Factuais
10.
Bioinformatics ; 39(8)2023 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-37555812

RESUMO

MOTIVATION: The investigation of DNA methylation can shed light on the processes underlying human well-being and help determine overall human health. However, insufficient coverage makes it challenging to implement single-stranded DNA methylation sequencing technologies, highlighting the need for an efficient prediction model. Models are required to create an understanding of the underlying biological systems and to project single-cell (methylated) data accurately. RESULTS: In this study, we developed positional features for predicting CpG sites. Positional characteristics of the sequence are derived using data from CpG regions and the separation between nearby CpG sites. Multiple optimized classifiers and different ensemble learning approaches are evaluated. The OPTUNA framework is used to optimize the algorithms. The CatBoost algorithm followed by the stacking algorithm outperformed existing DNA methylation identifiers. AVAILABILITY AND IMPLEMENTATION: The data and methodologies used in this study are openly accessible to the research community. Researchers can access the positional features and algorithms used for predicting CpG site methylation patterns. To achieve superior performance, we employed the CatBoost algorithm followed by the stacking algorithm, which outperformed existing DNA methylation identifiers. The proposed iCpG-Pos approach utilizes only positional features, resulting in a substantial reduction in computational complexity compared to other known approaches for detecting CpG site methylation patterns. In conclusion, our study introduces a novel approach, iCpG-Pos, for predicting CpG site methylation patterns. By focusing on positional features, our model offers both accuracy and efficiency, making it a promising tool for advancing DNA methylation research and its applications in human health and well-being.


Assuntos
Algoritmos , Metilação de DNA , Humanos , Ilhas de CpG , Análise de Sequência de DNA/métodos , Genoma Humano
11.
Biomolecules ; 13(8)2023 08 12.
Artigo em Inglês | MEDLINE | ID: mdl-37627305

RESUMO

With the development of accurate protein structure prediction algorithms, artificial intelligence (AI) has emerged as a powerful tool in the field of structural biology. AI-based algorithms have been used to analyze large amounts of protein sequence data including the human proteome, complementing experimental structure data found in resources such as the Protein Data Bank. The EBI AlphaFold Protein Structure Database (for example) contains over 230 million structures. In this study, these data have been analyzed to find all human proteins containing (or predicted to contain) the cytosolic glutathione transferase (cGST) fold. A total of 39 proteins were found, including the alpha-, mu-, pi-, sigma-, zeta- and omega-class GSTs, intracellular chloride channels, metaxins, multisynthetase complex components, elongation factor 1 complex components and others. Three broad themes emerge: cGST domains as enzymes, as chloride ion channels and as protein-protein interaction mediators. As the majority of cGSTs are dimers, the AI-based structure prediction algorithm AlphaFold-multimer was used to predict structures of all pairwise combinations of these cGST domains. Potential homo- and heterodimers are described. Experimental biochemical and structure data is used to highlight the strengths and limitations of AI-predicted structures.


Assuntos
Genoma Humano , Glutationa Transferase , Humanos , Glutationa Transferase/genética , Inteligência Artificial , Algoritmos , Sequência de Aminoácidos
12.
Nat Commun ; 14(1): 5164, 2023 08 24.
Artigo em Inglês | MEDLINE | ID: mdl-37620373

RESUMO

Long-read sequencing has dramatically increased our understanding of human genome variation. Here, we demonstrate that long-read technology can give new insights into the genomic architecture of individual cells. Clonally expanded CD8+ T-cells from a human donor were subjected to droplet-based multiple displacement amplification (dMDA) to generate long molecules with reduced bias. PacBio sequencing generated up to 40% genome coverage per single-cell, enabling detection of single nucleotide variants (SNVs), structural variants (SVs), and tandem repeats, also in regions inaccessible by short reads. 28 somatic SNVs were detected, including one case of mitochondrial heteroplasmy. 5473 high-confidence SVs/cell were discovered, a sixteen-fold increase compared to Illumina-based results from clonally related cells. Single-cell de novo assembly generated a genome size of up to 598 Mb and 1762 (12.8%) complete gene models. In summary, our work shows the promise of long-read sequencing toward characterization of the full spectrum of genetic variation in single cells.


Assuntos
Genoma Humano , Genômica , Humanos , Tamanho do Genoma , Genoma Humano/genética , Linfócitos T CD8-Positivos , Ciclo Celular
13.
PLoS Genet ; 19(8): e1010399, 2023 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-37578977

RESUMO

Evidence of interbreeding between archaic hominins and humans comes from methods that infer the locations of segments of archaic haplotypes, or 'archaic coverage' using the genomes of people living today. As more estimates of archaic coverage have emerged, it has become clear that most of this coverage is found on the autosomes- very little is retained on chromosome X. Here, we summarize published estimates of archaic coverage on autosomes and chromosome X from extant human samples. We find on average 7 times more archaic coverage on autosomes than chromosome X, and identify broad continental patterns in this ratio: greatest in European samples, and least in South Asian samples. We also perform extensive simulation studies to investigate how the amount of archaic coverage, lengths of coverage, and rates of purging of archaic coverage are affected by sex-bias caused by an unequal sex ratio within the archaic introgressors. Our results generally confirm that, with increasing male sex-bias, less archaic coverage is retained on chromosome X. Ours is the first study to explicitly model such sex-bias and its potential role in creating the dearth of archaic coverage on chromosome X.


Assuntos
Introgressão Genética , Genoma Humano , Hominidae , Cromossomo X , Animais , Humanos , Masculino , Povo Asiático/genética , Genoma , Genoma Humano/genética , Hominidae/genética , Homem de Neandertal/genética , Cromossomo X/genética , Fatores Sexuais , Haplótipos/genética , Introgressão Genética/genética , Cromossomos Humanos/genética , Feminino , População do Sul da Ásia/genética , População Europeia/genética
14.
Comput Biol Chem ; 106: 107939, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-37598466

RESUMO

In this paper we propose that high copy number of the mitochondrial genome in neurons is a functional adaptation. We simulated the proliferation of deletion mutants of the human mitochondrial genome in a virtual mitochondrion and recorded the cell loss rates due to deletions overwhelming the wild-type. Our results showed that cell loss increased with mtDNA copy number. Given that neuron loss equates to cognitive dysfunction, it would seem counterintuitive that there would be a selective pressure for high copy number over low. However, for a low copy number, the onset of cognitive decline, while mild, started early in life. Whereas, for high copy number, it did not start until middle age but progressed rapidly. There could have been an advantage to high copy number in the brain if it delayed the onset of cognitive decline until after reproductive age. The prevalence of dementia in our aged population is a consequence of this functional adaptation.


Assuntos
Encéfalo , Demência , Pessoa de Meia-Idade , Humanos , Idoso , Prevalência , DNA Mitocondrial , Genoma Humano , Demência/epidemiologia , Demência/genética
15.
Brief Bioinform ; 24(5)2023 09 20.
Artigo em Inglês | MEDLINE | ID: mdl-37594299

RESUMO

Genome assembly is a computational technique that involves piecing together deoxyribonucleic acid (DNA) fragments generated by sequencing technologies to create a comprehensive and precise representation of the entire genome. Generating a high-quality human reference genome is a crucial prerequisite for comprehending human biology, and it is also vital for downstream genomic variation analysis. Many efforts have been made over the past few decades to create a complete and gapless reference genome for humans by using a diverse range of advanced sequencing technologies. Several available tools are aimed at enhancing the quality of haploid and diploid human genome assemblies, which include contig assembly, polishing of contig errors, scaffolding and variant phasing. Selecting the appropriate tools and technologies remains a daunting task despite several studies have investigated the pros and cons of different assembly strategies. The goal of this paper was to benchmark various strategies for human genome assembly by combining sequencing technologies and tools on two publicly available samples (NA12878 and NA24385) from Genome in a Bottle. We then compared their performances in terms of continuity, accuracy, completeness, variant calling and phasing. We observed that PacBio HiFi long-reads are the optimal choice for generating an assembly with low base errors. On the other hand, we were able to produce the most continuous contigs with Oxford Nanopore long-reads, but they may require further polishing to improve on quality. We recommend using short-reads rather than long-reads themselves to improve the base accuracy of contigs from Oxford Nanopore long-reads. Hi-C is the best choice for chromosome-level scaffolding because it can capture the longest-range DNA connectedness compared to 10× linked-reads and Bionano optical maps. However, a combination of multiple technologies can be used to further improve the quality and completeness of genome assembly. For diploid assembly, hifiasm is the best tool for human diploid genome assembly using PacBio HiFi and Hi-C data. Looking to the future, we expect that further advancements in human diploid assemblers will leverage the power of PacBio HiFi reads and other technologies with long-range DNA connectedness to enable the generation of high-quality, chromosome-level and haplotype-resolved human genome assemblies.


Assuntos
Benchmarking , Genoma Humano , Humanos , Análise de Sequência de DNA/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , DNA/genética
16.
PLoS Comput Biol ; 19(8): e1010974, 2023 08.
Artigo em Inglês | MEDLINE | ID: mdl-37590332

RESUMO

Proteolysis-targeting chimeras (PROTACs) are hetero-bifunctional molecules that induce the degradation of target proteins by recruiting an E3 ligase. PROTACs have the potential to inactivate disease-related genes that are considered undruggable by small molecules, making them a promising therapy for the treatment of incurable diseases. However, only a few hundred proteins have been experimentally tested for their amenability to PROTACs, and it remains unclear which other proteins in the entire human genome can be targeted by PROTACs. In this study, we have developed PrePROTAC, an interpretable machine learning model based on a transformer-based protein sequence descriptor and random forest classification. PrePROTAC predicts genome-wide targets that can be degraded by CRBN, one of the E3 ligases. In the benchmark studies, PrePROTAC achieved a ROC-AUC of 0.81, an average precision of 0.84, and over 40% sensitivity at a false positive rate of 0.05. When evaluated by an external test set which comprised proteins from different structural folds than those in the training set, the performance of PrePROTAC did not drop significantly, indicating its generalizability. Furthermore, we developed an embedding SHapley Additive exPlanations (eSHAP) method, which extends conventional SHAP analysis for original features to an embedding space through in silico mutagenesis. This method allowed us to identify key residues in the protein structure that play critical roles in PROTAC activity. The identified key residues were consistent with existing knowledge. Using PrePROTAC, we identified over 600 novel understudied proteins that are potentially degradable by CRBN and proposed PROTAC compounds for three novel drug targets associated with Alzheimer's disease.


Assuntos
Doença de Alzheimer , Ubiquitina-Proteína Ligases , Humanos , Sequência de Aminoácidos , Genoma Humano , Aprendizado de Máquina , Quimera de Direcionamento de Proteólise
17.
Nat Rev Cancer ; 23(10): 657-672, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-37537310

RESUMO

The human genome is organized into multiple structural layers, ranging from chromosome territories to progressively smaller substructures, such as topologically associating domains (TADs) and chromatin loops. These substructures, collectively referred to as long-range chromatin interactions (LRIs), have a significant role in regulating gene expression. TADs are regions of the genome that harbour groups of genes and regulatory elements that frequently interact with each other and are insulated from other regions, thereby preventing widespread uncontrolled DNA contacts. Chromatin loops formed within TADs through enhancer and promoter interactions are elastic, allowing transcriptional heterogeneity and stochasticity. Over the past decade, it has become evident that the 3D genome structure, also referred to as the chromatin architecture, is central to many transcriptional cellular decisions. In this Review, we delve into the intricate relationship between steroid receptors and LRIs, discussing how steroid receptors interact with and modulate these chromatin interactions. Genetic alterations in the many processes involved in organizing the nuclear architecture are often associated with the development of hormone-dependent cancers. A better understanding of the interplay between architectural proteins and hormone regulatory networks can ultimately be exploited to develop improved approaches for cancer treatment.


Assuntos
Cromatina , Neoplasias , Humanos , Cromatina/genética , Cromossomos , DNA , Genoma Humano , Regulação da Expressão Gênica , Elementos Facilitadores Genéticos , Neoplasias/genética
18.
Cell Rep ; 42(8): 112930, 2023 08 29.
Artigo em Inglês | MEDLINE | ID: mdl-37540596

RESUMO

The somatic mutations found in a cancer genome are imprinted by different mutational processes. Each process exhibits a characteristic mutational signature, which can be affected by the genome architecture. However, the interplay between mutational signatures and topographical genomic features has not been extensively explored. Here, we integrate mutations from 5,120 whole-genome-sequenced tumors from 40 cancer types with 516 topographical features from ENCODE to evaluate the effect of nucleosome occupancy, histone modifications, CTCF binding, replication timing, and transcription/replication strand asymmetries on the cancer-specific accumulation of mutations from distinct mutagenic processes. Most mutational signatures are affected by topographical features, with signatures of related etiologies being similarly affected. Certain signatures exhibit periodic behaviors or cancer-type-specific enrichments/depletions near topographical features, revealing further information about the processes that imprinted them. Our findings, disseminated via the COSMIC (Catalog of Somatic Mutations in Cancer) signatures database, provide a comprehensive online resource for exploring the interactions between mutational signatures and topographical features across human cancer.


Assuntos
Neoplasias , Humanos , Mutação/genética , Neoplasias/genética , Genômica , Sequência de Bases , Genoma Humano
19.
Nature ; 621(7979): 610-619, 2023 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-37557913

RESUMO

The proper regulation of transcription is essential for maintaining genome integrity and executing other downstream cellular functions1,2. Here we identify a stable association between the genome-stability regulator sensor of single-stranded DNA (SOSS)3 and the transcription regulator Integrator-PP2A (INTAC)4-6. Through SSB1-mediated recognition of single-stranded DNA, SOSS-INTAC stimulates promoter-proximal termination of transcription and attenuates R-loops associated with paused RNA polymerase II to prevent R-loop-induced genome instability. SOSS-INTAC-dependent attenuation of R-loops is enhanced by the ability of SSB1 to form liquid-like condensates. Deletion of NABP2 (encoding SSB1) or introduction of cancer-associated mutations into its intrinsically disordered region leads to a pervasive accumulation of R-loops, highlighting a genome surveillance function of SOSS-INTAC that enables timely termination of transcription at promoters to constrain R-loop accumulation and ensure genome stability.


Assuntos
Instabilidade Genômica , Regiões Promotoras Genéticas , Estruturas R-Loop , Terminação da Transcrição Genética , Humanos , DNA de Cadeia Simples/metabolismo , Instabilidade Genômica/genética , Mutação , Estruturas R-Loop/genética , RNA Polimerase II/metabolismo , Regiões Promotoras Genéticas/genética , Genoma Humano , Proteínas de Ligação a DNA/metabolismo
20.
Sci Adv ; 9(32): eadg6319, 2023 08 09.
Artigo em Inglês | MEDLINE | ID: mdl-37556544

RESUMO

Underrepresentation of non-European (EUR) populations hinders growth of global precision medicine. Resources such as imputation reference panels that match the study population are necessary to find low-frequency variants with substantial effects. We created a reference panel consisting of 14,393 whole-genome sequences including more than 11,000 Asian individuals. Genome-wide association studies were conducted using the reference panel and a population-specific genotype array of 72,298 subjects for eight phenotypes. This panel yields improved imputation accuracy of rare and low-frequency variants within East Asian populations compared with the largest reference panel. Thirty-nine previously unidentified associations were found, and more than half of the variants were East Asian specific. We discovered genes with rare protein-altering variants, including LTBP1 for height and GPR75 for body mass index, as well as putative regulatory mechanisms for rare noncoding variants with cell type-specific effects. We suggest that this dataset will add to the potential value of Asian precision medicine.


Assuntos
População do Leste Asiático , Estudo de Associação Genômica Ampla , Humanos , Genoma Humano , Polimorfismo de Nucleotídeo Único , Genótipo , Receptores Acoplados a Proteínas G/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...