Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 27
Filtrar
1.
Cell ; 185(11): 1986-2005.e26, 2022 05 26.
Artigo em Inglês | MEDLINE | ID: mdl-35525246

RESUMO

Unlike copy number variants (CNVs), inversions remain an underexplored genetic variation class. By integrating multiple genomic technologies, we discover 729 inversions in 41 human genomes. Approximately 85% of inversions <2 kbp form by twin-priming during L1 retrotransposition; 80% of the larger inversions are balanced and affect twice as many nucleotides as CNVs. Balanced inversions show an excess of common variants, and 72% are flanked by segmental duplications (SDs) or retrotransposons. Since flanking repeats promote non-allelic homologous recombination, we developed complementary approaches to identify recurrent inversion formation. We describe 40 recurrent inversions encompassing 0.6% of the genome, showing inversion rates up to 2.7 × 10-4 per locus per generation. Recurrent inversions exhibit a sex-chromosomal bias and co-localize with genomic disorder critical regions. We propose that inversion recurrence results in an elevated number of heterozygous carriers and structural SD diversity, which increases mutability in the population and predisposes specific haplotypes to disease-causing CNVs.


Assuntos
Inversão Cromossômica , Duplicações Segmentares Genômicas , Inversão Cromossômica/genética , Variações do Número de Cópias de DNA/genética , Genoma Humano , Genômica , Humanos
2.
BMC Genomics ; 24(1): 226, 2023 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-37127568

RESUMO

Open reading frames (ORFs) with fewer than 100 codons are generally not annotated in genomes, although bona fide genes of that size are known. Newer biochemical studies have suggested that thousands of small protein-coding ORFs (smORFs) may exist in the human genome, but the true number and the biological significance of the micropeptides they encode remain uncertain. Here, we used a comparative genomics approach to identify high-confidence smORFs that are likely protein-coding. We identified 3,326 high-confidence smORFs using constraint within human populations and evolutionary conservation as additional lines of evidence. Next, we validated that, as a group, our high-confidence smORFs are conserved at the amino-acid level rather than merely residing in highly conserved non-coding regions. Finally, we found that high-confidence smORFs are enriched among disease-associated variants from GWAS. Overall, our results highlight that smORF-encoded peptides likely have important functional roles in human disease.


Assuntos
Peptídeos , Proteínas , Humanos , Fases de Leitura Aberta , Proteínas/genética , Peptídeos/genética , Genoma Humano , Micropeptídeos
3.
Int J Mol Sci ; 24(3)2023 Jan 27.
Artigo em Inglês | MEDLINE | ID: mdl-36768784

RESUMO

Next Generation Sequencing (NGS) technologies are rapidly entering clinical practice. A promising area for their use lies in the field of newborn screening. The mass screening of newborns using NGS technology leads to the discovery of a large number of new missense variants that need to be assessed for association with the development of hereditary diseases. Currently, the primary analysis and identification of pathogenic variations is carried out using bioinformatic tools. Although extensive efforts have been made in the computational approach to variant interpretation, there is currently no generally accepted pathogenicity predictor. In this study, we used the sequence-structure-property relationships (SSPR) approach, based on the representation of protein fragments by molecular structural formula. The approach predicts the pathogenic effect of single amino acid substitutions in proteins related with twenty-five monogenic heritable diseases from the Uniform Screening Panel for Major Conditions recommended by the Advisory Committee on Hereditary Disorders in Newborns and Children. In order to create SSPR models of classification, we modified a piece of cheminformatics software, MultiPASS, that was originally developed for the prediction of activity spectra for drug-like substances. The created SSPR models were compared with traditional bioinformatic tools (SIFT 4G, Polyphen-2 HDIV, MutationAssessor, PROVEAN and FATHMM). The average AUC of our approach was 0.804 ± 0.040. Better quality scores were achieved for 15 from 25 proteins with a significantly higher accuracy for some proteins (IVD, HADHB, HBB). The best SSPR models of classification are freely available in the online resource SAV-Pred (Single Amino acid Variants Predictor).


Assuntos
Triagem Neonatal , Software , Recém-Nascido , Criança , Humanos , Substituição de Aminoácidos , Mutação de Sentido Incorreto , Biologia Computacional
4.
BMC Bioinformatics ; 23(1): 401, 2022 Sep 29.
Artigo em Inglês | MEDLINE | ID: mdl-36175857

RESUMO

BACKGROUND: Population variant analysis is of great importance for gathering insights into the links between human genotype and phenotype. The 1000 Genomes Project established a valuable reference for human genetic variation; however, the integrative use of the corresponding data with other datasets within existing repositories and pipelines is not fully supported. Particularly, there is a pressing need for flexible and fast selection of population partitions based on their variant and metadata-related characteristics. RESULTS: Here, we target general germline or somatic mutation data sources for their seamless inclusion within an interoperable-format repository, supporting integration among them and with other genomic data, as well as their integrated use within bioinformatic workflows. In addition, we provide VarSum, a data summarization service working on sub-populations of interest selected using filters on population metadata and/or variant characteristics. The service is developed as an optimized computational framework with an Application Programming Interface (API) that can be called from within any existing computing pipeline or programming script. Provided example use cases of biological interest show the relevance, power and ease of use of the API functionalities. CONCLUSIONS: The proposed data integration pipeline and data set extraction and summarization API pave the way for solid computational infrastructures that quickly process cumbersome variation data, and allow biologists and bioinformaticians to easily perform scalable analysis on user-defined partitions of large cohorts from increasingly available genetic variation studies. With the current tendency to large (cross)nation-wide sequencing and variation initiatives, we expect an ever growing need for the kind of computational support hereby proposed.


Assuntos
Genômica , Metadados , Biologia Computacional , Genótipo , Humanos , Software
5.
Pediatr Int ; 63(8): 918-922, 2021 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-33260258

RESUMO

BACKGROUND: Wilson disease (WD) is an autosomal recessive disorder caused by mutations in the ATP7B gene. In 1984, Scheinberg and Sternlieb estimated the prevalence of WD to be 1:30 000. However, recent epidemiological studies have reported increasing prevalence rates in different populations. The carrier frequency of ATP7B variants and the prevalence of WD in the Japanese population have not been reported using multiple databases. METHODS: Multiple public databases were used. First, we included mutations in the ATP7B gene that were registered in the Human Gene Mutation Database (HGMD) Professional, where 885 ATP7B variants were identified as pathogenic. Next, we investigated the allele frequencies of these 885 variants in Japanese individuals using the Human Genetic Variation Database (HGVD) and the Japanese Multi Omics Reference Panel (jMorp). RESULTS: Of the 885 variants of ATP7B, 7 and 12 missense and nonsense variants, zero and three splicing variants, and zero and two small deletions were found in the HGVD and in jMorp, respectively. The total allele frequencies of the ATP7B mutations were 0.011 in the HGVD and 0.014 in the jMorp. According to these data, the carrier frequencies were 0.022 (2.2%) and 0.028 (2.8%), respectively, and patient frequencies were 0.000121 (1.21/10 000 individuals) and 0.000196 (1.96/10 000 individuals), respectively. CONCLUSIONS: This is the first study to report the carrier frequency of ATP7B variants and the prevalence of WD in Japan using multiple databases. The calculated prevalence of WD was comparatively higher than that of previous reports, indicating previous underdiagnosis or the existence of less severe phenotypes.


Assuntos
Degeneração Hepatolenticular , ATPases Transportadoras de Cobre/genética , Frequência do Gene , Degeneração Hepatolenticular/diagnóstico , Degeneração Hepatolenticular/epidemiologia , Degeneração Hepatolenticular/genética , Humanos , Japão/epidemiologia , Mutação , Prevalência
6.
Proc Natl Acad Sci U S A ; 114(52): E11257-E11266, 2017 12 26.
Artigo em Inglês | MEDLINE | ID: mdl-29229813

RESUMO

The CRISPR-Cas9 nuclease system holds enormous potential for therapeutic genome editing of a wide spectrum of diseases. Large efforts have been made to further understanding of on- and off-target activity to assist the design of CRISPR-based therapies with optimized efficacy and safety. However, current efforts have largely focused on the reference genome or the genome of cell lines to evaluate guide RNA (gRNA) efficiency, safety, and toxicity. Here, we examine the effect of human genetic variation on both on- and off-target specificity. Specifically, we utilize 7,444 whole-genome sequences to examine the effect of variants on the targeting specificity of ∼3,000 gRNAs across 30 therapeutically implicated loci. We demonstrate that human genetic variation can alter the off-target landscape genome-wide including creating and destroying protospacer adjacent motifs (PAMs). Furthermore, single-nucleotide polymorphisms (SNPs) and insertions/deletions (indels) can result in altered on-target sites and novel potent off-target sites, which can predispose patients to treatment failure and adverse effects, respectively; however, these events are rare. Taken together, these data highlight the importance of considering individual genomes for therapeutic genome-editing applications for the design and evaluation of CRISPR-based therapies to minimize risk of treatment failure and/or adverse outcomes.


Assuntos
Sistemas CRISPR-Cas , Loci Gênicos , Terapia Genética , Polimorfismo de Nucleotídeo Único , RNA Guia de Cinetoplastídeos/genética , Humanos
7.
Retrovirology ; 15(1): 78, 2018 12 17.
Artigo em Inglês | MEDLINE | ID: mdl-30558640

RESUMO

BACKGROUND: The APOBEC3 (A3) family of DNA cytosine deaminases provides an innate barrier to infection by retroviruses including HIV-1. A total of five enzymes, A3C, A3D, A3F, A3G and A3H, are degraded by the viral accessory protein Vif and expressed at high levels in CD4+ T cells, the primary reservoir for HIV-1 replication in vivo. Apart from A3C, all of these enzymes mediate restriction of Vif-deficient HIV-1. However, a rare variant of human A3C (Ile188) was shown recently to restrict Vif-deficient HIV-1 in a 293T-based single cycle infection system. The potential activity of this naturally occurring A3C variant has yet to be characterized in a T cell-based spreading infection system. Here we employ a combination of Cas9/gRNA disruption and transient and stable protein expression to assess the roles of major Ser188 and minor Ile188 A3C variants in HIV-1 restriction in T cell lines. RESULTS: Cas9-mediated mutation of endogenous A3C in the non-permissive CEM2n T cell line did not alter HIV-1 replication kinetics, and complementation with A3C-Ser188 or A3C-Ile188 was similarly aphenotypic. Stable expression of A3C-Ser188 in the permissive T cell line SupT11 also had little effect. However, stable expression of A3C-Ile188 in SupT11 cells inhibited Vif-deficient virus replication and inflicted G-to-A mutations. CONCLUSIONS: A3C-Ile188 is capable of inhibiting Vif-deficient HIV-1 replication in T cells. Although A3C is eclipsed by the dominant anti-viral activities of other A3s in non-permissive T cell lines and primary T lymphocytes, this enzyme may still be able to contribute to HIV-1 diversification in vivo. Our results highlight the functional redundancy in the human A3 family with regards to HIV-1 restriction and the need to consider naturally occurring variants.


Assuntos
Citidina Desaminase/genética , Variação Genética , HIV-1/imunologia , Proteína 9 Associada à CRISPR/genética , Células HEK293 , HIV-1/fisiologia , Interações entre Hospedeiro e Microrganismos , Humanos , Imunidade Inata , Replicação Viral , Produtos do Gene vif do Vírus da Imunodeficiência Humana/genética
8.
Curr HIV/AIDS Rep ; 15(6): 431-440, 2018 12.
Artigo em Inglês | MEDLINE | ID: mdl-30218255

RESUMO

PURPOSE OF REVIEW: Human genetic polymorphisms known to influence HIV acquisition and disease progression occur in Papua New Guinea (PNG). However, no genetic association study has been reported so far. In this article, we review research findings, with a view to stimulate genotype-to-phenotype research. RECENT FINDINGS: PNG, a country in Oceania, has a high prevalence of HIV and many sexually transmitted infections. While limited data is available from this country regarding the distribution of human genetic polymorphisms known to influence clinical outcomes of HIV/AIDS, genetic association studies are lacking. Our studies, in the past decade, have revealed that polymorphisms in chemokine receptor-ligand (CCR2-CCR5, CXCL12), innate immune (Toll-like receptor, ß-defensin), and antiretroviral drug-metabolism enzyme (CYP2B6, UGT2B7) genes are prevalent in PNG. Although our results need to be validated in further studies, it is urgent to pursue large-scale, comprehensive genetic association studies that include these as well as additional genetic polymorphisms.


Assuntos
Predisposição Genética para Doença , Variação Genética , Infecções por HIV/epidemiologia , Infecções por HIV/genética , HIV , Humanos , Papua Nova Guiné/epidemiologia
9.
J Hist Biol ; 51(4): 841-873, 2018 12.
Artigo em Inglês | MEDLINE | ID: mdl-30338423

RESUMO

In this article we examine the history of the production of microarray technologies and their role in constructing and operationalizing views of human genetic difference in contemporary genomics. Rather than the "turn to difference" emerging as a post-Human Genome Project (HGP) phenomenon, interest in individual and group differences was a central, motivating concept in human genetics throughout the twentieth century. This interest was entwined with efforts to develop polymorphic "genetic markers" for studying human traits and diseases. We trace the technological, methodological and conceptual strategies in the late twentieth century that established single nucleotide polymorphisms (SNPs) as key focal points for locating difference in the genome. By embedding SNPs in microarrays, researchers created a technology that they used to catalog and assess human genetic variation. In the process of making genetic markers and array-based technologies to track variation, scientists also made commitments to ways of describing, cataloging and "knowing" human genetic differences that refracted difference through a continental geographic lens. We show how difference came to matter in both senses of the term: difference was made salient to, and inscribed on, genetic matter(s), as a result of the decisions, assessments and choices of collaborative and hybrid research collectives in medical genomics research.


Assuntos
Marcadores Genéticos , Genômica/história , Análise de Sequência com Séries de Oligonucleotídeos/história , Polimorfismo de Nucleotídeo Único , História do Século XX , Projeto Genoma Humano/história , Humanos
10.
J Med Philos ; 42(3): 328-341, 2017 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-28419342

RESUMO

The possibility of performing germline modifications on currently living individuals targets future generations' health and well-being by reducing the diversity of the human gene pool. This can have two negative repercussions: (1) reduction of heterozygosity, the latter being associated with a health or performance advantage; (2) uniformization of the genes involved in reproductive recombination, which may lead to the health risks involved in asexual reproduction. I argue that germline interventions aimed at modifying the genomes of future people cannot be ethically justifiable if there is no possibility of controlling the intervention either by reversing or altering it, whenever need demands it. This argument is challenged on six different grounds: safety, population versus individual focus, spontaneous mutations, exceptionalism, the intentional pursuit of genetic diversity through germline interventions, and harm reduction potential.


Assuntos
Previsões , Melhoramento Genético , Heterozigoto , Humanos , Recombinação Genética , Reprodução Assexuada
11.
J Biol Chem ; 289(7): 4455-69, 2014 Feb 14.
Artigo em Inglês | MEDLINE | ID: mdl-24338022

RESUMO

Pancreastatin (PST), a chromogranin A-derived peptide, is a potent physiological inhibitor of glucose-induced insulin secretion. PST also triggers glycogenolysis in liver and reduces glucose uptake in adipocytes and hepatocytes. Here, we probed for genetic variations in PST sequence and identified two variants within its functionally important carboxyl terminus domain: E287K and G297S. To understand functional implications of these amino acid substitutions, we tested the effects of wild-type (PST-WT), PST-287K, and PST-297S peptides on various cellular processes/events. The rank order of efficacy to inhibit insulin-stimulated glucose uptake was: PST-297S > PST-287K > PST-WT. The PST peptides also displayed the same order of efficacy for enhancing intracellular nitric oxide and Ca(2+) levels in various cell types. In addition, PST peptides activated gluconeogenic genes in the following order: PST-297S ≈ PST-287K > PST-WT. Consistent with these in vitro results, the common PST variant allele Ser-297 was associated with significantly higher (by ∼17 mg/dl, as compared with the wild-type Gly-297 allele) plasma glucose level in our study population (n = 410). Molecular modeling and molecular dynamics simulations predicted the following rank order of α-helical content: PST-297S > PST-287K > PST-WT. Corroboratively, circular dichroism analysis of PST peptides revealed significant differences in global structures (e.g. the order of propensity to form α-helix was: PST-297S ≈ PST-287K > PST-WT). This study provides a molecular basis for enhanced potencies/efficacies of human PST variants (likely to occur in ∼300 million people worldwide) and has quantitative implications for inter-individual variations in glucose/insulin homeostasis.


Assuntos
Variação Genética , Mutação de Sentido Incorreto , Hormônios Pancreáticos , Células 3T3-L1 , Adulto , Substituição de Aminoácidos , Animais , Glicemia/metabolismo , Dicroísmo Circular , Feminino , Células Hep G2 , Humanos , Insulina/sangue , Masculino , Camundongos , Hormônios Pancreáticos/sangue , Hormônios Pancreáticos/química , Hormônios Pancreáticos/genética , Hormônios Pancreáticos/farmacologia , Estrutura Terciária de Proteína , Relação Estrutura-Atividade
12.
Hum Biol ; 87(4): 361-371, 2015 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-27737584

RESUMO

Despite major public health initiatives, significant disparities persist among racially and ethnically defined groups in the prevalence of disease, access to medical care, quality of medical care, and health outcomes for common causes of morbidity and mortality in the United States. It is critical that we develop new and creative strategies to address such inequities; mitigate the social, environmental, institutional, and genetic determinants of poor health; and combat the persistence of racial profiling in clinical contexts that further exacerbates racial/ethnic health disparities. This article argues that medical education is a prime target for intervention and that anthropologists and human population geneticists should play a role in efforts to reform US medical curricula. Medical education would benefit greatly by incorporating anthropological and genetic perspectives on the complexities of race, human genetic variation, epigenetics, and the causes of racial/ethnic disparities. Medical students and practicing physicians should also receive training on how to use this knowledge to improve clinical practice, diagnosis, and treatment for racially diverse populations.


Assuntos
Dor Abdominal/etnologia , Anemia Falciforme/etnologia , Educação Médica/organização & administração , Variação Genética/genética , Disparidades nos Níveis de Saúde , Disparidades em Assistência à Saúde/etnologia , Dor Abdominal/diagnóstico , Dor Abdominal/etiologia , Anemia Falciforme/diagnóstico , Antropologia , Criança , Currículo , Doença/etnologia , Epigenômica , Etnicidade , Humanos , Masculino , Morbidade , Mortalidade/etnologia , Grupos Raciais , Segregação Social , Estados Unidos/epidemiologia , Estados Unidos/etnologia , Adulto Jovem
13.
Sci Rep ; 14(1): 14208, 2024 06 20.
Artigo em Inglês | MEDLINE | ID: mdl-38902252

RESUMO

The COVID-19 disease is an ongoing global health concern. Although vaccination provides some protection, people are still susceptible to re-infection. Ostensibly, certain populations or clinical groups may be more vulnerable. Factors causing these differences are unclear and whilst socioeconomic and cultural differences are likely to be important, human genetic factors could influence susceptibility. Experimental studies indicate SARS-CoV-2 uses innate immune suppression as a strategy to speed-up entry and replication into the host cell. Therefore, it is necessary to understand the impact of variants in immunity-associated human proteins on susceptibility to COVID-19. In this work, we analysed missense coding variants in several SARS-CoV-2 proteins and their human protein interactors that could enhance binding affinity to SARS-CoV-2. We curated a dataset of 19 SARS-CoV-2: human protein 3D-complexes, from the experimentally determined structures in the Protein Data Bank and models built using AlphaFold2-multimer, and analysed the impact of missense variants occurring in the protein-protein interface region. We analysed 468 missense variants from human proteins and 212 variants from SARS-CoV-2 proteins and computationally predicted their impacts on binding affinities for the human viral protein complexes. We predicted a total of 26 affinity-enhancing variants from 13 human proteins implicated in increased binding affinity to SARS-CoV-2. These include key-immunity associated genes (TOMM70, ISG15, IFIH1, IFIT2, RPS3, PALS1, NUP98, AXL, ARF6, TRIMM, TRIM25) as well as important spike receptors (KREMEN1, AXL and ACE2). We report both common (e.g., Y13N in IFIH1) and rare variants in these proteins and discuss their likely structural and functional impact, using information on known and predicted functional sites. Potential mechanisms associated with immune suppression implicated by these variants are discussed. Occurrence of certain predicted affinity-enhancing variants should be monitored as they could lead to increased susceptibility and reduced immune response to SARS-CoV-2 infection in individuals/populations carrying them. Our analyses aid in understanding the potential impact of genetic variation in immunity-associated proteins on COVID-19 susceptibility and help guide drug-repurposing strategies.


Assuntos
COVID-19 , Mutação de Sentido Incorreto , SARS-CoV-2 , Humanos , SARS-CoV-2/genética , SARS-CoV-2/imunologia , COVID-19/genética , COVID-19/virologia , COVID-19/imunologia , Reposicionamento de Medicamentos , Proteínas Virais/genética , Proteínas Virais/metabolismo , Ligação Proteica , Predisposição Genética para Doença , Suscetibilidade a Doenças , Tratamento Farmacológico da COVID-19
14.
Ann Hum Genet ; 77(5): 392-408, 2013 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-23808542

RESUMO

South Asian populations harbor a high degree of genetic diversity, due in part to demographic history. Two studies on genome-wide variation in Indian populations have shown that most Indian populations show varying degrees of admixture between ancestral north Indian and ancestral south Indian components. As a result of this structure, genetic variation in India appears to follow a geographic cline. Similarly, Indian populations seem to show detectable differences in diabetes and obesity prevalence between different geographic regions of the country. We tested the hypothesis that genetic variation at diabetes- and obesity-associated loci may be potentially related to different genetic ancestries. We genotyped 2977 individuals from 61 populations across India for 18 SNPs in genes implicated in T2D and obesity. We examined patterns of variation in allele frequency across different geographical gradients and considered state of origin and language affiliation. Our results show that most of the 18 SNPs show no significant correlation with latitude, the geographic cline reported in previous studies, or by language family. Exceptions include KCNQ1 with latitude and THADA and JAK1 with language, which suggests that genetic variation at previously ascertained diabetes-associated loci may only partly mirror geographic patterns of genome-wide diversity in Indian populations.


Assuntos
Diabetes Mellitus/epidemiologia , Diabetes Mellitus/genética , Loci Gênicos , Variação Genética , Obesidade/genética , Alelos , Diabetes Mellitus Tipo 2/epidemiologia , Diabetes Mellitus Tipo 2/genética , Frequência do Gene , Predisposição Genética para Doença , Genótipo , Humanos , Índia/epidemiologia , Polimorfismo de Nucleotídeo Único , Prevalência
15.
Toxicol Appl Pharmacol ; 271(3): 395-404, 2013 Sep 15.
Artigo em Inglês | MEDLINE | ID: mdl-21291902

RESUMO

Response to environmental chemicals can vary widely among individuals and between population groups. In human health risk assessment, data on susceptibility can be utilized by deriving risk levels based on a study of a susceptible population and/or an uncertainty factor may be applied to account for the lack of information about susceptibility. Defining genetic susceptibility in response to environmental chemicals across human populations is an area of interest in the NAS' new paradigm of toxicity pathway-based risk assessment. Data from high-throughput/high content (HT/HC), including -omics (e.g., genomics, transcriptomics, proteomics, metabolomics) technologies, have been integral to the identification and characterization of drug target and disease loci, and have been successfully utilized to inform the mechanism of action for numerous environmental chemicals. Large-scale population genotyping studies may help to characterize levels of variability across human populations at identified target loci implicated in response to environmental chemicals. By combining mechanistic data for a given environmental chemical with next generation sequencing data that provides human population variation information, one can begin to characterize differential susceptibility due to genetic variability to environmental chemicals within and across genetically heterogeneous human populations. The integration of such data sources will be informative to human health risk assessment.


Assuntos
Bases de Dados Factuais , Poluentes Ambientais/toxicidade , Predisposição Genética para Doença , Humanos , Polimorfismo Genético , Medição de Risco/métodos
16.
J Mol Biol ; 435(20): 168260, 2023 10 15.
Artigo em Inglês | MEDLINE | ID: mdl-37678708

RESUMO

Short tandem repeats (STRs) are consecutive repetitions of one to six nucleotide motifs. They are hypervariable due to the high prevalence of repeat unit insertions or deletions primarily caused by polymerase slippage during replication. Genetic variation at STRs has been shown to influence a range of traits in humans, including gene expression, cancer risk, and autism. Until recently STRs have been poorly studied since they pose significant challenges to bioinformatics analyses. Moreover, genome-wide analysis of STR variation in population-scale cohorts requires large amounts of data and computational resources. However, the recent advent of genome-wide analysis tools has resulted in multiple large genome-wide datasets of STR variation spanning nearly two million genomic loci in thousands of individuals from diverse populations. Here we present WebSTR, a database of genetic variation and other characteristics of genome-wide STRs across human populations. WebSTR is based on reference panels of more than 1.7 million human STRs created with state of the art repeat annotation methods and can easily be extended to include additional cohorts or species. It currently contains data based on STR genotypes for individuals from the 1000 Genomes Project, H3Africa, the Genotype-Tissue Expression (GTEx) Project and colorectal cancer patients from the TCGA dataset. WebSTR is implemented as a relational database with programmatic access available through an API and a web portal for browsing data. The web portal is publicly available at https://webstr.ucsd.edu.


Assuntos
Bases de Dados Genéticas , Variação Genética , Genoma Humano , Repetições de Microssatélites , Humanos , Biologia Computacional , Genótipo , Repetições de Microssatélites/genética , Estudo de Associação Genômica Ampla , Conjuntos de Dados como Assunto , Neoplasias Colorretais/genética
17.
J Cell Commun Signal ; 17(3): 485-493, 2023 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-36689135

RESUMO

Matricellular proteins comprise several families of secreted proteins that function in higher animals at the interface between cells and their surrounding extracellular matrix. Targeted gene disruptions that result in loss of viability in mice have revealed critical roles for several matricellular proteins in murine embryonic development, including two members of the cellular communication network (CCN) gene family. In contrast, mice lacking single or multiple members of the thrombospondin (THBS) gene family remain viable and fertile. The frequency of loss of function mutants, identified using human deep exome sequencing data, provided evidence that some of the essential genes in mice, including Ccn1, are also essential genes in humans. However, a deficit in loss of function mutants in humans indicated that THBS1 is also highly loss-intolerant. In addition to roles in embryonic development or adult reproduction, genes may be loss-intolerant in humans because their function is needed to survive environmental stresses that are encountered between birth and reproduction. Laboratory mice live in a protected environment that lacks the exposures to pathogens and injury that humans routinely face. However, subjecting Thbs1-/- mice to defined stresses has provided valuable insights into functions of thrombospondin-1 that could account for the loss-intolerance of THBS1 in humans. Stress response models using transgenic mice have identified protective functions of thrombospondin-1 in the cardiovascular system (red) and immune defenses (blue) that could account for its intolerance to loss of function mutants in humans.

18.
Dev Cell ; 58(20): 2112-2127.e4, 2023 10 23.
Artigo em Inglês | MEDLINE | ID: mdl-37586368

RESUMO

Controlled release of promoter-proximal paused RNA polymerase II (RNA Pol II) is crucial for gene regulation. However, studying RNA Pol II pausing is challenging, as pause-release factors are almost all essential. In this study, we identified heterozygous loss-of-function mutations in SUPT5H, which encodes SPT5, in individuals with ß-thalassemia. During erythropoiesis in healthy human cells, cell cycle genes were highly paused as cells transition from progenitors to precursors. When the pathogenic mutations were recapitulated by SUPT5H editing, RNA Pol II pause release was globally disrupted, and as cells began transitioning from progenitors to precursors, differentiation was delayed, accompanied by a transient lag in erythroid-specific gene expression and cell cycle kinetics. Despite this delay, cells terminally differentiate, and cell cycle phase distributions normalize. Therefore, hindering pause release perturbs proliferation and differentiation dynamics at a key transition during erythropoiesis, identifying a role for RNA Pol II pausing in temporally coordinating the cell cycle and erythroid differentiation.


Assuntos
Regulação da Expressão Gênica , RNA Polimerase II , Humanos , RNA Polimerase II/genética , RNA Polimerase II/metabolismo , Diferenciação Celular , Ciclo Celular , Transcrição Gênica , Proteínas Nucleares/metabolismo , Fatores de Elongação da Transcrição/genética
19.
HGG Adv ; 3(3): 100123, 2022 Jul 14.
Artigo em Inglês | MEDLINE | ID: mdl-35789587

RESUMO

The 1000 Genomes Project (TGP) is a foundational resource that serves the biomedical community as a standard reference cohort for human genetic variation. There are now seven public versions of these genomes. The TGP Consortium produced the first by mapping its final data release against human reference sequence GRCh37, then "lifted over" these genomes to the improved reference sequence (GRCh38) when it was released, and remapped the original data to GRCh38 with two similar pipelines. As best-practice quality validation, the pipelines that generated these versions were benchmarked against the Genome In A Bottle Consortium's "platinum quality" genome (NA12878). The New York Genome Center recently released the results of independently resequencing the cohort at greater depth (30×), a phased version informed by the inclusion of related individuals, and independently remapped the original variant calls to GRCh38. We performed a cross-comparison evaluation of all seven versions using genome fingerprinting, which supports ultrafast genome comparison even across reference versions. We noted multiple issues, including discrepancies in cohort membership, disagreement on the overall level of variation, evidence of substandard pipeline performance on specific genomes and in specific regions of the genome, cryptic relationships between individuals, inconsistent phasing, and annotation distortions caused by the history of the reference genome itself. We therefore recommend global quality assessment by rapid genome comparisons, alongside benchmarking as part of best-practice quality assessment of large genome datasets. Our observations also help inform the decision of which version to use, to support analyses by individual researchers.

20.
OMICS ; 25(1): 23-37, 2021 01.
Artigo em Inglês | MEDLINE | ID: mdl-33058752

RESUMO

Single-nucleotide polymorphisms (SNPs) are single-base variants that contribute to human biological variation and pathogenesis of many human diseases. Among all SNP types, nonsynonymous single-nucleotide polymorphisms (nsSNPs) can alter many structural, biochemical, and functional features of a protein such as folding characteristics, charge distribution, stability, dynamics, and interactions with other proteins/nucleotides. These modifications in the protein structure can lead nsSNPs to be closely associated with many multifactorial diseases such as cancer, diabetes, and neurodegenerative diseases. Predicting structural and functional effects of nsSNPs with experimental approaches can be time-consuming and costly; hence, computational prediction tools and algorithms are being widely and increasingly utilized in biology and medical research. This expert review examines the in silico tools and algorithms for the prediction of functional or structural effects of SNP variants, in addition to the description of the phenotypic effects of nsSNPs on protein structure, association between pathogenicity of variants, and functional or structural features of disease-associated variants. Finally, case studies investigating the functional and structural effects of nsSNPs on selected protein structures are highlighted. We conclude that creating a consistent workflow with a combination of in silico approaches or tools should be considered to increase the performance, accuracy, and precision of the biological and clinical predictions made in silico.


Assuntos
Biologia Computacional/métodos , Modelos Biológicos , Modelos Moleculares , Polimorfismo de Nucleotídeo Único , Proteínas/química , Proteínas/genética , Algoritmos , Suscetibilidade a Doenças , Humanos , Reprodutibilidade dos Testes , Relação Estrutura-Atividade
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa