Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 26
Filtrar
1.
Proc Natl Acad Sci U S A ; 115(8): E1859-E1866, 2018 02 20.
Artigo em Inglês | MEDLINE | ID: mdl-29434036

RESUMO

In individuals with autism spectrum disorder (ASD), de novo mutations have previously been shown to be significantly correlated with lower IQ but not with the core characteristics of ASD: deficits in social communication and interaction and restricted interests and repetitive patterns of behavior. We extend these findings by demonstrating in the Simons Simplex Collection that damaging de novo mutations in ASD individuals are also significantly and convincingly correlated with measures of impaired motor skills. This correlation is not explained by a correlation between IQ and motor skills. We find that IQ and motor skills are distinctly associated with damaging mutations and, in particular, that motor skills are a more sensitive indicator of mutational severity than is IQ, as judged by mutational type and target gene. We use this finding to propose a combined classification of phenotypic severity: mild (little impairment of either), moderate (impairment mainly to motor skills), and severe (impairment of both IQ and motor skills).


Assuntos
Transtorno do Espectro Autista/genética , Destreza Motora/fisiologia , Criança , Feminino , Genótipo , Humanos , Masculino , Mutação
2.
Nucleic Acids Res ; 44(6): 2501-13, 2016 Apr 07.
Artigo em Inglês | MEDLINE | ID: mdl-26926108

RESUMO

Existing methods for interpreting protein variation focus on annotating mutation pathogenicity rather than detailed interpretation of variant deleteriousness and frequently use only sequence-based or structure-based information. We present VIPUR, a computational framework that seamlessly integrates sequence analysis and structural modelling (using the Rosetta protein modelling suite) to identify and interpret deleterious protein variants. To train VIPUR, we collected 9477 protein variants with known effects on protein function from multiple organisms and curated structural models for each variant from crystal structures and homology models. VIPUR can be applied to mutations in any organism's proteome with improved generalized accuracy (AUROC .83) and interpretability (AUPR .87) compared to other methods. We demonstrate that VIPUR's predictions of deleteriousness match the biological phenotypes in ClinVar and provide a clear ranking of prediction confidence. We use VIPUR to interpret known mutations associated with inflammation and diabetes, demonstrating the structural diversity of disrupted functional sites and improved interpretation of mutations associated with human diseases. Lastly, we demonstrate VIPUR's ability to highlight candidate variants associated with human diseases by applying VIPUR to de novo variants associated with autism spectrum disorders.


Assuntos
Transtorno do Espectro Autista/genética , Doença Celíaca/genética , Doença de Crohn/genética , Diabetes Mellitus/genética , Mutação , Proteínas/genética , Software , Animais , Transtorno do Espectro Autista/metabolismo , Transtorno do Espectro Autista/patologia , Benchmarking , Doença Celíaca/metabolismo , Doença Celíaca/patologia , Doença de Crohn/metabolismo , Doença de Crohn/patologia , Mineração de Dados , Bases de Dados de Proteínas , Diabetes Mellitus/metabolismo , Diabetes Mellitus/patologia , Humanos , Inflamação , Modelos Moleculares , Anotação de Sequência Molecular , Proteínas/química , Proteínas/metabolismo
3.
PLoS Genet ; 9(9): e1003816, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24086153

RESUMO

Single base substitutions constitute the most frequent type of human gene mutation and are a leading cause of cancer and inherited disease. These alterations occur non-randomly in DNA, being strongly influenced by the local nucleotide sequence context. However, the molecular mechanisms underlying such sequence context-dependent mutagenesis are not fully understood. Using bioinformatics, computational and molecular modeling analyses, we have determined the frequencies of mutation at G • C bp in the context of all 64 5'-NGNN-3' motifs that contain the mutation at the second position. Twenty-four datasets were employed, comprising >530,000 somatic single base substitutions from 21 cancer genomes, >77,000 germline single-base substitutions causing or associated with human inherited disease and 16.7 million benign germline single-nucleotide variants. In several cancer types, the number of mutated motifs correlated both with the free energies of base stacking and the energies required for abstracting an electron from the target guanines (ionization potentials). Similar correlations were also evident for the pathological missense and nonsense germline mutations, but only when the target guanines were located on the non-transcribed DNA strand. Likewise, pathogenic splicing mutations predominantly affected positions in which a purine was located on the non-transcribed DNA strand. Novel candidate driver mutations and tissue-specific mutational patterns were also identified in the cancer datasets. We conclude that electron transfer reactions within the DNA molecule contribute to sequence context-dependent mutagenesis, involving both somatic driver and passenger mutations in cancer, as well as germline alterations causing or associated with inherited disease.


Assuntos
Substituição de Aminoácidos/genética , Doenças Genéticas Inatas/genética , Guanina , Neoplasias/genética , Biologia Computacional , DNA de Neoplasias/genética , Doenças Genéticas Inatas/patologia , Mutação em Linhagem Germinativa , Humanos , Modelos Moleculares , Neoplasias/patologia , Motivos de Nucleotídeos/genética
4.
Genome Res ; 22(5): 870-84, 2012 May.
Artigo em Inglês | MEDLINE | ID: mdl-22367191

RESUMO

Endogenous retrotransposons have caused extensive genomic variation within mammalian species, but the functional implications of such mobilization are mostly unknown. We mapped thousands of endogenous retrovirus (ERV) germline integrants in highly divergent, previously unsequenced mouse lineages, facilitating a comparison of gene expression in the presence or absence of local insertions. Polymorphic ERVs occur relatively infrequently in gene introns and are particularly depleted from genes involved in embryogenesis or that are highly expressed in embryonic stem cells. Their genomic distribution implies ongoing negative selection due to deleterious effects on gene expression and function. A polymorphic, intronic ERV at Slc15a2 triggers up to 49-fold increases in premature transcriptional termination and up to 39-fold reductions in full-length transcripts in adult mouse tissues, thereby disrupting protein expression and functional activity. Prematurely truncated transcripts also occur at Polr1a, Spon1, and up to ∼5% of other genes when intronic ERV polymorphisms are present. Analysis of expression quantitative trait loci (eQTLs) in recombinant BxD mouse strains demonstrated very strong genetic associations between the polymorphic ERV in cis and disrupted transcript levels. Premature polyadenylation is triggered at genomic distances up to >12.5 kb upstream of the ERV, both in cis and between alleles. The parent of origin of the ERV is associated with variable expression of nonterminated transcripts and differential DNA methylation at its 5'-long terminal repeat. This study defines an unexpectedly strong functional impact of ERVs in disrupting gene transcription at a distance and demonstrates that ongoing retrotransposition can contribute significantly to natural phenotypic diversity.


Assuntos
Retrovirus Endógenos/genética , Regulação da Expressão Gênica , Transcrição Gênica , Animais , Sequência de Bases , Mapeamento Cromossômico , Metilação de DNA , Feminino , Variação Genética , Heterozigoto , Íntrons , Masculino , Camundongos , Camundongos Endogâmicos C57BL , Dados de Sequência Molecular , Polimorfismo Genético , Biossíntese de Proteínas/genética , Locos de Características Quantitativas , Análise de Sequência de DNA , Simportadores/genética , Simportadores/metabolismo , Sequências Repetidas Terminais
5.
Bioinformatics ; 30(7): 1013-4, 2014 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-24215028

RESUMO

MOTIVATION: The plethora of information that emerges from large-scale genome characterization studies has triggered the development of computational frameworks and tools for efficient analysis, interpretation and visualization of genomic data. Functional annotation of genomic variations and the ability to visualize the data in the context of whole genome and/or multiple genomes has remained a challenging task. We have developed an interactive web-based tool, AVIA (Annotation, Visualization and Impact Analysis), to explore and interpret large sets of genomic variations (single nucleotide variations and insertion/deletions) and to help guide and summarize genomic experiments. The annotation, summary plots and tables are packaged and can be downloaded by the user from the email link provided. AVAILABILITY AND IMPLEMENTATION: http://avia.abcc.ncifcrf.gov.


Assuntos
Deleção de Genes , Genoma , Genômica/métodos , Mutagênese Insercional , Polimorfismo de Nucleotídeo Único , Software , Internet
6.
Nucleic Acids Res ; 41(Database issue): D94-D100, 2013 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-23125372

RESUMO

The non-B DB, available at http://nonb.abcc.ncifcrf.gov, catalogs predicted non-B DNA-forming sequence motifs, including Z-DNA, G-quadruplex, A-phased repeats, inverted repeats, mirror repeats, direct repeats and their corresponding subsets: cruciforms, triplexes and slipped structures, in several genomes. Version 2.0 of the database revises and re-implements the motif discovery algorithms to better align with accepted definitions and thresholds for motifs, expands the non-B DNA-forming motifs coverage by including short tandem repeats and adds key visualization tools to compare motif locations relative to other genomic annotations. Non-B DB v2.0 extends the ability for comparative genomics by including re-annotation of the five organisms reported in non-B DB v1.0, human, chimpanzee, dog, macaque and mouse, and adds seven additional organisms: orangutan, rat, cow, pig, horse, platypus and Arabidopsis thaliana. Additionally, the non-B DB v2.0 provides an overall improved graphical user interface and faster query performance.


Assuntos
DNA/química , Bases de Dados de Ácidos Nucleicos , Animais , Gráficos por Computador , Cães , Humanos , Internet , Camundongos , Anotação de Sequência Molecular , Motivos de Nucleotídeos , Ratos , Sequências Repetitivas de Ácido Nucleico , Software , Interface Usuário-Computador
7.
BMC Genomics ; 15: 394, 2014 May 22.
Artigo em Inglês | MEDLINE | ID: mdl-24885769

RESUMO

BACKGROUND: Closely spaced long inverted repeats, also known as DNA palindromes, can undergo intrastrand annealing to form DNA hairpins. The ability to form these hairpins results in genome instability, difficulties in maintaining clones in Escherichia coli and major problems for most DNA sequencing approaches. Because of their role in genomic instability and gene amplification in some human cancers, it is important to develop systematic approaches to detect and characterize DNA palindromes. RESULTS: We developed a new protocol to identify palindromes that couples the S1 nuclease treated Cot0 DNA (GAPF) with high-throughput sequencing (GAP-Seq). Unlike earlier protocols, it does not involve restriction enzymatic digestion prior to DNA snap-back thereby preserving longer DNA sequences. It also indicates the location of the novel junction, which can then be recovered. Using MCF-7 breast cancer cell line as the proof-of-principle analysis, we have identified 35 palindrome candidates and physically characterized the top 5 candidates and their junctions. Because this protocol eliminates many of the false positives that plague earlier techniques, we have improved palindrome identification. CONCLUSIONS: The GAP-Seq approach underscores the importance of developing new tools for identifying and characterizing palindromes, and provides a new strategy to systematically assess palindromes in genomes. It will be useful for studying human cancers and other diseases associated with palindromes.


Assuntos
DNA/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Biologia Computacional , Humanos , Células MCF-7 , Reação em Cadeia da Polimerase
8.
bioRxiv ; 2024 Feb 11.
Artigo em Inglês | MEDLINE | ID: mdl-38370639

RESUMO

The exploration of genotypic variants impacting phenotypes is a cornerstone in genetics research. The emergence of vast collections containing deeply genotyped and phenotyped families has made it possible to pursue the search for variants associated with complex diseases. However, managing these large-scale datasets requires specialized computational tools tailored to organize and analyze the extensive data. GPF (Genotypes and Phenotypes in Families) is an open-source platform ( https://github.com/iossifovlab/gpf ) that manages genotypes and phenotypes derived from collections of families. The GPF interface allows interactive exploration of genetic variants, enrichment analysis for de novo mutations, and phenotype/genotype association tools. In addition, GPF allows researchers to share their data securely with the broader scientific community. GPF is used to disseminate two large-scale family collection datasets (SSC, SPARK) for the study of autism funded by the SFARI foundation. However, GPF is versatile and can manage genotypic data from other small or large family collections. Our GPF-SFARI GPF instance ( https://gpf.sfari.org/ ) provides protected access to comprehensive genotypic and phenotypic data for the SSC and SPARK. In addition, GPF-SFARI provides public access to an extensive collection of de novo mutations identified in individuals with autism and related disorders and to gene-level statistics of the protected datasets characterizing the genes' roles in autism. Here, we highlight the primary features of GPF within the context of GPF-SFARI.

9.
Mol Cancer ; 12: 13, 2013 Feb 14.
Artigo em Inglês | MEDLINE | ID: mdl-23409773

RESUMO

BACKGROUND: Ultraconserved regions (UCR) are genomic segments of more than 200 base pairs that are evolutionarily conserved among mammalian species. They are thought to have functions as transcriptional enhancers and regulators of alternative splicing. Recently, it was shown that numerous RNAs are transcribed from these regions. These UCR-encoded transcripts (ucRNAs) were found to be expressed in a tissue- and disease-specific manner and may interfere with the function of other RNAs through RNA: RNA interactions. We hypothesized that ucRNAs have unidentified roles in the pathogenesis of human prostate cancer. In a pilot study, we examined ucRNA expression profiles in human prostate tumors. METHODS: Using a custom microarray with 962 probesets representing sense and antisense sequences for the 481 human UCRs, we examined ucRNA expression in resected, fresh-frozen human prostate tissues (57 tumors, 7 non-cancerous prostate tissues) and in cultured prostate cancer cells treated with either epigenetic drugs (the hypomethylating agent, 5-Aza 2'deoxycytidine, and the histone deacetylase inhibitor, trichostatin A) or a synthetic androgen, R1881. Expression of selected ucRNAs was also assessed by qRT-PCR and NanoString®-based assays. Because ucRNAs may function as RNAs that target protein-coding genes through direct and inhibitory RNA: RNA interactions, computational analyses were applied to identify candidate ucRNA:mRNA binding pairs. RESULTS: We observed altered ucRNA expression in prostate cancer (e.g., uc.106+, uc.477+, uc.363 + A, uc.454 + A) and found that these ucRNAs were associated with cancer development, Gleason score, and extraprostatic extension after controlling for false discovery (false discovery rate < 5% for many of the transcripts). We also identified several ucRNAs that were responsive to treatment with either epigenetic drugs or androgen (R1881). For example, experiments with LNCaP human prostate cancer cells showed that uc.287+ is induced by R1881 (P < 0.05) whereas uc.283 + A was up-regulated following treatment with combined 5-Aza 2'deoxycytidine and trichostatin A (P < 0.05). Additional computational analyses predicted RNA loop-loop interactions of 302 different sense and antisense ucRNAs with 1058 different mRNAs, inferring possible functions of ucRNAs via direct interactions with mRNAs. CONCLUSIONS: This first study of ucRNA expression in human prostate cancer indicates an altered transcript expression in the disease.


Assuntos
Adenocarcinoma/genética , Neoplasias da Próstata/genética , RNA Neoplásico/genética , Transcriptoma , Adenocarcinoma/metabolismo , Idoso , Azacitidina/análogos & derivados , Azacitidina/farmacologia , Estudos de Casos e Controles , Linhagem Celular Tumoral , Sequência Conservada , Decitabina , Epigênese Genética/efeitos dos fármacos , Expressão Gênica , Regulação Neoplásica da Expressão Gênica , Genoma Humano , Inibidores de Histona Desacetilases/farmacologia , Humanos , Ácidos Hidroxâmicos/farmacologia , Masculino , Metribolona/farmacologia , Pessoa de Meia-Idade , Análise de Sequência com Séries de Oligonucleotídeos , Próstata/metabolismo , Neoplasias da Próstata/metabolismo , RNA Mensageiro/genética , RNA Neoplásico/metabolismo , RNA não Traduzido/genética , RNA não Traduzido/metabolismo , Congêneres da Testosterona/farmacologia
10.
Retrovirology ; 10: 18, 2013 Feb 13.
Artigo em Inglês | MEDLINE | ID: mdl-23402264

RESUMO

BACKGROUND: 454 sequencing technology is a promising approach for characterizing HIV-1 populations and for identifying low frequency mutations. The utility of 454 technology for determining allele frequencies and linkage associations in HIV infected individuals has not been extensively investigated. We evaluated the performance of 454 sequencing for characterizing HIV populations with defined allele frequencies. RESULTS: We constructed two HIV-1 RT clones. Clone A was a wild type sequence. Clone B was identical to clone A except it contained 13 introduced drug resistant mutations. The clones were mixed at ratios ranging from 1% to 50% and were amplified by standard PCR conditions and by PCR conditions aimed at reducing PCR-based recombination. The products were sequenced using 454 pyrosequencing. Sequence analysis from standard PCR amplification revealed that 14% of all sequencing reads from a sample with a 50:50 mixture of wild type and mutant DNA were recombinants. The majority of the recombinants were the result of a single crossover event which can happen during PCR when the DNA polymerase terminates synthesis prematurely. The incompletely extended template then competes for primer sites in subsequent rounds of PCR. Although less often, a spectrum of other distinct crossover patterns was also detected. In addition, we observed point mutation errors ranging from 0.01% to 1.0% per base as well as indel (insertion and deletion) errors ranging from 0.02% to nearly 50%. The point errors (single nucleotide substitution errors) were mainly introduced during PCR while indels were the result of pyrosequencing. We then used new PCR conditions designed to reduce PCR-based recombination. Using these new conditions, the frequency of recombination was reduced 27-fold. The new conditions had no effect on point mutation errors. We found that 454 pyrosequencing was capable of identifying minority HIV-1 mutations at frequencies down to 0.1% at some nucleotide positions. CONCLUSION: Standard PCR amplification results in a high frequency of PCR-introduced recombination precluding its use for linkage analysis of HIV populations using 454 pyrosequencing. We designed a new PCR protocol that resulted in a much lower recombination frequency and provided a powerful technique for linkage analysis and haplotype determination in HIV-1 populations. Our analyses of 454 sequencing results also demonstrated that at some specific HIV-1 drug resistant sites, mutations can reliably be detected at frequencies down to 0.1%.


Assuntos
Artefatos , Farmacorresistência Viral , HIV-1/genética , Testes de Sensibilidade Microbiana/métodos , Mutação , Recombinação Genética , Análise de Sequência de DNA/métodos , Infecções por HIV/virologia , HIV-1/efeitos dos fármacos , Humanos , Dados de Sequência Molecular , Reação em Cadeia da Polimerase/métodos , Projetos de Pesquisa
11.
Nucleic Acids Res ; 39(Database issue): D383-91, 2011 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-21097885

RESUMO

Although the capability of DNA to form a variety of non-canonical (non-B) structures has long been recognized, the overall significance of these alternate conformations in biology has only recently become accepted en masse. In order to provide access to genome-wide locations of these classes of predicted structures, we have developed non-B DB, a database integrating annotations and analysis of non-B DNA-forming sequence motifs. The database provides the most complete list of alternative DNA structure predictions available, including Z-DNA motifs, quadruplex-forming motifs, inverted repeats, mirror repeats and direct repeats and their associated subsets of cruciforms, triplex and slipped structures, respectively. The database also contains motifs predicted to form static DNA bends, short tandem repeats and homo(purine•pyrimidine) tracts that have been associated with disease. The database has been built using the latest releases of the human, chimp, dog, macaque and mouse genomes, so that the results can be compared directly with other data sources. In order to make the data interpretable in a genomic context, features such as genes, single-nucleotide polymorphisms and repetitive elements (SINE, LINE, etc.) have also been incorporated. The database is accessed through query pages that produce results with links to the UCSC browser and a GBrowse-based genomic viewer. It is freely accessible at http://nonb.abcc.ncifcrf.gov.


Assuntos
DNA/química , Bases de Dados de Ácidos Nucleicos , Animais , Sequência de Bases , Cães , Genômica , Humanos , Macaca , Camundongos , Conformação de Ácido Nucleico , Pan troglodytes/genética , Sequências Repetitivas de Ácido Nucleico
12.
Nucleic Acids Res ; 38(Database issue): D600-6, 2010 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-19933259

RESUMO

MouseIndelDB is an integrated database resource containing thousands of previously unreported mouse genomic indel (insertion and deletion) polymorphisms ranging from approximately 100 nt to 10 Kb in size. The database currently includes polymorphisms identified from our alignment of 26 million whole-genome shotgun sequence traces from four laboratory mouse strains mapped against the reference C57BL/6J genome using GMAP. They can be queried on a local level by chromosomal coordinates, nearby gene names or other genomic feature identifiers, or in bulk format using categories including mouse strain(s), class of polymorphism(s) and chromosome number. The results of such queries are presented either as a custom track on the UCSC mouse genome browser or in tabular format. We anticipate that the MouseIndelDB database will be widely useful for research in mammalian genetics, genomics, and evolutionary biology. Access to the MouseIndelDB database is freely available at: http://variation.osu.edu/.


Assuntos
Biologia Computacional/métodos , Bases de Dados Genéticas , Bases de Dados de Ácidos Nucleicos , Polimorfismo Genético , Animais , Biologia Computacional/tendências , Bases de Dados de Proteínas , Genoma , Armazenamento e Recuperação da Informação/métodos , Internet , Camundongos , Camundongos Endogâmicos C57BL , Modelos Genéticos , Software , Especificidade da Espécie , Interface Usuário-Computador
13.
Nat Genet ; 54(9): 1305-1319, 2022 09.
Artigo em Inglês | MEDLINE | ID: mdl-35982159

RESUMO

To capture the full spectrum of genetic risk for autism, we performed a two-stage analysis of rare de novo and inherited coding variants in 42,607 autism cases, including 35,130 new cases recruited online by SPARK. We identified 60 genes with exome-wide significance (P < 2.5 × 10-6), including five new risk genes (NAV3, ITSN1, MARK2, SCAF1 and HNRNPUL2). The association of NAV3 with autism risk is primarily driven by rare inherited loss-of-function (LoF) variants, with an estimated relative risk of 4, consistent with moderate effect. Autistic individuals with LoF variants in the four moderate-risk genes (NAV3, ITSN1, SCAF1 and HNRNPUL2; n = 95) have less cognitive impairment than 129 autistic individuals with LoF variants in highly penetrant genes (CHD8, SCN2A, ADNP, FOXP1 and SHANK3) (59% vs 88%, P = 1.9 × 10-6). Power calculations suggest that much larger numbers of autism cases are needed to identify additional moderate-risk genes.


Assuntos
Transtorno do Espectro Autista , Transtorno Autístico , Transtorno do Espectro Autista/genética , Transtorno Autístico/genética , Exoma/genética , Fatores de Transcrição Forkhead/genética , Predisposição Genética para Doença , Humanos , Mutação , Proteínas Repressoras/genética , Sequenciamento do Exoma
14.
J Infect Dis ; 202(7): 1126-35, 2010 Oct 01.
Artigo em Inglês | MEDLINE | ID: mdl-20715927

RESUMO

BACKGROUND: We recently identified polymorphisms in Kaposi sarcoma-associated herpesvirus (KSHV)-encoded microRNA (miRNA) sequences from clinical subjects. Here, we examine whether any of these may contribute to KS risk in a European AIDS-KS case-control study. METHODS: KSHV load in peripheral blood was determined by real-time quantitative polymerase chain reaction. Samples that had detectable viral loads were used to amplify the 2.8-kb miRNA encoding region plus a 646-bp fragment of the K12/T0.7 gene. Additionally, we characterized an 840-bp fragment of the K1 gene to determine KSHV subtypes. RESULTS: KSHV DNA was detected in peripheral blood mononuclear cells of 49.6% of case patients and 6.8% of controls, and viral loads tended to be higher in case patients. Sequences from the miRNA-encoding regions were conserved overall, but distinct polymorphisms were detected, some of which occurred in primary miRNAs, pre-miRNAs, or mature miRNAs. CONCLUSIONS: Patients with KS were more likely to have detectable viral loads than were controls without disease. Despite high conservation in KSHV miRNA-encoded sequences, polymorphisms were observed, including some that have been reported elsewhere. Some polymorphisms could affect mature miRNA processing and appear to be associated with KS risk.


Assuntos
Síndrome da Imunodeficiência Adquirida/complicações , DNA Viral/genética , Herpesvirus Humano 8/genética , MicroRNAs/genética , Polimorfismo Genético , Sarcoma de Kaposi/epidemiologia , Estudos de Casos e Controles , DNA Viral/sangue , DNA Viral/química , Humanos , Leucócitos Mononucleares/virologia , Reação em Cadeia da Polimerase/métodos , Fatores de Risco , Análise de Sequência , Carga Viral
15.
BMC Genomics ; 10: 51, 2009 Jan 26.
Artigo em Inglês | MEDLINE | ID: mdl-19171065

RESUMO

BACKGROUND: Understanding structure and function of human genome requires knowledge of genomes of our closest living relatives, the primates. Nucleotide insertions and deletions (indels) play a significant role in differentiation that underlies phenotypic differences between humans and chimpanzees. In this study, we evaluated distribution, evolutionary history, and function of indels found by comparing syntenic regions of the human and chimpanzee genomes. RESULTS: Specifically, we identified 6,279 indels of 10 bp or greater in a ~33 Mb alignment between human and chimpanzee chromosome 22. After the exclusion of those in repetitive DNA, 1,429 or 23% of indels still remained. This group was characterized according to the local or genome-wide repetitive nature, size, location relative to genes, and other genomic features. We defined three major classes of these indels, using local structure analysis: (i) those indels found uniquely without additional copies of indel sequence in the surrounding (10 Kb) region, (ii) those with at least one exact copy found nearby, and (iii) those with similar but not identical copies found locally. Among these classes, we encountered a high number of exactly repeated indel sequences, most likely due to recent duplications. Many of these indels (683 of 1,429) were in proximity of known human genes. Coding sequences and splice sites contained significantly fewer of these indels than expected from random expectations, suggesting that selection is a factor in limiting their persistence. A subset of indels from coding regions was experimentally validated and their impacts were predicted based on direct sequencing in several human populations as well as chimpanzees, bonobos, gorillas, and two subspecies of orangutans. CONCLUSION: Our analysis demonstrates that while indels are distributed essentially randomly in intergenic and intronic genomic regions, they are significantly under-represented in coding sequences. There are substantial differences in representation of indel classes among genomic elements, most likely caused by differences in their evolutionary histories. Using local sequence context, we predicted origins and phylogenetic relationships of gene-impacting indels in primate species. These results suggest that genome plasticity is a major force behind speciation events separating the great ape lineages.


Assuntos
Cromossomos Humanos Par 22/genética , Evolução Molecular , Mutação INDEL , Pan troglodytes/genética , Animais , Genoma Humano , Humanos , Alinhamento de Sequência , Análise de Sequência de DNA , Sintenia
16.
Mol Cancer Res ; 6(2): 212-21, 2008 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-18314482

RESUMO

The PVT1 locus is identified as a cluster of T(2;8) and T(8;22) "variant" MYC-activating chromosomal translocation breakpoints extending 400 kb downstream of MYC in a subset (approximately 20%) of Burkitt's lymphoma (vBL). Recent reports that microRNAs (miRNA) may be associated with fragile sites and cancer-associated genomic regions prompted us to investigate whether the PVT1 region on chromosome 8q24 may contain miRNAs. Computational analysis of the genomic sequence covering the PVT1 locus and experimental verification identified seven miRNAs. One miRNA, hsa-miR-1204, resides within a previously described PVT1 exon (1b) that is often fused to the immunoglobulin light chain constant region in vBLs and is present in high copy number in MYC/PVT1-amplified tumors. Like its human counterpart, mouse mmu-miR-1204 represents the closest miRNA to Myc (~50 kb) and is found only 1 to 2 kb downstream of a cluster of retroviral integration sites. Another miRNA, mmu-miR-1206, is close to a cluster of variant translocation breakpoints associated with mouse plasmacytoma and exon 1 of mouse Pvt1. Virtually all the miRNA precursor transcripts are expressed at higher levels in late-stage B cells (including plasmacytoma and vBL cell lines) compared with immature B cells, suggesting possible roles in lymphoid development and/or lymphoma. In addition, lentiviral vector-mediated overexpression of the miR-1204 precursor (human and mouse) in a mouse pre-B-cell line increased expression of Myc. High levels of expression of the hsa-miR-1204 precursor is also seen in several epithelial cancer cell lines with MYC/PVT1 coamplification, suggesting a potentially broad role for these miRNAs in tumorigenesis.


Assuntos
Cromossomos Humanos Par 8/genética , Instabilidade Genômica/genética , MicroRNAs/genética , Animais , Linfócitos B/metabolismo , Sequência de Bases , Northern Blotting , Linhagem Celular , Biologia Computacional , Dosagem de Genes , Genoma Humano/genética , Humanos , Camundongos , Dados de Sequência Molecular , Proteínas Proto-Oncogênicas c-myc/genética , Precursores de RNA/genética , Precursores de RNA/metabolismo , Reprodutibilidade dos Testes , Reação em Cadeia da Polimerase Via Transcriptase Reversa , Transdução Genética
17.
NPJ Genom Med ; 4: 19, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31452935

RESUMO

Autism spectrum disorder (ASD) is a genetically heterogeneous condition, caused by a combination of rare de novo and inherited variants as well as common variants in at least several hundred genes. However, significantly larger sample sizes are needed to identify the complete set of genetic risk factors. We conducted a pilot study for SPARK (SPARKForAutism.org) of 457 families with ASD, all consented online. Whole exome sequencing (WES) and genotyping data were generated for each family using DNA from saliva. We identified variants in genes and loci that are clinically recognized causes or significant contributors to ASD in 10.4% of families without previous genetic findings. In addition, we identified variants that are possibly associated with ASD in an additional 3.4% of families. A meta-analysis using the TADA framework at a false discovery rate (FDR) of 0.1 provides statistical support for 26 ASD risk genes. While most of these genes are already known ASD risk genes, BRSK2 has the strongest statistical support and reaches genome-wide significance as a risk gene for ASD (p-value = 2.3e-06). Future studies leveraging the thousands of individuals with ASD who have enrolled in SPARK are likely to further clarify the genetic risk factors associated with ASD as well as allow accelerate ASD research that incorporates genetic etiology.

18.
Nat Neurosci ; 19(11): 1454-1462, 2016 11.
Artigo em Inglês | MEDLINE | ID: mdl-27479844

RESUMO

Autism spectrum disorder (ASD) is a complex neurodevelopmental disorder with a strong genetic basis. Yet, only a small fraction of potentially causal genes-about 65 genes out of an estimated several hundred-are known with strong genetic evidence from sequencing studies. We developed a complementary machine-learning approach based on a human brain-specific gene network to present a genome-wide prediction of autism risk genes, including hundreds of candidates for which there is minimal or no prior genetic evidence. Our approach was validated in a large independent case-control sequencing study. Leveraging these genome-wide predictions and the brain-specific network, we demonstrated that the large set of ASD genes converges on a smaller number of key pathways and developmental stages of the brain. Finally, we identified likely pathogenic genes within frequent autism-associated copy-number variants and proposed genes and pathways that are likely mediators of ASD across multiple copy-number variants. All predictions and functional insights are available at http://asd.princeton.edu.


Assuntos
Transtorno do Espectro Autista/genética , Variações do Número de Cópias de DNA/genética , Polimorfismo de Nucleotídeo Único/genética , Redes Reguladoras de Genes , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Humanos
19.
EMBO Mol Med ; 8(3): 268-87, 2016 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-26881967

RESUMO

MicroRNA-10b (miR-10b) is a unique oncogenic miRNA that is highly expressed in all GBM subtypes, while absent in normal neuroglial cells of the brain. miR-10b inhibition strongly impairs proliferation and survival of cultured glioma cells, including glioma-initiating stem-like cells (GSC). Although several miR-10b targets have been identified previously, the common mechanism conferring the miR-10b-sustained viability of GSC is unknown. Here, we demonstrate that in heterogeneous GSC, miR-10b regulates cell cycle and alternative splicing, often through the non-canonical targeting via 5'UTRs of its target genes, including MBNL1-3, SART3, and RSRC1. We have further assessed the inhibition of miR-10b in intracranial human GSC-derived xenograft and murine GL261 allograft models in athymic and immunocompetent mice. Three delivery routes for the miR-10b antisense oligonucleotide inhibitors (ASO), direct intratumoral injections, continuous osmotic delivery, and systemic intravenous injections, have been explored. In all cases, the treatment with miR-10b ASO led to targets' derepression, and attenuated growth and progression of established intracranial GBM. No significant systemic toxicity was observed upon ASO administration by local or systemic routes. Our results indicate that miR-10b is a promising candidate for the development of targeted therapies against all GBM subtypes.


Assuntos
Antineoplásicos/administração & dosagem , Glioblastoma/tratamento farmacológico , MicroRNAs/antagonistas & inibidores , Oligonucleotídeos Antissenso/administração & dosagem , Aloenxertos , Animais , Modelos Animais de Doenças , Xenoenxertos , Humanos , Camundongos , Resultado do Tratamento
20.
Nat Commun ; 4: 1806, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23651994

RESUMO

Although human papillomavirus was identified as an aetiological factor in cervical cancer, the key human gene drivers of this disease remain unknown. Here we apply an unbiased approach integrating gene expression and chromosomal aberration data. In an independent group of patients, we reconstruct and validate a gene regulatory meta-network, and identify cell cycle and antiviral genes that constitute two major subnetworks upregulated in tumour samples. These genes are located within the same regions as chromosomal amplifications, most frequently on 3q. We propose a model in which selected chromosomal gains drive activation of antiviral genes contributing to episomal virus elimination, which synergizes with cell cycle dysregulation. These findings may help to explain the paradox of episomal human papillomavirus decline in women with invasive cancer who were previously unable to clear the virus.


Assuntos
Antivirais/metabolismo , Ciclo Celular/genética , Redes Reguladoras de Genes/genética , Genes Neoplásicos/genética , Papillomaviridae/genética , Neoplasias do Colo do Útero/genética , Neoplasias do Colo do Útero/virologia , Aberrações Cromossômicas , Cromossomos Humanos/genética , Bases de Dados Genéticas , Feminino , Perfilação da Expressão Gênica , Regulação Neoplásica da Expressão Gênica , Genoma Humano/genética , Instabilidade Genômica , Humanos , Proteínas de Membrana Lisossomal/metabolismo , Metanálise como Assunto , Proteínas de Neoplasias/metabolismo , Infecções por Papillomavirus/genética , Infecções por Papillomavirus/virologia , Reprodutibilidade dos Testes , Neoplasias do Colo do Útero/patologia , Integração Viral/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA