Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 10 de 10
Filter
Add more filters










Publication year range
1.
Nat Commun ; 15(1): 5053, 2024 Jun 13.
Article in English | MEDLINE | ID: mdl-38871684

ABSTRACT

Childhood radioactive iodine exposure from the Chornobyl accident increased papillary thyroid carcinoma (PTC) risk. While cervical lymph node metastases (cLNM) are well-recognized in pediatric PTC, the PTC metastatic process and potential radiation association are poorly understood. Here, we analyze cLNM occurrence among 428 PTC with genomic landscape analyses and known drivers (131I-exposed = 349, unexposed = 79; mean age = 27.9 years). We show that cLNM are more frequent in PTC with fusion (55%) versus mutation (30%) drivers, although the proportion varies by specific driver gene (RET-fusion = 71%, BRAF-mutation = 38%, RAS-mutation = 5%). cLNM frequency is not associated with other characteristics, including radiation dose. cLNM molecular profiling (N = 47) demonstrates 100% driver concordance with matched primary PTCs and highly concordant mutational spectra. Transcriptome analysis reveals 17 differentially expressed genes, particularly in the HOXC cluster and BRINP3; the strongest differentially expressed microRNA also is near HOXC10. Our findings underscore the critical role of driver alterations and provide promising candidates for elucidating the biological underpinnings of PTC cLNM.


Subject(s)
Chernobyl Nuclear Accident , Iodine Radioisotopes , Lymphatic Metastasis , Mutation , Thyroid Cancer, Papillary , Thyroid Neoplasms , Humans , Thyroid Cancer, Papillary/genetics , Thyroid Cancer, Papillary/pathology , Lymphatic Metastasis/genetics , Male , Adult , Female , Thyroid Neoplasms/genetics , Thyroid Neoplasms/pathology , Adolescent , Proto-Oncogene Proteins B-raf/genetics , Young Adult , Lymph Nodes/pathology , Proto-Oncogene Proteins c-ret/genetics , Child , Genomics , Middle Aged , Homeodomain Proteins/genetics , Homeodomain Proteins/metabolism , Gene Expression Profiling , MicroRNAs/genetics , MicroRNAs/metabolism , Neoplasms, Radiation-Induced/genetics , Neoplasms, Radiation-Induced/pathology , Neck/pathology , Gene Expression Regulation, Neoplastic
2.
PLoS Comput Biol ; 18(5): e1009123, 2022 05.
Article in English | MEDLINE | ID: mdl-35639788

ABSTRACT

Since its introduction in 2011 the variant call format (VCF) has been widely adopted for processing DNA and RNA variants in practically all population studies-as well as in somatic and germline mutation studies. The VCF format can represent single nucleotide variants, multi-nucleotide variants, insertions and deletions, and simple structural variants called and anchored against a reference genome. Here we present a spectrum of over 125 useful, complimentary free and open source software tools and libraries, we wrote and made available through the multiple vcflib, bio-vcf, cyvcf2, hts-nim and slivar projects. These tools are applied for comparison, filtering, normalisation, smoothing and annotation of VCF, as well as output of statistics, visualisation, and transformations of files variants. These tools run everyday in critical biomedical pipelines and countless shell scripts. Our tools are part of the wider bioinformatics ecosystem and we highlight best practices. We shortly discuss the design of VCF, lessons learnt, and how we can address more complex variation through pangenome graph formats, variation that can not easily be represented by the VCF format.


Subject(s)
Ecosystem , Genetic Variation , Computational Biology , Genetic Variation/genetics , Nucleotides , Software
3.
Science ; 372(6543)2021 05 14.
Article in English | MEDLINE | ID: mdl-33888599

ABSTRACT

The 1986 Chernobyl nuclear power plant accident increased papillary thyroid carcinoma (PTC) incidence in surrounding regions, particularly for radioactive iodine (131I)-exposed children. We analyzed genomic, transcriptomic, and epigenomic characteristics of 440 PTCs from Ukraine (from 359 individuals with estimated childhood 131I exposure and 81 unexposed children born after 1986). PTCs displayed radiation dose-dependent enrichment of fusion drivers, nearly all in the mitogen-activated protein kinase pathway, and increases in small deletions and simple/balanced structural variants that were clonal and bore hallmarks of nonhomologous end-joining repair. Radiation-related genomic alterations were more pronounced for individuals who were younger at exposure. Transcriptomic and epigenomic features were strongly associated with driver events but not radiation dose. Our results point to DNA double-strand breaks as early carcinogenic events that subsequently enable PTC growth after environmental radiation exposure.


Subject(s)
Chernobyl Nuclear Accident , Mutation , Neoplasms, Radiation-Induced/genetics , Thyroid Cancer, Papillary/etiology , Thyroid Cancer, Papillary/genetics , Thyroid Neoplasms/etiology , Thyroid Neoplasms/genetics , Adolescent , Adult , Child , Child, Preschool , DNA Copy Number Variations , Epigenome , Female , Gene Expression Profiling , Genes, ras , Genetic Variation , Humans , Infant , Iodine Radioisotopes , Loss of Heterozygosity , Male , Middle Aged , Proto-Oncogene Proteins B-raf/genetics , RNA-Seq , Radiation Dosage , Thyroid Gland/physiology , Thyroid Gland/radiation effects , Translocation, Genetic , Ukraine , Whole Genome Sequencing , Young Adult
4.
Genome Biol ; 21(1): 35, 2020 02 12.
Article in English | MEDLINE | ID: mdl-32051000

ABSTRACT

Structural variants (SVs) remain challenging to represent and study relative to point mutations despite their demonstrated importance. We show that variation graphs, as implemented in the vg toolkit, provide an effective means for leveraging SV catalogs for short-read SV genotyping experiments. We benchmark vg against state-of-the-art SV genotypers using three sequence-resolved SV catalogs generated by recent long-read sequencing studies. In addition, we use assemblies from 12 yeast strains to show that graphs constructed directly from aligned de novo assemblies improve genotyping compared to graphs built from intermediate SV catalogs in the VCF format.


Subject(s)
Genomic Structural Variation , Genotyping Techniques/methods , Software , Genome, Fungal , Saccharomyces cerevisiae , Whole Genome Sequencing/methods
5.
Article in English | MEDLINE | ID: mdl-31535074

ABSTRACT

SUMMARY: GFA has emerged as a standard format for the exchange of genome assemblies and sequence graphs. To encourage further adoption in high-performance software we have developed an open-source C++ library for GFA and a set of utilities for summarizing and manipulating the format. AVAILABILITY: The gfakluge source code is freely available under the MIT license at https://github.com/edawson/gfakluge. It has been tested on both Mac OS X and Linux.

6.
BMC Bioinformatics ; 20(1): 389, 2019 Jul 12.
Article in English | MEDLINE | ID: mdl-31299914

ABSTRACT

BACKGROUND: Human papillomavirus (HPV) is a common sexually transmitted infection associated with cervical cancer that frequently occurs as a coinfection of types and subtypes. Highly similar sublineages that show over 100-fold differences in cancer risk are not distinguishable in coinfections with current typing methods. RESULTS: We describe an efficient set of computational tools, rkmh, for analyzing complex mixed infections of related viruses based on sequence data. rkmh makes extensive use of MinHash similarity measures, and includes utilities for removing host DNA and classifying reads by type, lineage, and sublineage. We show that rkmh is capable of assigning reads to their HPV type as well as HPV16 lineage and sublineages. CONCLUSIONS: Accurate read classification enables estimates of percent composition when there are multiple infecting lineages or sublineages. While we demonstrate rkmh for HPV with multiple sequencing technologies, it is also applicable to other mixtures of related sequences.


Subject(s)
Coinfection/diagnosis , Coinfection/virology , Computational Biology/methods , Human papillomavirus 16/physiology , Software , DNA, Viral/genetics , Human papillomavirus 16/classification , Humans , Papillomavirus Infections/virology , Phylogeny , Sequence Analysis, DNA , Time Factors
7.
Nat Biotechnol ; 36(9): 875-879, 2018 10.
Article in English | MEDLINE | ID: mdl-30125266

ABSTRACT

Reference genomes guide our interpretation of DNA sequence data. However, conventional linear references represent only one version of each locus, ignoring variation in the population. Poor representation of an individual's genome sequence impacts read mapping and introduces bias. Variation graphs are bidirected DNA sequence graphs that compactly represent genetic variation across a population, including large-scale structural variation such as inversions and duplications. Previous graph genome software implementations have been limited by scalability or topological constraints. Here we present vg, a toolkit of computational methods for creating, manipulating, and using these structures as references at the scale of the human genome. vg provides an efficient approach to mapping reads onto arbitrary variation graphs using generalized compressed suffix arrays, with improved accuracy over alignment to a linear reference, and effectively removing reference bias. These capabilities make using variation graphs as references for DNA sequencing practical at a gigabase scale, or at the topological complexity of de novo assemblies.


Subject(s)
Genetic Variation , Computer Simulation , DNA/genetics , Humans
8.
J Mol Evol ; 79(3-4): 130-42, 2014 Oct.
Article in English | MEDLINE | ID: mdl-25217382

ABSTRACT

Several recent works have shown that protein structure can predict site-specific evolutionary sequence variation. In particular, sites that are buried and/or have many contacts with other sites in a structure have been shown to evolve more slowly, on average, than surface sites with few contacts. Here, we present a comprehensive study of the extent to which numerous structural properties can predict sequence variation. The quantities we considered include buriedness (as measured by relative solvent accessibility), packing density (as measured by contact number), structural flexibility (as measured by B factors, root-mean-square fluctuations, and variation in dihedral angles), and variability in designed structures. We obtained structural flexibility measures both from molecular dynamics simulations performed on nine non-homologous viral protein structures and from variation in homologous variants of those proteins, where they were available. We obtained measures of variability in designed structures from flexible-backbone design in the Rosetta software. We found that most of the structural properties correlate with site variation in the majority of structures, though the correlations are generally weak (correlation coefficients of 0.1-0.4). Moreover, we found that buriedness and packing density were better predictors of evolutionary variation than structural flexibility. Finally, variability in designed structures was a weaker predictor of evolutionary variability than buriedness or packing density, but it was comparable in its predictive power to the best structural flexibility measures. We conclude that simple measures of buriedness and packing density are better predictors of evolutionary variation than the more complicated predictors obtained from dynamic simulations, ensembles of homologous structures, or computational protein design.


Subject(s)
Evolution, Molecular , Viral Proteins/chemistry , Amino Acid Sequence , Entropy , Molecular Dynamics Simulation , Protein Conformation
9.
Mol Biol Evol ; 31(9): 2496-500, 2014 Sep.
Article in English | MEDLINE | ID: mdl-24899665

ABSTRACT

Errors in multiple sequence alignments (MSAs) can reduce accuracy in positive-selection inference. Therefore, it has been suggested to filter MSAs before conducting further analyses. One widely used filter, Guidance, allows users to remove MSA positions aligned with low confidence. However, Guidance's utility in positive-selection inference has been disputed in the literature. We have conducted an extensive simulation-based study to characterize fully how Guidance impacts positive-selection inference, specifically for protein-coding sequences of realistic divergence levels. We also investigated whether novel scoring algorithms, which phylogenetically corrected confidence scores, and a new gap-penalization score-normalization scheme improved Guidance's performance. We found that no filter, including original Guidance, consistently benefitted positive-selection inferences. Moreover, all improvements detected were exceedingly minimal, and in certain circumstances, Guidance-based filters worsened inferences.


Subject(s)
Computational Biology/methods , Sequence Alignment/methods , Computer Simulation , Proteins/genetics , Selection, Genetic , Software
10.
Philos Trans R Soc Lond B Biol Sci ; 368(1614): 20120334, 2013 Mar 19.
Article in English | MEDLINE | ID: mdl-23382434

ABSTRACT

We investigate the causes of site-specific evolutionary-rate variation in influenza haemagglutinin (HA) between human and avian influenza, for subtypes H1, H3, and H5. By calculating the evolutionary-rate ratio, ω = dN/dS as a function of a residue's solvent accessibility in the three-dimensional protein structure, we show that solvent accessibility has a significant but relatively modest effect on site-specific rate variation. By comparing rates within HA subtypes among host species, we derive an upper limit to the amount of variation that can be explained by structural constraints of any kind. Protein structure explains only 20-40% of the variation in ω. Finally, by comparing ω at sites near the sialic-acid-binding region to ω at other sites, we show that ω near the sialic-acid-binding region is significantly elevated in both human and avian influenza, with the exception of avian H5. We conclude that protein structure, HA subtype, and host biology all impose distinct selection pressures on sites in influenza HA.


Subject(s)
Evolution, Molecular , Genetic Variation , Hemagglutinins, Viral/genetics , Influenza A Virus, H1N1 Subtype/genetics , Influenza A Virus, H3N2 Subtype/genetics , Influenza A Virus, H5N1 Subtype/genetics , Influenza in Birds/virology , Influenza, Human/virology , Animals , Birds , Humans , Models, Genetic , Selection, Genetic , Species Specificity
SELECTION OF CITATIONS
SEARCH DETAIL
...