Search | VHL Regional Portal

1.

Functional Architecture of Deleterious Genetic Variants in the Genome of a Wrangel Island Mammoth.

Fry, Erin; Kim, Sun K; Chigurapti, Sravanthi; Mika, Katelyn M; Ratan, Aakrosh; Dammermann, Alexander; Mitchell, Brian J; Miller, Webb; Lynch, Vincent J.

Genome Biol Evol ; 12(3): 48-58, 2020 03 01.

Article in English | MEDLINE | ID: mdl-32031213

ABSTRACT

Woolly mammoths were among the most abundant cold-adapted species during the Pleistocene. Their once-large populations went extinct in two waves, an end-Pleistocene extinction of continental populations followed by the mid-Holocene extinction of relict populations on St. Paul Island â¼5,600 years ago and Wrangel Island â¼4,000 years ago. Wrangel Island mammoths experienced an episode of rapid demographic decline coincident with their isolation, leading to a small population, reduced genetic diversity, and the fixation of putatively deleterious alleles, but the functional consequences of these processes are unclear. Here, we show that a Wrangel Island mammoth genome had many putative deleterious mutations that are predicted to cause diverse behavioral and developmental defects. Resurrection and functional characterization of several genes from the Wrangel Island mammoth carrying putatively deleterious substitutions identified both loss and gain of function mutations in genes associated with developmental defects (HYLS1), oligozoospermia and reduced male fertility (NKD1), diabetes (NEUROG3), and the ability to detect floral scents (OR5A1). These data suggest that at least one Wrangel Island mammoth may have suffered adverse consequences from reduced population size and isolation.

Subject(s)

Evolution, Molecular , Mammoths/genetics , Mutation , Animals , Genome

2.

Genomic Variants Among Threatened Acropora Corals.

Kitchen, Sheila A; Ratan, Aakrosh; Bedoya-Reina, Oscar C; Burhans, Richard; Fogarty, Nicole D; Miller, Webb; Baums, Iliana B.

G3 (Bethesda) ; 9(5): 1633-1646, 2019 05 07.

Article in English | MEDLINE | ID: mdl-30914426

ABSTRACT

Genomic sequence data for non-model organisms are increasingly available requiring the development of efficient and reproducible workflows. Here, we develop the first genomic resources and reproducible workflows for two threatened members of the reef-building coral genus Acropora We generated genomic sequence data from multiple samples of the Caribbean A. cervicornis (staghorn coral) and A. palmata (elkhorn coral), and predicted millions of nucleotide variants among these two species and the Pacific A. digitifera A subset of predicted nucleotide variants were verified using restriction length polymorphism assays and proved useful in distinguishing the two Caribbean acroporids and the hybrid they form ("A. prolifera"). Nucleotide variants are freely available from the Galaxy server (usegalaxy.org), and can be analyzed there with computational tools and stored workflows that require only an internet browser. We describe these data and some of the analysis tools, concentrating on fixed differences between A. cervicornis and A. palmata In particular, we found that fixed amino acid differences between these two species were enriched in proteins associated with development, cellular stress response, and the host's interactions with associated microbes, for instance in the ABC transporters and superoxide dismutase. Identified candidate genes may underlie functional differences in how these threatened species respond to changing environments. Users can expand the presented analyses easily by adding genomic data from additional species, as they become available.

Subject(s)

Anthozoa/genetics , Endangered Species , Genetic Variation , Genome , Genomics , Animals , Anthozoa/classification , Evolution, Molecular , Genetics, Population , Genomics/methods , Geography , INDEL Mutation , Phylogeny , Polymorphism, Restriction Fragment Length , Polymorphism, Single Nucleotide

3.

Giraffe genome sequence reveals clues to its unique morphology and physiology.

Agaba, Morris; Ishengoma, Edson; Miller, Webb C; McGrath, Barbara C; Hudson, Chelsea N; Bedoya Reina, Oscar C; Ratan, Aakrosh; Burhans, Rico; Chikhi, Rayan; Medvedev, Paul; Praul, Craig A; Wu-Cavener, Lan; Wood, Brendan; Robertson, Heather; Penfold, Linda; Cavener, Douglas R.

Nat Commun ; 7: 11519, 2016 05 17.

Article in English | MEDLINE | ID: mdl-27187213

ABSTRACT

The origins of giraffe's imposing stature and associated cardiovascular adaptations are unknown. Okapi, which lacks these unique features, is giraffe's closest relative and provides a useful comparison, to identify genetic variation underlying giraffe's long neck and cardiovascular system. The genomes of giraffe and okapi were sequenced, and through comparative analyses genes and pathways were identified that exhibit unique genetic changes and likely contribute to giraffe's unique features. Some of these genes are in the HOX, NOTCH and FGF signalling pathways, which regulate both skeletal and cardiovascular development, suggesting that giraffe's stature and cardiovascular adaptations evolved in parallel through changes in a small number of genes. Mitochondrial metabolism and volatile fatty acids transport genes are also evolutionarily diverged in giraffe and may be related to its unusual diet that includes toxic plants. Unexpectedly, substantial evolutionary changes have occurred in giraffe and okapi in double-strand break repair and centrosome functions.

Subject(s)

Genome , Giraffes/genetics , Giraffes/physiology , Adaptation, Physiological , Amino Acid Sequence , Amino Acid Substitution/genetics , Animals , Base Sequence , Biological Evolution , Bone Development/genetics , Cluster Analysis , Gene Ontology , Gene Regulatory Networks , Genetic Variation , Giraffes/anatomy & histology , Sequence Analysis, DNA

4.

Sex determination by SRY PCR and sequencing of Tasmanian devil facial tumour cell lines reveals non-allograft transmission.

Cui, Xianlan; Wang, Yunfeng; Hua, Bobby; Miller, Webb; Zhao, Yan; Cui, Hongyu; Kong, Xiangang.

Biochem Biophys Res Commun ; 474(1): 29-34, 2016 05 20.

Article in English | MEDLINE | ID: mdl-27084454

ABSTRACT

Devil facial tumour disease (DFTD) is an infectious tumour disease and was hypothesised to be transmitted by allograft during biting based on two cytogenetic findings of DFTD tumours in 2006. It was then believed that DFTD tumours were originally from a female devil. In this study the devil sex-determining region Y (SRY) gene was PCR amplified and sequenced, and six pairs of devil SRY PCR primers were used for detection of devil SRY gene fragments in purified DFTD tumour cell lines. Using three pairs of devil SRY PCR primers, devil SRY gene sequence was detected by PCR and sequencing in genomic DNA of DFTD tumour cell lines from six male devils, but not from six female devils. Four out of six DFTD tumour cell lines from male devils contained nucleotides 288-482 of the devil SRY gene, and another two DFTD tumour cell lines contained nucleotides 381-577 and 493-708 of the gene, respectively. These results indicate that the different portions of the SRY gene in the DFTD tumours of the male devils were originally from the male hosts, rejecting the currently believed DFTD allograft transmission theory. The reasons why DFTD transmission was incorrectly defined as allograft are discussed.

Subject(s)

Facial Neoplasms/genetics , Marsupialia/genetics , Polymerase Chain Reaction/methods , Sequence Analysis, DNA/methods , Sex Determination Analysis/methods , Sex-Determining Region Y Protein/genetics , Allografts/transplantation , Animals , Cell Line, Tumor , Female , Male , Sex Characteristics

5.

Elephantid Genomes Reveal the Molecular Bases of Woolly Mammoth Adaptations to the Arctic.

Lynch, Vincent J; Bedoya-Reina, Oscar C; Ratan, Aakrosh; Sulak, Michael; Drautz-Moses, Daniela I; Perry, George H; Miller, Webb; Schuster, Stephan C.

Cell Rep ; 12(2): 217-28, 2015 Jul 14.

Article in English | MEDLINE | ID: mdl-26146078

ABSTRACT

Woolly mammoths and living elephants are characterized by major phenotypic differences that have allowed them to live in very different environments. To identify the genetic changes that underlie the suite of woolly mammoth adaptations to extreme cold, we sequenced the nuclear genome from three Asian elephants and two woolly mammoths, and we identified and functionally annotated genetic changes unique to woolly mammoths. We found that genes with mammoth-specific amino acid changes are enriched in functions related to circadian biology, skin and hair development and physiology, lipid metabolism, adipose development and physiology, and temperature sensation. Finally, we resurrected and functionally tested the mammoth and ancestral elephant TRPV3 gene, which encodes a temperature-sensitive transient receptor potential (thermoTRP) channel involved in thermal sensation and hair growth, and we show that a single mammoth-specific amino acid substitution in an otherwise highly conserved region of the TRPV3 channel strongly affects its temperature sensitivity.

Subject(s)

Adaptation, Physiological , Genome , Mammoths/genetics , Amino Acid Sequence , Amino Acid Substitution , Animals , Arctic Regions , Elephants/classification , Elephants/genetics , Elephants/metabolism , Evolution, Molecular , HEK293 Cells , High-Throughput Nucleotide Sequencing , Humans , Mammoths/classification , Mammoths/metabolism , Molecular Sequence Annotation , Molecular Sequence Data , Phylogeny , Protein Structure, Tertiary , Sequence Analysis, DNA , TRPV Cation Channels/chemistry , TRPV Cation Channels/genetics , TRPV Cation Channels/metabolism

6.

Genome-wide analysis of signatures of selection in populations of African honey bees (Apis mellifera) using new web-based tools.

Fuller, Zachary L; Niño, Elina L; Patch, Harland M; Bedoya-Reina, Oscar C; Baumgarten, Tracey; Muli, Elliud; Mumoki, Fiona; Ratan, Aakrosh; McGraw, John; Frazier, Maryann; Masiga, Daniel; Schuster, Stephen; Grozinger, Christina M; Miller, Webb.

BMC Genomics ; 16: 518, 2015 Jul 10.

Article in English | MEDLINE | ID: mdl-26159619

ABSTRACT

BACKGROUND: With the development of inexpensive, high-throughput sequencing technologies, it has become feasible to examine questions related to population genetics and molecular evolution of non-model species in their ecological contexts on a genome-wide scale. Here, we employed a newly developed suite of integrated, web-based programs to examine population dynamics and signatures of selection across the genome using several well-established tests, including F ST, pN/pS, and McDonald-Kreitman. We applied these techniques to study populations of honey bees (Apis mellifera) in East Africa. In Kenya, there are several described A. mellifera subspecies, which are thought to be localized to distinct ecological regions. RESULTS: We performed whole genome sequencing of 11 worker honey bees from apiaries distributed throughout Kenya and identified 3.6 million putative single-nucleotide polymorphisms. The dense coverage allowed us to apply several computational procedures to study population structure and the evolutionary relationships among the populations, and to detect signs of adaptive evolution across the genome. While there is considerable gene flow among the sampled populations, there are clear distinctions between populations from the northern desert region and those from the temperate, savannah region. We identified several genes showing population genetic patterns consistent with positive selection within African bee populations, and between these populations and European A. mellifera or Asian Apis florea. CONCLUSIONS: These results lay the groundwork for future studies of adaptive ecological evolution in honey bees, and demonstrate the use of new, freely available web-based tools and workflows ( http://usegalaxy.org/r/kenyanbee ) that can be applied to any model system with genomic information.

Subject(s)

Bees/genetics , Genome, Insect/genetics , Selection, Genetic/genetics , Transcriptome/genetics , Animals , Evolution, Molecular , Genetics, Population/methods , Genomics/methods , Kenya , Models, Genetic , Polymorphism, Single Nucleotide/genetics , Population Dynamics

7.

Identification of indels in next-generation sequencing data.

Ratan, Aakrosh; Olson, Thomas L; Loughran, Thomas P; Miller, Webb.

BMC Bioinformatics ; 16: 42, 2015 Feb 13.

Article in English | MEDLINE | ID: mdl-25879703

ABSTRACT

BACKGROUND: The discovery and mapping of genomic variants is an essential step in most analysis done using sequencing reads. There are a number of mature software packages and associated pipelines that can identify single nucleotide polymorphisms (SNPs) with a high degree of concordance. However, the same cannot be said for tools that are used to identify the other types of variants. Indels represent the second most frequent class of variants in the human genome, after single nucleotide polymorphisms. The reliable detection of indels is still a challenging problem, especially for variants that are longer than a few bases. RESULTS: We have developed a set of algorithms and heuristics collectively called indelMINER to identify indels from whole genome resequencing datasets using paired-end reads. indelMINER uses a split-read approach to identify the precise breakpoints for indels of size less than a user specified threshold, and supplements that with a paired-end approach to identify larger variants that are frequently missed with the split-read approach. We use simulated and real datasets to show that an implementation of the algorithm performs favorably when compared to several existing tools. CONCLUSIONS: indelMINER can be used effectively to identify indels in whole-genome resequencing projects. The output is provided in the VCF format along with additional information about the variant, including information about its presence or absence in another sample. The source code and documentation for indelMINER can be freely downloaded from www.bx.psu.edu/miller_lab/indelMINER.tar.gz .

Subject(s)

Algorithms , Biomarkers, Tumor/genetics , Genome, Human , High-Throughput Nucleotide Sequencing/methods , INDEL Mutation/genetics , Neoplasms/genetics , Sequence Analysis, DNA/methods , Case-Control Studies , Genomics/methods , Humans , Polymorphism, Single Nucleotide/genetics

8.

Comparative and population mitogenomic analyses of Madagascar's extinct, giant 'subfossil' lemurs.

Kistler, Logan; Ratan, Aakrosh; Godfrey, Laurie R; Crowley, Brooke E; Hughes, Cris E; Lei, Runhua; Cui, Yinqiu; Wood, Mindy L; Muldoon, Kathleen M; Andriamialison, Haingoson; McGraw, John J; Tomsho, Lynn P; Schuster, Stephan C; Miller, Webb; Louis, Edward E; Yoder, Anne D; Malhi, Ripan S; Perry, George H.

J Hum Evol ; 79: 45-54, 2015 Feb.

Article in English | MEDLINE | ID: mdl-25523037

ABSTRACT

Humans first arrived on Madagascar only a few thousand years ago. Subsequent habitat destruction and hunting activities have had significant impacts on the island's biodiversity, including the extinction of megafauna. For example, we know of 17 recently extinct 'subfossil' lemur species, all of which were substantially larger (body mass â¼11-160 kg) than any living population of the â¼100 extant lemur species (largest body mass â¼6.8 kg). We used ancient DNA and genomic methods to study subfossil lemur extinction biology and update our understanding of extant lemur conservation risk factors by i) reconstructing a comprehensive phylogeny of extinct and extant lemurs, and ii) testing whether low genetic diversity is associated with body size and extinction risk. We recovered complete or near-complete mitochondrial genomes from five subfossil lemur taxa, and generated sequence data from population samples of two extinct and eight extant lemur species. Phylogenetic comparisons resolved prior taxonomic uncertainties and confirmed that the extinct subfossil species did not comprise a single clade. Genetic diversity estimates for the two sampled extinct species were relatively low, suggesting small historical population sizes. Low genetic diversity and small population sizes are both risk factors that would have rendered giant lemurs especially susceptible to extinction. Surprisingly, among the extant lemurs, we did not observe a relationship between body size and genetic diversity. The decoupling of these variables suggests that risk factors other than body size may have as much or more meaning for establishing future lemur conservation priorities.

Subject(s)

Body Size , Extinction, Biological , Genomics/methods , Lemur , Paleontology/methods , Animals , Body Size/genetics , Body Size/physiology , DNA/analysis , DNA/genetics , Fossils , Lemur/classification , Lemur/genetics , Lemur/physiology , Madagascar , Phylogeny

9.

Khoisan hunter-gatherers have been the largest population throughout most of modern-human demographic history.

Kim, Hie Lim; Ratan, Aakrosh; Perry, George H; Montenegro, Alvaro; Miller, Webb; Schuster, Stephan C.

Nat Commun ; 5: 5692, 2014 Dec 04.

Article in English | MEDLINE | ID: mdl-25471224

ABSTRACT

The Khoisan people from Southern Africa maintained ancient lifestyles as hunter-gatherers or pastoralists up to modern times, though little else is known about their early history. Here we infer early demographic histories of modern humans using whole-genome sequences of five Khoisan individuals and one Bantu speaker. Comparison with a 420 K SNP data set from worldwide individuals demonstrates that two of the Khoisan genomes from the Ju/'hoansi population contain exclusive Khoisan ancestry. Coalescent analysis shows that the Khoisan and their ancestors have been the largest populations since their split with the non-Khoisan population ~100-150 kyr ago. In contrast, the ancestors of the non-Khoisan groups, including Bantu-speakers and non-Africans, experienced population declines after the split and lost more than half of their genetic diversity. Paleoclimate records indicate that the precipitation in southern Africa increased ~80-100 kyr ago while west-central Africa became drier. We hypothesize that these climate differences might be related to the divergent-ancient histories among human populations.

Subject(s)

Black People/genetics , Genetic Variation , Genetics, Population , Sequence Analysis, DNA , Africa, Southern , Demography , Female , Humans , Male

10.

Polar bears exhibit genome-wide signatures of bioenergetic adaptation to life in the arctic environment.

Welch, Andreanna J; Bedoya-Reina, Oscar C; Carretero-Paulet, Lorenzo; Miller, Webb; Rode, Karyn D; Lindqvist, Charlotte.

Genome Biol Evol ; 6(2): 433-50, 2014 Feb.

Article in English | MEDLINE | ID: mdl-24504087

ABSTRACT

Polar bears (Ursus maritimus) face extremely cold temperatures and periods of fasting, which might result in more severe energetic challenges than those experienced by their sister species, the brown bear (U. arctos). We have examined the mitochondrial and nuclear genomes of polar and brown bears to investigate whether polar bears demonstrate lineage-specific signals of molecular adaptation in genes associated with cellular respiration/energy production. We observed increased evolutionary rates in the mitochondrial cytochrome c oxidase I gene in polar but not brown bears. An amino acid substitution occurred near the interaction site with a nuclear-encoded subunit of the cytochrome c oxidase complex and was predicted to lead to a functional change, although the significance of this remains unclear. The nuclear genomes of brown and polar bears demonstrate different adaptations related to cellular respiration. Analyses of the genomes of brown bears exhibited substitutions that may alter the function of proteins that regulate glucose uptake, which could be beneficial when feeding on carbohydrate-dominated diets during hyperphagia, followed by fasting during hibernation. In polar bears, genes demonstrating signatures of functional divergence and those potentially under positive selection were enriched in functions related to production of nitric oxide (NO), which can regulate energy production in several different ways. This suggests that polar bears may be able to fine-tune intracellular levels of NO as an adaptive response to control trade-offs between energy production in the form of adenosine triphosphate versus generation of heat (thermogenesis).

Subject(s)

Energy Metabolism , Genome , Ursidae/genetics , Ursidae/metabolism , Adaptation, Physiological , Animals , Arctic Regions , Biological Evolution , Nitric Oxide/metabolism , Phylogeny , Proteins/genetics , Proteins/metabolism , Ursidae/classification

11.

Updates of the HbVar database of human hemoglobin variants and thalassemia mutations.

Giardine, Belinda; Borg, Joseph; Viennas, Emmanouil; Pavlidis, Cristiana; Moradkhani, Kamran; Joly, Philippe; Bartsakoulia, Marina; Riemer, Cathy; Miller, Webb; Tzimas, Giannis; Wajcman, Henri; Hardison, Ross C; Patrinos, George P.

Nucleic Acids Res ; 42(Database issue): D1063-9, 2014 Jan.

Article in English | MEDLINE | ID: mdl-24137000

ABSTRACT

HbVar (http://globin.bx.psu.edu/hbvar) is one of the oldest and most appreciated locus-specific databases launched in 2001 by a multi-center academic effort to provide timely information on the genomic alterations leading to hemoglobin variants and all types of thalassemia and hemoglobinopathies. Database records include extensive phenotypic descriptions, biochemical and hematological effects, associated pathology and ethnic occurrence, accompanied by mutation frequencies and references. Here, we report updates to >600 HbVar entries, inclusion of population-specific data for 28 populations and 27 ethnic groups for α-, and ß-thalassemias and additional querying options in the HbVar query page. HbVar content was also inter-connected with two other established genetic databases, namely FINDbase (http://www.findbase.org) and Leiden Open-Access Variation database (http://www.lovd.nl), which allows comparative data querying and analysis. HbVar data content has contributed to the realization of two collaborative projects to identify genomic variants that lie on different globin paralogs. Most importantly, HbVar data content has contributed to demonstrate the microattribution concept in practice. These updates significantly enriched the database content and querying potential, enhanced the database profile and data quality and broadened the inter-relation of HbVar with other databases, which should increase the already high impact of this resource to the globin and genetic database community.

Subject(s)

Databases, Nucleic Acid , Genetic Variation , Hemoglobins/genetics , Mutation , Thalassemia/genetics , Genotype , Humans , Internet , Phenotype , Thalassemia/ethnology

12.

Aye-aye population genomic analyses highlight an important center of endemism in northern Madagascar.

Perry, George H; Louis, Edward E; Ratan, Aakrosh; Bedoya-Reina, Oscar C; Burhans, Richard C; Lei, Runhua; Johnson, Steig E; Schuster, Stephan C; Miller, Webb.

Proc Natl Acad Sci U S A ; 110(15): 5823-8, 2013 Apr 09.

Article in English | MEDLINE | ID: mdl-23530231

ABSTRACT

We performed a population genomics study of the aye-aye, a highly specialized nocturnal lemur from Madagascar. Aye-ayes have low population densities and extensive range requirements that could make this flagship species particularly susceptible to extinction. Therefore, knowledge of genetic diversity and differentiation among aye-aye populations is critical for conservation planning. Such information may also advance our general understanding of Malagasy biogeography, as aye-ayes have the largest species distribution of any lemur. We generated and analyzed whole-genome sequence data for 12 aye-ayes from three regions of Madagascar (North, West, and East). We found that the North population is genetically distinct, with strong differentiation from other aye-ayes over relatively short geographic distances. For comparison, the average FST value between the North and East aye-aye populations--separated by only 248 km--is over 2.1-times greater than that observed between human Africans and Europeans. This finding is consistent with prior watershed- and climate-based hypotheses of a center of endemism in northern Madagascar. Taken together, these results suggest a strong and long-term biogeographical barrier to gene flow. Thus, the specific attention that should be directed toward preserving large, contiguous aye-aye habitats in northern Madagascar may also benefit the conservation of other distinct taxonomic units. To help facilitate future ecological- and conservation-motivated population genomic analyses by noncomputational biologists, the analytical toolkit used in this study is available on the Galaxy Web site.

Subject(s)

Genetics, Population , Genomics , Lemur/genetics , Lemur/physiology , Animals , Evolution, Molecular , Genome , Genotype , Geography , Internet , Madagascar , Phylogeny , Polymorphism, Single Nucleotide , Sequence Analysis, DNA , Time Factors

13.

Comparison of sequencing platforms for single nucleotide variant calls in a human sample.

Ratan, Aakrosh; Miller, Webb; Guillory, Joseph; Stinson, Jeremy; Seshagiri, Somasekar; Schuster, Stephan C.

PLoS One ; 8(2): e55089, 2013.

Article in English | MEDLINE | ID: mdl-23405114

ABSTRACT

Next-generation sequencings platforms coupled with advanced bioinformatic tools enable re-sequencing of the human genome at high-speed and large cost savings. We compare sequencing platforms from Roche/454(GS FLX), Illumina/HiSeq (HiSeq 2000), and Life Technologies/SOLiD (SOLiD 3 ECC) for their ability to identify single nucleotide substitutions in whole genome sequences from the same human sample. We report on significant GC-related bias observed in the data sequenced on Illumina and SOLiD platforms. The differences in the variant calls were investigated with regards to coverage, and sequencing error. Some of the variants called by only one or two of the platforms were experimentally tested using mass spectrometry; a method that is independent of DNA sequencing. We establish several causes why variants remained unreported, specific to each platform. We report the indel called using the three sequencing technologies and from the obtained results we conclude that sequencing human genomes with more than a single platform and multiple libraries is beneficial when high level of accuracy is required.

Subject(s)

Genome, Human , High-Throughput Nucleotide Sequencing/methods , Sequence Analysis, DNA/methods , Humans , Polymorphism, Single Nucleotide

14.

Galaxy tools to study genome diversity.

Bedoya-Reina, Oscar C; Ratan, Aakrosh; Burhans, Richard; Kim, Hie Lim; Giardine, Belinda; Riemer, Cathy; Li, Qunhua; Olson, Thomas L; Loughran, Thomas P; Vonholdt, Bridgett M; Perry, George H; Schuster, Stephan C; Miller, Webb.

Gigascience ; 2(1): 17, 2013 Dec 30.

Article in English | MEDLINE | ID: mdl-24377391

ABSTRACT

BACKGROUND: Intra-species genetic variation can be used to investigate population structure, selection, and gene flow in non-model vertebrates; and due to the plummeting costs for genome sequencing, it is now possible for small labs to obtain full-genome variation data from their species of interest. However, those labs may not have easy access to, and familiarity with, computational tools to analyze those data. RESULTS: We have created a suite of tools for the Galaxy web server aimed at handling nucleotide and amino-acid polymorphisms discovered by full-genome sequencing of several individuals of the same species, or using a SNP genotyping microarray. In addition to providing user-friendly tools, a main goal is to make published analyses reproducible. While most of the examples discussed in this paper deal with nuclear-genome diversity in non-human vertebrates, we also illustrate the application of the tools to fungal genomes, human biomedical data, and mitochondrial sequences. CONCLUSIONS: This project illustrates that a small group can design, implement, test, document, and distribute a Galaxy tool collection to meet the needs of a particular community of biologists.

15.

Some phenotype association tools in Galaxy: looking for disease SNPs in a full genome.

Giardine, Belinda M; Riemer, Cathy; Burhans, Richard; Ratan, Aakrosh; Miller, Webb.

Curr Protoc Bioinformatics ; Chapter 15: 15.2.1-15.2.27, 2012 Sep.

Article in English | MEDLINE | ID: mdl-22948727

ABSTRACT

This unit focuses on some of the tools available on the public Galaxy server that are useful for exploring possible associations between human genetic variants and phenotypes. We trace step-by-step through an example illustrating several methods for examining a single full-coverage genome to look for single-nucleotide polymorphisms (SNPs) that are either known to be associated with disease or suspected to have impact for other reasons. It makes use of public genomic data, tools designed specifically for working with variants, and also some general tools for text manipulation and operations on genomic coordinates.

Subject(s)

Phenotype , Polymorphism, Single Nucleotide , Software , Genetic Variation , Genome, Human , Humans

16.

Sequencing and analysis of a South Asian-Indian personal genome.

Gupta, Ravi; Ratan, Aakrosh; Rajesh, Changanamkandath; Chen, Rong; Kim, Hie Lim; Burhans, Richard; Miller, Webb; Santhosh, Sam; Davuluri, Ramana V; Butte, Atul J; Schuster, Stephan C; Seshagiri, Somasekar; Thomas, George.

BMC Genomics ; 13: 440, 2012 Aug 31.

Article in English | MEDLINE | ID: mdl-22938532

ABSTRACT

BACKGROUND: With over 1.3 billion people, India is estimated to contain three times more genetic diversity than does Europe. Next-generation sequencing technologies have facilitated the understanding of diversity by enabling whole genome sequencing at greater speed and lower cost. While genomes from people of European and Asian descent have been sequenced, only recently has a single male genome from the Indian subcontinent been published at sufficient depth and coverage. In this study we have sequenced and analyzed the genome of a South Asian Indian female (SAIF) from the Indian state of Kerala. RESULTS: We identified over 3.4 million SNPs in this genome including over 89,873 private variations. Comparison of the SAIF genome with several published personal genomes revealed that this individual shared ~50% of the SNPs with each of these genomes. Analysis of the SAIF mitochondrial genome showed that it was closely related to the U1 haplogroup which has been previously observed in Kerala. We assessed the SAIF genome for SNPs with health and disease consequences and found that the individual was at a higher risk for multiple sclerosis and a few other diseases. In analyzing SNPs that modulate drug response, we found a variation that predicts a favorable response to metformin, a drug used to treat diabetes. SNPs predictive of adverse reaction to warfarin indicated that the SAIF individual is not at risk for bleeding if treated with typical doses of warfarin. In addition, we report the presence of several additional SNPs of medical relevance. CONCLUSIONS: This is the first study to report the complete whole genome sequence of a female from the state of Kerala in India. The availability of this complete genome and variants will further aid studies aimed at understanding genetic diversity, identifying clinically relevant changes and assessing disease burden in the Indian population.

Subject(s)

Asian People/genetics , Chromosome Mapping , Genome, Human , Genome, Mitochondrial , Polymorphism, Single Nucleotide , Anticoagulants/adverse effects , DNA Copy Number Variations , Diabetes Mellitus/genetics , Diabetes Mellitus/prevention & control , Female , Genetic Predisposition to Disease , Genetic Variation , Haplotypes , Hemorrhage/chemically induced , Hemorrhage/genetics , Hemorrhage/prevention & control , Humans , Hypoglycemic Agents/therapeutic use , India , Metformin/therapeutic use , Middle Aged , Multiple Sclerosis/genetics , Multiple Sclerosis/prevention & control , Sequence Analysis, DNA , Warfarin/adverse effects

17.

Polar and brown bear genomes reveal ancient admixture and demographic footprints of past climate change.

Miller, Webb; Schuster, Stephan C; Welch, Andreanna J; Ratan, Aakrosh; Bedoya-Reina, Oscar C; Zhao, Fangqing; Kim, Hie Lim; Burhans, Richard C; Drautz, Daniela I; Wittekindt, Nicola E; Tomsho, Lynn P; Ibarra-Laclette, Enrique; Herrera-Estrella, Luis; Peacock, Elizabeth; Farley, Sean; Sage, George K; Rode, Karyn; Obbard, Martyn; Montiel, Rafael; Bachmann, Lutz; Ingólfsson, Olafur; Aars, Jon; Mailund, Thomas; Wiig, Oystein; Talbot, Sandra L; Lindqvist, Charlotte.

Proc Natl Acad Sci U S A ; 109(36): E2382-90, 2012 Sep 04.

Article in English | MEDLINE | ID: mdl-22826254

ABSTRACT

Polar bears (PBs) are superbly adapted to the extreme Arctic environment and have become emblematic of the threat to biodiversity from global climate change. Their divergence from the lower-latitude brown bear provides a textbook example of rapid evolution of distinct phenotypes. However, limited mitochondrial and nuclear DNA evidence conflicts in the timing of PB origin as well as placement of the species within versus sister to the brown bear lineage. We gathered extensive genomic sequence data from contemporary polar, brown, and American black bear samples, in addition to a 130,000- to 110,000-y old PB, to examine this problem from a genome-wide perspective. Nuclear DNA markers reflect a species tree consistent with expectation, showing polar and brown bears to be sister species. However, for the enigmatic brown bears native to Alaska's Alexander Archipelago, we estimate that not only their mitochondrial genome, but also 5-10% of their nuclear genome, is most closely related to PBs, indicating ancient admixture between the two species. Explicit admixture analyses are consistent with ancient splits among PBs, brown bears and black bears that were later followed by occasional admixture. We also provide paleodemographic estimates that suggest bear evolution has tracked key climate events, and that PB in particular experienced a prolonged and dramatic decline in its effective population size during the last ca. 500,000 years. We demonstrate that brown bears and PBs have had sufficiently independent evolutionary histories over the last 4-5 million years to leave imprints in the PB nuclear genome that likely are associated with ecological adaptation to the Arctic environment.

Subject(s)

Adaptation, Biological/genetics , Climate Change/history , Evolution, Molecular , Genetics, Population , Genome/genetics , Ursidae/genetics , Animals , Arctic Regions , Base Sequence , Genetic Markers/genetics , History, Ancient , Molecular Sequence Data , Population Density , Population Dynamics , Sequence Analysis, DNA , Species Specificity

18.

Revealing mammalian evolutionary relationships by comparative analysis of gene clusters.

Song, Giltae; Riemer, Cathy; Dickins, Benjamin; Kim, Hie Lim; Zhang, Louxin; Zhang, Yu; Hsu, Chih-Hao; Hardison, Ross C; Green, Eric D; Miller, Webb.

Genome Biol Evol ; 4(4): 586-601, 2012.

Article in English | MEDLINE | ID: mdl-22454131

ABSTRACT

Many software tools for comparative analysis of genomic sequence data have been released in recent decades. Despite this, it remains challenging to determine evolutionary relationships in gene clusters due to their complex histories involving duplications, deletions, inversions, and conversions. One concept describing these relationships is orthology. Orthologs derive from a common ancestor by speciation, in contrast to paralogs, which derive from duplication. Discriminating orthologs from paralogs is a necessary step in most multispecies sequence analyses, but doing so accurately is impeded by the occurrence of gene conversion events. We propose a refined method of orthology assignment based on two paradigms for interpreting its definition: by genomic context or by sequence content. X-orthology (based on context) traces orthology resulting from speciation and duplication only, while N-orthology (based on content) includes the influence of conversion events. We developed a computational method for automatically mapping both types of orthology on a per-nucleotide basis in gene cluster regions studied by comparative sequencing, and we make this mapping accessible by visualizing the output. All of these steps are incorporated into our newly extended CHAP 2 package. We evaluate our method using both simulated data and real gene clusters (including the well-characterized α-globin and ß-globin clusters). We also illustrate use of CHAP 2 by analyzing four more loci: CCL (chemokine ligand), IFN (interferon), CYP2abf (part of cytochrome P450 family 2), and KIR (killer cell immunoglobulin-like receptors). These new methods facilitate and extend our understanding of evolution at these and other loci by adding automated accurate evolutionary inference to the biologist's toolkit. The CHAP 2 package is freely available from http://www.bx.psu.edu/miller_lab.

Subject(s)

Evolution, Molecular , Mammals/genetics , Multigene Family , Proteins/genetics , Animals , Gene Conversion , Gene Duplication , Genome , Humans , Mammals/classification , Phylogeny

19.

A genome sequence resource for the aye-aye (Daubentonia madagascariensis), a nocturnal lemur from Madagascar.

Perry, George H; Reeves, Darryl; Melsted, Páll; Ratan, Aakrosh; Miller, Webb; Michelini, Katelyn; Louis, Edward E; Pritchard, Jonathan K; Mason, Christopher E; Gilad, Yoav.

Genome Biol Evol ; 4(2): 126-35, 2012.

Article in English | MEDLINE | ID: mdl-22155688

ABSTRACT

We present a high-coverage draft genome assembly of the aye-aye (Daubentonia madagascariensis), a highly unusual nocturnal primate from Madagascar. Our assembly totals ~3.0 billion bp (3.0 Gb), roughly the size of the human genome, comprised of ~2.6 million scaffolds (N50 scaffold size = 13,597 bp) based on short paired-end sequencing reads. We compared the aye-aye genome sequence data with four other published primate genomes (human, chimpanzee, orangutan, and rhesus macaque) as well as with the mouse and dog genomes as nonprimate outgroups. Unexpectedly, we observed strong evidence for a relatively slow substitution rate in the aye-aye lineage compared with these and other primates. In fact, the aye-aye branch length is estimated to be ~10% shorter than that of the human lineage, which is known for its low substitution rate. This finding may be explained, in part, by the protracted aye-aye life-history pattern, including late weaning and age of first reproduction relative to other lemurs. Additionally, the availability of this draft lemur genome sequence allowed us to polarize nucleotide and protein sequence changes to the ancestral primate lineage-a critical period in primate evolution, for which the relevant fossil record is sparse. Finally, we identified 293,800 high-confidence single nucleotide polymorphisms in the donor individual for our aye-aye genome sequence, a captive-born individual from two wild-born parents. The resulting heterozygosity estimate of 0.051% is the lowest of any primate studied to date, which is understandable considering the aye-aye's extensive home-range size and relatively low population densities. Yet this level of genetic diversity also suggests that conservation efforts benefiting this unusual species should be prioritized, especially in the face of the accelerating degradation and fragmentation of Madagascar's forests.

Subject(s)

Darkness , Genome/genetics , Lemur/genetics , Animals , Base Sequence , Dogs , Evolution, Molecular , Genetic Variation , Geography , Humans , Madagascar , Mice , Molecular Sequence Data , Nucleotides/genetics , Open Reading Frames/genetics , Phylogeny , Species Specificity

20.

Conversion events in gene clusters.

Song, Giltae; Hsu, Chih-Hao; Riemer, Cathy; Zhang, Yu; Kim, Hie Lim; Hoffmann, Federico; Zhang, Louxin; Hardison, Ross C; Green, Eric D; Miller, Webb.

BMC Evol Biol ; 11: 226, 2011 Jul 28.

Article in English | MEDLINE | ID: mdl-21798034

ABSTRACT

BACKGROUND: Gene clusters containing multiple similar genomic regions in close proximity are of great interest for biomedical studies because of their associations with inherited diseases. However, such regions are difficult to analyze due to their structural complexity and their complicated evolutionary histories, reflecting a variety of large-scale mutational events. In particular, conversion events can mislead inferences about the relationships among these regions, as traced by traditional methods such as construction of phylogenetic trees or multi-species alignments. RESULTS: To correct the distorted information generated by such methods, we have developed an automated pipeline called CHAP (Cluster History Analysis Package) for detecting conversion events. We used this pipeline to analyze the conversion events that affected two well-studied gene clusters (α-globin and ß-globin) and three gene clusters for which comparative sequence data were generated from seven primate species: CCL (chemokine ligand), IFN (interferon), and CYP2abf (part of cytochrome P450 family 2). CHAP is freely available at http://www.bx.psu.edu/miller_lab. CONCLUSIONS: These studies reveal the value of characterizing conversion events in the context of studying gene clusters in complex genomes.

Subject(s)

Gene Conversion , Multigene Family , Primates/genetics , alpha-Globins/genetics , beta-Globins/genetics , Animals , Evolution, Molecular , Genome , Humans , Molecular Sequence Data , Phylogeny , Primates/classification , Software

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL