Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 108
Filter
Add more filters

Publication year range
1.
Biochem Biophys Res Commun ; 474(1): 29-34, 2016 05 20.
Article in English | MEDLINE | ID: mdl-27084454

ABSTRACT

Devil facial tumour disease (DFTD) is an infectious tumour disease and was hypothesised to be transmitted by allograft during biting based on two cytogenetic findings of DFTD tumours in 2006. It was then believed that DFTD tumours were originally from a female devil. In this study the devil sex-determining region Y (SRY) gene was PCR amplified and sequenced, and six pairs of devil SRY PCR primers were used for detection of devil SRY gene fragments in purified DFTD tumour cell lines. Using three pairs of devil SRY PCR primers, devil SRY gene sequence was detected by PCR and sequencing in genomic DNA of DFTD tumour cell lines from six male devils, but not from six female devils. Four out of six DFTD tumour cell lines from male devils contained nucleotides 288-482 of the devil SRY gene, and another two DFTD tumour cell lines contained nucleotides 381-577 and 493-708 of the gene, respectively. These results indicate that the different portions of the SRY gene in the DFTD tumours of the male devils were originally from the male hosts, rejecting the currently believed DFTD allograft transmission theory. The reasons why DFTD transmission was incorrectly defined as allograft are discussed.


Subject(s)
Facial Neoplasms/genetics , Marsupialia/genetics , Polymerase Chain Reaction/methods , Sequence Analysis, DNA/methods , Sex Determination Analysis/methods , Sex-Determining Region Y Protein/genetics , Allografts/transplantation , Animals , Cell Line, Tumor , Female , Male , Sex Characteristics
2.
Nature ; 463(7283): 943-7, 2010 Feb 18.
Article in English | MEDLINE | ID: mdl-20164927

ABSTRACT

The genetic structure of the indigenous hunter-gatherer peoples of southern Africa, the oldest known lineage of modern human, is important for understanding human diversity. Studies based on mitochondrial and small sets of nuclear markers have shown that these hunter-gatherers, known as Khoisan, San, or Bushmen, are genetically divergent from other humans. However, until now, fully sequenced human genomes have been limited to recently diverged populations. Here we present the complete genome sequences of an indigenous hunter-gatherer from the Kalahari Desert and a Bantu from southern Africa, as well as protein-coding regions from an additional three hunter-gatherers from disparate regions of the Kalahari. We characterize the extent of whole-genome and exome diversity among the five men, reporting 1.3 million novel DNA differences genome-wide, including 13,146 novel amino acid variants. In terms of nucleotide substitutions, the Bushmen seem to be, on average, more different from each other than, for example, a European and an Asian. Observed genomic differences between the hunter-gatherers and others may help to pinpoint genetic adaptations to an agricultural lifestyle. Adding the described variants to current databases will facilitate inclusion of southern Africans in medical research efforts, particularly when family and medical histories can be correlated with genome-wide data.


Subject(s)
Black People/genetics , Ethnicity/genetics , Genome, Human/genetics , Asian People/genetics , Exons/genetics , Genetics, Medical , Humans , Phylogeny , Polymorphism, Single Nucleotide/genetics , South Africa/ethnology , White People/genetics
3.
Nucleic Acids Res ; 42(Database issue): D1063-9, 2014 Jan.
Article in English | MEDLINE | ID: mdl-24137000

ABSTRACT

HbVar (http://globin.bx.psu.edu/hbvar) is one of the oldest and most appreciated locus-specific databases launched in 2001 by a multi-center academic effort to provide timely information on the genomic alterations leading to hemoglobin variants and all types of thalassemia and hemoglobinopathies. Database records include extensive phenotypic descriptions, biochemical and hematological effects, associated pathology and ethnic occurrence, accompanied by mutation frequencies and references. Here, we report updates to >600 HbVar entries, inclusion of population-specific data for 28 populations and 27 ethnic groups for α-, and ß-thalassemias and additional querying options in the HbVar query page. HbVar content was also inter-connected with two other established genetic databases, namely FINDbase (http://www.findbase.org) and Leiden Open-Access Variation database (http://www.lovd.nl), which allows comparative data querying and analysis. HbVar data content has contributed to the realization of two collaborative projects to identify genomic variants that lie on different globin paralogs. Most importantly, HbVar data content has contributed to demonstrate the microattribution concept in practice. These updates significantly enriched the database content and querying potential, enhanced the database profile and data quality and broadened the inter-relation of HbVar with other databases, which should increase the already high impact of this resource to the globin and genetic database community.


Subject(s)
Databases, Nucleic Acid , Genetic Variation , Hemoglobins/genetics , Mutation , Thalassemia/genetics , Genotype , Humans , Internet , Phenotype , Thalassemia/ethnology
4.
Proc Natl Acad Sci U S A ; 110(15): 5823-8, 2013 Apr 09.
Article in English | MEDLINE | ID: mdl-23530231

ABSTRACT

We performed a population genomics study of the aye-aye, a highly specialized nocturnal lemur from Madagascar. Aye-ayes have low population densities and extensive range requirements that could make this flagship species particularly susceptible to extinction. Therefore, knowledge of genetic diversity and differentiation among aye-aye populations is critical for conservation planning. Such information may also advance our general understanding of Malagasy biogeography, as aye-ayes have the largest species distribution of any lemur. We generated and analyzed whole-genome sequence data for 12 aye-ayes from three regions of Madagascar (North, West, and East). We found that the North population is genetically distinct, with strong differentiation from other aye-ayes over relatively short geographic distances. For comparison, the average FST value between the North and East aye-aye populations--separated by only 248 km--is over 2.1-times greater than that observed between human Africans and Europeans. This finding is consistent with prior watershed- and climate-based hypotheses of a center of endemism in northern Madagascar. Taken together, these results suggest a strong and long-term biogeographical barrier to gene flow. Thus, the specific attention that should be directed toward preserving large, contiguous aye-aye habitats in northern Madagascar may also benefit the conservation of other distinct taxonomic units. To help facilitate future ecological- and conservation-motivated population genomic analyses by noncomputational biologists, the analytical toolkit used in this study is available on the Galaxy Web site.


Subject(s)
Genetics, Population , Genomics , Lemur/genetics , Lemur/physiology , Animals , Evolution, Molecular , Genome , Genotype , Geography , Internet , Madagascar , Phylogeny , Polymorphism, Single Nucleotide , Sequence Analysis, DNA , Time Factors
5.
BMC Bioinformatics ; 16: 42, 2015 Feb 13.
Article in English | MEDLINE | ID: mdl-25879703

ABSTRACT

BACKGROUND: The discovery and mapping of genomic variants is an essential step in most analysis done using sequencing reads. There are a number of mature software packages and associated pipelines that can identify single nucleotide polymorphisms (SNPs) with a high degree of concordance. However, the same cannot be said for tools that are used to identify the other types of variants. Indels represent the second most frequent class of variants in the human genome, after single nucleotide polymorphisms. The reliable detection of indels is still a challenging problem, especially for variants that are longer than a few bases. RESULTS: We have developed a set of algorithms and heuristics collectively called indelMINER to identify indels from whole genome resequencing datasets using paired-end reads. indelMINER uses a split-read approach to identify the precise breakpoints for indels of size less than a user specified threshold, and supplements that with a paired-end approach to identify larger variants that are frequently missed with the split-read approach. We use simulated and real datasets to show that an implementation of the algorithm performs favorably when compared to several existing tools. CONCLUSIONS: indelMINER can be used effectively to identify indels in whole-genome resequencing projects. The output is provided in the VCF format along with additional information about the variant, including information about its presence or absence in another sample. The source code and documentation for indelMINER can be freely downloaded from www.bx.psu.edu/miller_lab/indelMINER.tar.gz .


Subject(s)
Algorithms , Biomarkers, Tumor/genetics , Genome, Human , High-Throughput Nucleotide Sequencing/methods , INDEL Mutation/genetics , Neoplasms/genetics , Sequence Analysis, DNA/methods , Case-Control Studies , Genomics/methods , Humans , Polymorphism, Single Nucleotide/genetics
6.
BMC Genomics ; 16: 518, 2015 Jul 10.
Article in English | MEDLINE | ID: mdl-26159619

ABSTRACT

BACKGROUND: With the development of inexpensive, high-throughput sequencing technologies, it has become feasible to examine questions related to population genetics and molecular evolution of non-model species in their ecological contexts on a genome-wide scale. Here, we employed a newly developed suite of integrated, web-based programs to examine population dynamics and signatures of selection across the genome using several well-established tests, including F ST, pN/pS, and McDonald-Kreitman. We applied these techniques to study populations of honey bees (Apis mellifera) in East Africa. In Kenya, there are several described A. mellifera subspecies, which are thought to be localized to distinct ecological regions. RESULTS: We performed whole genome sequencing of 11 worker honey bees from apiaries distributed throughout Kenya and identified 3.6 million putative single-nucleotide polymorphisms. The dense coverage allowed us to apply several computational procedures to study population structure and the evolutionary relationships among the populations, and to detect signs of adaptive evolution across the genome. While there is considerable gene flow among the sampled populations, there are clear distinctions between populations from the northern desert region and those from the temperate, savannah region. We identified several genes showing population genetic patterns consistent with positive selection within African bee populations, and between these populations and European A. mellifera or Asian Apis florea. CONCLUSIONS: These results lay the groundwork for future studies of adaptive ecological evolution in honey bees, and demonstrate the use of new, freely available web-based tools and workflows ( http://usegalaxy.org/r/kenyanbee ) that can be applied to any model system with genomic information.


Subject(s)
Bees/genetics , Genome, Insect/genetics , Selection, Genetic/genetics , Transcriptome/genetics , Animals , Evolution, Molecular , Genetics, Population/methods , Genomics/methods , Kenya , Models, Genetic , Polymorphism, Single Nucleotide/genetics , Population Dynamics
7.
J Hum Evol ; 79: 45-54, 2015 Feb.
Article in English | MEDLINE | ID: mdl-25523037

ABSTRACT

Humans first arrived on Madagascar only a few thousand years ago. Subsequent habitat destruction and hunting activities have had significant impacts on the island's biodiversity, including the extinction of megafauna. For example, we know of 17 recently extinct 'subfossil' lemur species, all of which were substantially larger (body mass ∼11-160 kg) than any living population of the ∼100 extant lemur species (largest body mass ∼6.8 kg). We used ancient DNA and genomic methods to study subfossil lemur extinction biology and update our understanding of extant lemur conservation risk factors by i) reconstructing a comprehensive phylogeny of extinct and extant lemurs, and ii) testing whether low genetic diversity is associated with body size and extinction risk. We recovered complete or near-complete mitochondrial genomes from five subfossil lemur taxa, and generated sequence data from population samples of two extinct and eight extant lemur species. Phylogenetic comparisons resolved prior taxonomic uncertainties and confirmed that the extinct subfossil species did not comprise a single clade. Genetic diversity estimates for the two sampled extinct species were relatively low, suggesting small historical population sizes. Low genetic diversity and small population sizes are both risk factors that would have rendered giant lemurs especially susceptible to extinction. Surprisingly, among the extant lemurs, we did not observe a relationship between body size and genetic diversity. The decoupling of these variables suggests that risk factors other than body size may have as much or more meaning for establishing future lemur conservation priorities.


Subject(s)
Body Size , Extinction, Biological , Genomics/methods , Lemur , Paleontology/methods , Animals , Body Size/genetics , Body Size/physiology , DNA/analysis , DNA/genetics , Fossils , Lemur/classification , Lemur/genetics , Lemur/physiology , Madagascar , Phylogeny
9.
Proc Natl Acad Sci U S A ; 109(36): E2382-90, 2012 Sep 04.
Article in English | MEDLINE | ID: mdl-22826254

ABSTRACT

Polar bears (PBs) are superbly adapted to the extreme Arctic environment and have become emblematic of the threat to biodiversity from global climate change. Their divergence from the lower-latitude brown bear provides a textbook example of rapid evolution of distinct phenotypes. However, limited mitochondrial and nuclear DNA evidence conflicts in the timing of PB origin as well as placement of the species within versus sister to the brown bear lineage. We gathered extensive genomic sequence data from contemporary polar, brown, and American black bear samples, in addition to a 130,000- to 110,000-y old PB, to examine this problem from a genome-wide perspective. Nuclear DNA markers reflect a species tree consistent with expectation, showing polar and brown bears to be sister species. However, for the enigmatic brown bears native to Alaska's Alexander Archipelago, we estimate that not only their mitochondrial genome, but also 5-10% of their nuclear genome, is most closely related to PBs, indicating ancient admixture between the two species. Explicit admixture analyses are consistent with ancient splits among PBs, brown bears and black bears that were later followed by occasional admixture. We also provide paleodemographic estimates that suggest bear evolution has tracked key climate events, and that PB in particular experienced a prolonged and dramatic decline in its effective population size during the last ca. 500,000 years. We demonstrate that brown bears and PBs have had sufficiently independent evolutionary histories over the last 4-5 million years to leave imprints in the PB nuclear genome that likely are associated with ecological adaptation to the Arctic environment.


Subject(s)
Adaptation, Biological/genetics , Climate Change/history , Evolution, Molecular , Genetics, Population , Genome/genetics , Ursidae/genetics , Animals , Arctic Regions , Base Sequence , Genetic Markers/genetics , History, Ancient , Molecular Sequence Data , Population Density , Population Dynamics , Sequence Analysis, DNA , Species Specificity
10.
Genome Res ; 21(7): 1139-49, 2011 Jul.
Article in English | MEDLINE | ID: mdl-21628450

ABSTRACT

Plasticity of gene regulatory encryption can permit DNA sequence divergence without loss of function. Functional information is preserved through conservation of the composition of transcription factor binding sites (TFBS) in a regulatory element. We have developed a method that can accurately identify pairs of functional noncoding orthologs at evolutionarily diverged loci by searching for conserved TFBS arrangements. With an estimated 5% false-positive rate (FPR) in approximately 3000 human and zebrafish syntenic loci, we detected approximately 300 pairs of diverged elements that are likely to share common ancestry and have similar regulatory activity. By analyzing a pool of experimentally validated human enhancers, we demonstrated that 7/8 (88%) of their predicted functional orthologs retained in vivo regulatory control. Moreover, in 5/7 (71%) of assayed enhancer pairs, we observed concordant expression patterns. We argue that TFBS composition is often necessary to retain and sufficient to predict regulatory function in the absence of overt sequence conservation, revealing an entire class of functionally conserved, evolutionarily diverged regulatory elements that we term "covert."


Subject(s)
Conserved Sequence , Enhancer Elements, Genetic , Gene Expression Regulation, Developmental , Sequence Analysis, DNA/methods , Animals , Animals, Genetically Modified/genetics , Computational Biology/methods , Evolution, Molecular , Genetic Loci , Genome, Human , Humans , Models, Genetic , Oligonucleotide Array Sequence Analysis , Sequence Alignment , Synteny , Transcription Factors/genetics , Zebrafish/genetics
11.
Genome Res ; 21(10): 1659-71, 2011 Oct.
Article in English | MEDLINE | ID: mdl-21795386

ABSTRACT

Interplays among lineage-specific nuclear proteins, chromatin modifying enzymes, and the basal transcription machinery govern cellular differentiation, but their dynamics of action and coordination with transcriptional control are not fully understood. Alterations in chromatin structure appear to establish a permissive state for gene activation at some loci, but they play an integral role in activation at other loci. To determine the predominant roles of chromatin states and factor occupancy in directing gene regulation during differentiation, we mapped chromatin accessibility, histone modifications, and nuclear factor occupancy genome-wide during mouse erythroid differentiation dependent on the master regulatory transcription factor GATA1. Notably, despite extensive changes in gene expression, the chromatin state profiles (proportions of a gene in a chromatin state dominated by activating or repressive histone modifications) and accessibility remain largely unchanged during GATA1-induced erythroid differentiation. In contrast, gene induction and repression are strongly associated with changes in patterns of transcription factor occupancy. Our results indicate that during erythroid differentiation, the broad features of chromatin states are established at the stage of lineage commitment, largely independently of GATA1. These determine permissiveness for expression, with subsequent induction or repression mediated by distinctive combinations of transcription factors.


Subject(s)
Cell Differentiation/genetics , Epigenesis, Genetic , Erythropoiesis/genetics , GATA1 Transcription Factor/metabolism , Animals , Basic Helix-Loop-Helix Transcription Factors/metabolism , Cell Line , Chromatin Assembly and Disassembly , Chromatin Immunoprecipitation , Estradiol/pharmacology , Estradiol/physiology , GATA1 Transcription Factor/genetics , GATA2 Transcription Factor/metabolism , Gene Expression Profiling , Gene Silencing , Mice , Multivariate Analysis , Peptide Hydrolases/metabolism , Protein Binding , Proto-Oncogene Proteins/metabolism , Receptors, Estrogen/genetics , Recombinant Proteins/genetics , Recombinant Proteins/metabolism , Regulatory Sequences, Nucleic Acid , T-Cell Acute Lymphocytic Leukemia Protein 1
12.
Nature ; 456(7220): 387-90, 2008 Nov 20.
Article in English | MEDLINE | ID: mdl-19020620

ABSTRACT

In 1994, two independent groups extracted DNA from several Pleistocene epoch mammoths and noted differences among individual specimens. Subsequently, DNA sequences have been published for a number of extinct species. However, such ancient DNA is often fragmented and damaged, and studies to date have typically focused on short mitochondrial sequences, never yielding more than a fraction of a per cent of any nuclear genome. Here we describe 4.17 billion bases (Gb) of sequence from several mammoth specimens, 3.3 billion (80%) of which are from the woolly mammoth (Mammuthus primigenius) genome and thus comprise an extensive set of genome-wide sequence from an extinct species. Our data support earlier reports that elephantid genomes exceed 4 Gb. The estimated divergence rate between mammoth and African elephant is half of that between human and chimpanzee. The observed number of nucleotide differences between two particular mammoths was approximately one-eighth of that between one of them and the African elephant, corresponding to a separation between the mammoths of 1.5-2.0 Myr. The estimated probability that orthologous elephant and mammoth amino acids differ is 0.002, corresponding to about one residue per protein. Differences were discovered between mammoth and African elephant in amino-acid positions that are otherwise invariant over several billion years of combined mammalian evolution. This study shows that nuclear genome sequencing of extinct species can reveal population differences not evident from the fossil record, and perhaps even discover genetic factors that affect extinction.


Subject(s)
Cell Nucleus/genetics , Elephants/genetics , Evolution, Molecular , Extinction, Biological , Fossils , Genome/genetics , Genomics , Sequence Analysis, DNA/methods , Africa , Animals , Conserved Sequence/genetics , Elephants/anatomy & histology , Female , Hair/metabolism , Humans , India , Male , Phylogeny
13.
Nature ; 453(7192): 175-83, 2008 May 08.
Article in English | MEDLINE | ID: mdl-18464734

ABSTRACT

We present a draft genome sequence of the platypus, Ornithorhynchus anatinus. This monotreme exhibits a fascinating combination of reptilian and mammalian characters. For example, platypuses have a coat of fur adapted to an aquatic lifestyle; platypus females lactate, yet lay eggs; and males are equipped with venom similar to that of reptiles. Analysis of the first monotreme genome aligned these features with genetic innovations. We find that reptile and platypus venom proteins have been co-opted independently from the same gene families; milk protein genes are conserved despite platypuses laying eggs; and immune gene family expansions are directly related to platypus biology. Expansions of protein, non-protein-coding RNA and microRNA families, as well as repeat elements, are identified. Sequencing of this genome now provides a valuable resource for deep mammalian comparative analyses, as well as for monotreme biology and conservation.


Subject(s)
Evolution, Molecular , Genome/genetics , Platypus/genetics , Animals , Base Composition , Dentition , Female , Genomic Imprinting/genetics , Humans , Immunity/genetics , Male , Mammals/genetics , MicroRNAs/genetics , Milk Proteins/genetics , Phylogeny , Platypus/immunology , Platypus/physiology , Receptors, Odorant/genetics , Repetitive Sequences, Nucleic Acid/genetics , Reptiles/genetics , Sequence Analysis, DNA , Spermatozoa/metabolism , Venoms/genetics , Zona Pellucida/metabolism
14.
Proc Natl Acad Sci U S A ; 108(30): 12348-53, 2011 Jul 26.
Article in English | MEDLINE | ID: mdl-21709235

ABSTRACT

The Tasmanian devil (Sarcophilus harrisii) is threatened with extinction because of a contagious cancer known as Devil Facial Tumor Disease. The inability to mount an immune response and to reject these tumors might be caused by a lack of genetic diversity within a dwindling population. Here we report a whole-genome analysis of two animals originating from extreme northwest and southeast Tasmania, the maximal geographic spread, together with the genome from a tumor taken from one of them. A 3.3-Gb de novo assembly of the sequence data from two complementary next-generation sequencing platforms was used to identify 1 million polymorphic genomic positions, roughly one-quarter of the number observed between two genetically distant human genomes. Analysis of 14 complete mitochondrial genomes from current and museum specimens, as well as mitochondrial and nuclear SNP markers in 175 animals, suggests that the observed low genetic diversity in today's population preceded the Devil Facial Tumor Disease disease outbreak by at least 100 y. Using a genetically characterized breeding stock based on the genome sequence will enable preservation of the extant genetic diversity in future Tasmanian devil populations.


Subject(s)
Genetic Variation , Marsupialia/genetics , Animals , Breeding , DNA, Mitochondrial/genetics , DNA, Neoplasm/genetics , Extinction, Biological , Facial Neoplasms/genetics , Facial Neoplasms/veterinary , Genetics, Population , Genome, Mitochondrial , Humans , Models, Molecular , Molecular Sequence Data , Neoplasm Proteins/chemistry , Neoplasm Proteins/genetics , Neoplasms/genetics , Neoplasms/veterinary , Phylogeny , Polymorphism, Single Nucleotide , Tasmania , Time Factors
15.
Proc Natl Acad Sci U S A ; 107(11): 5053-7, 2010 Mar 16.
Article in English | MEDLINE | ID: mdl-20194737

ABSTRACT

The polar bear has become the flagship species in the climate-change discussion. However, little is known about how past climate impacted its evolution and persistence, given an extremely poor fossil record. Although it is undisputed from analyses of mitochondrial (mt) DNA that polar bears constitute a lineage within the genetic diversity of brown bears, timing estimates of their divergence have differed considerably. Using next-generation sequencing technology, we have generated a complete, high-quality mt genome from a stratigraphically validated 130,000- to 110,000-year-old polar bear jawbone. In addition, six mt genomes were generated of extant polar bears from Alaska and brown bears from the Admiralty and Baranof islands of the Alexander Archipelago of southeastern Alaska and Kodiak Island. We show that the phylogenetic position of the ancient polar bear lies almost directly at the branching point between polar bears and brown bears, elucidating a unique morphologically and molecularly documented fossil link between living mammal species. Molecular dating and stable isotope analyses also show that by very early in their evolutionary history, polar bears were already inhabitants of the Artic sea ice and had adapted very rapidly to their current and unique ecology at the top of the Arctic marine food chain. As such, polar bears provide an excellent example of evolutionary opportunism within a widespread mammalian lineage.


Subject(s)
Biological Evolution , Genome, Mitochondrial/genetics , Jaw/anatomy & histology , Ursidae/anatomy & histology , Ursidae/genetics , Animals , Base Sequence , Genetic Variation , Molecular Sequence Data , Phylogeny , Time Factors
16.
Nat Genet ; 33(4): 514-7, 2003 Apr.
Article in English | MEDLINE | ID: mdl-12612582

ABSTRACT

Although mutation is commonly thought of as a random process, evolutionary studies show that different types of nucleotide substitution occur with widely varying rates that presumably reflect biases intrinsic to mutation and repair mechanisms. A strand asymmetry, the occurrence of particular substitution types at higher rates than their complementary types, that is associated with DNA replication has been found in bacteria and mitochondria. A strand asymmetry that is associated with transcription and attributable to higher rates of cytosine deamination on the coding strand has been observed in enterobacteria. Here, we describe a qualitatively different transcription-associated strand asymmetry in mammals, which may be a byproduct of transcription-coupled repair in germline cells. This mutational asymmetry has acted over long periods of time to produce a compositional asymmetry, an excess of G+T over A+C on the coding strand, in most genes. The mutational and compositional asymmetries can be used to detect the orientations and approximate extents of transcribed regions.


Subject(s)
DNA Mutational Analysis , Transcription, Genetic , Animals , Biological Evolution , Cell Lineage , Chromosomes, Human, Pair 22 , CpG Islands , Databases as Topic , Humans , Models, Genetic , Papio , RNA, Messenger/metabolism , Sequence Analysis, DNA
17.
BMC Genomics ; 13: 440, 2012 Aug 31.
Article in English | MEDLINE | ID: mdl-22938532

ABSTRACT

BACKGROUND: With over 1.3 billion people, India is estimated to contain three times more genetic diversity than does Europe. Next-generation sequencing technologies have facilitated the understanding of diversity by enabling whole genome sequencing at greater speed and lower cost. While genomes from people of European and Asian descent have been sequenced, only recently has a single male genome from the Indian subcontinent been published at sufficient depth and coverage. In this study we have sequenced and analyzed the genome of a South Asian Indian female (SAIF) from the Indian state of Kerala. RESULTS: We identified over 3.4 million SNPs in this genome including over 89,873 private variations. Comparison of the SAIF genome with several published personal genomes revealed that this individual shared ~50% of the SNPs with each of these genomes. Analysis of the SAIF mitochondrial genome showed that it was closely related to the U1 haplogroup which has been previously observed in Kerala. We assessed the SAIF genome for SNPs with health and disease consequences and found that the individual was at a higher risk for multiple sclerosis and a few other diseases. In analyzing SNPs that modulate drug response, we found a variation that predicts a favorable response to metformin, a drug used to treat diabetes. SNPs predictive of adverse reaction to warfarin indicated that the SAIF individual is not at risk for bleeding if treated with typical doses of warfarin. In addition, we report the presence of several additional SNPs of medical relevance. CONCLUSIONS: This is the first study to report the complete whole genome sequence of a female from the state of Kerala in India. The availability of this complete genome and variants will further aid studies aimed at understanding genetic diversity, identifying clinically relevant changes and assessing disease burden in the Indian population.


Subject(s)
Asian People/genetics , Chromosome Mapping , Genome, Human , Genome, Mitochondrial , Polymorphism, Single Nucleotide , Anticoagulants/adverse effects , DNA Copy Number Variations , Diabetes Mellitus/genetics , Diabetes Mellitus/prevention & control , Female , Genetic Predisposition to Disease , Genetic Variation , Haplotypes , Hemorrhage/chemically induced , Hemorrhage/genetics , Hemorrhage/prevention & control , Humans , Hypoglycemic Agents/therapeutic use , India , Metformin/therapeutic use , Middle Aged , Multiple Sclerosis/genetics , Multiple Sclerosis/prevention & control , Sequence Analysis, DNA , Warfarin/adverse effects
18.
Genome Res ; 19(12): 2172-84, 2009 Dec.
Article in English | MEDLINE | ID: mdl-19887574

ABSTRACT

The transcription factor GATA1 regulates an extensive program of gene activation and repression during erythroid development. However, the associated mechanisms, including the contributions of distal versus proximal cis-regulatory modules, co-occupancy with other transcription factors, and the effects of histone modifications, are poorly understood. We studied these problems genome-wide in a Gata1 knockout erythroblast cell line that undergoes GATA1-dependent terminal maturation, identifying 2616 GATA1-responsive genes and 15,360 GATA1-occupied DNA segments after restoration of GATA1. Virtually all occupied DNA segments have high levels of H3K4 monomethylation and low levels of H3K27me3 around the canonical GATA binding motif, regardless of whether the nearby gene is induced or repressed. Induced genes tend to be bound by GATA1 close to the transcription start site (most frequently in the first intron), have multiple GATA1-occupied segments that are also bound by TAL1, and show evolutionary constraint on the GATA1-binding site motif. In contrast, repressed genes are further away from GATA1-occupied segments, and a subset shows reduced TAL1 occupancy and increased H3K27me3 at the transcription start site. Our data expand the repertoire of GATA1 action in erythropoiesis by defining a new cohort of target genes and determining the spatial distribution of cis-regulatory modules throughout the genome. In addition, we begin to establish functional criteria and mechanisms that distinguish GATA1 activation from repression at specific target genes. More broadly, these studies illustrate how a "master regulator" transcription factor coordinates tissue differentiation through a panoply of DNA and protein interactions.


Subject(s)
Erythropoiesis/drug effects , GATA1 Transcription Factor/metabolism , Gene Expression Regulation, Developmental , Genome , Histones/metabolism , RNA, Messenger/metabolism , Binding Sites , Cell Differentiation , Cell Line , Chromatin/metabolism , Chromatin Immunoprecipitation , Erythroblasts/cytology , Erythroid Cells/cytology , GATA1 Transcription Factor/pharmacology , Oligonucleotide Array Sequence Analysis , RNA, Messenger/genetics
19.
BMC Bioinformatics ; 12 Suppl 1: S45, 2011 Feb 15.
Article in English | MEDLINE | ID: mdl-21342577

ABSTRACT

BACKGROUND: Gene clusters are genetically important, but their analysis poses significant computational challenges. One of the major reasons for these difficulties is gene conversion among the duplicated regions of the cluster, which can obscure their true relationships. Many computational methods for detecting gene conversion events have been released, but their performance has not been assessed for wide deployment in evolutionary history studies due to a lack of accurate evaluation methods. RESULTS: We designed a new method that simulates gene cluster evolution, including large-scale events of duplication, deletion, and conversion as well as small mutations. We used this simulation data to evaluate several different programs for detecting gene conversion events. CONCLUSIONS: Our evaluation identifies strengths and weaknesses of several methods for detecting gene conversion, which can contribute to more accurate analysis of gene cluster evolution.


Subject(s)
Computational Biology/methods , Gene Conversion , Multigene Family , Animals , Biological Evolution , Computer Simulation , Humans , Primates/genetics , Sequence Alignment
20.
BMC Evol Biol ; 11: 226, 2011 Jul 28.
Article in English | MEDLINE | ID: mdl-21798034

ABSTRACT

BACKGROUND: Gene clusters containing multiple similar genomic regions in close proximity are of great interest for biomedical studies because of their associations with inherited diseases. However, such regions are difficult to analyze due to their structural complexity and their complicated evolutionary histories, reflecting a variety of large-scale mutational events. In particular, conversion events can mislead inferences about the relationships among these regions, as traced by traditional methods such as construction of phylogenetic trees or multi-species alignments. RESULTS: To correct the distorted information generated by such methods, we have developed an automated pipeline called CHAP (Cluster History Analysis Package) for detecting conversion events. We used this pipeline to analyze the conversion events that affected two well-studied gene clusters (α-globin and ß-globin) and three gene clusters for which comparative sequence data were generated from seven primate species: CCL (chemokine ligand), IFN (interferon), and CYP2abf (part of cytochrome P450 family 2). CHAP is freely available at http://www.bx.psu.edu/miller_lab. CONCLUSIONS: These studies reveal the value of characterizing conversion events in the context of studying gene clusters in complex genomes.


Subject(s)
Gene Conversion , Multigene Family , Primates/genetics , alpha-Globins/genetics , beta-Globins/genetics , Animals , Evolution, Molecular , Genome , Humans , Molecular Sequence Data , Phylogeny , Primates/classification , Software
SELECTION OF CITATIONS
SEARCH DETAIL