Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 29
Filter
1.
BMC Biol ; 21(1): 32, 2023 02 13.
Article in English | MEDLINE | ID: mdl-36782149

ABSTRACT

BACKGROUND: Sex determination occurs across animal species, but most of our knowledge about its mechanisms comes from only a handful of bilaterian taxa. This limits our ability to infer the evolutionary history of sex determination within animals. RESULTS: In this study, we generated a linkage map of the genome of the colonial cnidarian Hydractinia symbiolongicarpus and used it to demonstrate that this species has an XX/XY sex determination system. We demonstrate that the X and Y chromosomes have pseudoautosomal and non-recombining regions. We then use the linkage map and a method based on the depth of sequencing coverage to identify genes encoded in the non-recombining region and show that many of them have male gonad-specific expression. In addition, we demonstrate that recombination rates are enhanced in the female genome and that the haploid chromosome number in Hydractinia is n = 15. CONCLUSIONS: These findings establish Hydractinia as a tractable non-bilaterian model system for the study of sex determination and the evolution of sex chromosomes.


Subject(s)
Hydrozoa , Sex Chromosomes , Male , Female , Animals , Sex Chromosomes/genetics , Chromosome Mapping , Y Chromosome/genetics , Hydrozoa/genetics , Evolution, Molecular
2.
Am J Hum Genet ; 110(1): 3-12, 2023 01 05.
Article in English | MEDLINE | ID: mdl-36608682

ABSTRACT

Although genomic research has predominantly relied on phenotypic ascertainment of individuals affected with heritable disease, the falling costs of sequencing allow consideration of genomic ascertainment and reverse phenotyping (the ascertainment of individuals with specific genomic variants and subsequent evaluation of physical characteristics). In this research modality, the scientific question is inverted: investigators gather individuals with a genomic variant and test the hypothesis that there is an associated phenotype via targeted phenotypic evaluations. Genomic ascertainment research is thus a model of predictive genomic medicine and genomic screening. Here, we provide our experience implementing this research method. We describe the infrastructure we developed to perform reverse phenotyping studies, including aggregating a super-cohort of sequenced individuals who consented to recontact for genomic ascertainment research. We assessed 13 studies completed at the National Institutes of Health (NIH) that piloted our reverse phenotyping approach. The studies can be broadly categorized as (1) facilitating novel genotype-disease associations, (2) expanding the phenotypic spectra, or (3) demonstrating ex vivo functional mechanisms of disease. We highlight three examples of reverse phenotyping studies in detail and describe how using a targeted reverse phenotyping approach (as opposed to phenotypic ascertainment or clinical informatics approaches) was crucial to the conclusions reached. Finally, we propose a framework and address challenges to building collaborative genomic ascertainment research programs at other institutions. Our goal is for more researchers to take advantage of this approach, which will expand our understanding of the predictive capability of genomic medicine and increase the opportunity to mitigate genomic disease.


Subject(s)
Genome , Medical Informatics , Phenotype , Genotype , Genomics/methods
3.
Front Immunol ; 13: 941839, 2022.
Article in English | MEDLINE | ID: mdl-36466872

ABSTRACT

Rationale: Previous studies identified an interaction between HLA and oral peanut exposure. HLA-DQA1*01:02 had a protective role with the induction of Ara h 2 epitope-specific IgG4 associated with peanut consumption during the LEAP clinical trial for prevention of peanut allergy, while it was a risk allele for peanut allergy in the peanut avoidance group. We have now evaluated this gene-environment interaction in two subsequent peanut oral immunotherapy (OIT) trials - IMPACT and POISED - to better understand the potential for the HLA-DQA1*01:02 allele as an indicator of higher likelihood of desensitization, sustained unresponsiveness, and peanut allergy remission. Methods: We determined HLA-DQA1*01:02 carrier status using genome sequencing from POISED (N=118, age: 7-55yr) and IMPACT (N=126, age: 12-<48mo). We tested for association with remission, sustained unresponsiveness (SU), and desensitization in the OIT groups, as well as peanut component specific IgG4 (psIgG4) using generalized linear models and adjusting for relevant covariates and ancestry. Results: While not quite statistically significant, a higher proportion of HLA-DQA1*01:02 carriers receiving OIT in IMPACT were desensitized (93%) compared to non-carriers (78%); odds ratio (OR)=5.74 (p=0.06). In this sample we also observed that a higher proportion of carriers achieved remission (35%) compared to non-carriers (22%); OR=1.26 (p=0.80). In POISED, carriers more frequently attained continued desensitization (80% versus 61% among non-carriers; OR=1.28, p=0.86) and achieved SU (52% versus 31%; OR=2.32, p=0.19). psIgG4 associations with HLA-DQA1*01:02 in the OIT arm of IMPACT which included younger study subjects recapitulated patterns noted in LEAP, but no associations of note were observed in the older POISED study subjects. Conclusions: Findings across three clinical trials show a pattern of a gene environment interaction between HLA and oral peanut exposure. Age, and prior sensitization contribute additional determinants of outcomes, consistent with a mechanism of restricted antigen recognition fundamental to driving protective immune responses to OIT.


Subject(s)
Arachis , Peanut Hypersensitivity , Adolescent , Adult , Child , Humans , Middle Aged , Young Adult , Immunoglobulin G , Immunologic Factors , Immunotherapy , Peanut Hypersensitivity/genetics , Peanut Hypersensitivity/therapy , Clinical Trials as Topic
4.
Genes (Basel) ; 13(7)2022 07 18.
Article in English | MEDLINE | ID: mdl-35886053

ABSTRACT

The Hawaiian monk seal (HMS) is the single extant species of tropical earless seals of the genus Neomonachus. The species survived a severe bottleneck in the late 19th century and experienced subsequent population declines until becoming the subject of a NOAA-led species recovery effort beginning in 1976 when the population was fewer than 1000 animals. Like other recovering species, the Hawaiian monk seal has been reported to have reduced genetic heterogeneity due to the bottleneck and subsequent inbreeding. Here, we report a chromosomal reference assembly for a male animal produced using a variety of methods. The final assembly consisted of 16 autosomes, an X, and portions of the Y chromosomes. We compared variants in this animal to other HMS and to a frequently sequenced human sample, confirming about 12% of the variation seen in man. To confirm that the reference animal was representative of the HMS, we compared his sequence to that of 10 other individuals and noted similarly low variation in all. Variation in the major histocompatibility (MHC) genes was nearly absent compared to the orthologous human loci. Demographic analysis predicts that Hawaiian monk seals have had a long history of small populations preceding the bottleneck, and their current low levels of heterozygosity may indicate specialization to a stable environment. When we compared our reference assembly to that of other species, we observed significant conservation of chromosomal architecture with other pinnipeds, especially other phocids. This reference should be a useful tool for future evolutionary studies as well as the long-term management of this species.


Subject(s)
Seals, Earless , Animals , Chromosomes , Genomic Instability , Hawaii/epidemiology , Humans , Male , Seals, Earless/genetics
5.
Genes (Basel) ; 13(5)2022 05 03.
Article in English | MEDLINE | ID: mdl-35627201

ABSTRACT

Craniosynostosis (CS) is a major birth defect in which one or more skull sutures fuse prematurely. We previously performed a genome-wide association study (GWAS) for sagittal non-syndromic CS (sNCS), identifying associations downstream from BMP2 on 20p12.3 and intronic to BBS9 on 7p14.3; analyses of imputed variants in DLG1 on 3q29 were also genome-wide significant. We followed this work with a GWAS for metopic non-syndromic NCS (mNCS), discovering a significant association intronic to BMP7 on 20q13.31. In the current study, we sequenced the associated regions on 3q29, 7p14.3, and 20p12.3, including two candidate genes (BMP2 and BMPER) near some of these regions in 83 sNCS child-parent trios, and sequenced regions on 7p14.3 and 20q13.2-q13.32 in 80 mNCS child-parent trios. These child-parent trios were selected from the original GWAS cohorts if the probands carried at least one copy of the top associated GWAS variant (rs1884302 C allele for sNCS; rs6127972 T allele for mNCS). Many of the variants sequenced in these targeted regions are strongly predicted to be within binding sites for transcription factors involved in craniofacial development or bone morphogenesis. Variants enriched in more than one trio and predicted to be damaging to gene function are prioritized for functional studies.


Subject(s)
Craniosynostoses , Genome-Wide Association Study , Alleles , Carrier Proteins/genetics , Craniosynostoses/genetics , Humans
6.
Cell Syst ; 9(6): 609-613.e3, 2019 12 18.
Article in English | MEDLINE | ID: mdl-31812694

ABSTRACT

The decreasing cost of DNA sequencing over the past decade has led to an explosion of sequencing datasets, leaving us with petabytes of data to analyze. However, current sequencing visualization tools are designed to run on single machines, which limits their scalability and interactivity on modern genomic datasets. Here, we leverage the scalability of Apache Spark to provide Mango, consisting of a Jupyter notebook and genome browser, which removes scalability and interactivity constraints by leveraging multi-node compute clusters to allow interactive analysis over terabytes of sequencing data. We demonstrate scalability of the Mango tools by performing quality control analyses on 10 terabytes of 100 high-coverage sequencing samples from the Simons Genome Diversity Project, enabling capability for interactive genomic exploration of multi-sample datasets that surpass the computational limitations of single-node visualization tools. Mango is freely available for download with full documentation at https://bdg-mango.readthedocs.io/en/latest/.


Subject(s)
Genomics/methods , Sequence Analysis, DNA/methods , Algorithms , Big Data , Data Analysis , Genome/genetics , High-Throughput Nucleotide Sequencing/methods , Software
7.
F1000Res ; 6: 1795, 2017.
Article in English | MEDLINE | ID: mdl-29123647

ABSTRACT

The impact of structural variants (SVs) on a variety of organisms and diseases like cancer has become increasingly evident. Methods for SV detection when studying genomic differences across cells, individuals or populations are being actively developed. Currently, just a few methods are available to compare different SVs callsets, and no specialized methods are available to annotate SVs that account for the unique characteristics of these variant types. Here, we introduce SURVIVOR_ant, a tool that compares types and breakpoints for candidate SVs from different callsets and enables fast comparison of SVs to genomic features such as genes and repetitive regions, as well as to previously established SV datasets such as from the 1000 Genomes Project. As proof of concept we compared 16 SV callsets generated by different SV calling methods on a single genome, the Genome in a Bottle sample HG002 (Ashkenazi son), and annotated the SVs with gene annotations, 1000 Genomes Project SV calls, and four different types of repetitive regions. Computation time to annotate 134,528 SVs with 33,954 of annotations was 22 seconds on a laptop.

9.
Am J Hum Genet ; 100(5): 695-705, 2017 May 04.
Article in English | MEDLINE | ID: mdl-28475856

ABSTRACT

Provision of a molecularly confirmed diagnosis in a timely manner for children and adults with rare genetic diseases shortens their "diagnostic odyssey," improves disease management, and fosters genetic counseling with respect to recurrence risks while assuring reproductive choices. In a general clinical genetics setting, the current diagnostic rate is approximately 50%, but for those who do not receive a molecular diagnosis after the initial genetics evaluation, that rate is much lower. Diagnostic success for these more challenging affected individuals depends to a large extent on progress in the discovery of genes associated with, and mechanisms underlying, rare diseases. Thus, continued research is required for moving toward a more complete catalog of disease-related genes and variants. The International Rare Diseases Research Consortium (IRDiRC) was established in 2011 to bring together researchers and organizations invested in rare disease research to develop a means of achieving molecular diagnosis for all rare diseases. Here, we review the current and future bottlenecks to gene discovery and suggest strategies for enabling progress in this regard. Each successful discovery will define potential diagnostic, preventive, and therapeutic opportunities for the corresponding rare disease, enabling precision medicine for this patient population.


Subject(s)
International Cooperation , Rare Diseases/diagnosis , Rare Diseases/genetics , Databases, Factual , Exome , Genome, Human , Humans
10.
Nucleic Acids Res ; 45(D1): D985-D994, 2017 01 04.
Article in English | MEDLINE | ID: mdl-27899665

ABSTRACT

We have designed and developed a data integration and visualization platform that provides evidence about the association of known and potential drug targets with diseases. The platform is designed to support identification and prioritization of biological targets for follow-up. Each drug target is linked to a disease using integrated genome-wide data from a broad range of data sources. The platform provides either a target-centric workflow to identify diseases that may be associated with a specific target, or a disease-centric workflow to identify targets that may be associated with a specific disease. Users can easily transition between these target- and disease-centric workflows. The Open Targets Validation Platform is accessible at https://www.targetvalidation.org.


Subject(s)
Computational Biology/methods , Molecular Targeted Therapy , Search Engine , Software , Databases, Factual , Humans , Molecular Targeted Therapy/methods , Reproducibility of Results , Web Browser , Workflow
11.
Hum Mutat ; 36(10): 915-21, 2015 Oct.
Article in English | MEDLINE | ID: mdl-26295439

ABSTRACT

There are few better examples of the need for data sharing than in the rare disease community, where patients, physicians, and researchers must search for "the needle in a haystack" to uncover rare, novel causes of disease within the genome. Impeding the pace of discovery has been the existence of many small siloed datasets within individual research or clinical laboratory databases and/or disease-specific organizations, hoping for serendipitous occasions when two distant investigators happen to learn they have a rare phenotype in common and can "match" these cases to build evidence for causality. However, serendipity has never proven to be a reliable or scalable approach in science. As such, the Matchmaker Exchange (MME) was launched to provide a robust and systematic approach to rare disease gene discovery through the creation of a federated network connecting databases of genotypes and rare phenotypes using a common application programming interface (API). The core building blocks of the MME have been defined and assembled. Three MME services have now been connected through the API and are available for community use. Additional databases that support internal matching are anticipated to join the MME network as it continues to grow.


Subject(s)
Genetic Predisposition to Disease/genetics , Information Dissemination/methods , Rare Diseases/genetics , Database Management Systems , Databases, Genetic , Genetic Association Studies , Humans , Software
13.
J Gen Intern Med ; 29 Suppl 3: S780-7, 2014 Aug.
Article in English | MEDLINE | ID: mdl-25029978

ABSTRACT

Research into rare diseases is typically fragmented by data type and disease. Individual efforts often have poor interoperability and do not systematically connect data across clinical phenotype, genomic data, biomaterial availability, and research/trial data sets. Such data must be linked at both an individual-patient and whole-cohort level to enable researchers to gain a complete view of their disease and patient population of interest. Data access and authorization procedures are required to allow researchers in multiple institutions to securely compare results and gain new insights. Funded by the European Union's Seventh Framework Programme under the International Rare Diseases Research Consortium (IRDiRC), RD-Connect is a global infrastructure project initiated in November 2012 that links genomic data with registries, biobanks, and clinical bioinformatics tools to produce a central research resource for rare diseases.


Subject(s)
Biological Specimen Banks , Computational Biology , Databases, Factual , Health Information Exchange , Rare Diseases , Registries , Humans
14.
Nucleic Acids Res ; 41(Database issue): D936-41, 2013 Jan.
Article in English | MEDLINE | ID: mdl-23193291

ABSTRACT

Much has changed in the last two years at DGVa (http://www.ebi.ac.uk/dgva) and dbVar (http://www.ncbi.nlm.nih.gov/dbvar). We are now processing direct submissions rather than only curating data from the literature and our joint study catalog includes data from over 100 studies in 11 organisms. Studies from human dominate with data from control and case populations, tumor samples as well as three large curated studies derived from multiple sources. During the processing of these data, we have made improvements to our data model, submission process and data representation. Additionally, we have made significant improvements in providing access to these data via web and FTP interfaces.


Subject(s)
Databases, Nucleic Acid , Genomic Structural Variation , Genotype , Humans , Internet , Phenotype
15.
Genet Epidemiol ; 35(8): 887-98, 2011 Dec.
Article in English | MEDLINE | ID: mdl-22125226

ABSTRACT

Genome-wide association studies (GWAS) are a useful approach in the study of the genetic components of complex phenotypes. Aside from large cohorts, GWAS have generally been limited to the study of one or a few diseases or traits. The emergence of biobanks linked to electronic medical records (EMRs) allows the efficient reuse of genetic data to yield meaningful genotype-phenotype associations for multiple phenotypes or traits. Phase I of the electronic MEdical Records and GEnomics (eMERGE-I) Network is a National Human Genome Research Institute-supported consortium composed of five sites to perform various genetic association studies using DNA repositories and EMR systems. Each eMERGE site has developed EMR-based algorithms to comprise a core set of 14 phenotypes for extraction of study samples from each site's DNA repository. Each eMERGE site selected samples for a specific phenotype, and these samples were genotyped at either the Broad Institute or at the Center for Inherited Disease Research using the Illumina Infinium BeadChip technology. In all, approximately 17,000 samples from across the five sites were genotyped. A unified quality control (QC) pipeline was developed by the eMERGE Genomics Working Group and used to ensure thorough cleaning of the data. This process includes examination of sample and marker quality and various batch effects. Upon completion of the genotyping and QC analyses for each site's primary study, eMERGE Coordinating Center merged the datasets from all five sites. This larger merged dataset reentered the established eMERGE QC pipeline. Based on lessons learned during the process, additional analyses and QC checkpoints were added to the pipeline to ensure proper merging. Here, we explore the challenges associated with combining datasets from different genotyping centers and describe the expansion to eMERGE QC pipeline for merged datasets. These additional steps will be useful as the eMERGE project expands to include additional sites in eMERGE-II, and also serve as a starting point for investigators merging multiple genotype datasets accessible through the National Center for Biotechnology Information in the database of Genotypes and Phenotypes. Our experience demonstrates that merging multiple datasets after additional QC can be an efficient use of genotype data despite new challenges that appear in the process.


Subject(s)
Electronic Health Records , Genome-Wide Association Study/standards , Quality Control , Algorithms , Genotype , Humans , National Human Genome Research Institute (U.S.) , Phenotype , United States
16.
Nat Rev Genet ; 12(10): 730-6, 2011 09 16.
Article in English | MEDLINE | ID: mdl-21921928

ABSTRACT

Access to genetic data across studies is an important aspect of identifying new genetic associations through genome-wide association studies (GWASs). Meta-analysis across multiple GWASs with combined cohort sizes of tens of thousands of individuals often uncovers many more genome-wide associated loci than the original individual studies; this emphasizes the importance of tools and mechanisms for data sharing. However, even sharing summary-level data, such as allele frequencies, inherently carries some degree of privacy risk to study participants. Here we discuss mechanisms and resources for sharing data from GWASs, particularly focusing on approaches for assessing and quantifying the privacy risks to participants that result from the sharing of summary-level data.


Subject(s)
Data Collection , Genetic Variation , Genome-Wide Association Study , Information Dissemination/methods , Cohort Studies , Confidentiality , Data Collection/legislation & jurisprudence , Databases, Genetic , Genetic Variation/physiology , Genome-Wide Association Study/methods , Genome-Wide Association Study/statistics & numerical data , Humans , Information Dissemination/legislation & jurisprudence , Meta-Analysis as Topic , Polymorphism, Single Nucleotide , Risk Assessment
17.
Genet Med ; 13(9): 777-84, 2011 Sep.
Article in English | MEDLINE | ID: mdl-21844811

ABSTRACT

PURPOSE: Copy number variants have emerged as a major cause of human disease such as autism and intellectual disabilities. Because copy number variants are common in normal individuals, determining the functional and clinical significance of rare copy number variants in patients remains challenging. The adoption of whole-genome chromosomal microarray analysis as a first-tier diagnostic test for individuals with unexplained developmental disabilities provides a unique opportunity to obtain large copy number variant datasets generated through routine patient care. METHODS: A consortium of diagnostic laboratories was established (the International Standards for Cytogenomic Arrays consortium) to share copy number variant and phenotypic data in a central, public database. We present the largest copy number variant case-control study to date comprising 15,749 International Standards for Cytogenomic Arrays cases and 10,118 published controls, focusing our initial analysis on recurrent deletions and duplications involving 14 copy number variant regions. RESULTS: Compared with controls, 14 deletions and seven duplications were significantly overrepresented in cases, providing a clinical diagnosis as pathogenic. CONCLUSION: Given the rapid expansion of clinical chromosomal microarray analysis testing, very large datasets will be available to determine the functional significance of increasingly rare copy number variants. This data will provide an evidence-based guide to clinicians across many disciplines involved in the diagnosis, management, and care of these patients and their families.


Subject(s)
DNA Copy Number Variations , Developmental Disabilities/genetics , Evidence-Based Medicine/methods , Intellectual Disability/genetics , Cytogenetic Analysis , Gene Dosage , Genome, Human , Humans
18.
Curr Protoc Hum Genet ; Chapter 1: Unit1.19, 2011 Jan.
Article in English | MEDLINE | ID: mdl-21234875

ABSTRACT

Genome-wide association studies (GWAS) are being conducted at an unprecedented rate in population-based cohorts and have increased our understanding of the pathophysiology of complex disease. Regardless of context, the practical utility of this information will ultimately depend upon the quality of the original data. Quality control (QC) procedures for GWAS are computationally intensive, operationally challenging, and constantly evolving. Here we enumerate some of the challenges in QC of GWAS data and describe the approaches that the electronic MEdical Records and Genomics (eMERGE) network is using for quality assurance in GWAS data, thereby minimizing potential bias and error in GWAS results. We discuss common issues associated with QC of GWAS data, including data file formats, software packages for data manipulation and analysis, sex chromosome anomalies, sample identity, sample relatedness, population substructure, batch effects, and marker quality. We propose best practices and discuss areas of ongoing and future research.


Subject(s)
Genome-Wide Association Study/standards , Software , Electronic Health Records , Genome-Wide Association Study/methods , Genomics , Genotype , Humans , Phenotype , Quality Control
20.
Nat Genet ; 42(9): 781-5, 2010 Sep.
Article in English | MEDLINE | ID: mdl-20711177

ABSTRACT

Parkinson's disease is a common disorder that leads to motor and cognitive disability. We performed a genome-wide association study of 2,000 individuals with Parkinson's disease (cases) and 1,986 unaffected controls from the NeuroGenetics Research Consortium (NGRC). We confirmed associations with SNCA and MAPT, replicated an association with GAK (using data from the NGRC and a previous study, P = 3.2 x 10(-9)) and detected a new association with the HLA region (using data from the NGRC only, P = 2.9 x 10(-8)), which replicated in two datasets (meta-analysis P = 1.9 x 10(-10)). The HLA association was uniform across all genetic and environmental risk strata and was strong in sporadic (P = 5.5 x 10(-10)) and late-onset (P = 2.4 x 10(-8)) disease. The association peak we found was at rs3129882, a noncoding variant in HLA-DRA. Two studies have previously suggested that rs3129882 influences expression of HLA-DR and HLA-DQ. The brains of individuals with Parkinson's disease show upregulation of DR antigens and the presence of DR-positive reactive microglia, and nonsteroidal anti-inflammatory drugs reduce Parkinson's disease risk. The genetic association with HLA supports the involvement of the immune system in Parkinson's disease and offers new targets for drug development.


Subject(s)
HLA Antigens/genetics , Parkinson Disease/genetics , Adult , Age of Onset , Aged , Aged, 80 and over , Case-Control Studies , Female , Genetic Linkage , Genetic Predisposition to Disease , Genetic Variation/physiology , Genome-Wide Association Study , Humans , Male , Meta-Analysis as Topic , Middle Aged , Odds Ratio , Parkinson Disease/epidemiology , Young Adult
SELECTION OF CITATIONS
SEARCH DETAIL
...