Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 7 de 7
Filter
1.
Am J Hum Genet ; 108(4): 597-607, 2021 04 01.
Article in English | MEDLINE | ID: mdl-33675682

ABSTRACT

Each human genome includes de novo mutations that arose during gametogenesis. While these germline mutations represent a fundamental source of new genetic diversity, they can also create deleterious alleles that impact fitness. Whereas the rate and patterns of point mutations in the human germline are now well understood, far less is known about the frequency and features that impact de novo structural variants (dnSVs). We report a family-based study of germline mutations among 9,599 human genomes from 33 multigenerational CEPH-Utah families and 2,384 families from the Simons Foundation Autism Research Initiative. We find that de novo structural mutations detected by alignment-based, short-read WGS occur at an overall rate of at least 0.160 events per genome in unaffected individuals, and we observe a significantly higher rate (0.206 per genome) in ASD-affected individuals. In both probands and unaffected samples, nearly 73% of de novo structural mutations arose in paternal gametes, and we predict most de novo structural mutations to be caused by mutational mechanisms that do not require sequence homology. After multiple testing correction, we did not observe a statistically significant correlation between parental age and the rate of de novo structural variation in offspring. These results highlight that a spectrum of mutational mechanisms contribute to germline structural mutations and that these mechanisms most likely have markedly different rates and selective pressures than those leading to point mutations.


Subject(s)
Family , Genome, Human/genetics , Germ Cells , Germ-Line Mutation/genetics , Mutation Rate , Aging/genetics , Autistic Disorder/genetics , Bias , DNA Copy Number Variations/genetics , DNA Mutational Analysis , Female , Humans , Male , Paternal Age , Point Mutation/genetics
2.
Bioinformatics ; 37(24): 4860-4861, 2021 12 11.
Article in English | MEDLINE | ID: mdl-34146087

ABSTRACT

SUMMARY: Unfazed is a command-line tool to determine the parental gamete of origin for de novo mutations from paired-end Illumina DNA sequencing reads. Unfazed uses variant information for a sequenced trio to identify the parental gamete of origin by linking phase-informative inherited variants to de novo mutations using read-based phasing. It achieves a high success rate by chaining reads into haplotype groups, thus increasing the search space for informative sites. Unfazed provides a simple command-line interface and scales well to large inputs, determining parent-of-origin for nearly 30 000 de novo variants in under 60 h. AVAILABILITY AND IMPLEMENTATION: Unfazed is available at https://github.com/jbelyeu/unfazed. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Algorithms , Software , Sequence Analysis, DNA , Haplotypes , High-Throughput Nucleotide Sequencing
3.
PLoS Comput Biol ; 16(1): e1007625, 2020 01.
Article in English | MEDLINE | ID: mdl-32004313

ABSTRACT

Ribosome profiling, an application of nucleic acid sequencing for monitoring ribosome activity, has revolutionized our understanding of protein translation dynamics. This technique has been available for a decade, yet the current state and standardization of publicly available computational tools for these data is bleak. We introduce XPRESSyourself, an analytical toolkit that eliminates barriers and bottlenecks associated with this specialized data type by filling gaps in the computational toolset for both experts and non-experts of ribosome profiling. XPRESSyourself automates and standardizes analysis procedures, decreasing time-to-discovery and increasing reproducibility. This toolkit acts as a reference implementation of current best practices in ribosome profiling analysis. We demonstrate this toolkit's performance on publicly available ribosome profiling data by rapidly identifying hypothetical mechanisms related to neurodegenerative phenotypes and neuroprotective mechanisms of the small-molecule ISRIB during acute cellular stress. XPRESSyourself brings robust, rapid analysis of ribosome-profiling data to a broad and ever-expanding audience and will lead to more reproducible and accessible measurements of translation regulation. XPRESSyourself software is perpetually open-source under the GPL-3.0 license and is hosted at https://github.com/XPRESSyourself, where users can access additional documentation and report software issues.


Subject(s)
Computational Biology/methods , RNA/genetics , Ribosomes/genetics , Sequence Analysis, RNA/methods , Software , Databases, Genetic , HEK293 Cells , High-Throughput Nucleotide Sequencing/methods , Humans , Internet , Protein Biosynthesis/genetics , Reproducibility of Results
4.
BMC Med Genomics ; 17(1): 255, 2024 Oct 24.
Article in English | MEDLINE | ID: mdl-39449055

ABSTRACT

The abundance of Lp(a) protein holds significant implications for the risk of cardiovascular disease (CVD), which is directly impacted by the copy number (CN) of KIV-2, a 5.5 kbp sub-region. KIV-2 is highly polymorphic in the population and accurate analysis is challenging. In this study, we present the DRAGEN KIV-2 CN caller, which utilizes short reads. Data across 166 WGS show that the caller has high accuracy, compared to optical mapping and can further phase approximately 50% of the samples. We compared KIV-2 CN numbers to 24 previously postulated KIV-2 relevant SNVs, revealing that many are ineffective predictors of KIV-2 copy number. Population studies, including USA-based cohorts, showed distinct KIV-2 CN, distributions for European-, African-, and Hispanic-American populations and further underscored the limitations of SNV predictors. We demonstrate that the CN estimates correlate significantly with the available Lp(a) protein levels and that phasing is highly important.


Subject(s)
Alleles , Cardiovascular Diseases , Lipoprotein(a) , Humans , Cardiovascular Diseases/genetics , Lipoprotein(a)/genetics , Lipoprotein(a)/blood , DNA Copy Number Variations , Genetic Predisposition to Disease , Polymorphism, Single Nucleotide
5.
Nat Commun ; 12(1): 2151, 2021 04 12.
Article in English | MEDLINE | ID: mdl-33846313

ABSTRACT

The rapid increase in the amount of genomic data provides researchers with an opportunity to integrate diverse datasets and annotations when addressing a wide range of biological questions. However, genomic datasets are deposited on different platforms and are stored in numerous formats from multiple genome builds, which complicates the task of collecting, annotating, transforming, and integrating data as needed. Here, we developed Go Get Data (GGD) as a fast, reproducible approach to installing standardized data recipes. GGD is available on Github ( https://gogetdata.github.io/ ), is extendable to other data types, and can streamline the complexities typically associated with data integration, saving researchers time and improving research reproducibility.


Subject(s)
Algorithms , Genomics , Reproducibility of Results , User-Computer Interface
6.
Genome Biol ; 22(1): 161, 2021 05 25.
Article in English | MEDLINE | ID: mdl-34034781

ABSTRACT

Visual validation is an important step to minimize false-positive predictions from structural variant (SV) detection. We present Samplot, a tool for creating images that display the read depth and sequence alignments necessary to adjudicate purported SVs across samples and sequencing technologies. These images can be rapidly reviewed to curate large SV call sets. Samplot is applicable to many biological problems such as SV prioritization in disease studies, analysis of inherited variation, or de novo SV review. Samplot includes a machine learning package that dramatically decreases the number of false positives without human review. Samplot is available at https://github.com/ryanlayer/samplot .


Subject(s)
Genomic Structural Variation , Software , Automation , Chromosome Inversion , Gene Duplication , Reproducibility of Results , Translocation, Genetic
7.
Gigascience ; 7(7)2018 07 01.
Article in English | MEDLINE | ID: mdl-29860504

ABSTRACT

SV-plaudit is a framework for rapidly curating structural variant (SV) predictions. For each SV, we generate an image that visualizes the coverage and alignment signals from a set of samples. Images are uploaded to our cloud framework where users assess the quality of each image using a client-side web application. Reports can then be generated as a tab-delimited file or annotated Variant Call Format (VCF) file. As a proof of principle, nine researchers collaborated for 1 hour to evaluate 1,350 SVs each. We anticipate that SV-plaudit will become a standard step in variant calling pipelines and the crowd-sourced curation of other biological results.Code available at https://github.com/jbelyeu/SV-plauditDemonstration video available at https://www.youtube.com/watch?v=ono8kHMKxDs.


Subject(s)
Genomics/methods , High-Throughput Nucleotide Sequencing , Medical Informatics/methods , Sequence Alignment , Sequence Analysis, DNA , False Positive Reactions , Genetic Variation , Genome, Human , Humans , Internet , Software
SELECTION OF CITATIONS
SEARCH DETAIL