Your browser doesn't support javascript.
loading
: 20 | 50 | 100
1 - 9 de 9
1.
Sci Rep ; 14(1): 7028, 2024 03 25.
Article En | MEDLINE | ID: mdl-38528062

Accurate indel calling plays an important role in precision medicine. A benchmarking indel set is essential for thoroughly evaluating the indel calling performance of bioinformatics pipelines. A reference sample with a set of known-positive variants was developed in the FDA-led Sequencing Quality Control Phase 2 (SEQC2) project, but the known indels in the known-positive set were limited. This project sought to provide an enriched set of known indels that would be more translationally relevant by focusing on additional cancer related regions. A thorough manual review process completed by 42 reviewers, two advisors, and a judging panel of three researchers significantly enriched the known indel set by an additional 516 indels. The extended benchmarking indel set has a large range of variant allele frequencies (VAFs), with 87% of them having a VAF below 20% in reference Sample A. The reference Sample A and the indel set can be used for comprehensive benchmarking of indel calling across a wider range of VAF values in the lower range. Indel length was also variable, but the majority were under 10 base pairs (bps). Most of the indels were within coding regions, with the remainder in the gene regulatory regions. Although high confidence can be derived from the robust study design and meticulous human review, this extensive indel set has not undergone orthogonal validation. The extended benchmarking indel set, along with the indels in the previously published known-positive set, was the truth set used to benchmark indel calling pipelines in a community challenge hosted on the precisionFDA platform. This benchmarking indel set and reference samples can be utilized for a comprehensive evaluation of indel calling pipelines. Additionally, the insights and solutions obtained during the manual review process can aid in improving the performance of these pipelines.


Benchmarking , High-Throughput Nucleotide Sequencing , Humans , Computational Biology , Quality Control , INDEL Mutation , Polymorphism, Single Nucleotide
2.
Curr Biol ; 31(12): 2728-2736.e8, 2021 06 21.
Article En | MEDLINE | ID: mdl-33878301

Analysis of ancient environmental DNA (eDNA) has revolutionized our ability to describe biological communities in space and time,1-3 by allowing for parallel sequencing of DNA from all trophic levels.4-8 However, because environmental samples contain sparse and fragmented data from multiple individuals, and often contain closely related species,9 the field of ancient eDNA has so far been limited to organellar genomes in its contribution to population and phylogenetic studies.5,6,10,11 This is in contrast to data from fossils12,13 where full-genome studies are routine, despite these being rare and their destruction for sequencing undesirable.14-16 Here, we report the retrieval of three low-coverage (0.03×) environmental genomes from American black bear (Ursus americanus) and a 0.04× environmental genome of the extinct giant short-faced bear (Arctodus simus) from cave sediment samples from northern Mexico dated to 16-14 thousand calibrated years before present (cal kyr BP), which we contextualize with a new high-coverage (26×) and two lower-coverage giant short-faced bear genomes obtained from fossils recovered from Yukon Territory, Canada, which date to ∼22-50 cal kyr BP. We show that the Late Pleistocene black bear population in Mexico is ancestrally related to the present-day Eastern American black bear population, and that the extinct giant short-faced bears present in Mexico were deeply divergent from the earlier Beringian population. Our findings demonstrate the ability to separately analyze genomic-scale DNA sequences of closely related species co-preserved in environmental samples, which brings the use of ancient eDNA into the era of population genomics and phylogenetics.


Ursidae , Animals , DNA, Ancient , DNA, Mitochondrial , Fossils , Humans , Metagenomics , Phylogeny , Ursidae/genetics
3.
J Hered ; 112(4): 377-384, 2021 07 15.
Article En | MEDLINE | ID: mdl-33882130

The Andean bear is the only extant member of the Tremarctine subfamily and the only extant ursid species to inhabit South America. Here, we present an annotated de novo assembly of a nuclear genome from a captive-born female Andean bear, Mischief, generated using a combination of short and long DNA and RNA reads. Our final assembly has a length of 2.23 Gb, and a scaffold N50 of 21.12 Mb, contig N50 of 23.5 kb, and BUSCO score of 88%. The Andean bear genome will be a useful resource for exploring the complex phylogenetic history of extinct and extant bear species and for future population genetics studies of Andean bears.


Ursidae , Animals , Cell Nucleus , Female , Genome , Molecular Sequence Annotation , Phylogeny , South America , Ursidae/genetics
4.
Nature ; 591(7848): 87-91, 2021 03.
Article En | MEDLINE | ID: mdl-33442059

Dire wolves are considered to be one of the most common and widespread large carnivores in Pleistocene America1, yet relatively little is known about their evolution or extinction. Here, to reconstruct the evolutionary history of dire wolves, we sequenced five genomes from sub-fossil remains dating from 13,000 to more than 50,000 years ago. Our results indicate that although they were similar morphologically to the extant grey wolf, dire wolves were a highly divergent lineage that split from living canids around 5.7 million years ago. In contrast to numerous examples of hybridization across Canidae2,3, there is no evidence for gene flow between dire wolves and either North American grey wolves or coyotes. This suggests that dire wolves evolved in isolation from the Pleistocene ancestors of these species. Our results also support an early New World origin of dire wolves, while the ancestors of grey wolves, coyotes and dholes evolved in Eurasia and colonized North America only relatively recently.


Extinction, Biological , Phylogeny , Wolves/classification , Animals , Fossils , Gene Flow , Genome/genetics , Genomics , Geographic Mapping , North America , Paleontology , Phenotype , Wolves/genetics
5.
Nucleic Acids Res ; 48(13): e75, 2020 07 27.
Article En | MEDLINE | ID: mdl-32491177

A high quality genome assembly is a vital first step for the study of an organism. Recent advances in technology have made the creation of high quality chromosome scale assemblies feasible and low cost. However, the amount of input DNA needed for an assembly project can be a limiting factor for small organisms or precious samples. Here we demonstrate the feasibility of creating a chromosome scale assembly using a hybrid method for a low input sample, a single outbred Drosophila melanogaster. Our approach combines an Illumina shotgun library, Oxford nanopore long reads, and chromosome conformation capture for long range scaffolding. This single fly genome assembly has a N50 of 26 Mb, a length that encompasses entire chromosome arms, contains 95% of expected single copy orthologs, and a nearly complete assembly of this individual's Wolbachia endosymbiont. The methods described here enable the accurate and complete assembly of genomes from small, field collected organisms as well as precious clinical samples.


Chromosomes, Bacterial/genetics , Chromosomes, Insect/genetics , Drosophila melanogaster/genetics , Genome, Bacterial/genetics , Genome, Insect/genetics , Wolbachia/genetics , Animals , Genomics/methods
7.
Nat Commun ; 10(1): 4769, 2019 10 18.
Article En | MEDLINE | ID: mdl-31628318

Pumas are the most widely distributed felid in the Western Hemisphere. Increasingly, however, human persecution and habitat loss are isolating puma populations. To explore the genomic consequences of this isolation, we assemble a draft puma genome and a geographically broad panel of resequenced individuals. We estimate that the lineage leading to present-day North American pumas diverged from South American lineages 300-100 thousand years ago. We find signatures of close inbreeding in geographically isolated North American populations, but also that tracts of homozygosity are rarely shared among these populations, suggesting that assisted gene flow would restore local genetic diversity. The genome of a Florida panther descended from translocated Central American individuals has long tracts of homozygosity despite recent outbreeding. This suggests that while translocations may introduce diversity, sustaining diversity in small and isolated populations will require either repeated translocations or restoration of landscape connectivity. Our approach provides a framework for genome-wide analyses that can be applied to the management of similarly small and isolated populations.


Genome-Wide Association Study/methods , Genomics/methods , Inbreeding/methods , Puma/genetics , Animals , Gene Flow , Genetic Variation , Genetics, Population , Geography , North America , Phylogeny , Puma/classification , South America
9.
Commun Biol ; 1: 197, 2018.
Article En | MEDLINE | ID: mdl-30456315

Recent advances in genomic sequencing technology and computational assembly methods have allowed scientists to improve reference genome assemblies in terms of contiguity and composition. EquCab2, a reference genome for the domestic horse, was released in 2007. Although of equal or better quality compared to other first-generation Sanger assemblies, it had many of the shortcomings common to them. In 2014, the equine genomics research community began a project to improve the reference sequence for the horse, building upon the solid foundation of EquCab2 and incorporating new short-read data, long-read data, and proximity ligation data. Here, we present EquCab3. The count of non-N bases in the incorporated chromosomes is improved from 2.33 Gb in EquCab2 to 2.41 Gb in EquCab3. Contiguity has also been improved nearly 40-fold with a contig N50 of 4.5 Mb and scaffold contiguity enhanced to where all but one of the 32 chromosomes is comprised of a single scaffold.

...