Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 8 de 8
Filter
Add more filters










Database
Language
Publication year range
1.
Wellcome Open Res ; 8: 507, 2023.
Article in English | MEDLINE | ID: mdl-38046191

ABSTRACT

We present a genome assembly from an individual male Anopheles moucheti (the malaria mosquito; Arthropoda; Insecta; Diptera; Culicidae), from a wild population in Cameroon. The genome sequence is 271 megabases in span. The majority of the assembly is scaffolded into three chromosomal pseudomolecules with the X sex chromosome assembled. The complete mitochondrial genome was also assembled and is 15.5 kilobases in length.

2.
Wellcome Open Res ; 8: 74, 2023.
Article in English | MEDLINE | ID: mdl-37424773

ABSTRACT

We present a genome assembly from an individual female Anopheles gambiae (the malaria mosquito; Arthropoda; Insecta; Diptera; Culicidae), Ifakara strain. The genome sequence is 264 megabases in span. Most of the assembly is scaffolded into three chromosomal pseudomolecules with the X sex chromosome assembled. The complete mitochondrial genome was also assembled and is 15.4 kilobases in length.

3.
Wellcome Open Res ; 7: 287, 2022.
Article in English | MEDLINE | ID: mdl-36874567

ABSTRACT

We present a genome assembly from an individual female Anopheles funestus (the malaria mosquito; Arthropoda; Insecta; Diptera; Culicidae). The genome sequence is 251 megabases in span. The majority of the assembly is scaffolded into three chromosomal pseudomolecules with the X sex chromosome assembled. The complete mitochondrial genome was also assembled and is 15.4 kilobases in length.

4.
Nat Methods ; 17(6): 615-620, 2020 06.
Article in English | MEDLINE | ID: mdl-32366989

ABSTRACT

Methods to deconvolve single-cell RNA-sequencing (scRNA-seq) data are necessary for samples containing a mixture of genotypes, whether they are natural or experimentally combined. Multiplexing across donors is a popular experimental design that can avoid batch effects, reduce costs and improve doublet detection. By using variants detected in scRNA-seq reads, it is possible to assign cells to their donor of origin and identify cross-genotype doublets that may have highly similar transcriptional profiles, precluding detection by transcriptional profile. More subtle cross-genotype variant contamination can be used to estimate the amount of ambient RNA. Ambient RNA is caused by cell lysis before droplet partitioning and is an important confounder of scRNA-seq analysis. Here we develop souporcell, a method to cluster cells using the genetic variants detected within the scRNA-seq reads. We show that it achieves high accuracy on genotype clustering, doublet detection and ambient RNA estimation, as demonstrated across a range of challenging scenarios.


Subject(s)
RNA-Seq/methods , RNA/genetics , Single-Cell Analysis/methods , Algorithms , Base Sequence , Cell Line , Cluster Analysis , Genotype , Humans , Polymorphism, Single Nucleotide , Sensitivity and Specificity , Software
5.
Science ; 365(6455)2019 08 23.
Article in English | MEDLINE | ID: mdl-31439762

ABSTRACT

Malaria parasites adopt a remarkable variety of morphological life stages as they transition through multiple mammalian host and mosquito vector environments. We profiled the single-cell transcriptomes of thousands of individual parasites, deriving the first high-resolution transcriptional atlas of the entire Plasmodium berghei life cycle. We then used our atlas to precisely define developmental stages of single cells from three different human malaria parasite species, including parasites isolated directly from infected individuals. The Malaria Cell Atlas provides both a comprehensive view of gene usage in a eukaryotic parasite and an open-access reference dataset for the study of malaria parasites.


Subject(s)
Atlases as Topic , Genes, Protozoan/physiology , Life Cycle Stages/genetics , Malaria/parasitology , Plasmodium berghei/genetics , Plasmodium berghei/physiology , Transcriptome , Animals , Anopheles/parasitology , HeLa Cells , Humans , Plasmodium berghei/isolation & purification , Single-Cell Analysis
6.
Nat Biotechnol ; 37(5): 561-566, 2019 05.
Article in English | MEDLINE | ID: mdl-30936564

ABSTRACT

Benchmark small variant calls are required for developing, optimizing and assessing the performance of sequencing and bioinformatics methods. Here, as part of the Genome in a Bottle (GIAB) Consortium, we apply a reproducible, cloud-based pipeline to integrate multiple short- and linked-read sequencing datasets and provide benchmark calls for human genomes. We generate benchmark calls for one previously analyzed GIAB sample, as well as six genomes from the Personal Genome Project. These new genomes have broad, open consent, making this a 'first of its kind' resource that is available to the community for multiple downstream applications. We produce 17% more benchmark single nucleotide variations, 176% more indels and 12% larger benchmark regions than previously published GIAB benchmarks. We demonstrate that this benchmark reliably identifies errors in existing callsets and highlight challenges in interpreting performance metrics when using benchmarks that are not perfect or comprehensive. Finally, we identify strengths and weaknesses of callsets by stratifying performance according to variant type and genome context.


Subject(s)
Benchmarking , Computational Biology/trends , Genome, Human/genetics , Genomics/trends , Genetic Variation/genetics , High-Throughput Nucleotide Sequencing , Humans , INDEL Mutation/genetics , Polymorphism, Single Nucleotide , Software/trends
7.
Genome Res ; 29(4): 635-645, 2019 04.
Article in English | MEDLINE | ID: mdl-30894395

ABSTRACT

Large-scale population analyses coupled with advances in technology have demonstrated that the human genome is more diverse than originally thought. To date, this diversity has largely been uncovered using short-read whole-genome sequencing. However, these short-read approaches fail to give a complete picture of a genome. They struggle to identify structural events, cannot access repetitive regions, and fail to resolve the human genome into haplotypes. Here, we describe an approach that retains long range information while maintaining the advantages of short reads. Starting from ∼1 ng of high molecular weight DNA, we produce barcoded short-read libraries. Novel informatic approaches allow for the barcoded short reads to be associated with their original long molecules producing a novel data type known as "Linked-Reads". This approach allows for simultaneous detection of small and large variants from a single library. In this manuscript, we show the advantages of Linked-Reads over standard short-read approaches for reference-based analysis. Linked-Reads allow mapping to 38 Mb of sequence not accessible to short reads, adding sequence in 423 difficult-to-sequence genes including disease-relevant genes STRC, SMN1, and SMN2 Both Linked-Read whole-genome and whole-exome sequencing identify complex structural variations, including balanced events and single exon deletions and duplications. Further, Linked-Reads extend the region of high-confidence calls by 68.9 Mb. The data presented here show that Linked-Reads provide a scalable approach for comprehensive genome analysis that is not possible using short reads alone.


Subject(s)
Genome-Wide Association Study/methods , Polymorphism, Genetic , Whole Genome Sequencing/methods , Cell Line , Genome, Human , Humans , Intercellular Signaling Peptides and Proteins , Membrane Proteins/genetics , Survival of Motor Neuron 1 Protein/genetics , Survival of Motor Neuron 2 Protein/genetics
8.
Genes (Basel) ; 10(1)2019 01 18.
Article in English | MEDLINE | ID: mdl-30669388

ABSTRACT

A high-quality reference genome is a fundamental resource for functional genetics, comparative genomics, and population genomics, and is increasingly important for conservation biology. PacBio Single Molecule, Real-Time (SMRT) sequencing generates long reads with uniform coverage and high consensus accuracy, making it a powerful technology for de novo genome assembly. Improvements in throughput and concomitant reductions in cost have made PacBio an attractive core technology for many large genome initiatives, however, relatively high DNA input requirements (~5 µg for standard library protocol) have placed PacBio out of reach for many projects on small organisms that have lower DNA content, or on projects with limited input DNA for other reasons. Here we present a high-quality de novo genome assembly from a single Anopheles coluzzii mosquito. A modified SMRTbell library construction protocol without DNA shearing and size selection was used to generate a SMRTbell library from just 100 ng of starting genomic DNA. The sample was run on the Sequel System with chemistry 3.0 and software v6.0, generating, on average, 25 Gb of sequence per SMRT Cell with 20 h movies, followed by diploid de novo genome assembly with FALCON-Unzip. The resulting curated assembly had high contiguity (contig N50 3.5 Mb) and completeness (more than 98% of conserved genes were present and full-length). In addition, this single-insect assembly now places 667 (>90%) of formerly unplaced genes into their appropriate chromosomal contexts in the AgamP4 PEST reference. We were also able to resolve maternal and paternal haplotypes for over 1/3 of the genome. By sequencing and assembling material from a single diploid individual, only two haplotypes were present, simplifying the assembly process compared to samples from multiple pooled individuals. The method presented here can be applied to samples with starting DNA amounts as low as 100 ng per 1 Gb genome size. This new low-input approach puts PacBio-based assemblies in reach for small highly heterozygous organisms that comprise much of the diversity of life.


Subject(s)
Anopheles/genetics , Genome, Insect , Sequence Analysis, DNA/methods , Animals , Contig Mapping/methods , Contig Mapping/standards , Ploidies , Polymorphism, Genetic , Sequence Analysis, DNA/standards
SELECTION OF CITATIONS
SEARCH DETAIL