Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 19 de 19
Filter
Add more filters










Publication year range
1.
Sci Rep ; 8(1): 384, 2018 01 10.
Article in English | MEDLINE | ID: mdl-29321653

ABSTRACT

Current approaches of single cell DNA-RNA integrated sequencing are difficult to call SNPs, because a large amount of DNA and RNA is lost during DNA-RNA separation. Here, we performed simultaneous single-cell exome and transcriptome sequencing on individual mouse oocytes. Using microinjection, we kept the nuclei intact to avoid DNA loss, while retaining the cytoplasm inside the cell membrane, to maximize the amount of DNA and RNA captured from the single cell. We then conducted exome-sequencing on the isolated nuclei and mRNA-sequencing on the enucleated cytoplasm. For single oocytes, exome-seq can cover up to 92% of exome region with an average sequencing depth of 10+, while mRNA-sequencing reveals more than 10,000 expressed genes in enucleated cytoplasm, with similar performance for intact oocytes. This approach provides unprecedented opportunities to study DNA-RNA regulation, such as RNA editing at single nucleotide level in oocytes. In future, this method can also be applied to other large cells, including neurons, large dendritic cells and large tumour cells for integrated exome and transcriptome sequencing.


Subject(s)
Cell Nucleus/genetics , Cytoplasm/genetics , Exome Sequencing/methods , Sequence Analysis, RNA/methods , Animals , Female , Gene Expression Profiling , High-Throughput Nucleotide Sequencing/methods , Mice , Oocytes/chemistry , Oocytes/cytology , Polymorphism, Single Nucleotide , Single-Cell Analysis/methods
2.
Sci Rep ; 8(1): 1640, 2018 01 26.
Article in English | MEDLINE | ID: mdl-29374225

ABSTRACT

The longest possible haplotype is chromosome haplotype that is a set of co-inherited alleles occurred on a single strand chromosome inherited from one parent. Standard whole-genome shotgun sequencing technologies are limited by the inability to independently study the haplotype of homologous chromosomes due to the short-reads sequencing strategy and disturbance of homologue chromosomes. Here, we investigated several types of chromosomal abnormalities by a dilution-based method to separate an intact copy of homologous chromosome from human metaphase cells, and then single chromosomes were independently amplified by whole-genome amplification methods, converted into barcoded sequencing libraries, and sequenced in multiplexed pools by Illumina sequencers. We analyzed single chromosome derived from single metaphase cells of one patient with balanced chromosomal translocation t(3;5)(q24;q13), one patient with (47, XXY) karyotype and one with (47, XY, 21+) Down syndrome. We determined the translocation region of chromosomes in patient with t(3;5)(q24;q13) balanced chromosomal translocation by shallow whole-genome sequencing, which is helpful to pinpoint the chromosomal break point. We showed that SCS can physically separate and independently sequence three copies of chromosome 21 of Down syndrome patient. SCS has potential applications in personal genomics, single-cell genomics, and clinical diagnosis, particularly in revealing chromosomal level of genetic diseases.


Subject(s)
Chromosomes, Human , Genotyping Techniques/methods , Haplotypes , Single-Cell Analysis/methods , Chromosome Disorders/genetics , Humans , Metaphase , Nucleic Acid Amplification Techniques , Sequence Analysis, DNA
3.
PLoS One ; 12(12): e0188181, 2017.
Article in English | MEDLINE | ID: mdl-29253901

ABSTRACT

Next generation sequencing (NGS) has revolutionized life sciences research. However, GC bias and costly, time-intensive library preparation make NGS an ill fit for increasing sequencing demands in the clinic. A new class of third-generation sequencing platforms has arrived to meet this need, capable of directly measuring DNA and RNA sequences at the single-molecule level without amplification. Here, we use the new GenoCare single-molecule sequencing platform from Direct Genomics to sequence the genome of the M13 virus. Our platform detects single-molecule fluorescence by total internal reflection microscopy, with sequencing-by-synthesis chemistry. We sequenced the genome of M13 to a depth of 316x, with 100% coverage. We determined a consensus sequence accuracy of 100%. In contrast to GC bias inherent to NGS results, we demonstrated that our single-molecule sequencing method yields minimal GC bias.


Subject(s)
Bacteriophage M13/genetics , Genome, Viral , High-Throughput Nucleotide Sequencing/methods , Nucleic Acid Amplification Techniques , Base Composition/genetics , Base Sequence
4.
Sci Rep ; 7(1): 7526, 2017 08 08.
Article in English | MEDLINE | ID: mdl-28790338

ABSTRACT

Cell-free DNA (cfDNA) in plasma has emerged as a potential important biomarker in clinical diagnostics, particularly in cancer. However, somatic mutations are also commonly found in healthy individuals, which interfere with the effectiveness for cancer diagnostics. This study examined the background somatic mutations in white blood cells (WBC) and cfDNA in healthy controls based on sequencing data from 821 non-cancer individuals and several cancer samples with the aim of understanding the patterns of mutations detected in cfDNA. We determined the mutation allele frequencies in both WBC and cfDNA using a panel of 50 cancer-associated genes that covers 20 K-nucleotide region and ultra-deep sequencing with average depth >40000-fold. Our results showed that most of the mutations in cfDNA were highly correlated to WBC. We also observed that the NPM1 gene was the most frequently mutated gene in both WBC and cfDNA. Our study highlighted the importance of sequencing both cfDNA and WBC to improve the sensitivity and accuracy for calling cancer-related mutations from circulating tumour DNA, and shedded light on developing a strategy for early cancer diagnosis by cfDNA sequencing.


Subject(s)
Cell-Free Nucleic Acids/genetics , Circulating Tumor DNA/genetics , Gene Frequency , Mutation , Neoplasms/genetics , Nuclear Proteins/genetics , Adolescent , Adult , Aged , Aged, 80 and over , Alleles , Case-Control Studies , Cell-Free Nucleic Acids/blood , Circulating Tumor DNA/blood , Early Detection of Cancer , Female , Gene Expression , High-Throughput Nucleotide Sequencing , Humans , Leukocytes, Mononuclear/immunology , Leukocytes, Mononuclear/metabolism , Male , Middle Aged , Neoplasms/blood , Neoplasms/diagnosis , Neoplasms/pathology , Nuclear Proteins/blood , Nucleophosmin , Sequence Analysis, DNA
5.
Genomics Proteomics Bioinformatics ; 14(6): 338-348, 2016 Dec.
Article in English | MEDLINE | ID: mdl-28024918

ABSTRACT

Type 1 diabetes mellitus (T1D) is an immune-mediated disease. The autoreactive T cells in T1D patients attack and destroy their own pancreatic cells. In order to systematically investigate the potential autoreactive T cell receptors (TCRs), we used a high-throughput immune repertoire sequencing technique to profile the spectrum of TCRs in individual T1D patients and controls. We sequenced the T cell repertoire of nine T1D patients, four type 2 diabetes (T2D) patients, and six nondiabetic controls. The diversity of the T cell repertoire in T1D patients was significantly decreased in comparison with T2D patients (P=7.0E-08 for CD4+ T cells, P=1.4E-04 for CD8+ T cells) and nondiabetic controls (P=2.7E-09 for CD4+ T cells, P=7.6E-06 for CD8+ T cells). Moreover, T1D patients had significantly more highly-expanded T cell clones than T2D patients (P=5.2E-06 for CD4+ T cells, P=1.9E-07 for CD8+ T cells) and nondiabetic controls (P=1.7E-07 for CD4+ T cells, P=3.3E-03 for CD8+ T cells). Furthermore, we identified a group of highly-expanded T cell receptor clones that are shared by more than two T1D patients. Although further validation in larger cohorts is needed, our data suggest that T cell receptor diversity measurements may become a valuable tool in investigating diabetes, such as using the diversity as an index to distinguish different types of diabetes.


Subject(s)
CD4-Positive T-Lymphocytes/metabolism , CD8-Positive T-Lymphocytes/metabolism , Diabetes Mellitus, Type 1/pathology , Receptors, Antigen, T-Cell/metabolism , Adolescent , Adult , Aged , Amino Acid Sequence , Base Sequence , CD4-Positive T-Lymphocytes/cytology , CD4-Positive T-Lymphocytes/immunology , CD8-Positive T-Lymphocytes/cytology , CD8-Positive T-Lymphocytes/immunology , Case-Control Studies , Child , Diabetes Mellitus, Type 1/immunology , Diabetes Mellitus, Type 1/metabolism , Diabetes Mellitus, Type 2/immunology , Diabetes Mellitus, Type 2/metabolism , Diabetes Mellitus, Type 2/pathology , Female , Histocompatibility Testing , Humans , Male , Middle Aged , Polymerase Chain Reaction , Protein Domains , Receptors, Antigen, T-Cell/chemistry , Receptors, Antigen, T-Cell/genetics , Young Adult
6.
Sci Rep ; 6: 26110, 2016 05 19.
Article in English | MEDLINE | ID: mdl-27193446

ABSTRACT

With the rapid decline in cost of sequencing, it is now affordable to examine multiple genes in a single disease-targeted clinical test using next generation sequencing. Current targeted sequencing methods require a separate step of targeted capture enrichment during sample preparation before sequencing. Although there are fast sample preparation methods available in market, the library preparation process is still relatively complicated for physicians to use routinely. Here, we introduced an amplification-free Single Molecule Targeted Sequencing (SMTS) technology, which combined targeted capture and sequencing in one step. We demonstrated that this technology can detect low-frequency mutations using artificially synthesized DNA sample. SMTS has several potential advantages, including simple sample preparation thus no biases and errors are introduced by PCR reaction. SMTS has the potential to be an easy and quick sequencing technology for clinical diagnosis such as cancer gene mutation detection, infectious disease detection, inherited condition screening and noninvasive prenatal diagnosis.


Subject(s)
Genes, Neoplasm , High-Throughput Nucleotide Sequencing/methods , Mutation , Neoplasms/pathology , Pathology, Molecular/methods , Sequence Analysis, DNA/methods , Humans , Neoplasms/diagnosis
7.
Interdiscip Sci ; 8(1): 23-7, 2016 Mar.
Article in English | MEDLINE | ID: mdl-26267707

ABSTRACT

Warfarin is a drug normally used in the prevention of thrombosis and the formation of blood clots. The dosage of warfarin is strongly affected by genetic variants of CYP2C9 and VKORC1 genes. Current technologies for detecting the variants of these genes are mainly based on real-time PCR. In recent years, due to the rapidly dropping cost of whole genome sequencing and genotyping, more and more people get their whole genome sequenced or genotyped. However, current software for warfarin dosing prediction is based on low-throughput genetic information from either real-time PCR or melting curve methods. There is no bioinformatics tool available that can take the high-throughput genome sequencing data as input and determine the accurate dosage of warfarin. Here, we present PGWD, a web tool that analyzes personal genome sequencing data and integrates with clinical information for warfarin dosing.


Subject(s)
Genome, Human , Software , Warfarin/administration & dosage , Aged , Algorithms , Dose-Response Relationship, Drug , Genotype , Humans , Middle Aged , Pharmacogenetics , Polymorphism, Single Nucleotide/genetics , User-Computer Interface
8.
PLoS One ; 10(10): e0141105, 2015.
Article in English | MEDLINE | ID: mdl-26496198

ABSTRACT

We present Virtual Pharmacist, a web-based platform that takes common types of high-throughput data, namely microarray SNP genotyping data, FASTQ and Variant Call Format (VCF) files as inputs, and reports potential drug responses in terms of efficacy, dosage and toxicity at one glance. Batch submission facilitates multivariate analysis or data mining of targeted groups. Individual analysis consists of a report that is readily comprehensible to patients and practioners who have basic knowledge in pharmacology, a table that summarizes variants and potential affected drug response according to the US Food and Drug Administration pharmacogenomic biomarker labeled drug list and PharmGKB, and visualization of a gene-drug-target network. Group analysis provides the distribution of the variants and potential affected drug response of a target group, a sample-gene variant count table, and a sample-drug count table. Our analysis of genomes from the 1000 Genome Project underlines the potentially differential drug responses among different human populations. Even within the same population, the findings from Watson's genome highlight the importance of personalized medicine. Virtual Pharmacist can be accessed freely at http://www.sustc-genome.org.cn/vp or installed as a local web server. The codes and documentation are available at the GitHub repository (https://github.com/VirtualPharmacist/vp). Administrators can download the source codes to customize access settings for further development.


Subject(s)
Genome, Human , Pharmacogenetics , Prescription Drugs/therapeutic use , User-Computer Interface , Data Mining , Genotype , Humans , Internet , Multivariate Analysis , Precision Medicine , Prescription Drugs/adverse effects , Prescription Drugs/pharmacokinetics
9.
Sci Rep ; 5: 11415, 2015 Jun 19.
Article in English | MEDLINE | ID: mdl-26091148

ABSTRACT

Single-cell genomic analysis has grown rapidly in recent years and finds widespread applications in various fields of biology, including cancer biology, development, immunology, pre-implantation genetic diagnosis, and neurobiology. To date, the amplification bias, amplification uniformity and reproducibility of the three major single cell whole genome amplification methods (GenomePlex WGA4, MDA and MALBAC) have not been systematically investigated using mammalian cells. In this study, we amplified genomic DNA from individual hippocampal neurons using three single-cell DNA amplification methods, and sequenced them at shallow depth. We then systematically evaluated the GC-bias, reproducibility, and copy number variations among individual neurons. Our results showed that single-cell genome sequencing results obtained from the MALBAC and WGA4 methods are highly reproducible and have a high success rate. The MALBAC displays significant biases towards high GC content. We then attempted to correct the GC bias issue by developing a bioinformatics pipeline, which allows us to call CNVs in single cell sequencing data, and chromosome level and sub-chromosomal level CNVs among individual neurons can be detected. We also proposed a metric to determine the CNV detection limits. Overall, MALBAC and WGA4 have better performance than MDA in detecting CNVs.


Subject(s)
DNA Copy Number Variations , Gene Dosage , Genome , Genomics , Pyramidal Cells/metabolism , Single-Cell Analysis , Animals , Base Composition , Genomics/methods , High-Throughput Nucleotide Sequencing , Humans , Polymerase Chain Reaction , Rats , Reproducibility of Results , Sensitivity and Specificity , Single-Cell Analysis/methods
10.
Sci Rep ; 5: 10092, 2015 May 11.
Article in English | MEDLINE | ID: mdl-25961410

ABSTRACT

Profiling immune repertoires by high throughput sequencing enhances our understanding of immune system complexity and immune-related diseases in humans. Previously, cloning and Sanger sequencing identified limited numbers of T cell receptor (TCR) nucleotide sequences in rhesus monkeys, thus their full immune repertoire is unknown. We applied multiplex PCR and Illumina high throughput sequencing to study the TCRß of rhesus monkeys. We identified 1.26 million TCRß sequences corresponding to 643,570 unique TCRß sequences and 270,557 unique complementarity-determining region 3 (CDR3) gene sequences. Precise measurements of CDR3 length distribution, CDR3 amino acid distribution, length distribution of N nucleotide of junctional region, and TCRV and TCRJ gene usage preferences were performed. A comprehensive profile of rhesus monkey immune repertoire might aid human infectious disease studies using rhesus monkeys.


Subject(s)
Complementarity Determining Regions/genetics , Receptors, Antigen, T-Cell, alpha-beta/genetics , Sequence Analysis, DNA , Animals , Macaca mulatta , Multiplex Polymerase Chain Reaction
12.
Interdiscip Sci ; 2015 Feb 06.
Article in English | MEDLINE | ID: mdl-25663116

ABSTRACT

Warfarin is a drug normally used in the prevention of thrombosis and the formation of blood clots. The dosage of warfarin is strongly affected by genetic variants of CYP2C9 and VKORC1 genes. Current technologies for detecting the variants of these genes are mainly based on real-time PCR. In recent years, due to the rapidly dropping cost of whole genome sequencing and genotyping, more and more people get their whole genome sequenced or genotyped. However, current software for warfarin dosing prediction is based on low-throughput genetic information either from real-time PCR, or melting curve methods. There is no bioinformatics tool available that can take the high-throughput genome sequencing data as input and determine the accurate dosage of warfarin. Here, we present PGWD, a web tool that analyzes personal genome sequencing data, and integrates with clinical information for warfarin dosing.

13.
Front Oncol ; 4: 7, 2014.
Article in English | MEDLINE | ID: mdl-24478987

ABSTRACT

Single cell genomics is a rapidly growing field with many new techniques emerging in the past few years. However, few bioinformatics tools specific for single cell genomics analysis are available. Single cell DNA/RNA sequencing data usually have low genome coverage and high amplification bias, which makes bioinformatics analysis challenging. Many current bioinformatics tools developed for bulk cell sequencing do not work well with single cell sequencing data. Here, we summarize current challenges in the bioinformatics analysis of single cell genomic DNA sequencing and single cell transcriptomes. These challenges include calling copy number variations, identifying mutated genes in tumor samples, reconstructing cell lineages, recovering low abundant transcripts, and improving the accuracy of quantitative analysis of transcripts. Development in single cell genomics bioinformatics analysis will promote the application of this technology to basic biology and medical research.

14.
Sci Transl Med ; 5(171): 171ra19, 2013 Feb 06.
Article in English | MEDLINE | ID: mdl-23390249

ABSTRACT

The human antibody repertoire is one of the most important defenses against infectious disease, and the development of vaccines has enabled the conferral of targeted protection to specific pathogens. However, there are many challenges to measuring and analyzing the immunoglobulin sequence repertoire, including that each B cell's genome encodes a distinct antibody sequence, that the antibody repertoire changes over time, and the high similarity between antibody sequences. We have addressed these challenges by using high-throughput long read sequencing to perform immunogenomic characterization of expressed human antibody repertoires in the context of influenza vaccination. Informatic analysis of 5 million antibody heavy chain sequences from healthy individuals allowed us to perform global characterizations of isotype distributions, determine the lineage structure of the repertoire, and measure age- and antigen-related mutational activity. Our analysis of the clonal structure and mutational distribution of individuals' repertoires shows that elderly subjects have a decreased number of lineages but an increased prevaccination mutation load in their repertoire and that some of these subjects have an oligoclonal character to their repertoire in which the diversity of the lineages is greatly reduced relative to younger subjects. We have thus shown that global analysis of the immune system's clonal structure provides direct insight into the effects of vaccination and provides a detailed molecular portrait of age-related effects.


Subject(s)
Antibodies, Viral/chemistry , Antibodies, Viral/immunology , Influenza, Human/immunology , Phylogeny , Vaccination , Adolescent , Adult , Aged , Aged, 80 and over , Aging/immunology , Child , Cluster Analysis , Genetic Variation , Humans , Immunoglobulin Isotypes , Immunoglobulin M/immunology , Middle Aged , Mutation/genetics , Somatic Hypermutation, Immunoglobulin/genetics , Young Adult
15.
Protein Eng Des Sel ; 23(12): 935-46, 2010 Dec.
Article in English | MEDLINE | ID: mdl-21036781

ABSTRACT

Influenza has been circulating in the human population and has caused three pandemics in the last century (1918 H1N1, 1957 H2N2 and 1968 H3N2). The 2009 A(H1N1) was classified by World Health Organization as the fourth pandemic. Influenza has a high evolution rate, which makes vaccine design challenging. We here consider an approach for early detection of new dominant strains. By clustering the 2009 A(H1N1) sequence data, we found two main clusters. We then define a metric to detect the emergence of dominant strains. We show on historical H3N2 data that this method is able to identify a cluster around an incipient dominant strain before it becomes dominant. For example, for H3N2 as of 30 March 2009, the method detects the cluster for the new A/British Columbia/RV1222/2009 strain. This strain detection tool would appear to be useful for annual influenza vaccine selection.


Subject(s)
Disease Outbreaks , Influenza A Virus, H1N1 Subtype/chemistry , Influenza A Virus, H3N2 Subtype/chemistry , Influenza, Human/virology , Algorithms , Cluster Analysis , Computational Biology , Evolution, Molecular , Hemagglutinin Glycoproteins, Influenza Virus/genetics , Hemagglutinin Glycoproteins, Influenza Virus/metabolism , Humans , Influenza A Virus, H1N1 Subtype/genetics , Influenza A Virus, H3N2 Subtype/genetics , Influenza, Human/epidemiology , Models, Biological
16.
Phys Rev Lett ; 105(12): 128102, 2010 Sep 17.
Article in English | MEDLINE | ID: mdl-20867676

ABSTRACT

Clustered regularly interspaced short palindromic repeats (CRISPR) in bacterial and archaeal DNA have recently been shown to be a new type of antiviral immune system in these organisms. We here study the diversity of spacers in CRISPR under selective pressure. We propose a population dynamics model that explains the biological observation that the leader-proximal end of CRISPR is more diversified and the leader-distal end of CRISPR is more conserved. This result is shown to be in agreement with recent experiments. Our results show that the CRISPR spacer structure is influenced by and provides a record of the viral challenges that bacteria face.


Subject(s)
Bacteria/genetics , DNA, Intergenic/genetics , Genome, Bacterial/genetics , Inverted Repeat Sequences , Bacteria/classification , Bacteria/metabolism , Cluster Analysis , DNA, Bacterial/genetics , DNA, Bacterial/physiology , DNA, Intergenic/physiology , Genome, Bacterial/physiology , Polymorphism, Genetic , Time Factors
17.
Phys Rev Lett ; 105(19): 198701, 2010 Nov 05.
Article in English | MEDLINE | ID: mdl-21231202

ABSTRACT

We examine how the structure of the world trade network has been shaped by globalization and recessions over the last 40 years. We show that by treating the world trade network as an evolving system, theory predicts the trade network is more sensitive to recessionary shocks and recovers more slowly from them now than it did 40 years ago, due to structural changes in the world trade network induced by globalization. We also show that recession-induced change to the world trade network leads to an increased hierarchical structure of the global trade network for a few years after the recession.

18.
Dev Biol ; 337(1): 157-61, 2010 Jan 01.
Article in English | MEDLINE | ID: mdl-19799894

ABSTRACT

An open question in animal evolution is why the phylum- and superphylum-level body plans have changed so little, while the class- and family-level body plans have changed so greatly since the early Cambrian. Davidson and Erwin (Davidson and Erwin, 2006; Erwin and Davidson, 2009) proposed that the hierarchical structure of gene regulatory networks leads to different observed evolutionary rates for terminal properties of the body plan versus major aspects of body plan morphology. Here, we calculated the speed of evolution of genes in these gene regulatory networks. We found that the genes which determine the phylum and superphylum characters evolve slowly, while those genes which determine the classes, families, and speciation evolve more rapidly. This result furnishes genetic support to the hypothesis that the hierarchical structure of developmental regulatory networks provides an organizing structure which guides the evolution of aspects of the body plan.


Subject(s)
Biological Evolution , Body Patterning/genetics , Gene Expression Regulation, Developmental , Morphogenesis/genetics , Animals , Embryonic Development , Sea Urchins/embryology , Transcription Factors/genetics
19.
Phys Rev E Stat Nonlin Soft Matter Phys ; 79(3 Pt 1): 031907, 2009 Mar.
Article in English | MEDLINE | ID: mdl-19391971

ABSTRACT

We investigate the selective forces that promote the emergence of modularity in nature. We demonstrate the spontaneous emergence of modularity in a population of individuals that evolve in a changing environment. We show that the level of modularity correlates with the rapidity and severity of environmental change. The modularity arises as a synergistic response to the noise in the environment in the presence of horizontal gene transfer. We suggest that the hierarchical structure observed in the natural world may be a broken symmetry state, which generically results from evolution in a changing environment. To support our results, we analyze experimental protein interaction data and show that protein interaction networks became increasingly modular as evolution proceeded over the last four billion years. We also discuss a method to determine the divergence time of a protein.


Subject(s)
Evolution, Molecular , Models, Biological , Bacteria/metabolism , Environment , Gene Transfer, Horizontal , Metabolic Networks and Pathways , Protein Binding , Protein Structure, Tertiary
SELECTION OF CITATIONS
SEARCH DETAIL
...