Your browser doesn't support javascript.
loading
: 20 | 50 | 100
1 - 10 de 10
1.
Mol Biol Rep ; 48(3): 2963-2971, 2021 Mar.
Article En | MEDLINE | ID: mdl-33635471

Due mainly to large genome size and prevalence of repetitive sequences in the nuclear genome of spruce (Picea Mill.), it is very difficult to develop single-copy genomic microsatellite markers. We have developed and characterized 25 polymorphic, single-copy genic microsatellites from white spruce (Picea glauca (Moench) Voss) EST sequences and determined their informativeness in white spruce and black spruce (Picea mariana (Mill.) B.S.P.) and inheritance in black spruce. White spruce EST sequences from NCBI dbEST were searched for the presence of microsatellite repeats. Forty-seven sequences containing dinucleotide, trinucleotide, tetranucleotide and compound repeats were selected to develop primers. Twenty-five of the designed primer pairs yielded scorable amplicons, with single-locus patterns, and were characterized in 20 individuals each of white spruce and black spruce. All 25 microsatellites were polymorphic in white spruce and 24 in black spruce. The number of alleles at a locus ranged from two to 18, with a mean of 8.8 in white spruce, and from one to 17, with a mean of 7.6 in black spruce. The expected heterozygosity/polymorphic information content ranged from 0.10 to 0.92, with a mean of 0.67 in white spruce, and from 0 to 0.93, with a mean of 0.59 in black spruce. Microsatellites with dinucleotide and compound repeats were more informative than those with trinucleotide and tetranucleotide repeats. Eighteen microsatellite markers polymorphic between the parents of a black spruce controlled cross inherited in a single-locus Mendelian fashion. The microsatellite markers developed can be applied for various genetics, genomics, breeding, and conservation studies and applications.


DNA, Plant/genetics , Expressed Sequence Tags/metabolism , Gene Dosage , Microsatellite Repeats/genetics , Picea/genetics , Chi-Square Distribution , Genotype , Inheritance Patterns/genetics , Nucleotide Motifs/genetics
2.
Nat Commun ; 9(1): 3647, 2018 09 07.
Article En | MEDLINE | ID: mdl-30194434

Here we develop a high-throughput single-cell ATAC-seq (assay for transposition of accessible chromatin) method to measure physical access to DNA in whole cells. Our approach integrates fluorescence imaging and addressable reagent deposition across a massively parallel (5184) nano-well array, yielding a nearly 20-fold improvement in throughput (up to ~1800 cells/chip, 4-5 h on-chip processing time) and library preparation cost (~81¢ per cell) compared to prior microfluidic implementations. We apply this method to measure regulatory variation in peripheral blood mononuclear cells (PBMCs) and show robust, de novo clustering of single cells by hematopoietic cell type.


Chromatin Assembly and Disassembly , High-Throughput Screening Assays , Optical Imaging/methods , Single-Cell Analysis/methods , Animals , Cell Line , Epigenesis, Genetic , Humans , Mice
3.
BMC Genomics ; 18(1): 519, 2017 07 07.
Article En | MEDLINE | ID: mdl-28687070

BACKGROUND: Technological advances have enabled transcriptome characterization of cell types at the single-cell level providing new biological insights. New methods that enable simple yet high-throughput single-cell expression profiling are highly desirable. RESULTS: Here we report a novel nanowell-based single-cell RNA sequencing system, ICELL8, which enables processing of thousands of cells per sample. The system employs a 5,184-nanowell-containing microchip to capture ~1,300 single cells and process them. Each nanowell contains preprinted oligonucleotides encoding poly-d(T), a unique well barcode, and a unique molecular identifier. The ICELL8 system uses imaging software to identify nanowells containing viable single cells and only wells with single cells are processed into sequencing libraries. Here, we report the performance and utility of ICELL8 using samples of increasing complexity from cultured cells to mouse solid tissue samples. Our assessment of the system to discriminate between mixed human and mouse cells showed that ICELL8 has a low cell multiplet rate (< 3%) and low cross-cell contamination. We characterized single-cell transcriptomes of more than a thousand cultured human and mouse cells as well as 468 mouse pancreatic islets cells. We were able to identify distinct cell types in pancreatic islets, including alpha, beta, delta and gamma cells. CONCLUSIONS: Overall, ICELL8 provides efficient and cost-effective single-cell expression profiling of thousands of cells, allowing researchers to decipher single-cell transcriptomes within complex biological samples.


Gene Expression Profiling/methods , High-Throughput Nucleotide Sequencing/methods , Nanotechnology/methods , Sequence Analysis, RNA/methods , Single-Cell Analysis/methods , Tissue Array Analysis/methods , Cell Line , Humans , Islets of Langerhans/cytology , Islets of Langerhans/metabolism
4.
G3 (Bethesda) ; 5(9): 1909-18, 2015 Jul 16.
Article En | MEDLINE | ID: mdl-26185160

To achieve proper spatiotemporal control of gene expression, transcription factors cooperatively assemble onto specific DNA sequences. The ETS domain protein monomer of GABPα and the B-ZIP domain protein dimer of CREB1 cooperatively bind DNA only when the ETS ((C)/GCGGAA GT: ) and CRE ( GT: GACGTCAC) motifs overlap precisely, producing the ETS↔CRE motif ((C)/GCGGAA GT: GACGTCAC). We designed a Protein Binding Microarray (PBM) with 60-bp DNAs containing four identical sectors, each with 177,440 features that explore the cooperative interactions between GABPα and CREB1 upon binding the ETS↔CRE motif. The DNA sequences include all 15-mers of the form (C)/GCGGA--CG-, the ETS↔CRE motif, and all single nucleotide polymorphisms (SNPs), and occurrences in the human and mouse genomes. CREB1 enhanced GABPα binding to the canonical ETS↔CRE motif CCGGAAGT two-fold, and up to 23-fold for several SNPs at the beginning and end of the ETS motif, which is suggestive of two separate and distinct allosteric mechanisms of cooperative binding. We show that the ETS-CRE array data can be used to identify regions likely cooperatively bound by GABPα and CREB1 in vivo, and demonstrate their ability to identify human genetic variants that might inhibit cooperative binding.


Binding Sites , Cyclic AMP Response Element-Binding Protein/metabolism , GA-Binding Protein Transcription Factor/metabolism , Nucleotide Motifs , Proto-Oncogene Proteins c-ets/metabolism , Animals , Cell Line , Genetic Loci , Humans , Mice , Oligonucleotide Array Sequence Analysis , Polymorphism, Single Nucleotide , Protein Binding , Recombinant Fusion Proteins/metabolism
5.
Cell ; 158(6): 1431-1443, 2014 Sep 11.
Article En | MEDLINE | ID: mdl-25215497

Transcription factor (TF) DNA sequence preferences direct their regulatory activity, but are currently known for only ∼1% of eukaryotic TFs. Broadly sampling DNA-binding domain (DBD) types from multiple eukaryotic clades, we determined DNA sequence preferences for >1,000 TFs encompassing 54 different DBD classes from 131 diverse eukaryotes. We find that closely related DBDs almost always have very similar DNA sequence preferences, enabling inference of motifs for ∼34% of the ∼170,000 known or predicted eukaryotic TFs. Sequences matching both measured and inferred motifs are enriched in chromatin immunoprecipitation sequencing (ChIP-seq) peaks and upstream of transcription start sites in diverse eukaryotic lineages. SNPs defining expression quantitative trait loci in Arabidopsis promoters are also enriched for predicted TF binding sites. Importantly, our motif "library" can be used to identify specific TFs whose binding may be altered by human disease risk alleles. These data present a powerful resource for mapping transcriptional networks across eukaryotes.


Arabidopsis/genetics , Nucleotide Motifs , Sequence Analysis, DNA , Transcription Factors/metabolism , Arabidopsis/metabolism , Chromatin Immunoprecipitation , Humans , Polymorphism, Single Nucleotide , Promoter Regions, Genetic , Protein Binding , Quantitative Trait Loci
6.
Biochem Biophys Res Commun ; 449(2): 248-55, 2014 Jun 27.
Article En | MEDLINE | ID: mdl-24835951

Three oxidative products of 5-methylcytosine (5mC) occur in mammalian genomes. We evaluated if these cytosine modifications in a CG dinucleotide altered DNA binding of four B-HLH homodimers and three heterodimers to the E-Box motif CGCAG|GTG. We examined 25 DNA probes containing all combinations of cytosine in a CG dinucleotide and none changed binding except for carboxylation of cytosine (5caC) in the strand CGCAG|GTG. 5caC enhanced binding of all examined B-HLH homodimers and heterodimers, particularly the Tcf3|Ascl1 heterodimer which increased binding ~10-fold. These results highlight a potential function of the oxidative products of 5mC, changing the DNA binding of sequence-specific transcription factors.


Basic Helix-Loop-Helix Transcription Factors/chemistry , Basic Helix-Loop-Helix Transcription Factors/metabolism , Cytosine/analogs & derivatives , 5-Methylcytosine/chemistry , 5-Methylcytosine/metabolism , Amino Acid Sequence , Animals , Base Sequence , Basic Helix-Loop-Helix Transcription Factors/genetics , Circular Dichroism , Cytosine/chemistry , Cytosine/metabolism , Dinucleoside Phosphates/chemistry , Dinucleoside Phosphates/metabolism , E-Box Elements , Humans , Models, Molecular , Molecular Sequence Data , Protein Binding , Protein Multimerization
7.
BMC Genomics ; 14: 702, 2013 Oct 11.
Article En | MEDLINE | ID: mdl-24119028

BACKGROUND: EST (expressed sequence tag) sequences and their annotation provide a highly valuable resource for gene discovery, genome sequence annotation, and other genomics studies that can be applied in genetics, breeding and conservation programs for non-model organisms. Conifers are long-lived plants that are ecologically and economically important globally, and have a large genome size. Black spruce (Picea mariana), is a transcontinental species of the North American boreal and temperate forests. However, there are limited transcriptomic and genomic resources for this species. The primary objective of our study was to develop a black spruce transcriptomic resource to facilitate on-going functional genomics projects related to growth and adaptation to climate change. RESULTS: We conducted bidirectional sequencing of cDNA clones from a standard cDNA library constructed from black spruce needle tissues. We obtained 4,594 high quality (2,455 5' end and 2,139 3' end) sequence reads, with an average read-length of 532 bp. Clustering and assembly of ESTs resulted in 2,731 unique sequences, consisting of 2,234 singletons and 497 contigs. Approximately two-thirds (63%) of unique sequences were functionally annotated. Genes involved in 36 molecular functions and 90 biological processes were discovered, including 24 putative transcription factors and 232 genes involved in photosynthesis. Most abundantly expressed transcripts were associated with photosynthesis, growth factors, stress and disease response, and transcription factors. A total of 216 full-length genes were identified. About 18% (493) of the transcripts were novel, representing an important addition to the Genbank EST database (dbEST). Fifty-seven di-, tri-, tetra- and penta-nucleotide simple sequence repeats were identified. CONCLUSIONS: We have developed the first high quality EST resource for black spruce and identified 493 novel transcripts, which may be species-specific related to life history and ecological traits. We have also identified full-length genes and microsatellite-containing ESTs. Based on EST sequence similarities, black spruce showed close evolutionary relationships with congeneric Picea glauca and Picea sitchensis compared to other Pinaceae members and angiosperms. The EST sequences reported here provide an important resource for genome annotation, functional and comparative genomics, molecular breeding, conservation and management studies and applications in black spruce and related conifer species.


Expressed Sequence Tags/metabolism , Genomics , Molecular Sequence Annotation/methods , Picea/genetics , Base Sequence , Conserved Sequence/genetics , Contig Mapping , DNA, Complementary/genetics , Databases, Protein , Evolution, Molecular , Gene Expression Regulation, Plant , Gene Ontology , Genes, Plant/genetics , Genetic Association Studies , Molecular Sequence Data , Multigene Family/genetics , Peptides/genetics , Pinus/genetics , RNA, Messenger/genetics , RNA, Messenger/metabolism , Sequence Homology, Nucleic Acid
8.
Genome Res ; 23(6): 988-97, 2013 Jun.
Article En | MEDLINE | ID: mdl-23590861

To evaluate the effect of CG methylation on DNA binding of sequence-specific B-ZIP transcription factors (TFs) in a high-throughput manner, we enzymatically methylated the cytosine in the CG dinucleotide on protein binding microarrays. Two Agilent DNA array designs were used. One contained 40,000 features using de Bruijn sequences where each 8-mer occurs 32 times in various positions in the DNA sequence. The second contained 180,000 features with each CG containing 8-mer occurring three times. The first design was better for identification of binding motifs, while the second was better for quantification. Using this novel technology, we show that CG methylation enhanced binding for CEBPA and CEBPB and inhibited binding for CREB, ATF4, JUN, JUND, CEBPD, and CEBPG. The CEBPB|ATF4 heterodimer bound a novel motif CGAT|GCAA 10-fold better when methylated. The electrophoretic mobility shift assay (EMSA) confirmed these results. CEBPB ChIP-seq data using primary female mouse dermal fibroblasts with 50× methylome coverage for each strand indicate that the methylated sequences well-bound on the arrays are also bound in vivo. CEBPB bound 39% of the methylated canonical 10-mers ATTGC|GCAAT in the mouse genome. After ATF4 protein induction by thapsigargin which results in ER stress, CEBPB binds methylated CGAT|GCAA in vivo, recapitulating what was observed on the arrays. This methodology can be used to identify new methylated DNA sequences preferentially bound by TFs, which may be functional in vivo.


Activating Transcription Factor 4/metabolism , CCAAT-Enhancer-Binding Protein-beta/metabolism , CpG Islands , DNA Methylation , Activating Transcription Factor 4/chemistry , Animals , Base Sequence , Binding Sites , CCAAT-Enhancer-Binding Protein-beta/chemistry , Female , Fibroblasts , Mice , Nucleotide Motifs , Position-Specific Scoring Matrices , Protein Binding/drug effects , Protein Multimerization , Thapsigargin/immunology , Transcription Factors/metabolism
9.
G3 (Bethesda) ; 2(10): 1243-56, 2012 Oct.
Article En | MEDLINE | ID: mdl-23050235

Previously, we identified 8-bps long DNA sequences (8-mers) that localize in human proximal promoters and grouped them into known transcription factor binding sites (TFBS). We now examine split 8-mers consisting of two 4-mers separated by 1-bp to 30-bps (X(4)-N(1-30)-X(4)) to identify pairs of TFBS that localize in proximal promoters at a precise distance. These include two overlapping TFBS: the ETS⇔ETS motif ((C/G)CCGGAAGCGGAA) and the ETS⇔CRE motif ((C/G)CGGAAGTGACGTCAC). The nucleotides in bold are part of both TFBS. Molecular modeling shows that the ETS⇔CRE motif can be bound simultaneously by both the ETS and the B-ZIP domains without protein-protein clashes. The electrophoretic mobility shift assay (EMSA) shows that the ETS protein GABPα and the B-ZIP protein CREB preferentially bind to the ETS⇔CRE motif only when the two TFBS overlap precisely. In contrast, the ETS domain of ETV5 and CREB interfere with each other for binding the ETS⇔CRE. The 11-mer (CGGAAGTGACG), the conserved part of the ETS⇔CRE motif, occurs 226 times in the human genome and 83% are in known regulatory regions. In vivo GABPα and CREB ChIP-seq peaks identified the ETS⇔CRE as the most enriched motif occurring in promoters of genes involved in mRNA processing, cellular catabolic processes, and stress response, suggesting that a specific class of genes is regulated by this composite motif.


Cyclic AMP Response Element-Binding Protein/metabolism , GA-Binding Protein Transcription Factor/metabolism , Nucleotide Motifs , Promoter Regions, Genetic , Proto-Oncogene Proteins c-ets/genetics , Animals , Base Sequence , Binding Sites , Conserved Sequence , Cyclic AMP Response Element-Binding Protein/chemistry , DNA Methylation , GA-Binding Protein Transcription Factor/chemistry , Humans , Mice , Molecular Docking Simulation , Nucleic Acid Conformation , Protein Binding , Protein Conformation , Proto-Oncogene Proteins c-ets/chemistry
10.
BMC Genomics ; 11: 515, 2010 Sep 24.
Article En | MEDLINE | ID: mdl-20868486

BACKGROUND: Genetic maps provide an important genomic resource for understanding genome organization and evolution, comparative genomics, mapping genes and quantitative trait loci, and associating genomic segments with phenotypic traits. Spruce (Picea) genomics work is quite challenging, mainly because of extremely large size and highly repetitive nature of its genome, unsequenced and poorly understood genome, and the general lack of advanced-generation pedigrees. Our goal was to construct a high-density genetic linkage map of black spruce (Picea mariana, 2n = 24), which is a predominant, transcontinental species of the North American boreal and temperate forests, with high ecological and economic importance. RESULTS: We have developed a near-saturated and complete genetic linkage map of black spruce using a three-generation outbred pedigree and amplified fragment length polymorphism (AFLP), selectively amplified microsatellite polymorphic loci (SAMPL), expressed sequence tag polymorphism (ESTP), and microsatellite (mostly cDNA based) markers. Maternal, paternal, and consensus genetic linkage maps were constructed. The maternal, paternal, and consensus maps in our study consistently coalesced into 12 linkage groups, corresponding to the haploid chromosome number (1n = 1x = 12) of 12 in the genus Picea. The maternal map had 816 and the paternal map 743 markers distributed over 12 linkage groups each. The consensus map consisted of 1,111 markers distributed over 12 linkage groups, and covered almost the entire (> 97%) black spruce genome. The mapped markers included 809 AFLPs, 255 SAMPL, 42 microsatellites, and 5 ESTPs. Total estimated length of the genetic map was 1,770 cM, with an average of one marker every 1.6 cM. The maternal, paternal and consensus genetic maps aligned almost perfectly. CONCLUSION: We have constructed the first high density to near-saturated genetic linkage map of black spruce, with greater than 97% genome coverage. Also, this is the first genetic map based on a three-generation outbred pedigree in the genus Picea. The genome length in P. mariana is likely to be about 1,800 cM. The genetic maps developed in our study can serve as a reference map for various genomics studies and applications in Picea and Pinaceae.


Chromosome Mapping , Genetic Linkage , Picea/genetics , Amplified Fragment Length Polymorphism Analysis , DNA, Plant/genetics , Genetic Markers , Genome, Plant/genetics , Microsatellite Repeats/genetics , Poisson Distribution
...