RESUMO
Considerable efforts have been made to develop technologies for selection of peptidic molecules that act as substrates or binders to a protein of interest. Here we demonstrate the combination of rational peptide array library design, parallel screening and stepwise evolution, to discover novel peptide hotspots. These hotspots can be systematically evolved to create high-affinity, high-specificity binding peptides to a protein target in a reproducible and digitally controlled process. The method can be applied to synthesize both linear and cyclic peptides, as well as peptides composed of natural and non-natural amino acid analogs, thereby enabling screens in a much diverse chemical space. We apply this method to stepwise evolve peptide binders to streptavidin, a protein studied for over two decades and report novel peptides that mimic key interactions of biotin to streptavidin.
Assuntos
Biblioteca de Peptídeos , Peptídeos/metabolismo , Estreptavidina/metabolismo , Sequência de Aminoácidos , Sítios de Ligação , Simulação de Acoplamento Molecular , Peptídeos/química , Peptídeos Cíclicos/química , Peptídeos Cíclicos/metabolismo , Ligação Proteica , Proteínas/química , Proteínas/metabolismo , Estreptavidina/químicaRESUMO
Microbial transglutaminases (MTGs) catalyze the formation of Gln-Lys isopeptide bonds and are widely used for the cross-linking of proteins and peptides in food and biotechnological applications (e.g. to improve the texture of protein-rich foods or in generating antibody-drug conjugates). Currently used MTGs have low substrate specificity, impeding their biotechnological use as enzymes that do not cross-react with nontarget substrates (i.e. as bio-orthogonal labeling systems). Here, we report the discovery of an MTG from Kutzneria albida (KalbTG), which exhibited no cross-reactivity with known MTG substrates or commonly used target proteins, such as antibodies. KalbTG was produced in Escherichia coli as soluble and active enzyme in the presence of its natural inhibitor ammonium to prevent potentially toxic cross-linking activity. The crystal structure of KalbTG revealed a conserved core similar to other MTGs but very short surface loops, making it the smallest MTG characterized to date. Ultra-dense peptide array technology involving a pool of 1.4 million unique peptides identified specific recognition motifs for KalbTG in these peptides. We determined that the motifs YRYRQ and RYESK are the best Gln and Lys substrates of KalbTG, respectively. By first reacting a bifunctionalized peptide with the more specific KalbTG and in a second step with the less specific MTG from Streptomyces mobaraensis, a successful bio-orthogonal labeling system was demonstrated. Fusing the KalbTG recognition motif to an antibody allowed for site-specific and ratio-controlled labeling using low label excess. Its site specificity, favorable kinetics, ease of use, and cost-effective production render KalbTG an attractive tool for a broad range of applications, including production of therapeutic antibody-drug conjugates.
Assuntos
Actinomycetales/enzimologia , Proteínas/química , Proteínas/metabolismo , Transglutaminases/metabolismo , Sítios de Ligação , Modelos Moleculares , Peptídeos/química , Peptídeos/metabolismo , Conformação Proteica , Coloração e Rotulagem , Especificidade por Substrato , Transglutaminases/químicaRESUMO
Although the locations of promoters and enhancers have been identified in several cell types, we still have limited information on their connectivity. We developed HiCap, which combines a 4-cutter restriction enzyme Hi-C with sequence capture of promoter regions. Applying the method to mouse embryonic stem cells, we identified promoter-anchored interactions involving 15,905 promoters and 71,984 distal regions. The distal regions were enriched for enhancer marks and transcription, and had a mean fragment size of only 699 bp--close to single-enhancer resolution. High-resolution maps of promoter-anchored interactions with HiCap will be important for detailed characterizations of chromatin interaction landscapes.
Assuntos
Cromatina/química , Elementos Facilitadores Genéticos , Genômica/métodos , Regiões Promotoras Genéticas , Animais , Mapeamento Cromossômico , Expressão Gênica , Redes Reguladoras de Genes , Camundongos , Fatores de Transcrição/metabolismoRESUMO
Antibodies are of importance for the field of proteomics, both as reagents for imaging cells, tissues, and organs and as capturing agents for affinity enrichment in mass-spectrometry-based techniques. It is important to gain basic insights regarding the binding sites (epitopes) of antibodies and potential cross-reactivity to nontarget proteins. Knowledge about an antibody's linear epitopes is also useful in, for instance, developing assays involving the capture of peptides obtained from trypsin cleavage of samples prior to mass spectrometry analysis. Here, we describe, for the first time, the design and use of peptide arrays covering all human proteins for the analysis of antibody specificity, based on parallel in situ photolithic synthesis of a total of 2.1 million overlapping peptides. This has allowed analysis of on- and off-target binding of both monoclonal and polyclonal antibodies, complemented with precise mapping of epitopes based on full amino acid substitution scans. The analysis suggests that linear epitopes are relatively short, confined to five to seven residues, resulting in apparent off-target binding to peptides corresponding to a large number of unrelated human proteins. However, subsequent analysis using recombinant proteins suggests that these linear epitopes have a strict conformational component, thus giving us new insights regarding how antibodies bind to their antigens.
Assuntos
Anticorpos/genética , Mapeamento de Epitopos/métodos , Biossíntese Peptídica/genética , Proteoma , Sequência de Aminoácidos , Anticorpos/imunologia , Sítios de Ligação , Epitopos/genética , Epitopos/imunologia , Humanos , Espectrometria de Massas , Biossíntese Peptídica/imunologia , TripsinaRESUMO
SmMAK16 from the trematode Schistosoma mansoni is a protein that is known to localize in the nucleolus. Recent findings show that SmMAK16 is involved in 60S ribosomal subunit synthesis. Although the SmMAK16 protein contains putative nuclear localization signals (NLS), little is known about their precise function, redundancy or regulation. The goal of the current study was to identify and characterize the presence and functional regulation of the localization signals in SmMAK16. The SmMAK16 coding sequence and specific fragments were individually cloned in-frame into the pEGFP-C2 expression vector to encode Green Fluorescent Protein (GFP) fusion proteins. Constructs were individually transfected into COS-7 cells and fluorescent microscopy used to determine the cellular location and thus the presence of signals regulating nuclear and nucleolar localization. SmMAK16 was found to contain two NLSs and one nucleolar localization signal (NoLS). One of the signals contains a sequence identical to an established nucleolar detention signal that reportedly functions only under acidic cellular conditions. The localization of the SmMAK16-GFP constructs was analyzed under acidic conditions; however, altering pH did not influence the localization of SmMAK16. It has been previously reported that casein kinase 2 (CK2) can phosphorylate SmMAK16 at serines adjacent to one of the NLSs. One of these CK2 sites and the adjacent NLS are conserved with that of the SV40 Large T Antigen (LTA) and phosphorylation of this site in the SV40 LTA regulates the kinetics of the NLS. To discover if kinetic regulation also occurs in SmMAK16, mutant and wild type SmMAK16-GFP proteins were purified and injected into individual COS-7 cells. No difference in the rate of transport was found between wt and mutant SmMAK16 proteins. Therefore, SmMAK16 localizes to the nucleolus using three separate signals, two NLSs and one NoLS, however, these signals appear to function independently of pH and phosphorylation by CK2.
Assuntos
Proteínas de Helminto/genética , Proteínas de Helminto/metabolismo , Sinais de Localização Nuclear , Schistosoma mansoni/genética , Schistosoma mansoni/metabolismo , Animais , Células COS , Caseína Quinase II/metabolismo , Núcleo Celular/química , Chlorocebus aethiops , Análise Mutacional de DNA , Genes Reporter , Proteínas de Fluorescência Verde/análise , Proteínas de Fluorescência Verde/genética , Concentração de Íons de Hidrogênio , Microscopia de Fluorescência , Fosforilação , Proteínas Recombinantes de Fusão/análise , Proteínas Recombinantes de Fusão/genéticaRESUMO
BACKGROUND: Enrichment of loci by DNA hybridization-capture, followed by high-throughput sequencing, is an important tool in modern genetics. Currently, the most common targets for enrichment are the protein coding exons represented by the consensus coding DNA sequence (CCDS). The CCDS, however, excludes many actual or computationally predicted coding exons present in other databases, such as RefSeq and Vega, and non-coding functional elements such as untranslated and regulatory regions. The number of variants per base pair (variant density) and our ability to interrogate regions outside of the CCDS regions is consequently less well understood. RESULTS: We examine capture sequence data from outside of the CCDS regions and find that extremes of GC content that are present in different subregions of the genome can reduce the local capture sequence coverage to less than 50% relative to the CCDS. This effect is due to biases inherent in both the Illumina and SOLiD sequencing platforms that are exacerbated by the capture process. Interestingly, for two subregion types, microRNA and predicted exons, the capture process yields higher than expected coverage when compared to whole genome sequencing. Lastly, we examine the variation present in non-CCDS regions and find that predicted exons, as well as exonic regions specific to RefSeq and Vega, show much higher variant densities than the CCDS. CONCLUSIONS: We show that regions outside of the CCDS perform less efficiently in capture sequence experiments. Further, we show that the variant density in computationally predicted exons is more than 2.5-times higher than that observed in the CCDS.
Assuntos
Sequência Consenso , Exoma , Éxons , Fases de Leitura Aberta/genética , Análise de Sequência de DNA , Alelos , Biologia Computacional , Frequência do Gene , Genoma Humano , Humanos , Íntrons , Taxa de Mutação , Polimorfismo de Nucleotídeo ÚnicoRESUMO
Measurements of serum prostate-specific antigen (PSA) protein levels form the basis for a widely used test to screen men for prostate cancer. Germline variants in the gene that encodes the PSA protein (KLK3) have been shown to be associated with both serum PSA levels and prostate cancer. Based on a resequencing analysis of a 56 kb region on chromosome 19q13.33, centered on the KLK3 gene, we fine mapped this locus by genotyping tag SNPs in 3,522 prostate cancer cases and 3,338 controls from five case-control studies. We did not observe a strong association with the KLK3 variant, reported in previous studies to confer risk for prostate cancer (rs2735839; P = 0.20) but did observe three highly correlated SNPs (rs17632542, rs62113212 and rs62113214) associated with prostate cancer [P = 3.41 × 10(-4), per-allele trend odds ratio (OR) = 0.77, 95% CI = 0.67-0.89]. The signal was apparent only for nonaggressive prostate cancer cases with Gleason score <7 and disease stage
Assuntos
Cromossomos Humanos Par 19 , Predisposição Genética para Doença , Calicreínas/genética , Antígeno Prostático Específico/biossíntese , Neoplasias da Próstata/genética , Estudos de Casos e Controles , Mapeamento Cromossômico , Mutação em Linhagem Germinativa , Humanos , Masculino , Polimorfismo de Nucleotídeo ÚnicoRESUMO
Soybean (Glycine max) is a self-pollinating species that has relatively low nucleotide polymorphism rates compared with other crop species. Despite the low rate of nucleotide polymorphisms, a wide range of heritable phenotypic variation exists. There is even evidence for heritable phenotypic variation among individuals within some cultivars. Williams 82, the soybean cultivar used to produce the reference genome sequence, was derived from backcrossing a Phytophthora root rot resistance locus from the donor parent Kingwa into the recurrent parent Williams. To explore the genetic basis of intracultivar variation, we investigated the nucleotide, structural, and gene content variation of different Williams 82 individuals. Williams 82 individuals exhibited variation in the number and size of introgressed Kingwa loci. In these regions of genomic heterogeneity, the reference Williams 82 genome sequence consists of a mosaic of Williams and Kingwa haplotypes. Genomic structural variation between Williams and Kingwa was maintained between the Williams 82 individuals within the regions of heterogeneity. Additionally, the regions of heterogeneity exhibited gene content differences between Williams 82 individuals. These findings show that genetic heterogeneity in Williams 82 primarily originated from the differential segregation of polymorphic chromosomal regions following the backcross and single-seed descent generations of the breeding process. We conclude that soybean haplotypes can possess a high rate of structural and gene content variation, and the impact of intracultivar genetic heterogeneity may be significant. This detailed characterization will be useful for interpreting soybean genomic data sets and highlights important considerations for research communities that are developing or utilizing a reference genome sequence.
Assuntos
Variação Genética , Genoma de Planta , Glycine max/genética , Hibridização Genômica Comparativa , DNA de Plantas/genética , Haplótipos , Endogamia , Análise de Sequência com Séries de Oligonucleotídeos , Polimorfismo de Nucleotídeo Único , Análise de Sequência de DNARESUMO
Massively parallel DNA sequencing technologies have greatly increased our ability to generate large amounts of sequencing data at a rapid pace. Several methods have been developed to enrich for genomic regions of interest for targeted sequencing. We have compared three of these methods: Molecular Inversion Probes (MIP), Solution Hybrid Selection (SHS), and Microarray-based Genomic Selection (MGS). Using HapMap DNA samples, we compared each of these methods with respect to their ability to capture an identical set of exons and evolutionarily conserved regions associated with 528 genes (2.61 Mb). For sequence analysis, we developed and used a novel Bayesian genotype-assigning algorithm, Most Probable Genotype (MPG). All three capture methods were effective, but sensitivities (percentage of targeted bases associated with high-quality genotypes) varied for an equivalent amount of pass-filtered sequence: for example, 70% (MIP), 84% (SHS), and 91% (MGS) for 400 Mb. In contrast, all methods yielded similar accuracies of >99.84% when compared to Infinium 1M SNP BeadChip-derived genotypes and >99.998% when compared to 30-fold coverage whole-genome shotgun sequencing data. We also observed a low false-positive rate with all three methods; of the heterozygous positions identified by each of the capture methods, >99.57% agreed with 1M SNP BeadChip, and >98.840% agreed with the whole-genome shotgun data. In addition, we successfully piloted the genomic enrichment of a set of 12 pooled samples via the MGS method using molecular bar codes. We find that these three genomic enrichment methods are highly accurate and practical, with sensitivities comparable to that of 30-fold coverage whole-genome shotgun data.
Assuntos
Diabetes Mellitus Tipo 2/genética , Genoma Humano , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Análise de Sequência de DNA/métodos , Algoritmos , Teorema de Bayes , DNA/genética , Sondas de DNA/genética , Éxons , Genótipo , Humanos , Reprodutibilidade dos Testes , Sensibilidade e EspecificidadeRESUMO
We have developed a solution-based method for targeted DNA capture-sequencing that is directed to the complete human exome. Using this approach allows the discovery of greater than 95% of all expected heterozygous singe base variants, requires as little as 3 Gbp of raw sequence data and constitutes an effective tool for identifying rare coding alleles in large scale genomic studies.
Assuntos
Pareamento de Bases/genética , Bases de Dados de Ácidos Nucleicos , Éxons/genética , Análise de Sequência de DNA/métodos , Biblioteca Gênica , Haplótipos/genética , Humanos , Polimorfismo de Nucleotídeo Único/genética , Reprodutibilidade dos Testes , Alinhamento de Sequência , SoluçõesRESUMO
Sequence capture technologies, pioneered in mammalian genomes, enable the resequencing of targeted genomic regions. Most capture protocols require blocking DNA, the production of which in large quantities can prove challenging. A blocker-free, two-stage capture protocol was developed using NimbleGen arrays. The first capture depletes the library of repetitive sequences, while the second enriches for target loci. This strategy was used to resequence non-repetitive portions of an approximately 2.2 Mb chromosomal interval and a set of 43 genes dispersed in the 2.3 Gb maize genome. This approach achieved approximately 1800-3000-fold enrichment and 80-98% coverage of targeted bases. More than 2500 SNPs were identified in target genes. Low rates of false-positive SNP predictions were obtained, even in the presence of captured paralogous sequences. Importantly, it was possible to recover novel sequences from non-reference alleles. The ability to design novel repeat-subtraction and target capture arrays makes this technology accessible in any species.
Assuntos
Genoma de Planta , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Análise de Sequência de DNA/métodos , Hibridização Genômica Comparativa , DNA de Plantas/genética , Genes de Plantas , Polimorfismo de Nucleotídeo Único , Zea mays/genéticaRESUMO
Many disease-associated variants identified by genome-wide association (GWA) studies are expected to regulate gene expression. Allele-specific expression (ASE) quantifies transcription from both haplotypes using individuals heterozygous at tested SNPs. We performed deep human transcriptome-wide resequencing (RNA-seq) for ASE analysis and expression quantitative trait locus discovery. We resequenced double poly(A)-selected RNA from primary CD4(+) T cells (n = 4 individuals, both activated and untreated conditions) and developed tools for paired-end RNA-seq alignment and ASE analysis. We generated an average of 20 million uniquely mapping 45 base reads per sample. We obtained sufficient read depth to test 1371 unique transcripts for ASE. Multiple biases inflate the false discovery rate which we estimate to be approximately 50% for random SNPs. However, after controlling for these biases and considering the subset of SNPs that pass HapMap QC, 4.6% of heterozygous SNP-sample pairs show evidence of imbalance (P < 0.001). We validated four findings by both bacterial cloning and Sanger sequencing assays. We also found convincing evidence for allelic imbalance at multiple reporter exonic SNPs in CD6 for two samples heterozygous at the multiple sclerosis-associated variant rs17824933, linking GWA findings with variation in gene expression. Finally, we show in CD4(+) T cells from a further individual that high-throughput sequencing of genomic DNA and RNA-seq following enrichment for targeted gene sequences by sequence capture methods offers an unbiased means to increase the read depth for transcripts of interest, and therefore a method to investigate the regulatory role of many disease-associated genetic variants.
Assuntos
Desequilíbrio Alélico/genética , Perfilação da Expressão Gênica/métodos , Estudo de Associação Genômica Ampla , Ensaios de Triagem em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Alelos , Pareamento de Bases/genética , Viés , Células Cultivadas , Biologia Computacional , Doença/genética , Epigênese Genética , Reações Falso-Positivas , Loci Gênicos/genética , Heterozigoto , Humanos , Polimorfismo de Nucleotídeo Único/genética , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Reprodutibilidade dos TestesRESUMO
Single nucleotide polymorphisms (SNPs) in the KLK3 gene on chromosome 19q13.33 are associated with serum prostate-specific antigen (PSA) levels. Recent genome wide association studies of prostate cancer have yielded conflicting results for association of the same SNPs with prostate cancer risk. Since the KLK3 gene encodes the PSA protein that forms the basis for a widely used screening test for prostate cancer, it is critical to fully characterize genetic variation in this region and assess its relationship with the risk of prostate cancer. We have conducted a next-generation sequence analysis in 78 individuals of European ancestry to characterize common (minor allele frequency, MAF >1%) genetic variation in a 56 kb region on chromosome 19q13.33 centered on the KLK3 gene (chr19:56,019,829-56,076,043 bps). We identified 555 polymorphic loci in the process including 116 novel SNPs and 182 novel insertion/deletion polymorphisms (indels). Based on tagging analysis, 144 loci are necessary to tag the region at an r (2) threshold of 0.8 and MAF of 1% or higher, while 86 loci are required to tag the region at an r (2) threshold of 0.8 and MAF >5%. Our sequence data augments coverage by 35 and 78% as compared to variants in dbSNP and HapMap, respectively. We observed six non-synonymous amino acid or frame shift changes in the KLK3 gene and three changes in each of the neighboring genes, KLK15 and KLK2. Our study has generated a detailed map of common genetic variation in the genomic region surrounding the KLK3 gene, which should be useful for fine-mapping the association signal as well as determining the contribution of this locus to prostate cancer risk and/or regulation of PSA expression.
Assuntos
Cromossomos Humanos Par 19/genética , Calicreínas/genética , Polimorfismo de Nucleotídeo Único , Antígeno Prostático Específico/genética , Calicreínas Teciduais/genética , Feminino , Frequência do Gene , Haplótipos , Humanos , Mutação INDEL , Desequilíbrio de Ligação , Masculino , Mutação , Neoplasias da Próstata/etnologia , Neoplasias da Próstata/genética , Análise de Sequência de DNA , População Branca/genéticaRESUMO
Forward genetics (phenotype-driven approaches) remain the primary source for allelic variants in the mouse. Unfortunately, the gap between observable phenotype and causative genotype limits the widespread use of spontaneous and induced mouse mutants. As alternatives to traditional positional cloning and mutation detection approaches, sequence capture and next-generation sequencing technologies can be used to rapidly sequence subsets of the genome. Application of these technologies to mutation detection efforts in the mouse has the potential to significantly reduce the time and resources required for mutation identification by abrogating the need for high-resolution genetic mapping, long-range PCR, and sequencing of individual PCR amplimers. As proof of principle, we used array-based sequence capture and pyrosequencing to sequence an allelic series from the classically defined Kit locus (approximately 200 kb) from each of five noncomplementing Kit mutants (one known allele and four unknown alleles) and have successfully identified and validated a nonsynonymous coding mutation for each allele. These data represent the first documentation and validation that these new technologies can be used to efficiently discover causative mutations. Importantly, these data also provide a specific methodological foundation for the development of large-scale mutation detection efforts in the laboratory mouse.
Assuntos
Análise Mutacional de DNA/métodos , Camundongos/genética , Mutação , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Alelos , Sequência de Aminoácidos , Animais , Sequência de Bases , Feminino , Masculino , Camundongos Endogâmicos C57BL , Camundongos Endogâmicos DBA , Dados de Sequência Molecular , Alinhamento de SequênciaRESUMO
We have generated extreme ionizing radiation resistance in a relatively sensitive bacterial species, Escherichia coli, by directed evolution. Four populations of Escherichia coli K-12 were derived independently from strain MG1655, with each specifically adapted to survive exposure to high doses of ionizing radiation. D(37) values for strains isolated from two of the populations approached that exhibited by Deinococcus radiodurans. Complete genomic sequencing was carried out on nine purified strains derived from these populations. Clear mutational patterns were observed that both pointed to key underlying mechanisms and guided further characterization of the strains. In these evolved populations, passive genomic protection is not in evidence. Instead, enhanced recombinational DNA repair makes a prominent but probably not exclusive contribution to genome reconstitution. Multiple genes, multiple alleles of some genes, multiple mechanisms, and multiple evolutionary pathways all play a role in the evolutionary acquisition of extreme radiation resistance. Several mutations in the recA gene and a deletion of the e14 prophage both demonstrably contribute to and partially explain the new phenotype. Mutations in additional components of the bacterial recombinational repair system and the replication restart primosome are also prominent, as are mutations in genes involved in cell division, protein turnover, and glutamate transport. At least some evolutionary pathways to extreme radiation resistance are constrained by the temporally ordered appearance of specific alleles.
Assuntos
Evolução Molecular Direcionada , Escherichia coli/genética , Escherichia coli/efeitos da radiação , Radiação Ionizante , Cromatografia Líquida de Alta Pressão , Eletroforese em Gel de Campo Pulsado , Escherichia coli/crescimento & desenvolvimento , Mutação , Filogenia , Recombinases Rec A/genética , Recombinases Rec A/fisiologiaRESUMO
BACKGROUND: Syphilis spirochete Treponema pallidum ssp. pallidum remains the enigmatic pathogen, since no virulence factors have been identified and the pathogenesis of the disease is poorly understood. Increasing rates of new syphilis cases per year have been observed recently. RESULTS: The genome of the SS14 strain was sequenced to high accuracy by an oligonucleotide array strategy requiring hybridization to only three arrays (Comparative Genome Sequencing, CGS). Gaps in the resulting sequence were filled with targeted dideoxy-terminators (DDT) sequencing and the sequence was confirmed by whole genome fingerprinting (WGF). When compared to the Nichols strain, 327 single nucleotide substitutions (224 transitions, 103 transversions), 14 deletions, and 18 insertions were found. On the proteome level, the highest frequency of amino acid-altering substitution polymorphisms was in novel genes, while the lowest was in housekeeping genes, as expected by their evolutionary conservation. Evidence was also found for hypervariable regions and multiple regions showing intrastrain heterogeneity in the T. pallidum chromosome. CONCLUSION: The observed genetic changes do not have influence on the ability of Treponema pallidum to cause syphilitic infection, since both SS14 and Nichols are virulent in rabbit. However, this is the first assessment of the degree of variation between the two syphilis pathogens and paves the way for phylogenetic studies of this fascinating organism.
Assuntos
Genoma Bacteriano , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Treponema pallidum/genética , Animais , Mapeamento Cromossômico , Impressões Digitais de DNA , Humanos , Dados de Sequência Molecular , Fases de Leitura Aberta , Polimorfismo de Nucleotídeo Único , Coelhos , Reprodutibilidade dos Testes , Análise de Sequência de DNA , Sífilis/microbiologia , Treponema pallidum/isolamento & purificação , Treponema pallidum/patogenicidadeRESUMO
We have developed an optimized array-based approach for customizable allele-specific gene expression (ASE) analysis. The central features of the approach are the ability to select SNPs at will for detection, and the absence of need to PCR amplify the target. A surprisingly long probe length (39-49 nt) was needed for allelic discrimination. Reconstitution experiments demonstrate linearity of ASE over a broad range. Using this approach, we have discovered at least two novel imprinted genes, NLRP2, which encodes a member of the inflammasome, and OSBPL1A, which encodes a presumed oxysterol-binding protein, were both preferentially expressed from the maternal allele. In contrast, ERAP2, which encodes an aminopeptidase, did not show preferential parent-of-origin expression, but rather, cis-acting nonimprinted differential allelic control. The approach is scalable to the whole genome and can be used for discovery of functional epigenetic modifications in patient samples.
Assuntos
Alelos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Polimorfismo de Nucleotídeo Único , Proteínas Adaptadoras de Transdução de Sinal/genética , Aminopeptidases/genética , Proteínas Reguladoras de Apoptose , Proteínas de Transporte/genética , Linhagem Celular , Impressão Genômica , Heterozigoto , Humanos , Receptores de Esteroides , Reprodutibilidade dos TestesRESUMO
Increasingly powerful sequencing technologies are ushering in an era of personal genome sequences and raising the possibility of using such information to guide medical decisions. Genome resequencing also promises to accelerate the identification of disease-associated mutations. Roughly 98% of the human genome is composed of repeats and intergenic or non-protein-coding sequences. Thus, it is crucial to focus resequencing on high-value genomic regions. Protein-coding exons represent one such type of high-value target. We have developed a method of using flexible, high-density microarrays to capture any desired fraction of the human genome, in this case corresponding to more than 200,000 protein-coding exons. Depending on the precise protocol, up to 55-85% of the captured fragments are associated with targeted regions and up to 98% of intended exons can be recovered. This methodology provides an adaptable route toward rapid and efficient resequencing of any sizeable, non-repeat portion of the human genome.
Assuntos
Éxons , Genoma Humano , Análise de Sequência de DNA/métodos , Humanos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Oligonucleotídeos/genéticaRESUMO
We applied high-density microarrays to the enrichment of specific sequences from the human genome for high-throughput sequencing. After capture of 6,726 approximately 500-base 'exon' segments, and of 'locus-specific' regions ranging in size from 200 kb to 5 Mb, followed by sequencing on a 454 Life Sciences FLX sequencer, most sequence reads represented selection targets. These direct selection methods supersede multiplex PCR for the large-scale analysis of genomic regions.
Assuntos
Genoma Humano/genética , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Proteína BRCA1/genética , Linhagem Celular Tumoral , Biblioteca Gênica , Humanos , Hibridização de Ácido Nucleico/métodos , Oligodesoxirribonucleotídeos/genética , Reação em Cadeia da Polimerase , Polimorfismo de Nucleotídeo Único , Análise de Sequência de DNA/métodosRESUMO
We developed a general method, microarray-based genomic selection (MGS), capable of selecting and enriching targeted sequences from complex eukaryotic genomes without the repeat blocking steps necessary for bacterial artificial chromosome (BAC)-based genomic selection. We demonstrate that large human genomic regions, on the order of hundreds of kilobases, can be enriched and resequenced with resequencing arrays. MGS, when combined with a next-generation resequencing technology, can enable large-scale resequencing in single-investigator laboratories.