Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
1.
Virol J ; 9: 261, 2012 Nov 06.
Artigo em Inglês | MEDLINE | ID: mdl-23131097

RESUMO

BACKGROUND: In a high-throughput environment, to PCR amplify and sequence a large set of viral isolates from populations that are potentially heterogeneous and continuously evolving, the use of degenerate PCR primers is an important strategy. Degenerate primers allow for the PCR amplification of a wider range of viral isolates with only one set of pre-mixed primers, thus increasing amplification success rates and minimizing the necessity for genome finishing activities. To successfully select a large set of degenerate PCR primers necessary to tile across an entire viral genome and maximize their success, this process is best performed computationally. RESULTS: We have developed a fully automated degenerate PCR primer design system that plays a key role in the J. Craig Venter Institute's (JCVI) high-throughput viral sequencing pipeline. A consensus viral genome, or a set of consensus segment sequences in the case of a segmented virus, is specified using IUPAC ambiguity codes in the consensus template sequence to represent the allelic diversity of the target population. PCR primer pairs are then selected computationally to produce a minimal amplicon set capable of tiling across the full length of the specified target region. As part of the tiling process, primer pairs are computationally screened to meet the criteria for successful PCR with one of two described amplification protocols. The actual sequencing success rates for designed primers for measles virus, mumps virus, human parainfluenza virus 1 and 3, human respiratory syncytial virus A and B and human metapneumovirus are described, where >90% of designed primer pairs were able to consistently successfully amplify >75% of the isolates. CONCLUSIONS: Augmenting our previously developed and published JCVI Primer Design Pipeline, we achieved similarly high sequencing success rates with only minor software modifications. The recommended methodology for the construction of the consensus sequence that encapsulates the allelic variation of the targeted population and is a key step prior to designing degenerate primers is also formally described.


Assuntos
Primers do DNA/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Vírus/genética , Genoma Viral , Humanos , Reação em Cadeia da Polimerase , Vírus/isolamento & purificação
2.
Bioinformatics ; 24(24): 2818-24, 2008 Dec 15.
Artigo em Inglês | MEDLINE | ID: mdl-18952627

RESUMO

MOTIVATION: DNA sequence reads from Sanger and pyrosequencing platforms differ in cost, accuracy, typical coverage, average read length and the variety of available paired-end protocols. Both read types can complement one another in a 'hybrid' approach to whole-genome shotgun sequencing projects, but assembly software must be modified to accommodate their different characteristics. This is true even of pyrosequencing mated and unmated read combinations. Without special modifications, assemblers tuned for homogeneous sequence data may perform poorly on hybrid data. RESULTS: Celera Assembler was modified for combinations of ABI 3730 and 454 FLX reads. The revised pipeline called CABOG (Celera Assembler with the Best Overlap Graph) is robust to homopolymer run length uncertainty, high read coverage and heterogeneous read lengths. In tests on four genomes, it generated the longest contigs among all assemblers tested. It exploited the mate constraints provided by paired-end reads from either platform to build larger contigs and scaffolds, which were validated by comparison to a finished reference sequence. A low rate of contig mis-assembly was detected in some CABOG assemblies, but this was reduced in the presence of sufficient mate pair data. AVAILABILITY: The software is freely available as open-source from http://wgs-assembler.sf.net under the GNU Public License.


Assuntos
Análise de Sequência de DNA/métodos , Software , Biologia Computacional/métodos , Genoma , Genômica
3.
BMC Bioinformatics ; 9: 191, 2008 Apr 11.
Artigo em Inglês | MEDLINE | ID: mdl-18405373

RESUMO

BACKGROUND: Polymerase chain reaction (PCR) is used in directed sequencing for the discovery of novel polymorphisms. As the first step in PCR directed sequencing, effective PCR primer design is crucial for obtaining high-quality sequence data for target regions. Since current computational primer design tools are not fully tuned with stable underlying laboratory protocols, researchers may still be forced to iteratively optimize protocols for failed amplifications after the primers have been ordered. Furthermore, potentially identifiable factors which contribute to PCR failures have yet to be elucidated. This inefficient approach to primer design is further intensified in a high-throughput laboratory, where hundreds of genes may be targeted in one experiment. RESULTS: We have developed a fully integrated computational PCR primer design pipeline that plays a key role in our high-throughput directed sequencing pipeline. Investigators may specify target regions defined through a rich set of descriptors, such as Ensembl accessions and arbitrary genomic coordinates. Primer pairs are then selected computationally to produce a minimal amplicon set capable of tiling across the specified target regions. As part of the tiling process, primer pairs are computationally screened to meet the criteria for success with one of two PCR amplification protocols. In the process of improving our sequencing success rate, which currently exceeds 95% for exons, we have discovered novel and accurate computational methods capable of identifying primers that may lead to PCR failures. We reveal the laboratory protocols and their associated, empirically determined computational parameters, as well as describe the novel computational methods which may benefit others in future primer design research. CONCLUSION: The high-throughput PCR primer design pipeline has been very successful in providing the basis for high-quality directed sequencing results and for minimizing costs associated with labor and reprocessing. The modular architecture of the primer design software has made it possible to readily integrate additional primer critique tests based on iterative feedback from the laboratory. As a result, the primer design software, coupled with the laboratory protocols, serves as a powerful tool for low and high-throughput primer design to enable successful directed sequencing.


Assuntos
Algoritmos , Primers do DNA/genética , Reação em Cadeia da Polimerase/métodos , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Sequência de Bases , Dados de Sequência Molecular , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
4.
J Comput Biol ; 19(3): 279-92, 2012 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-22175250

RESUMO

Unchained base reads on self-assembling DNA nanoarrays have recently emerged as a promising approach to low-cost, high-quality resequencing of human genomes. Because of unique characteristics of these mated pair reads, existing computational methods for resequencing assembly, such as those based on map-consensus calling, are not adequate for accurate variant calling. We describe novel computational methods developed for accurate calling of SNPs and short substitutions and indels (<100 bp); the same methods apply to evaluation of hypothesized larger, structural variations. We use an optimization process that iteratively adjusts the genome sequence to maximize its a posteriori probability given the observed reads. For each candidate sequence, this probability is computed using Bayesian statistics with a simple read generation model and simplifying assumptions that make the problem computationally tractable. The optimization process iteratively applies one-base substitutions, insertions, and deletions until convergence is achieved to an optimum diploid sequence. A local de novo assembly procedure that generalizes approaches based on De Bruijn graphs is used to seed the optimization process in order to reduce the chance of converging to local optima. Finally, a correlation-based filter is applied to reduce the false positive rate caused by the presence of repetitive regions in the reference genome.


Assuntos
Mapeamento de Sequências Contíguas/métodos , Genoma Humano , Análise de Sequência de DNA/métodos , Algoritmos , Alelos , Sequência de Bases , Teorema de Bayes , Mapeamento Cromossômico , Simulação por Computador , Interpretação Estatística de Dados , Humanos , Modelos Genéticos
5.
Methods Mol Biol ; 630: 271-99, 2010.
Artigo em Inglês | MEDLINE | ID: mdl-20301004

RESUMO

Primer design is a crucial initial step in any experiment utilizing PCR to target and amplify a known nucleotide sequence of interest. Properly designed primers will increase PCR amplification efficiency as well as isolate the targeted sequence of interest with higher specificity. Many factors that may limit the success of a primer pair can be detected a priori with computational methods. For example, primer dimer detection, amplification of alternative products, stem loop interference, extreme melting temperatures, and genotype-specific variations in the target sequence can all be considered computationally to minimize subsequent PCR failures. The use of computational sequence analysis tools to select the best primer pair from the available candidates will not only reduce experimental rates of failure but also avoid the generation of misleading results arising from the amplification of alternative products.


Assuntos
Primers do DNA/genética , Reação em Cadeia da Polimerase Via Transcriptase Reversa/métodos , Análise de Sequência de DNA/métodos , Sequência de Aminoácidos , Sequência de Bases , Biologia Computacional , Primers do DNA/química , Bases de Dados Genéticas , Dados de Sequência Molecular , Conformação de Ácido Nucleico , Reação em Cadeia da Polimerase Via Transcriptase Reversa/instrumentação , Alinhamento de Sequência
6.
Science ; 327(5961): 78-81, 2010 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-19892942

RESUMO

Genome sequencing of large numbers of individuals promises to advance the understanding, treatment, and prevention of human diseases, among other applications. We describe a genome sequencing platform that achieves efficient imaging and low reagent consumption with combinatorial probe anchor ligation chemistry to independently assay each base from patterned nanoarrays of self-assembling DNA nanoballs. We sequenced three human genomes with this platform, generating an average of 45- to 87-fold coverage per genome and identifying 3.2 to 4.5 million sequence variants per genome. Validation of one genome data set demonstrates a sequence accuracy of about 1 false variant per 100 kilobases. The high accuracy, affordable cost of $4400 for sequencing consumables, and scalability of this platform enable complete human genome sequencing for the detection of rare variants in large-scale genetic studies.


Assuntos
DNA/química , Genoma Humano , Análise em Microsséries , Análise de Sequência de DNA/métodos , Sequência de Bases , Biologia Computacional , Custos e Análise de Custo , DNA/genética , Bases de Dados de Ácidos Nucleicos , Biblioteca Genômica , Genótipo , Haplótipos , Projeto Genoma Humano , Humanos , Masculino , Nanoestruturas , Nanotecnologia , Técnicas de Amplificação de Ácido Nucleico , Polimorfismo de Nucleotídeo Único , Análise de Sequência de DNA/economia , Análise de Sequência de DNA/instrumentação , Análise de Sequência de DNA/normas , Software
7.
Science ; 319(5867): 1215-20, 2008 Feb 29.
Artigo em Inglês | MEDLINE | ID: mdl-18218864

RESUMO

We have synthesized a 582,970-base pair Mycoplasma genitalium genome. This synthetic genome, named M. genitalium JCVI-1.0, contains all the genes of wild-type M. genitalium G37 except MG408, which was disrupted by an antibiotic marker to block pathogenicity and to allow for selection. To identify the genome as synthetic, we inserted "watermarks" at intergenic sites known to tolerate transposon insertions. Overlapping "cassettes" of 5 to 7 kilobases (kb), assembled from chemically synthesized oligonucleotides, were joined by in vitro recombination to produce intermediate assemblies of approximately 24 kb, 72 kb ("1/8 genome"), and 144 kb ("1/4 genome"), which were all cloned as bacterial artificial chromosomes in Escherichia coli. Most of these intermediate clones were sequenced, and clones of all four 1/4 genomes with the correct sequence were identified. The complete synthetic genome was assembled by transformation-associated recombination cloning in the yeast Saccharomyces cerevisiae, then isolated and sequenced. A clone with the correct sequence was identified. The methods described here will be generally useful for constructing large DNA molecules from chemically synthesized pieces and also from combinations of natural and synthetic DNA segments.


Assuntos
Clonagem Molecular , DNA Bacteriano/síntese química , Genoma Bacteriano , Genômica/métodos , Mycoplasma genitalium/genética , Sequência de Bases , Cromossomos Artificiais Bacterianos , Cromossomos Artificiais de Levedura , DNA Recombinante , Escherichia coli/genética , Vetores Genéticos , Oligodesoxirribonucleotídeos/síntese química , Plasmídeos , Recombinação Genética , Saccharomyces cerevisiae/genética , Análise de Sequência de DNA , Transformação Genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA