Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
Mais filtros








Base de dados
Intervalo de ano de publicação
1.
J Biol Chem ; 293(30): 11687-11708, 2018 07 27.
Artigo em Inglês | MEDLINE | ID: mdl-29773649

RESUMO

HIV-1 subtype C (HIV-1C) may duplicate longer amino acid stretches in the p6 Gag protein, leading to the creation of an additional Pro-Thr/Ser-Ala-Pro (PTAP) motif necessary for viral packaging. However, the biological significance of a duplication of the PTAP motif for HIV-1 replication and pathogenesis has not been experimentally validated. In a longitudinal study of two different clinical cohorts of select HIV-1 seropositive, drug-naive individuals from India, we found that 8 of 50 of these individuals harbored a mixed infection of viral strains discordant for the PTAP duplication. Conventional and next-generation sequencing of six primary viral quasispecies at multiple time points disclosed that in a mixed infection, the viral strains containing the PTAP duplication dominated the infection. The dominance of the double-PTAP viral strains over a genetically similar single-PTAP viral clone was confirmed in viral proliferation and pairwise competition assays. Of note, in the proximity ligation assay, double-PTAP Gag proteins exhibited a significantly enhanced interaction with the host protein tumor susceptibility gene 101 (Tsg101). Moreover, Tsg101 overexpression resulted in a biphasic effect on HIV-1C proliferation, an enhanced effect at low concentration and an inhibitory effect only at higher concentrations, unlike a uniformly inhibitory effect on subtype B strains. In summary, our results indicate that the duplication of the PTAP motif in the p6 Gag protein enhances the replication fitness of HIV-1C by engaging the Tsg101 host protein with a higher affinity. Our results have implications for HIV-1 pathogenesis, especially of HIV-1C.


Assuntos
Proteínas de Ligação a DNA/metabolismo , Complexos Endossomais de Distribuição Requeridos para Transporte/metabolismo , Infecções por HIV/metabolismo , Infecções por HIV/virologia , HIV-1/fisiologia , Fatores de Transcrição/metabolismo , Replicação Viral , Produtos do Gene gag do Vírus da Imunodeficiência Humana/metabolismo , Adulto , Motivos de Aminoácidos , Células Cultivadas , Proteínas de Ligação a DNA/genética , Complexos Endossomais de Distribuição Requeridos para Transporte/genética , Feminino , Infecções por HIV/genética , HIV-1/química , HIV-1/genética , Interações Hospedeiro-Patógeno , Humanos , Estudos Longitudinais , Masculino , Pessoa de Meia-Idade , Mapas de Interação de Proteínas , Fatores de Transcrição/genética , Produtos do Gene gag do Vírus da Imunodeficiência Humana/química , Produtos do Gene gag do Vírus da Imunodeficiência Humana/genética
2.
Clin Cancer Res ; 23(18): 5648-5656, 2017 Sep 15.
Artigo em Inglês | MEDLINE | ID: mdl-28536309

RESUMO

Purpose: Tumor-derived cell-free DNA (cfDNA) in plasma can be used for molecular testing and provide an attractive alternative to tumor tissue. Commonly used PCR-based technologies can test for limited number of alterations at the time. Therefore, novel ultrasensitive technologies capable of testing for a broad spectrum of molecular alterations are needed to further personalized cancer therapy.Experimental Design: We developed a highly sensitive ultradeep next-generation sequencing (NGS) assay using reagents from TruSeqNano library preparation and NexteraRapid Capture target enrichment kits to generate plasma cfDNA sequencing libraries for mutational analysis in 61 cancer-related genes using common bioinformatics tools. The results were retrospectively compared with molecular testing of archival primary or metastatic tumor tissue obtained at different points of clinical care.Results: In a study of 55 patients with advanced cancer, the ultradeep NGS assay detected 82% (complete detection) to 87% (complete and partial detection) of the aberrations identified in discordantly collected corresponding archival tumor tissue. Patients with a low variant allele frequency (VAF) of mutant cfDNA survived longer than those with a high VAF did (P = 0.018). In patients undergoing systemic therapy, radiological response was positively associated with changes in cfDNA VAF (P = 0.02), and compared with unchanged/increased mutant cfDNA VAF, decreased cfDNA VAF was associated with longer time to treatment failure (TTF; P = 0.03).Conclusions: Ultradeep NGS assay has good sensitivity compared with conventional clinical mutation testing of archival specimens. A high VAF in mutant cfDNA corresponded with shorter survival. Changes in VAF of mutated cfDNA were associated with TTF. Clin Cancer Res; 23(18); 5648-56. ©2017 AACR.


Assuntos
Biomarcadores Tumorais , DNA Tumoral Circulante , Sequenciamento de Nucleotídeos em Larga Escala , Neoplasias/diagnóstico , Neoplasias/genética , Adulto , Idoso , Idoso de 80 Anos ou mais , Feminino , Testes Genéticos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Sequenciamento de Nucleotídeos em Larga Escala/normas , Humanos , Masculino , Pessoa de Meia-Idade , Mutação , Neoplasias/mortalidade , Prognóstico , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
3.
BMC Bioinformatics ; 16: 17, 2015 Jan 28.
Artigo em Inglês | MEDLINE | ID: mdl-25626454

RESUMO

BACKGROUND: Next-generation sequencing (NGS) is rapidly becoming common practice in clinical diagnostics and cancer research. In addition to the detection of single nucleotide variants (SNVs), information on copy number variants (CNVs) is of great interest. Several algorithms exist to detect CNVs by analyzing whole genome sequencing data or data from samples enriched by hybridization-capture. PCR-enriched amplicon-sequencing data have special characteristics that have been taken into account by only one publicly available algorithm so far. RESULTS: We describe a new algorithm named quandico to detect copy number differences based on NGS data generated following PCR-enrichment. A weighted t-test statistic was applied to calculate probabilities (p-values) of copy number changes. We assessed the performance of the method using sequencing reads generated from reference DNA with known CNVs, and we were able to detect these variants with 98.6% sensitivity and 98.5% specificity which is significantly better than another recently described method for amplicon sequencing. The source code (R-package) of quandico is licensed under the GPLv3 and it is available at https://github.com/reineckef/quandico . CONCLUSION: We demonstrated that our new algorithm is suitable to call copy number changes using data from PCR-enriched samples with high sensitivity and specificity even for single copy differences.


Assuntos
Algoritmos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Reação em Cadeia da Polimerase/métodos , Análise de Sequência de DNA/métodos , Estudos de Casos e Controles , Variações do Número de Cópias de DNA , Humanos , Sensibilidade e Especificidade
4.
BMC Genomics ; 15: 1073, 2014 Dec 05.
Artigo em Inglês | MEDLINE | ID: mdl-25480444

RESUMO

BACKGROUND: Analysis of targeted amplicon sequencing data presents some unique challenges in comparison to the analysis of random fragment sequencing data. Whereas reads from randomly fragmented DNA have arbitrary start positions, the reads from amplicon sequencing have fixed start positions that coincide with the amplicon boundaries. As a result, any variants near the amplicon boundaries can cause misalignments of multiple reads that can ultimately lead to false-positive or false-negative variant calls. RESULTS: We show that amplicon boundaries are variant calling blind spots where the variant calls are highly inaccurate. We propose that an effective strategy to avoid these blind spots is to incorporate the primer bases in obtaining read alignments and post-processing of the alignments, thereby effectively moving these blind spots into the primer binding regions (which are not used for variant calling). Targeted sequencing data analysis pipelines can provide better variant calling accuracy when primer bases are retained and sequenced. CONCLUSIONS: Read bases beyond the variant site are necessary for analysis of amplicon sequencing data. Enzymatic primer digestion, if used in the target enrichment process, should leave at least a few primer bases to ensure that these bases are available during data analysis. The primer bases should only be removed immediately before the variant calling step to ensure that the variants can be called irrespective of where they occur within the amplicon insert region.


Assuntos
Biologia Computacional/métodos , Sequenciamento de Nucleotídeos em Larga Escala , Análise de Sequência de DNA/métodos , Simulação por Computador , Primers do DNA , Reação em Cadeia da Polimerase/métodos , Reprodutibilidade dos Testes
5.
Microbiome ; 2: 31, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25228989

RESUMO

BACKGROUND: Sample storage conditions, extraction methods, PCR primers, and parameters are major factors that affect metagenomics analysis based on microbial 16S rRNA gene sequencing. Most published studies were limited to the comparison of only one or two types of these factors. Systematic multi-factor explorations are needed to evaluate the conditions that may impact validity of a microbiome analysis. This study was aimed to improve methodological options to facilitate the best technical approaches in the design of a microbiome study. Three readily available mock bacterial community materials and two commercial extraction techniques, Qiagen DNeasy and MO BIO PowerSoil DNA purification methods, were used to assess procedures for 16S ribosomal DNA amplification and pyrosequencing-based analysis. Primers were chosen for 16S rDNA quantitative PCR and amplification of region V3 to V1. Swabs spiked with mock bacterial community cells and clinical oropharyngeal swabs were incubated at respective temperatures of -80°C, -20°C, 4°C, and 37°C for 4 weeks, then extracted with the two methods, and subjected to pyrosequencing and taxonomic and statistical analyses to investigate microbiome profile stability. RESULTS: The bacterial compositions for the mock community DNA samples determined in this study were consistent with the projected levels and agreed with the literature. The quantitation accuracy of abundances for several genera was improved with changes made to the standard Human Microbiome Project (HMP) procedure. The data for the samples purified with DNeasy and PowerSoil methods were statistically distinct; however, both results were reproducible and in good agreement with each other. The temperature effect on storage stability was investigated by using mock community cells and showed that the microbial community profiles were altered with the increase in incubation temperature. However, this phenomenon was not detected when clinical oropharyngeal swabs were used in the experiment. CONCLUSIONS: Mock community materials originated from the HMP study are valuable controls in developing 16S metagenomics analysis procedures. Long-term exposure to a high temperature may introduce variation into analysis for oropharyngeal swabs, suggestive of storage at 4°C or lower. The observed variations due to sample storage temperature are in a similar range as the intrapersonal variability among different clinical oropharyngeal swab samples.

6.
BMC Genomics ; 15: 244, 2014 Mar 28.
Artigo em Inglês | MEDLINE | ID: mdl-24678773

RESUMO

BACKGROUND: High-throughput sequencing is rapidly becoming common practice in clinical diagnosis and cancer research. Many algorithms have been developed for somatic single nucleotide variant (SNV) detection in matched tumor-normal DNA sequencing. Although numerous studies have compared the performance of various algorithms on exome data, there has not yet been a systematic evaluation using PCR-enriched amplicon data with a range of variant allele fractions. The recently developed gold standard variant set for the reference individual NA12878 by the NIST-led "Genome in a Bottle" Consortium (NIST-GIAB) provides a good resource to evaluate admixtures with various SNV fractions. RESULTS: Using the NIST-GIAB gold standard, we compared the performance of five popular somatic SNV calling algorithms (GATK UnifiedGenotyper followed by simple subtraction, MuTect, Strelka, SomaticSniper and VarScan2) for matched tumor-normal amplicon and exome sequencing data. CONCLUSIONS: We demonstrated that the five commonly used somatic SNV calling methods are applicable to both targeted amplicon and exome sequencing data. However, the sensitivities of these methods vary based on the allelic fraction of the mutation in the tumor sample. Our analysis can assist researchers in choosing a somatic SNV calling method suitable for their specific needs.


Assuntos
Biologia Computacional/métodos , Exoma , Sequenciamento de Nucleotídeos em Larga Escala , Mutação , Software , Algoritmos , Bases de Dados de Ácidos Nucleicos , Genômica/métodos , Humanos , Mutação Puntual , Curva ROC , Sensibilidade e Especificidade
7.
Nucleic Acids Res ; 40(16): e127, 2012 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-22584625

RESUMO

Accurate estimation of expression levels from RNA-Seq data entails precise mapping of the sequence reads to a reference genome. Because the standard reference genome contains only one allele at any given locus, reads overlapping polymorphic loci that carry a non-reference allele are at least one mismatch away from the reference and, hence, are less likely to be mapped. This bias in read mapping leads to inaccurate estimates of allele-specific expression (ASE). To address this read-mapping bias, we propose the construction of an enhanced reference genome that includes the alternative alleles at known polymorphic loci. We show that mapping to this enhanced reference reduced the read-mapping biases, leading to more reliable estimates of ASE. Experiments on simulated data show that the proposed strategy reduced the number of loci with mapping bias by ≥ 63% when compared with a previous approach that relies on masking the polymorphic loci and by ≥ 18% when compared with the standard approach that uses an unaltered reference. When we applied our strategy to actual RNA-Seq data, we found that it mapped up to 15% more reads than the previous approaches and identified many seemingly incorrect inferences made by them.


Assuntos
Alelos , Mapeamento Cromossômico/métodos , Perfilação da Expressão Gênica , Análise de Sequência de RNA/métodos , Mapeamento Cromossômico/normas , Loci Gênicos , Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Polimorfismo de Nucleotídeo Único , Padrões de Referência
8.
PLoS One ; 6(3): e17469, 2011 Mar 07.
Artigo em Inglês | MEDLINE | ID: mdl-21408217

RESUMO

BACKGROUND: The annotation of genomes from next-generation sequencing platforms needs to be rapid, high-throughput, and fully integrated and automated. Although a few Web-based annotation services have recently become available, they may not be the best solution for researchers that need to annotate a large number of genomes, possibly including proprietary data, and store them locally for further analysis. To address this need, we developed a standalone software application, the Annotation of microbial Genome Sequences (AGeS) system, which incorporates publicly available and in-house-developed bioinformatics tools and databases, many of which are parallelized for high-throughput performance. METHODOLOGY: The AGeS system supports three main capabilities. The first is the storage of input contig sequences and the resulting annotation data in a central, customized database. The second is the annotation of microbial genomes using an integrated software pipeline, which first analyzes contigs from high-throughput sequencing by locating genomic regions that code for proteins, RNA, and other genomic elements through the Do-It-Yourself Annotation (DIYA) framework. The identified protein-coding regions are then functionally annotated using the in-house-developed Pipeline for Protein Annotation (PIPA). The third capability is the visualization of annotated sequences using GBrowse. To date, we have implemented these capabilities for bacterial genomes. AGeS was evaluated by comparing its genome annotations with those provided by three other methods. Our results indicate that the software tools integrated into AGeS provide annotations that are in general agreement with those provided by the compared methods. This is demonstrated by a >94% overlap in the number of identified genes, a significant number of identical annotated features, and a >90% agreement in enzyme function predictions.


Assuntos
Genoma Bacteriano/genética , Anotação de Sequência Molecular/métodos , Software , Sequência de Bases , Genes Bacterianos/genética , Reprodutibilidade dos Testes
9.
Artigo em Inglês | MEDLINE | ID: mdl-18989047

RESUMO

The incomplete perfect phylogeny (IPP) problem and the incomplete perfect phylogeny haplotyping (IPPH) problem deal with constructing a phylogeny for a given set of haplotypes or genotypes with missing entries. The earlier approaches for both of these problems dealt with restricted versions of the problems, where the root is either available or can be trivially re-constructed from the data, or certain assumptions were made about the data. In this paper, we deal with the unrestricted versions of the problems, where the root of the phylogeny is neither available nor trivially recoverable from the data. Both IPP and IPPH problems have previously been proven to be NP-complete. Here, we present efficient enumerative algorithms that can handle practical instances of the problem. Empirical analysis on simulated data shows that the algorithms perform very well both in terms of speed and in terms accuracy of the recovered data.


Assuntos
Algoritmos , Evolução Biológica , Mapeamento Cromossômico/métodos , Evolução Molecular , Haplótipos/genética , Filogenia , Polimorfismo de Nucleotídeo Único/genética , Análise de Sequência de DNA/métodos
10.
Bioinformatics ; 22(14): e514-22, 2006 Jul 15.
Artigo em Inglês | MEDLINE | ID: mdl-16873515

RESUMO

MOTIVATION: We explore the problem of constructing near-perfect phylogenies on bi-allelic haplotypes, where the deviation from perfect phylogeny is entirely due to homoplasy events. We present polynomial-time algorithms for restricted versions of the problem. We show that these algorithms can be extended to genotype data, in which case the problem is called the near-perfect phylogeny haplotyping (NPPH) problem. We present a near-optimal algorithm for the H1-NPPH problem, which is to determine if a given set of genotypes admit a phylogeny with a single homoplasy event. The time-complexity of our algorithm for the H1-NPPH problem is O(m2(n + m)), where n is the number of genotypes and m is the number of SNP sites. This is a significant improvement over the earlier O(n4) algorithm. We also introduce generalized versions of the problem. The H(1, q)-NPPH problem is to determine if a given set of genotypes admit a phylogeny with q homoplasy events, so that all the homoplasy events occur in a single site. We present an O(m(q+1)(n + m)) algorithm for the H(1,q)-NPPH problem. RESULTS: We present results on simulated data, which demonstrate that the accuracy of our algorithm for the H1-NPPH problem is comparable to that of the existing methods, while being orders of magnitude faster. AVAILABILITY: The implementation of our algorithm for the H1-NPPH problem is available upon request.


Assuntos
Evolução Biológica , Mapeamento Cromossômico/métodos , Análise Mutacional de DNA/métodos , Desequilíbrio de Ligação/genética , Polimorfismo de Nucleotídeo Único/genética , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Algoritmos , Sequência de Bases , Genoma Humano/genética , Haplótipos/genética , Humanos , Dados de Sequência Molecular , Filogenia
11.
Artigo em Inglês | MEDLINE | ID: mdl-16452805

RESUMO

Codon optimization enhances the efficiency of DNA expression vectors used in DNA vaccination and gene therapy by increasing protein expression. Additionally, certain nucleotide motifs have experimentally been shown to be immuno-stimulatory while certain others immuno-suppressive. In this paper, we present algorithms to locate a given set of immuno-modulatory motifs in the DNA expression vectors corresponding to a given amino acid sequence and maximize or minimize the number and the context of the immuno-modulatory motifs in the DNA expression vectors. The main contribution is to use multiple pattern matching algorithms to synthesize a DNA sequence for a given amino acid sequence and a graph theoretic approach for finding the longest weighted path in a directed graph that will maximize or minimize certain motifs. This is achieved using O(n(2)) time, where n is the length of the amino acid sequence. Based on this, we develop a software tool.


Assuntos
Algoritmos , Códon/genética , Ilhas de CpG/genética , Engenharia Genética/métodos , Vetores Genéticos/genética , Reconhecimento Automatizado de Padrão/métodos , Análise de Sequência de DNA/métodos , Motivos de Aminoácidos , Inteligência Artificial , Expressão Gênica/genética , Software , Vacinas de DNA/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA