Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 46
Filtrar
Más filtros

Intervalo de año de publicación
1.
Genome Res ; 33(7): 1133-1144, 2023 07.
Artículo en Inglés | MEDLINE | ID: mdl-37217250

RESUMEN

The assay for transposase-accessible chromatin with sequencing (ATAC-seq) is a common assay to identify chromatin accessible regions by using a Tn5 transposase that can access, cut, and ligate adapters to DNA fragments for subsequent amplification and sequencing. These sequenced regions are quantified and tested for enrichment in a process referred to as "peak calling." Most unsupervised peak calling methods are based on simple statistical models and suffer from elevated false positive rates. Newly developed supervised deep learning methods can be successful, but they rely on high quality labeled data for training, which can be difficult to obtain. Moreover, though biological replicates are recognized to be important, there are no established approaches for using replicates in the deep learning tools, and the approaches available for traditional methods either cannot be applied to ATAC-seq, where control samples may be unavailable, or are post hoc and do not capitalize on potentially complex, but reproducible signal in the read enrichment data. Here, we propose a novel peak caller that uses unsupervised contrastive learning to extract shared signals from multiple replicates. Raw coverage data are encoded to obtain low-dimensional embeddings and optimized to minimize a contrastive loss over biological replicates. These embeddings are passed to another contrastive loss for learning and predicting peaks and decoded to denoised data under an autoencoder loss. We compared our replicative contrastive learner (RCL) method with other existing methods on ATAC-seq data, using annotations from ChromHMM genomic labels and transcription factor ChIP-seq as noisy truth. RCL consistently achieved the best performance.


Asunto(s)
Secuenciación de Inmunoprecipitación de Cromatina , Secuenciación de Nucleótidos de Alto Rendimiento , Análisis de Secuencia de ADN/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Cromatina/genética , ADN/genética
2.
Plant Physiol ; 195(3): 2234-2255, 2024 Jun 28.
Artículo en Inglés | MEDLINE | ID: mdl-38537616

RESUMEN

The hydrophobic cuticle is the first line of defense between aerial portions of plants and the external environment. On maize (Zea mays L.) silks, the cuticular cutin matrix is infused with cuticular waxes, consisting of a homologous series of very long-chain fatty acids (VLCFAs), aldehydes, and hydrocarbons. Together with VLC fatty-acyl-CoAs (VLCFA-CoAs), these metabolites serve as precursors, intermediates, and end-products of the cuticular wax biosynthetic pathway. To deconvolute the potentially confounding impacts of the change in silk microenvironment and silk development on this pathway, we profiled cuticular waxes on the silks of the inbreds B73 and Mo17, and their reciprocal hybrids. Multivariate interrogation of these metabolite abundance data demonstrates that VLCFA-CoAs and total free VLCFAs are positively correlated with the cuticular wax metabolome, and this metabolome is primarily affected by changes in the silk microenvironment and plant genotype. Moreover, the genotype effect on the pathway explains the increased accumulation of cuticular hydrocarbons with a concomitant reduction in cuticular VLCFA accumulation on B73 silks, suggesting that the conversion of VLCFA-CoAs to hydrocarbons is more effective in B73 than Mo17. Statistical modeling of the ratios between cuticular hydrocarbons and cuticular VLCFAs reveals a significant role of precursor chain length in determining this ratio. This study establishes the complexity of the product-precursor relationships within the silk cuticular wax-producing network by dissecting both the impact of genotype and the allocation of VLCFA-CoA precursors to different biological processes and demonstrates that longer chain VLCFA-CoAs are preferentially utilized for hydrocarbon biosynthesis.


Asunto(s)
Ácidos Grasos , Hidrocarburos , Ceras , Zea mays , Zea mays/metabolismo , Zea mays/genética , Ceras/metabolismo , Hidrocarburos/metabolismo , Ácidos Grasos/metabolismo , Genotipo , Metaboloma , Epidermis de la Planta/metabolismo , Vías Biosintéticas
3.
Bioinformatics ; 39(1)2023 01 01.
Artículo en Inglés | MEDLINE | ID: mdl-36610988

RESUMEN

MOTIVATION: Amplicon sequencing is widely applied to explore heterogeneity and rare variants in genetic populations. Resolving true biological variants and quantifying their abundance is crucial for downstream analyses, but measured abundances are distorted by stochasticity and bias in amplification, plus errors during polymerase chain reaction (PCR) and sequencing. One solution attaches unique molecular identifiers (UMIs) to sample sequences before amplification. Counting UMIs instead of sequences provides unbiased estimates of abundance. While modern methods improve over naïve counting by UMI identity, most do not account for UMI reuse or collision, and they do not adequately model PCR and sequencing errors in the UMIs and sample sequences. RESULTS: We introduce Deduplication and Abundance estimation with UMIs (DAUMI), a probabilistic framework to detect true biological amplicon sequences and accurately estimate their deduplicated abundance. DAUMI recognizes UMI collision, even on highly similar sequences, and detects and corrects most PCR and sequencing errors in the UMI and sampled sequences. DAUMI performs better on simulated and real data compared to other UMI-aware clustering methods. AVAILABILITY AND IMPLEMENTATION: Source code is available at https://github.com/DormanLab/AmpliCI. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento , Programas Informáticos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Análisis de Secuencia de ADN/métodos , Reacción en Cadena de la Polimerasa , Análisis por Conglomerados
4.
Bioinformatics ; 39(1)2023 01 01.
Artículo en Inglés | MEDLINE | ID: mdl-36367243

RESUMEN

MOTIVATION: Genotyping by sequencing is a powerful tool for investigating genetic variation in plants, but many economically important plants are allopolyploids, where homoeologous similarity obscures the subgenomic origin of reads and confounds allelic and homoeologous SNPs. Recent polyploid genotyping methods use allelic frequencies, rate of heterozygosity, parental cross or other information to resolve read assignment, but good subgenomic references offer the most direct information. The typical strategy aligns reads to the joint reference, performs diploid genotyping within each subgenome, and filters the results, but persistent read misassignment results in an excess of false heterozygous calls. RESULTS: We introduce the Comprehensive Allopolyploid Genotyper (CAPG), which formulates an explicit likelihood to weight read alignments against both subgenomic references and genotype individual allopolyploids from whole-genome resequencing data. We demonstrate CAPG in allotetraploids, where it performs better than Genome Analysis Toolkit's HaplotypeCaller applied to reads aligned to the combined subgenomic references. AVAILABILITY AND IMPLEMENTATION: Code and tutorials are available at https://github.com/Kkulkarni1/CAPG.git. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Técnicas de Genotipaje , Programas Informáticos , Genotipo , Técnicas de Genotipaje/métodos , Análisis de Secuencia de ADN , Heterocigoto , Alelos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos
5.
Bioinformatics ; 39(5)2023 05 04.
Artículo en Inglés | MEDLINE | ID: mdl-37115636

RESUMEN

MOTIVATION: Allostery enables changes to the dynamic behavior of a protein at distant positions induced by binding. Here, we present APOP, a new allosteric pocket prediction method, which perturbs the pockets formed in the structure by stiffening pairwise interactions in the elastic network across the pocket, to emulate ligand binding. Ranking the pockets based on the shifts in the global mode frequencies, as well as their mean local hydrophobicities, leads to high prediction success when tested on a dataset of allosteric proteins, composed of both monomers and multimeric assemblages. RESULTS: Out of the 104 test cases, APOP predicts known allosteric pockets for 92 within the top 3 rank out of multiple pockets available in the protein. In addition, we demonstrate that APOP can also find new alternative allosteric pockets in proteins. Particularly interesting findings are the discovery of previously overlooked large pockets located in the centers of many protein biological assemblages; binding of ligands at these sites would likely be particularly effective in changing the protein's global dynamics. AVAILABILITY AND IMPLEMENTATION: APOP is freely available as an open-source code (https://github.com/Ambuj-UF/APOP) and as a web server at https://apop.bb.iastate.edu/.


Asunto(s)
Proteínas , Programas Informáticos , Proteínas/química , Ligandos , Unión Proteica , Sitios de Unión , Conformación Proteica , Sitio Alostérico
6.
Bioinformatics ; 38(10): 2727-2733, 2022 05 13.
Artículo en Inglés | MEDLINE | ID: mdl-35561187

RESUMEN

SUMMARY: A new dynamic community identifier (DCI) is presented that relies upon protein residue dynamic cross-correlations generated by Gaussian elastic network models to identify those residue clusters exhibiting motions within a protein. A number of examples of communities are shown for diverse proteins, including GPCRs. It is a tool that can immediately simplify and clarify the most essential functional moving parts of any given protein. Proteins usually can be subdivided into groups of residues that move as communities. These are usually densely packed local sub-structures, but in some cases can be physically distant residues identified to be within the same community. The set of these communities for each protein are the moving parts. The ways in which these are organized overall can aid in understanding many aspects of functional dynamics and allostery. DCI enables a more direct understanding of functions including enzyme activity, action across membranes and changes in the community structure from mutations or ligand binding. The DCI server is freely available on a web site (https://dci.bb.iastate.edu/). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Proteínas de Granos , Movimiento (Física) , Distribución Normal , Conformación Proteica , Proteínas/química
7.
Bioinformatics ; 36(21): 5151-5158, 2021 01 29.
Artículo en Inglés | MEDLINE | ID: mdl-32697845

RESUMEN

MOTIVATION: Next-generation amplicon sequencing is a powerful tool for investigating microbial communities. A main challenge is to distinguish true biological variants from errors caused by amplification and sequencing. In traditional analyses, such errors are eliminated by clustering reads within a sequence similarity threshold, usually 97%, and constructing operational taxonomic units, but the arbitrary threshold leads to low resolution and high false-positive rates. Recently developed 'denoising' methods have proven able to resolve single-nucleotide amplicon variants, but they still miss low-frequency sequences, especially those near more frequent sequences, because they ignore the sequencing quality information. RESULTS: We introduce AmpliCI, a reference-free, model-based method for rapidly resolving the number, abundance and identity of error-free sequences in massive Illumina amplicon datasets. AmpliCI considers the quality information and allows the data, not an arbitrary threshold or an external database, to drive conclusions. AmpliCI estimates a finite mixture model, using a greedy strategy to gradually select error-free sequences and approximately maximize the likelihood. AmpliCI has better performance than three popular denoising methods, with acceptable computation time and memory usage. AVAILABILITY AND IMPLEMENTATION: Source code is available at https://github.com/DormanLab/AmpliCI. SUPPLEMENTARY INFORMATION: Supplementary material are available at Bioinformatics online.


Asunto(s)
Algoritmos , Microbiota , Análisis por Conglomerados , Secuenciación de Nucleótidos de Alto Rendimiento , Análisis de Secuencia de ADN , Programas Informáticos
8.
PLoS Comput Biol ; 17(4): e1008890, 2021 04.
Artículo en Inglés | MEDLINE | ID: mdl-33798202

RESUMEN

Protein-protein interaction networks are one of the most effective representations of cellular behavior. In order to build these models, high-throughput techniques are required. Next-generation interaction screening (NGIS) protocols that combine yeast two-hybrid (Y2H) with deep sequencing are promising approaches to generate interactome networks in any organism. However, challenges remain to mining reliable information from these screens and thus, limit its broader implementation. Here, we present a computational framework, designated Y2H-SCORES, for analyzing high-throughput Y2H screens. Y2H-SCORES considers key aspects of NGIS experimental design and important characteristics of the resulting data that distinguish it from RNA-seq expression datasets. Three quantitative ranking scores were implemented to identify interacting partners, comprising: 1) significant enrichment under selection for positive interactions, 2) degree of interaction specificity among multi-bait comparisons, and 3) selection of in-frame interactors. Using simulation and an empirical dataset, we provide a quantitative assessment to predict interacting partners under a wide range of experimental scenarios, facilitating independent confirmation by one-to-one bait-prey tests. Simulation of Y2H-NGIS enabled us to identify conditions that maximize detection of true interactors, which can be achieved with protocols such as prey library normalization, maintenance of larger culture volumes and replication of experimental treatments. Y2H-SCORES can be implemented in different yeast-based interaction screenings, with an equivalent or superior performance than existing methods. Proof-of-concept was demonstrated by discovery and validation of novel interactions between the barley nucleotide-binding leucine-rich repeat (NLR) immune receptor MLA6, and fourteen proteins, including those that function in signaling, transcriptional regulation, and intracellular trafficking.


Asunto(s)
Proteínas de Plantas/metabolismo , Mapas de Interacción de Proteínas , Receptores Inmunológicos/metabolismo , Técnicas del Sistema de Dos Híbridos , Conjuntos de Datos como Asunto , Prueba de Estudio Conceptual
9.
Retrovirology ; 14(1): 40, 2017 Aug 22.
Artículo en Inglés | MEDLINE | ID: mdl-28830558

RESUMEN

BACKGROUND: Rev-like proteins are post-transcriptional regulatory proteins found in several retrovirus genera, including lentiviruses, betaretroviruses, and deltaretroviruses. These essential proteins mediate the nuclear export of incompletely spliced viral RNA, and act by tethering viral pre-mRNA to the host CRM1 nuclear export machinery. Although all Rev-like proteins are functionally homologous, they share less than 30% sequence identity. In the present study, we computationally assessed the extent of structural homology among retroviral Rev-like proteins within a phylogenetic framework. RESULTS: We undertook a comprehensive analysis of overall protein domain architecture and predicted secondary structural features for representative members of the Rev-like family of proteins. Similar patterns of α-helical domains were identified for Rev-like proteins within each genus, with the exception of deltaretroviruses, which were devoid of α-helices. Coiled-coil oligomerization motifs were also identified for most Rev-like proteins, with the notable exceptions of HIV-1, the deltaretroviruses, and some small ruminant lentiviruses. In Rev proteins of primate lentiviruses, the presence of predicted coiled-coil motifs segregated within specific primate lineages: HIV-1 descended from SIVs that lacked predicted coiled-coils in Rev whereas HIV-2 descended from SIVs that contained predicted coiled-coils in Rev. Phylogenetic ancestral reconstruction of coiled-coils for all Rev-like proteins predicted a single origin for the coiled-coil motif, followed by three losses of the predicted signal. The absence of a coiled-coil signal in HIV-1 was associated with replacement of canonical polar residues with non-canonical hydrophobic residues. However, hydrophobic residues were retained in the key 'a' and 'd' positions, and the α-helical region of HIV-1 Rev oligomerization domain could be modeled as a helical wheel with two predicted interaction interfaces. Moreover, the predicted interfaces mapped to the dimerization and oligomerization interfaces in HIV-1 Rev crystal structures. Helical wheel projections of other retroviral Rev-like proteins, including endogenous sequences, revealed similar interaction interfaces that could mediate oligomerization. CONCLUSIONS: Sequence-based computational analyses of Rev-like proteins, together with helical wheel projections of oligomerization domains, reveal a conserved homogeneous structural basis for oligomerization by retroviral Rev-like proteins.


Asunto(s)
Productos del Gen rev/química , Productos del Gen rev/metabolismo , Modelos Moleculares , Retroviridae/química , Retroviridae/metabolismo , Secuencia de Aminoácidos , Dimerización , Variación Genética , Filogenia , Estructura Secundaria de Proteína , Proteínas de los Retroviridae/química , Proteínas de los Retroviridae/metabolismo , Homología de Secuencia de Aminoácido
10.
J Gen Virol ; 98(8): 2001-2010, 2017 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-28758634

RESUMEN

Transmission of influenza A virus (IAV) from humans to swine occurs with relative frequency and is a critical contributor to swine IAV diversity. Subsequent to the introduction of these human seasonal lineages, there is often reassortment with endemic viruses and antigenic drift. To address whether particular genome constellations contributed to viral persistence following the introduction of the 2009 H1N1 human pandemic virus to swine in the USA, we collated and analysed 616 whole genomes of swine H1 isolates. For each gene, sequences were aligned, the best-known maximum likelihood phylogeny was inferred, and each virus was assigned a clade based upon its evolutionary history. A time-scaled Bayesian approach was implemented for the haemagglutinin (HA) gene to determine the patterns of genetic diversity over time. From these analyses, we observed an increase in genome diversity across all H1 lineages and clades, with the H1-γ and H1-δ1 genetic clades containing the greatest number of unique genome patterns. We documented 74 genome patterns from 2009 to 2016, of which 3 genome patterns were consistently detected at a significantly higher level than others across the entire time period. Eight genome patterns increased significantly, while five genome patterns were shown to decline in detection over time. Viruses with genome patterns identified as persisting in the US swine population may possess a greater capacity to infect and transmit in swine. This study highlights the emerging genetic diversity of US swine IAV from 2009 to 2016, with implications for swine and public health and vaccine control efforts.


Asunto(s)
Genoma Viral , Subtipo H1N1 del Virus de la Influenza A/aislamiento & purificación , Infecciones por Orthomyxoviridae/veterinaria , Enfermedades de los Porcinos/virología , Animales , Evolución Molecular , Genómica , Genotipo , Subtipo H1N1 del Virus de la Influenza A/clasificación , Subtipo H1N1 del Virus de la Influenza A/genética , Infecciones por Orthomyxoviridae/virología , Filogenia , ARN Viral/genética , Porcinos , Estados Unidos
11.
Virol J ; 14(1): 110, 2017 06 12.
Artículo en Inglés | MEDLINE | ID: mdl-28606155

RESUMEN

BACKGROUND: We previously reported the discovery of a novel, putative flavivirus designated T'Ho virus in Culex quinquefasciatus mosquitoes in the Yucatan Peninsula of Mexico. A 1358-nt region of the NS5 gene was amplified and sequenced but an isolate was not recovered. RESULTS: The complete genome of T'Ho virus was sequenced using a combination of unbiased high-throughput sequencing, 5' and 3' rapid amplification of cDNA ends, reverse transcription-polymerase chain reaction and Sanger sequencing. The genome contains a single open reading frame of 10,284 nt which is flanked by 5' and 3' untranslated regions of 97 and 556-nt, respectively. Genome sequence alignments revealed that T'Ho virus is most closely related to Rocio virus (67.4% nucleotide identity) and Ilheus virus (65.9%), both of which belong to the Ntaya group, followed by other Ntaya group viruses (58.8-63.3%) and Japanese encephalitis group viruses (62.0-63.7%). Phylogenetic inference is in agreement with these findings. CONCLUSIONS: This study furthers our understanding of flavivirus genetics, phylogeny and diagnostics. Because the two closest known relatives of T'Ho virus are human pathogens, T'Ho virus could be an unrecognized cause of human disease. It is therefore important that future studies investigate the public health significance of this virus.


Asunto(s)
Flavivirus/genética , Análisis de Secuencia de ADN , Secuenciación Completa del Genoma , Animales , Análisis por Conglomerados , Culex , Flavivirus/aislamiento & purificación , México , Sistemas de Lectura Abierta , Filogenia , Homología de Secuencia de Ácido Nucleico
12.
PLoS One ; 18(8): e0290473, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37616210

RESUMEN

Understanding the microbial genomic contributors to antimicrobial resistance (AMR) is essential for early detection of emerging AMR infections, a pressing global health threat in human and veterinary medicine. Here we used whole genome sequencing and antibiotic susceptibility test data from 980 disease causing Escherichia coli isolated from companion and farm animals to model AMR genotypes and phenotypes for 24 antibiotics. We determined the strength of genotype-to-phenotype relationships for 197 AMR genes with elastic net logistic regression. Model predictors were designed to evaluate different potential modes of AMR genotype translation into resistance phenotypes. Our results show a model that considers the presence of individual AMR genes and total number of AMR genes present from a set of genes known to confer resistance was able to accurately predict isolate resistance on average (mean F1 score = 98.0%, SD = 2.3%, mean accuracy = 98.2%, SD = 2.7%). However, fitted models sometimes varied for antibiotics in the same class and for the same antibiotic across animal hosts, suggesting heterogeneity in the genetic determinants of AMR resistance. We conclude that an interpretable AMR prediction model can be used to accurately predict resistance phenotypes across multiple host species and reveal testable hypotheses about how the mechanism of resistance may vary across antibiotics within the same class and across animal hosts for the same antibiotic.


Asunto(s)
Antibacterianos , Ganado , Animales , Humanos , Antibacterianos/farmacología , Mascotas , Farmacorresistencia Bacteriana/genética , Escherichia coli/genética
13.
Nat Commun ; 14(1): 7668, 2023 Nov 23.
Artículo en Inglés | MEDLINE | ID: mdl-37996457

RESUMEN

Uncovering the mechanisms regulating hematopoietic specification not only would overcome current limitations related to hematopoietic stem and progenitor cell (HSPC) transplantation, but also advance cellular immunotherapies. However, generating functional human induced pluripotent stem cell (hiPSC)-derived HSPCs and their derivatives has been elusive, necessitating a better understanding of the developmental mechanisms that trigger HSPC specification. Here, we reveal that early activation of the Nod1-Ripk2-NF-kB inflammatory pathway in endothelial cells (ECs) primes them to switch fate towards definitive hemogenic endothelium, a pre-requisite to specify HSPCs. Our genetic and chemical embryonic models show that HSPCs fail to specify in the absence of Nod1 and its downstream kinase Ripk2 due to a failure on hemogenic endothelial (HE) programming, and that small Rho GTPases coordinate the activation of this pathway. Manipulation of NOD1 in a human system of definitive hematopoietic differentiation indicates functional conservation. This work establishes the RAC1-NOD1-RIPK2-NF-kB axis as a critical intrinsic inductor that primes ECs prior to HE fate switch and HSPC specification. Manipulation of this pathway could help derive a competent HE amenable to specify functional patient specific HSPCs and their derivatives for the treatment of blood disorders.


Asunto(s)
Hemangioblastos , Células Madre Pluripotentes Inducidas , Proteínas de Unión al GTP Monoméricas , Humanos , Diferenciación Celular , Hematopoyesis/fisiología , Células Madre Hematopoyéticas/metabolismo , Células Madre Pluripotentes Inducidas/metabolismo , Proteínas de Unión al GTP Monoméricas/metabolismo , FN-kappa B/metabolismo , Proteínas de Unión al GTP rho/genética , Proteínas de Unión al GTP rho/metabolismo
14.
J Virol ; 85(19): 10421-4, 2011 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-21752904

RESUMEN

Two variants of equine infectious anemia virus (EIAV) that differed in sensitivity to broadly neutralizing antibody were tested in direct competition assays. No differences were observed in the growth curves and relative fitness scores of EIAVs of principal neutralizing domain variants of groups 1 (EIAV(PND-1)) and 5 (EIAV(PND-5)), respectively; however, the neutralization-resistant EIAV(PND-5) variant was less infectious in single-round replication assays. Infectious center assays indicated similar rates of cell-to-cell spread, which was approximately 1,000-fold more efficient than cell-free infectivity. These data indicate that efficient cell-to-cell spread can overcome the decreased infectivity that may accompany immune escape and should be considered in studies assessing the relative levels of fitness among lentivirus variants, including HIV-1.


Asunto(s)
Anticuerpos Neutralizantes/inmunología , Virus de la Anemia Infecciosa Equina/crecimiento & desarrollo , Virus de la Anemia Infecciosa Equina/inmunología , Mutación , Animales , Anticuerpos Antivirales/inmunología , Línea Celular , Virus de la Anemia Infecciosa Equina/genética , Virus de la Anemia Infecciosa Equina/patogenicidad , Pruebas de Neutralización , Virulencia
15.
Arch Virol ; 157(6): 1205-9, 2012 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-22411100

RESUMEN

We previously reported the isolation of South River virus (SORV) from a pool of mosquitoes collected in the Yucatan Peninsula of Mexico (Farfan-Ale et al. in Vector Borne Zoonotic Dis 10:777-783, 5). The isolate (designated SORV-252) was identified as SORV after a 197-nucleotide region of its small RNA genome segment was sequenced. In the present study, the complete small and medium RNA genome segments and part of the large RNA genome segment of SORV-252 were sequenced and shown to have 92%, 85% and 90% nucleotide sequence identity, respectively, to the homologous regions of the prototype SORV isolate (NJO-94F). To determine the antigenic relationship between SORV-252 and NJO-94F, cross-plaque reduction neutralization tests (PRNTs) were performed using sera from mice inoculated with these viruses. SORV-252 and NJO-94F were distinguishable in the cross-neutralization assays; there was a twofold difference in the PRNT titers in one direction and a fourfold difference in the other direction, suggesting that SORV-252 represents a novel subtype of SORV. Additionally, SORV-252 and NJO-94F have distinct plaque morphologies in African green monkey kidney (Vero) cells. In conclusion, we provide evidence that a novel subtype of SORV is present in the Yucatan Peninsula of Mexico.


Asunto(s)
Bunyaviridae/clasificación , Bunyaviridae/aislamiento & purificación , Culicidae/virología , Animales , Anticuerpos Antivirales/inmunología , Bunyaviridae/genética , Bunyaviridae/inmunología , Chlorocebus aethiops , Genoma Viral , Ratones , Datos de Secuencia Molecular , Pruebas de Neutralización , Filogenia , Células Vero
16.
Arch Virol ; 157(6): 1199-204, 2012 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-22407405

RESUMEN

We determined the complete nucleotide sequences of the small (S) and medium (M) RNA segments of an orthobunyavirus isolated from mosquitoes in the Yucatan Peninsula of Mexico. A 528-nt region of the large (L) RNA segment was also sequenced. The S RNA segment has greatest nucleotide identity to the homologous region of Cache Valley virus (CVV; 98%) followed by Potosi virus (POTV; 89%) and Northway virus (86%). The M RNA segment has 96% nucleotide identity to the homologous region of POTV, and less than 74% nucleotide identity to the homologous regions of all other orthobunyaviruses for which M segment sequence data are available. The L RNA segment has greatest nucleotide identity to the homologous region of POTV (98%) followed by CVV (82%) and Tensaw virus (77%). These data indicate that the virus, tentatively named Cholul virus (CHLV), is a novel reassortant that acquired its S RNA segment from CVV and its M and L RNA segments from POTV. Phylogenetic data support this conclusion.


Asunto(s)
Virus Bunyamwera/clasificación , Virus Bunyamwera/genética , Virus Bunyamwera/aislamiento & purificación , Filogenia , Virus Reordenados/clasificación , Secuencia de Aminoácidos , Animales , Secuencia de Bases , Culicidae/virología , México , Datos de Secuencia Molecular , Virus Reordenados/genética , Virus Reordenados/aislamiento & purificación , Recombinación Genética , Homología de Secuencia , Proteínas Virales/genética
17.
Virus Genes ; 45(1): 176-80, 2012 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-22467180

RESUMEN

Nucleotide sequencing was performed on part of the medium and large genome segments of 17 Cache Valley virus (CVV) isolates from the Yucatan Peninsula of Mexico. Alignment of these sequences to all other sequences in the Genbank database revealed that they have greatest nucleotide identity (97-98 %) with the equivalent regions of Tlacotalpan virus (TLAV), which is considered to be a variety of CVV. Next, cross-plaque reduction neutralization tests (PRNTs) were performed using sera from mice that had been inoculated with a representative isolate from the Yucatan Peninsula (CVV-478) or the prototype TLAV isolate (61-D-240). The PRNT titers exhibited a twofold difference in one direction and no difference in the other direction suggesting that CVV-478 and 61-D-240 belong to the same CVV subtype. In conclusion, we demonstrate that the CVV isolates from the Yucatan Peninsula of Mexico are genetically and antigenically similar to the prototype TLAV isolate.


Asunto(s)
Aedes/virología , Virus Bunyamwera/genética , Virus Bunyamwera/inmunología , Animales , Virus Bunyamwera/clasificación , Virus Bunyamwera/aislamiento & purificación , Femenino , Sueros Inmunes/inmunología , México , Ratones , Ratones Endogámicos BALB C , Pruebas de Neutralización , Filogenia , Análisis de Secuencia de ADN , Ensayo de Placa Viral
18.
J Bacteriol ; 193(19): 5450-64, 2011 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-21784931

RESUMEN

Xanthomonas is a large genus of bacteria that collectively cause disease on more than 300 plant species. The broad host range of the genus contrasts with stringent host and tissue specificity for individual species and pathovars. Whole-genome sequences of Xanthomonas campestris pv. raphani strain 756C and X. oryzae pv. oryzicola strain BLS256, pathogens that infect the mesophyll tissue of the leading models for plant biology, Arabidopsis thaliana and rice, respectively, were determined and provided insight into the genetic determinants of host and tissue specificity. Comparisons were made with genomes of closely related strains that infect the vascular tissue of the same hosts and across a larger collection of complete Xanthomonas genomes. The results suggest a model in which complex sets of adaptations at the level of gene content account for host specificity and subtler adaptations at the level of amino acid or noncoding regulatory nucleotide sequence determine tissue specificity.


Asunto(s)
Genoma Bacteriano/genética , Xanthomonas/genética , Arabidopsis/microbiología , Datos de Secuencia Molecular , Oryza/microbiología , Xanthomonas/fisiología
19.
BMC Bioinformatics ; 12 Suppl 1: S52, 2011 Feb 15.
Artículo en Inglés | MEDLINE | ID: mdl-21342585

RESUMEN

BACKGROUND: High-throughput short read sequencing is revolutionizing genomics and systems biology research by enabling cost-effective deep coverage sequencing of genomes and transcriptomes. Error detection and correction are crucial to many short read sequencing applications including de novo genome sequencing, genome resequencing, and digital gene expression analysis. Short read error detection is typically carried out by counting the observed frequencies of kmers in reads and validating those with frequencies exceeding a threshold. In case of genomes with high repeat content, an erroneous kmer may be frequently observed if it has few nucleotide differences with valid kmers with multiple occurrences in the genome. Error detection and correction were mostly applied to genomes with low repeat content and this remains a challenging problem for genomes with high repeat content. RESULTS: We develop a statistical model and a computational method for error detection and correction in the presence of genomic repeats. We propose a method to infer genomic frequencies of kmers from their observed frequencies by analyzing the misread relationships among observed kmers. We also propose a method to estimate the threshold useful for validating kmers whose estimated genomic frequency exceeds the threshold. We demonstrate that superior error detection is achieved using these methods. Furthermore, we break away from the common assumption of uniformly distributed errors within a read, and provide a framework to model position-dependent error occurrence frequencies common to many short read platforms. Lastly, we achieve better error correction in genomes with high repeat content. AVAILABILITY: The software is implemented in C++ and is freely available under GNU GPL3 license and Boost Software V1.0 license at "http://aluru-sun.ece.iastate.edu/doku.php?id = redeem". CONCLUSIONS: We introduce a statistical framework to model sequencing errors in next-generation reads, which led to promising results in detecting and correcting errors for genomes with high repeat content.


Asunto(s)
Biología Computacional/métodos , Genómica/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Modelos Estadísticos , Programas Informáticos , Algoritmos , Funciones de Verosimilitud
20.
Bioinformatics ; 26(20): 2526-33, 2010 Oct 15.
Artículo en Inglés | MEDLINE | ID: mdl-20834037

RESUMEN

MOTIVATION: Error correction is critical to the success of next-generation sequencing applications, such as resequencing and de novo genome sequencing. It is especially important for high-throughput short-read sequencing, where reads are much shorter and more abundant, and errors more frequent than in traditional Sanger sequencing. Processing massive numbers of short reads with existing error correction methods is both compute and memory intensive, yet the results are far from satisfactory when applied to real datasets. RESULTS: We present a novel approach, termed Reptile, for error correction in short-read data from next-generation sequencing. Reptile works with the spectrum of k-mers from the input reads, and corrects errors by simultaneously examining: (i) Hamming distance-based correction possibilities for potentially erroneous k-mers; and (ii) neighboring k-mers from the same read for correct contextual information. By not needing to store input data, Reptile has the favorable property that it can handle data that does not fit in main memory. In addition to sequence data, Reptile can make use of available quality score information. Our experiments show that Reptile outperforms previous methods in the percentage of errors removed from the data and the accuracy in true base assignment. In addition, a significant reduction in run time and memory usage have been achieved compared with previous methods, making it more practical for short-read error correction when sampling larger genomes. AVAILABILITY: Reptile is implemented in C++ and is available through the link: http://aluru-sun.ece.iastate.edu/doku.php?id=software CONTACT: aluru@iastate.edu.


Asunto(s)
Genómica/métodos , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Algoritmos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA