Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 20
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Proc Natl Acad Sci U S A ; 118(23)2021 06 08.
Artigo em Inglês | MEDLINE | ID: mdl-34074772

RESUMO

Bacteriophages (phages) have evolved efficient means to take over the machinery of the bacterial host. The molecular tools at their disposal may be applied to manipulate bacteria and to divert molecular pathways at will. Here, we describe a bacterial growth inhibitor, gene product T5.015, encoded by the T5 phage. High-throughput sequencing of genomic DNA of bacterial mutants, resistant to this inhibitor, revealed disruptive mutations in the Escherichia coli ung gene, suggesting that growth inhibition mediated by T5.015 depends on the uracil-excision activity of Ung. We validated that growth inhibition is abrogated in the absence of ung and confirmed physical binding of Ung by T5.015. In addition, biochemical assays with T5.015 and Ung indicated that T5.015 mediates endonucleolytic activity at abasic sites generated by the base-excision activity of Ung. Importantly, the growth inhibition resulting from the endonucleolytic activity is manifested by DNA replication and cell division arrest. We speculate that the phage uses this protein to selectively cause cleavage of the host DNA, which possesses more misincorporated uracils than that of the phage. This protein may also enhance phage utilization of the available resources in the infected cell, since halting replication saves nucleotides, and stopping cell division maintains both daughters of a dividing cell.


Assuntos
Bacteriófagos/genética , Bacteriófagos/fisiologia , DNA/metabolismo , Nucleotídeos de Desoxiuracil/metabolismo , Pontos de Checagem do Ciclo Celular , Divisão Celular , Endonucleases , Escherichia coli/genética , Sequenciamento de Nucleotídeos em Larga Escala , Mutação , Uracila/metabolismo
2.
Mol Biol Evol ; 39(11)2022 11 03.
Artigo em Inglês | MEDLINE | ID: mdl-36282896

RESUMO

The inference of genome rearrangement events has been extensively studied, as they play a major role in molecular evolution. However, probabilistic evolutionary models that explicitly imitate the evolutionary dynamics of such events, as well as methods to infer model parameters, are yet to be fully utilized. Here, we developed a probabilistic approach to infer genome rearrangement rate parameters using an Approximate Bayesian Computation (ABC) framework. We developed two genome rearrangement models, a basic model, which accounts for genomic changes in gene order, and a more sophisticated one which also accounts for changes in chromosome number. We characterized the ABC inference accuracy using simulations and applied our methodology to both prokaryotic and eukaryotic empirical datasets. Knowledge of genome-rearrangement rates can help elucidate their role in evolution as well as help simulate genomes with evolutionary dynamics that reflect empirical genomes.


Assuntos
Evolução Molecular , Genoma , Teorema de Bayes , Simulação por Computador , Genômica
3.
Bioinformatics ; 38(8): 2341-2343, 2022 04 12.
Artigo em Inglês | MEDLINE | ID: mdl-35157036

RESUMO

MOTIVATION: Type-III secretion systems are utilized by many Gram-negative bacteria to inject type-3 effectors (T3Es) to eukaryotic cells. These effectors manipulate host processes for the benefit of the bacteria and thus promote disease. They can also function as host-specificity determinants through their recognition as avirulence proteins that elicit immune response. Identifying the full effector repertoire within a set of bacterial genomes is of great importance to develop appropriate treatments against the associated pathogens. RESULTS: We present Effectidor, a user-friendly web server that harnesses several machine-learning techniques to predict T3Es within bacterial genomes. We compared the performance of Effectidor to other available tools for the same task on three pathogenic bacteria. Effectidor outperformed these tools in terms of classification accuracy (area under the precision-recall curve above 0.98 in all cases). AVAILABILITY AND IMPLEMENTATION: Effectidor is available at: https://effectidor.tau.ac.il, and the source code is available at: https://github.com/naamawagner/Effectidor. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Proteínas de Bactérias , Sistemas de Secreção Tipo III , Sistemas de Secreção Tipo III/metabolismo , Proteínas de Bactérias/metabolismo , Software , Aprendizado de Máquina , Bactérias Gram-Negativas/metabolismo
4.
Mol Biol Evol ; 38(12): 5769-5781, 2021 12 09.
Artigo em Inglês | MEDLINE | ID: mdl-34469521

RESUMO

Insertions and deletions (indels) are common molecular evolutionary events. However, probabilistic models for indel evolution are under-developed due to their computational complexity. Here, we introduce several improvements to indel modeling: 1) While previous models for indel evolution assumed that the rates and length distributions of insertions and deletions are equal, here we propose a richer model that explicitly distinguishes between the two; 2) we introduce numerous summary statistics that allow approximate Bayesian computation-based parameter estimation; 3) we develop a method to correct for biases introduced by alignment programs, when inferring indel parameters from empirical data sets; and 4) using a model-selection scheme, we test whether the richer model better fits biological data compared with the simpler model. Our analyses suggest that both our inference scheme and the model-selection procedure achieve high accuracy on simulated data. We further demonstrate that our proposed richer model better fits a large number of empirical data sets and that, for the majority of these data sets, the deletion rate is higher than the insertion rate.


Assuntos
Evolução Molecular , Mutação INDEL , Teorema de Bayes , Modelos Estatísticos , Filogenia
5.
PLoS Comput Biol ; 17(1): e1008607, 2021 01.
Artigo em Inglês | MEDLINE | ID: mdl-33493161

RESUMO

MOTIVATION: A comprehensive characterization of the humoral response towards a specific antigen requires quantification of the B-cell receptor repertoire by next-generation sequencing (BCR-Seq), as well as the analysis of serum antibodies against this antigen, using proteomics. The proteomic analysis is challenging since it necessitates the mapping of antigen-specific peptides to individual B-cell clones. RESULTS: The PASA web server provides a robust computational platform for the analysis and integration of data obtained from proteomics of serum antibodies. PASA maps peptides derived from antibodies raised against a specific antigen to corresponding antibody sequences. It then analyzes and integrates proteomics and BCR-Seq data, thus providing a comprehensive characterization of the humoral response. The PASA web server is freely available at https://pasa.tau.ac.il and open to all users without a login requirement.


Assuntos
Anticorpos , Internet , Proteômica/métodos , Software , Animais , Anticorpos/sangue , Anticorpos/imunologia , Antígenos/imunologia , Linfócitos B/imunologia , Bases de Dados de Proteínas , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Camundongos
6.
Mol Biol Evol ; 37(11): 3338-3352, 2020 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-32585030

RESUMO

Statistical criteria have long been the standard for selecting the best model for phylogenetic reconstruction and downstream statistical inference. Although model selection is regarded as a fundamental step in phylogenetics, existing methods for this task consume computational resources for long processing time, they are not always feasible, and sometimes depend on preliminary assumptions which do not hold for sequence data. Moreover, although these methods are dedicated to revealing the processes that underlie the sequence data, they do not always produce the most accurate trees. Notably, phylogeny reconstruction consists of two related tasks, topology reconstruction and branch-length estimation. It was previously shown that in many cases the most complex model, GTR+I+G, leads to topologies that are as accurate as using existing model selection criteria, but overestimates branch lengths. Here, we present ModelTeller, a computational methodology for phylogenetic model selection, devised within the machine-learning framework, optimized to predict the most accurate nucleotide substitution model for branch-length estimation. We demonstrate that ModelTeller leads to more accurate branch-length inference than current model selection criteria on data sets simulated under realistic processes. ModelTeller relies on a readily implemented machine-learning model and thus the prediction according to features extracted from the sequence data results in a substantial decrease in running time compared with existing strategies. By harnessing the machine-learning framework, we distinguish between features that mostly contribute to branch-length optimization, concerning the extent of sequence divergence, and features that are related to estimates of the model parameters that are important for the selection made by current criteria.


Assuntos
Aprendizado de Máquina , Modelos Genéticos , Filogenia
7.
Nucleic Acids Res ; 47(W1): W88-W92, 2019 07 02.
Artigo em Inglês | MEDLINE | ID: mdl-31114912

RESUMO

Large-scale mining and analysis of bacterial datasets contribute to the comprehensive characterization of complex microbial dynamics within a microbiome and among different bacterial strains, e.g., during disease outbreaks. The study of large-scale bacterial evolutionary dynamics poses many challenges. These include data-mining steps, such as gene annotation, ortholog detection, sequence alignment and phylogeny reconstruction. These steps require the use of multiple bioinformatics tools and ad-hoc programming scripts, making the entire process cumbersome, tedious and error-prone due to manual handling. This motivated us to develop the M1CR0B1AL1Z3R web server, a 'one-stop shop' for conducting microbial genomics data analyses via a simple graphical user interface. Some of the features implemented in M1CR0B1AL1Z3R are: (i) extracting putative open reading frames and comparative genomics analysis of gene content; (ii) extracting orthologous sets and analyzing their size distribution; (iii) analyzing gene presence-absence patterns; (iv) reconstructing a phylogenetic tree based on the extracted orthologous set; (v) inferring GC-content variation among lineages. M1CR0B1AL1Z3R facilitates the mining and analysis of dozens of bacterial genomes using advanced techniques, with the click of a button. M1CR0B1AL1Z3R is freely available at https://microbializer.tau.ac.il/.


Assuntos
Genoma Bacteriano/genética , Genômica , Software , Biologia Computacional , Internet , Filogenia , Alinhamento de Sequência/métodos , Interface Usuário-Computador
8.
PLoS Genet ; 12(9): e1006280, 2016 09.
Artigo em Inglês | MEDLINE | ID: mdl-27618184

RESUMO

The use of whole-genome phylogenetic analysis has revolutionized our understanding of the evolution and spread of many important bacterial pathogens due to the high resolution view it provides. However, the majority of such analyses do not consider the potential role of accessory genes when inferring evolutionary trajectories. Moreover, the recently discovered importance of the switching of gene regulatory elements suggests that an exhaustive analysis, combining information from core and accessory genes with regulatory elements could provide unparalleled detail of the evolution of a bacterial population. Here we demonstrate this principle by applying it to a worldwide multi-host sample of the important pathogenic E. coli lineage ST131. Our approach reveals the existence of multiple circulating subtypes of the major drug-resistant clade of ST131 and provides the first ever population level evidence of core genome substitutions in gene regulatory regions associated with the acquisition and maintenance of different accessory genome elements.


Assuntos
Resistência Microbiana a Medicamentos/genética , Infecções por Escherichia coli/tratamento farmacológico , Escherichia coli/genética , Evolução Molecular , Escherichia coli/patogenicidade , Infecções por Escherichia coli/genética , Genoma Bacteriano/efeitos dos fármacos , Humanos , Filogenia , Sequências Reguladoras de Ácido Nucleico/genética , Análise de Sequência de DNA
9.
Nat Biomed Eng ; 2024 Oct 01.
Artigo em Inglês | MEDLINE | ID: mdl-39354052

RESUMO

The application of machine learning to tasks involving volumetric biomedical imaging is constrained by the limited availability of annotated datasets of three-dimensional (3D) scans for model training. Here we report a deep-learning model pre-trained on 2D scans (for which annotated data are relatively abundant) that accurately predicts disease-risk factors from 3D medical-scan modalities. The model, which we named SLIViT (for 'slice integration by vision transformer'), preprocesses a given volumetric scan into 2D images, extracts their feature map and integrates it into a single prediction. We evaluated the model in eight different learning tasks, including classification and regression for six datasets involving four volumetric imaging modalities (computed tomography, magnetic resonance imaging, optical coherence tomography and ultrasound). SLIViT consistently outperformed domain-specific state-of-the-art models and was typically as accurate as clinical specialists who had spent considerable time manually annotating the analysed scans. Automating diagnosis tasks involving volumetric scans may save valuable clinician hours, reduce data acquisition costs and duration, and help expedite medical research and clinical applications.

10.
PLOS Digit Health ; 2(2): e0000106, 2023 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-36812608

RESUMO

Age-related Macular Degeneration (AMD) is a major cause of irreversible vision loss in individuals over 55 years old in the United States. One of the late-stage manifestations of AMD, and a major cause of vision loss, is the development of exudative macular neovascularization (MNV). Optical Coherence Tomography (OCT) is the gold standard to identify fluid at different levels within the retina. The presence of fluid is considered the hallmark to define the presence of disease activity. Anti-vascular growth factor (anti-VEGF) injections can be used to treat exudative MNV. However, given the limitations of anti-VEGF treatment, as burdensome need for frequent visits and repeated injections to sustain efficacy, limited durability of the treatment, poor or no response, there is a great interest in detecting early biomarkers associated with a higher risk for AMD progression to exudative forms in order to optimize the design of early intervention clinical trials. The annotation of structural biomarkers on optical coherence tomography (OCT) B-scans is a laborious, complex and time-consuming process, and discrepancies between human graders can introduce variability into this assessment. To address this issue, a deep-learning model (SLIVER-net) was proposed, which could identify AMD biomarkers on structural OCT volumes with high precision and without human supervision. However, the validation was performed on a small dataset, and the true predictive power of these detected biomarkers in the context of a large cohort has not been evaluated. In this retrospective cohort study, we perform the largest-scale validation of these biomarkers to date. We also assess how these features combined with other EHR data (demographics, comorbidities, etc) affect and/or improve the prediction performance relative to known factors. Our hypothesis is that these biomarkers can be identified by a machine learning algorithm without human supervision, in a way that they preserve their predictive nature. The way we test this hypothesis is by building several machine learning models utilizing these machine-read biomarkers and assessing their added predictive power. We found that not only can we show that the machine-read OCT B-scan biomarkers are predictive of AMD progression, we also observe that our proposed combined OCT and EHR data-based algorithm outperforms the state-of-the-art solution in clinically relevant metrics and provides actionable information which has the potential to improve patient care. In addition, it provides a framework for automated large-scale processing of OCT volumes, making it possible to analyze vast archives without human supervision.

11.
Microbiol Spectr ; 11(6): e0169723, 2023 Dec 12.
Artigo em Inglês | MEDLINE | ID: mdl-37888989

RESUMO

IMPORTANCE: We have identified a novel phage-encoded inhibitor of the major cytoskeletal protein in bacterial division, FtsZ. The inhibition is shown to confer T5 bacteriophage with a growth advantage in dividing hosts. Our studies demonstrate a strategy in bacteriophages to maximize their progeny number by inhibiting escape of one of the daughter cells of an infected bacterium. They further emphasize that FtsZ is a natural target for bacterial growth inhibition.


Assuntos
Bacteriófagos , Divisão Celular , Bacteriófagos/fisiologia , Bactérias , Proteínas do Citoesqueleto , Proteínas de Bactérias/genética
12.
Res Sq ; 2023 Nov 21.
Artigo em Inglês | MEDLINE | ID: mdl-38045283

RESUMO

We present SLIViT, a deep-learning framework that accurately measures disease-related risk factors in volumetric biomedical imaging, such as magnetic resonance imaging (MRI) scans, optical coherence tomography (OCT) scans, and ultrasound videos. To evaluate SLIViT, we applied it to five different datasets of these three different data modalities tackling seven learning tasks (including both classification and regression) and found that it consistently and significantly outperforms domain-specific state-of-the-art models, typically improving performance (ROC AUC or correlation) by 0.1-0.4. Notably, compared to existing approaches, SLIViT can be applied even when only a small number of annotated training samples is available, which is often a constraint in medical applications. When trained on less than 700 annotated volumes, SLIViT obtained accuracy comparable to trained clinical specialists while reducing annotation time by a factor of 5,000 demonstrating its utility to automate and expedite ongoing research and other practical clinical scenarios.

13.
Front Microbiol ; 13: 840308, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35495725

RESUMO

The type VI secretion system (T6SS) present in many Gram-negative bacteria is a contact-dependent apparatus that can directly deliver secreted effectors or toxins into diverse neighboring cellular targets including both prokaryotic and eukaryotic organisms. Recent reverse genetics studies with T6 core gene loci have indicated the importance of functional T6SS toward overall competitive fitness in various pathogenic Xanthomonas spp. To understand the contribution of T6SS toward ecology and evolution of Xanthomonas spp., we explored the distribution of the three distinguishable T6SS clusters, i3*, i3***, and i4, in approximately 1,740 Xanthomonas genomes, along with their conservation, genetic organization, and their evolutionary patterns in this genus. Screening genomes for core genes of each T6 cluster indicated that 40% of the sequenced strains possess two T6 clusters, with combinations of i3*** and i3* or i3*** and i4. A few strains of Xanthomonas citri, Xanthomonas phaseoli, and Xanthomonas cissicola were the exception, possessing a unique combination of i3* and i4. The findings also indicated clade-specific distribution of T6SS clusters. Phylogenetic analysis demonstrated that T6SS clusters i3* and i3*** were probably acquired by the ancestor of the genus Xanthomonas, followed by gain or loss of individual clusters upon diversification into subsequent clades. T6 i4 cluster has been acquired in recent independent events by group 2 xanthomonads followed by its spread via horizontal dissemination across distinct clades across groups 1 and 2 xanthomonads. We also noted reshuffling of the entire core T6 loci, as well as T6SS spike complex components, hcp and vgrG, among different species. Our findings indicate that gain or loss events of specific T6SS clusters across Xanthomonas phylogeny have not been random.

14.
J Mol Biol ; 433(15): 167071, 2021 07 23.
Artigo em Inglês | MEDLINE | ID: mdl-34052285

RESUMO

Antibodies provide a comprehensive record of the encounters with threats and insults to the immune system. The ability to examine the repertoire of antibodies in serum and discover those that best represent "discriminating features" characteristic of various clinical situations, is potentially very useful. Recently, phage display technologies combined with Next-Generation Sequencing (NGS) produced a powerful experimental methodology, coined "Deep-Panning", in which the spectrum of serum antibodies is probed. In order to extract meaningful biological insights from the tens of millions of affinity-selected peptides generated by Deep-Panning, advanced bioinformatics algorithms are a must. In this study, we describe Motifier, a computational pipeline comprised of a set of algorithms that systematically generates discriminatory peptide motifs based on the affinity-selected peptides identified by Deep-Panning. These motifs are shown to effectively characterize antibody binding activities and through the implementation of machine-learning protocols are shown to accurately classify complex antibody mixtures representing various biological conditions.


Assuntos
Anticorpos/química , Biologia Computacional/métodos , Algoritmos , Motivos de Aminoácidos , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Aprendizado de Máquina , Biblioteca de Peptídeos
15.
mSystems ; 6(1)2021 Feb 02.
Artigo em Inglês | MEDLINE | ID: mdl-33531410

RESUMO

Degradation of intracellular proteins in Gram-negative bacteria regulates various cellular processes and serves as a quality control mechanism by eliminating damaged proteins. To understand what causes the proteolytic machinery of the cell to degrade some proteins while sparing others, we employed a quantitative pulsed-SILAC (stable isotope labeling with amino acids in cell culture) method followed by mass spectrometry analysis to determine the half-lives for the proteome of exponentially growing Escherichia coli, under standard conditions. We developed a likelihood-based statistical test to find actively degraded proteins and identified dozens of fast-degrading novel proteins. Finally, we used structural, physicochemical, and protein-protein interaction network descriptors to train a machine learning classifier to discriminate fast-degrading proteins from the rest of the proteome, achieving an area under the receiver operating characteristic curve (AUC) of 0.72.IMPORTANCE Bacteria use protein degradation to control proliferation, dispose of misfolded proteins, and adapt to physiological and environmental shifts, but the factors that dictate which proteins are prone to degradation are mostly unknown. In this study, we have used a combined computational-experimental approach to explore protein degradation in E. coli We discovered that the proteome of E. coli is composed of three protein populations that are distinct in terms of stability and functionality, and we show that fast-degrading proteins can be identified using a combination of various protein properties. Our findings expand the understanding of protein degradation in bacteria and have implications for protein engineering. Moreover, as rapidly degraded proteins may play an important role in pathogenesis, our findings may help to identify new potential antibacterial drug targets.

16.
EMBO Mol Med ; 12(11): e13171, 2020 11 06.
Artigo em Inglês | MEDLINE | ID: mdl-33073919

RESUMO

The rapid spread of SARS-CoV-2 and its threat to health systems worldwide have led governments to take acute actions to enforce social distancing. Previous studies used complex epidemiological models to quantify the effect of lockdown policies on infection rates. However, these rely on prior assumptions or on official regulations. Here, we use country-specific reports of daily mobility from people cellular usage to model social distancing. Our data-driven model enabled the extraction of lockdown characteristics which were crossed with observed mortality rates to show that: (i) the time at which social distancing was initiated is highly correlated with the number of deaths, r2  = 0.64, while the lockdown strictness or its duration is not as informative; (ii) a delay of 7.49 days in initiating social distancing would double the number of deaths; and (iii) the immediate response has a prolonged effect on COVID-19 death toll.


Assuntos
COVID-19/patologia , Quarentena , COVID-19/epidemiologia , COVID-19/mortalidade , COVID-19/virologia , Humanos , Pandemias , Distanciamento Físico , SARS-CoV-2/isolamento & purificação , Taxa de Sobrevida , Fatores de Tempo
17.
Front Immunol ; 11: 619896, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-33643301

RESUMO

The presence of pathogen-specific antibodies in an individual's blood-sample is used as an indication of previous exposure and infection to that specific pathogen (e.g., virus or bacterium). Measurement of the diagnostic antibodies is routinely achieved using solid phase immuno-assays such as ELISA tests and western blots. Here, we describe a sero-diagnostic approach based on phage-display of epitope arrays we term "Domain-Scan". We harness Next-generation sequencing (NGS) to measure the serum binding to dozens of epitopes derived from HIV-1 and HCV simultaneously. The distinction of healthy individuals from those infected with either HIV-1 or HCV, is modeled as a machine-learning classification problem, in which each determinant ("domain") is considered as a feature, and its NGS read-out provides values that correspond to the level of determinant-specific antibodies in the sample. We show that following training of a machine-learning model on labeled examples, we can very accurately classify unlabeled samples and pinpoint the domains that contribute most to the classification. Our experimental/computational Domain-Scan approach is general and can be adapted to other pathogens as long as sufficient training samples are provided.


Assuntos
Doenças Transmissíveis/diagnóstico , Anticorpos Anti-HIV/sangue , Proteína do Núcleo p24 do HIV/imunologia , Proteína gp160 do Envelope de HIV/imunologia , Infecções por HIV/diagnóstico , Anticorpos Anti-Hepatite C/sangue , Antígenos da Hepatite C/imunologia , Hepatite C/diagnóstico , Aprendizado de Máquina , Biblioteca de Peptídeos , Testes Sorológicos/métodos , Sorodiagnóstico da AIDS/métodos , Sequência de Aminoácidos , Reações Antígeno-Anticorpo , Sequência de Bases , Código de Barras de DNA Taxonômico , DNA Recombinante/imunologia , Epitopos/genética , Epitopos/imunologia , Vetores Genéticos , Proteína do Núcleo p24 do HIV/genética , Antígenos da Hepatite C/genética , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Oligonucleotídeos/genética , Oligonucleotídeos/imunologia , Fragmentos de Peptídeos/genética , Fragmentos de Peptídeos/imunologia , Reação em Cadeia da Polimerase/métodos
19.
Front Immunol ; 9: 1686, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-30105017

RESUMO

Reproducible and robust data on antibody repertoires are invaluable for basic and applied immunology. Next-generation sequencing (NGS) of antibody variable regions has emerged as a powerful tool in systems immunology, providing quantitative molecular information on antibody polyclonal composition. However, major computational challenges exist when analyzing antibody sequences, from error handling to hypermutation profiles and clonal expansion analyses. In this work, we developed the ASAP (A webserver for Immunoglobulin-Seq Analysis Pipeline) webserver (https://asap.tau.ac.il). The input to ASAP is a paired-end sequence dataset from one or more replicates, with or without unique molecular identifiers. These datasets can be derived from NGS of human or murine antibody variable regions. ASAP first filters and annotates the sequence reads using public or user-provided germline sequence information. The ASAP webserver next performs various calculations, including somatic hypermutation level, CDR3 lengths, V(D)J family assignments, and V(D)J combination distribution. These analyses are repeated for each replicate. ASAP provides additional information by analyzing the commonalities and differences between the repeats ("joint" analysis). For example, ASAP examines the shared variable regions and their frequency in each replicate to determine which sequences are less likely to be a result of a sample preparation derived and/or sequencing errors. Moreover, ASAP clusters the data to clones and reports the identity and prevalence of top ranking clones (clonal expansion analysis). ASAP further provides the distribution of synonymous and non-synonymous mutations within the V genes somatic hypermutations. Finally, ASAP provides means to process the data for proteomic analysis of serum/secreted antibodies by generating a variable region database for liquid chromatography high resolution tandem mass spectrometry (LC-MS/MS) interpretation. ASAP is user-friendly, free, and open to all users, with no login requirement. ASAP is applicable for researchers interested in basic questions related to B cell development and differentiation, as well as applied researchers who are interested in vaccine development and monoclonal antibody engineering. By virtue of its user-friendliness, ASAP opens the antibody analysis field to non-expert users who seek to boost their research with immune repertoire analysis.


Assuntos
Biologia Computacional/métodos , Imunoglobulinas/genética , Análise de Sequência de DNA , Software , Navegador , Sequência de Aminoácidos , Animais , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Imunoglobulinas/química , Recombinação V(D)J
20.
Genome Biol Evol ; 7(12): 3226-38, 2015 Nov 03.
Artigo em Inglês | MEDLINE | ID: mdl-26537226

RESUMO

In this study, we present a novel methodology to infer indel parameters from multiple sequence alignments (MSAs) based on simulations. Our algorithm searches for the set of evolutionary parameters describing indel dynamics which best fits a given input MSA. In each step of the search, we use parametric bootstraps and the Mahalanobis distance to estimate how well a proposed set of parameters fits input data. Using simulations, we demonstrate that our methodology can accurately infer the indel parameters for a large variety of plausible settings. Moreover, using our methodology, we show that indel parameters substantially vary between three genomic data sets: Mammals, bacteria, and retroviruses. Finally, we demonstrate how our methodology can be used to simulate MSAs based on indel parameters inferred from real data sets.


Assuntos
Mutação INDEL , Alinhamento de Sequência/métodos , Software , Animais , Genoma Bacteriano , Genoma Viral , Mamíferos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA