Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 38
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
BMC Bioinformatics ; 19(1): 226, 2018 06 15.
Artigo em Inglês | MEDLINE | ID: mdl-29902968

RESUMO

BACKGROUND: Third generation sequencing technologies generate long reads that exhibit high error rates, in particular for insertions and deletions which are usually the most difficult errors to cope with. The only exact algorithm capable of aligning sequences with insertions and deletions is a dynamic programming algorithm. RESULTS: In this note, for the sake of efficiency, we consider dynamic programming in a band. We show how to choose the band width in function of the long reads' error rates, thus obtaining an [Formula: see text] algorithm in space and time. We also propose a procedure to decide whether this algorithm, when applied to semi-global alignments, provides the optimal score. CONCLUSIONS: We suggest that dynamic programming in a band is well suited to the problem of aligning long reads between themselves and can be used as a core component of methods for obtaining a consensus sequence from the long reads alone. The function implementing the dynamic programming algorithm in a band is available, as a standalone program, at: https://forgemia.inra.fr/jean-francois.gibrat/BAND_DYN_PROG.git.


Assuntos
Algoritmos , Biologia Computacional/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Linguagens de Programação , Análise de Sequência de DNA/métodos , Software , Genoma Humano , Humanos
2.
Bioinformatics ; 32(7): 1083-4, 2016 04 01.
Artigo em Inglês | MEDLINE | ID: mdl-26607491

RESUMO

MOTIVATION: High-throughput sequencing technologies provide access to an increasing number of bacterial genomes. Today, many analyses involve the comparison of biological properties among many strains of a given species, or among species of a particular genus. Tools that can help the microbiologist with these tasks become increasingly important. RESULTS: Insyght is a comparative visualization tool whose core features combine a synchronized navigation across genomic data of multiple organisms with a versatile interoperability between complementary views. In this work, we have greatly increased the scope of the Insyght public dataset by including 2688 complete bacterial genomes available in Ensembl thus vastly improving its phylogenetic coverage. We also report the development of a virtual machine that allows users to easily set up and customize their own local Insyght server. AVAILABILITY AND IMPLEMENTATION: http://genome.jouy.inra.fr/Insyght CONTACT: Thomas.Lacroix@jouy.inra.fr.


Assuntos
Gráficos por Computador , Genoma Bacteriano , Filogenia , Genômica , Sequenciamento de Nucleotídeos em Larga Escala , Internet , Software
3.
Nucleic Acids Res ; 42(21)2014 Dec 01.
Artigo em Inglês | MEDLINE | ID: mdl-25249626

RESUMO

High-throughput techniques have considerably increased the potential of comparative genomics whilst simultaneously posing many new challenges. One of those challenges involves efficiently mining the large amount of data produced and exploring the landscape of both conserved and idiosyncratic genomic regions across multiple genomes. Domains of application of these analyses are diverse: identification of evolutionary events, inference of gene functions, detection of niche-specific genes or phylogenetic profiling. Insyght is a comparative genomic visualization tool that combines three complementary displays: (i) a table for thoroughly browsing amongst homologues, (ii) a comparator of orthologue functional annotations and (iii) a genomic organization view designed to improve the legibility of rearrangements and distinctive loci. The latter display combines symbolic and proportional graphical paradigms. Synchronized navigation across multiple species and interoperability between the views are core features of Insyght. A gene filter mechanism is provided that helps the user to build a biologically relevant gene set according to multiple criteria such as presence/absence of homologues and/or various annotations. We illustrate the use of Insyght with scenarios. Currently, only Bacteria and Archaea are supported. A public instance is available at http://genome.jouy.inra.fr/Insyght. The tool is freely downloadable for private data set analysis.


Assuntos
Mineração de Dados/métodos , Genes Bacterianos , Genômica/métodos , Anotação de Sequência Molecular , Sintenia , Gráficos por Computador , Genes Arqueais , Homologia de Sequência do Ácido Nucleico , Software
4.
Gut ; 63(10): 1566-77, 2014 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-24436141

RESUMO

OBJECTIVE: No Crohn's disease (CD) molecular maker has advanced to clinical use, and independent lines of evidence support a central role of the gut microbial community in CD. Here we explore the feasibility of extracting bacterial protein signals relevant to CD, by interrogating myriads of intestinal bacterial proteomes from a small number of patients and healthy controls. DESIGN: We first developed and validated a workflow-including extraction of microbial communities, two-dimensional difference gel electrophoresis (2D-DIGE), and LC-MS/MS-to discover protein signals from CD-associated gut microbial communities. Then we used selected reaction monitoring (SRM) to confirm a set of candidates. In parallel, we used 16S rRNA gene sequencing for an integrated analysis of gut ecosystem structure and functions. RESULTS: Our 2D-DIGE-based discovery approach revealed an imbalance of intestinal bacterial functions in CD. Many proteins, largely derived from Bacteroides species, were over-represented, while under-represented proteins were mostly from Firmicutes and some Prevotella members. Most overabundant proteins could be confirmed using SRM. They correspond to functions allowing opportunistic pathogens to colonise the mucus layers, breach the host barriers and invade the mucosae, which could still be aggravated by decreased host-derived pancreatic zymogen granule membrane protein GP2 in CD patients. Moreover, although the abundance of most protein groups reflected that of related bacterial populations, we found a specific independent regulation of bacteria-derived cell envelope proteins. CONCLUSIONS: This study provides the first evidence that quantifiable bacterial protein signals are associated with CD, which can have a profound impact on future molecular diagnosis.


Assuntos
Proteínas de Bactérias/metabolismo , Biomarcadores/metabolismo , Doença de Crohn/microbiologia , Intestinos/microbiologia , Adulto , Bactérias/genética , Bactérias/isolamento & purificação , Cromatografia Líquida , Estudos Transversais , Eletroforese em Gel Bidimensional , Feminino , Humanos , Masculino , RNA Ribossômico 16S/genética , Análise de Sequência de Proteína , Espectrometria de Massas em Tandem
5.
BMC Evol Biol ; 13: 154, 2013 Jul 17.
Artigo em Inglês | MEDLINE | ID: mdl-23865988

RESUMO

BACKGROUND: Birnaviruses form a distinct family of double-stranded RNA viruses infecting animals as different as vertebrates, mollusks, insects and rotifers. With such a wide host range, they constitute a good model for studying the adaptation to the host. Additionally, several lines of evidence link birnaviruses to positive strand RNA viruses and suggest that phylogenetic analyses may provide clues about transition. RESULTS: We characterized the genome of a birnavirus from the rotifer Branchionus plicalitis. We used X-ray structures of RNA-dependent RNA polymerases and capsid proteins to obtain multiple structure alignments that allowed us to obtain reliable multiple sequence alignments and we employed "advanced" phylogenetic methods to study the evolutionary relationships between some positive strand and double-stranded RNA viruses. We showed that the rotifer birnavirus genome exhibited an organization remarkably similar to other birnaviruses. As this host was phylogenetically very distant from the other known species targeted by birnaviruses, we revisited the evolutionary pathways within the Birnaviridae family using phylogenetic reconstruction methods. We also applied a number of phylogenetic approaches based on structurally conserved domains/regions of the capsid and RNA-dependent RNA polymerase proteins to study the evolutionary relationships between birnaviruses, other double-stranded RNA viruses and positive strand RNA viruses. CONCLUSIONS: We show that there is a good correlation between the phylogeny of the birnaviruses and that of their hosts at the phylum level using the RNA-dependent RNA polymerase (genomic segment B) on the one hand and a concatenation of the capsid protein, protease and ribonucleoprotein (genomic segment A) on the other hand. This correlation tends to vanish within phyla. The use of advanced phylogenetic methods and robust structure-based multiple sequence alignments allowed us to obtain a more accurate picture (in terms of probability of the tree topologies) of the evolutionary affinities between double-stranded RNA and positive strand RNA viruses. In particular, we were able to show that there exists a good statistical support for the claims that dsRNA viruses are not monophyletic and that viruses with permuted RdRps belong to a common evolution lineage as previously proposed by other groups. We also propose a tree topology with a good statistical support describing the evolutionary relationships between the Picornaviridae, Caliciviridae, Flaviviridae families and a group including the Alphatetraviridae, Nodaviridae, Permutotretraviridae, Birnaviridae, and Cystoviridae families.


Assuntos
Evolução Molecular , Vírus de RNA/genética , Rotíferos/virologia , Sequência de Aminoácidos , Animais , Genoma Viral , Especificidade de Hospedeiro , Filogenia , Vírus de RNA/classificação , Vírus de RNA/fisiologia , Vírus de RNA/efeitos da radiação , RNA de Cadeia Dupla/genética , Rotíferos/classificação , Alinhamento de Sequência , Proteínas Virais/química , Proteínas Virais/genética
6.
Bioinformatics ; 28(7): 1040-1, 2012 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-22345617

RESUMO

SUMMARY: The DOMIRE web server implements a novel, automatic, protein structural domain assignment procedure based on 3D substructures of the query protein which are also found within structures of a non-redundant protein database. These common 3D substructures are transformed into a co-occurrence matrix that offers a global view of the protein domain organization. Three different algorithms are employed to define structural domain boundaries from this co-occurrence matrix. For each query, a list of structural neighbors and their alignments are provided. DOMIRE, by displaying the protein structural domain organization, can be a useful tool for defining protein common cores and for unravelling the evolutionary relationship between different proteins. AVAILABILITY: http://genome.jouy.inra.fr/domire CONTACT: jean.garnier@jouy.inra.fr.


Assuntos
Internet , Estrutura Terciária de Proteína , Proteínas/química , Software , Algoritmos , Bases de Dados de Proteínas , Alinhamento de Sequência
7.
J Bacteriol ; 194(3): 738-9, 2012 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-22247534

RESUMO

Corynebacterium casei is one of the most prevalent species present on the surfaces of smear-ripened cheeses, where it contributes to the production of the desired organoleptic properties. Here, we report the draft genome sequence of Corynebacterium casei UCMA 3821 to provide insights into its physiology.


Assuntos
Queijo/microbiologia , Corynebacterium/genética , Genoma Bacteriano , Sequência de Bases , Corynebacterium/isolamento & purificação , Dados de Sequência Molecular
8.
J Bacteriol ; 194(18): 5141-2, 2012 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-22933766

RESUMO

Staphylococcus equorum subsp. equorum is a member of the coagulase-negative staphylococcus group and is frequently isolated from fermented food products and from food-processing environments. It contributes to the formation of aroma compounds during the ripening of fermented foods, especially cheeses and sausages. Here, we report the draft genome sequence of Staphylococcus equorum subsp. equorum Mu2 to provide insights into its physiology and compare it with other Staphylococcus species.


Assuntos
DNA Bacteriano/química , DNA Bacteriano/genética , Genoma Bacteriano , Análise de Sequência de DNA , Staphylococcus/genética , Queijo/microbiologia , Dados de Sequência Molecular , Staphylococcus/isolamento & purificação
9.
J Bacteriol ; 194(9): 2385-6, 2012 May.
Artigo em Inglês | MEDLINE | ID: mdl-22493197

RESUMO

Salmonella enterica subsp. enterica serotype Senftenberg is an emerging serotype in poultry production which has been found to persist in animals and the farm environment. We report the genome sequence and annotation of the SS209 strain of S. Senftenberg, isolated from a hatchery, which was identified as persistent in broiler chickens.


Assuntos
Genoma Bacteriano , Salmonella enterica/classificação , Salmonella enterica/genética , Cromossomos Bacterianos , DNA Bacteriano/genética , Regulação Bacteriana da Expressão Gênica , Dados de Sequência Molecular
10.
J Bacteriol ; 194(9): 2387-8, 2012 May.
Artigo em Inglês | MEDLINE | ID: mdl-22493198

RESUMO

Salmonella enterica subsp. enterica serotype Enteritidis is one of the major causes of gastroenteritis in humans due to consumption of poultry derivatives. Here we report the whole-genome sequence and annotation, including the virulence plasmid, of S. Enteritidis LA5, which is a chicken isolate used by numerous laboratories in virulence studies.


Assuntos
Genoma Bacteriano , Salmonella enterica/classificação , Salmonella enterica/genética , Cromossomos Bacterianos , DNA Bacteriano/genética , Regulação Bacteriana da Expressão Gênica , Dados de Sequência Molecular
11.
J Bacteriol ; 193(19): 5581-2, 2011 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-21914889

RESUMO

Streptococcus thermophilus is a dairy species commonly used in the manufacture of cheese and yogurt. Here, we report the complete sequence of S. thermophilus strain JIM8232, isolated from milk and which produces a yellow pigment, an atypical trait for this bacterium.


Assuntos
Genoma Bacteriano/genética , Streptococcus thermophilus/genética , Animais , Corantes , Leite/microbiologia , Dados de Sequência Molecular
12.
J Bacteriol ; 193(18): 5041-2, 2011 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-21742894

RESUMO

Streptococcus salivarius is a commensal species commonly found in the human oral cavity and digestive tract, although it is also associated with human infections such as meningitis, endocarditis, and bacteremia. Here, we report the complete sequence of S. salivarius strain CCHSS3, isolated from human blood.


Assuntos
DNA Bacteriano/química , DNA Bacteriano/genética , Genoma Bacteriano , Análise de Sequência de DNA , Streptococcus/genética , Sangue/microbiologia , Humanos , Dados de Sequência Molecular , Sepse/microbiologia , Infecções Estreptocócicas/microbiologia , Streptococcus/isolamento & purificação
13.
J Bacteriol ; 193(18): 5024-5, 2011 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-21742871

RESUMO

The commensal bacterium Streptococcus salivarius is a prevalent species of the human oropharyngeal tract with an important role in oral ecology. Here, we report the complete 2.2-Mb genome sequence and annotation of strain JIM8777, which was recently isolated from the oral cavity of a healthy, dentate infant.


Assuntos
DNA Bacteriano/química , DNA Bacteriano/genética , Genoma Bacteriano , Análise de Sequência de DNA , Streptococcus/genética , Humanos , Lactente , Dados de Sequência Molecular , Boca/microbiologia , Streptococcus/isolamento & purificação
14.
Proteins ; 79(3): 853-66, 2011 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-21287617

RESUMO

Domains are basic units of protein structure and essential for exploring protein fold space and structure evolution. With the structural genomics initiative, the number of protein structures in the Protein Databank (PDB) is increasing dramatically and domain assignments need to be done automatically. Most existing structural domain assignment programs define domains using the compactness of the domains and/or the number and strength of intra-domain versus inter-domain contacts. Here we present a different approach based on the recurrence of locally similar structural pieces (LSSPs) found by one-against-all structure comparisons with a dataset of 6373 protein chains from the PDB. Residues of the query protein are clustered using LSSPs via three different procedures to define domains. This approach gives results that are comparable to several existing programs that use geometrical and other structural information explicitly. Remarkably, most of the proteins that contribute the LSSPs defining a domain do not themselves contain the domain of interest. This study shows that domains can be defined by a collection of relatively small locally similar structural pieces containing, on average, four secondary structure elements. In addition, it indicates that domains are indeed made of recurrent small structural pieces that are used to build protein structures of many different folds as suggested by recent studies.


Assuntos
Proteínas/química , Conformação Proteica
15.
J Bacteriol ; 192(10): 2647-8, 2010 May.
Artigo em Inglês | MEDLINE | ID: mdl-20348264

RESUMO

The entire genome of Lactobacillus casei BL23, a strain with probiotic properties, has been sequenced. The genomes of BL23 and the industrially used probiotic strain Shirota YIT 9029 (Yakult) seem to be very similar.


Assuntos
Genoma Bacteriano/genética , Lacticaseibacillus casei/genética , Dados de Sequência Molecular
16.
BMC Bioinformatics ; 11: 4, 2010 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-20047649

RESUMO

BACKGROUND: Sequence comparisons make use of a one-letter representation for amino acids, the necessary quantitative information being supplied by the substitution matrices. This paper deals with the problem of finding a representation that provides a comprehensive description of amino acid intrinsic properties consistent with the substitution matrices. RESULTS: We present a Euclidian vector representation of the amino acids, obtained by the singular value decomposition of the substitution matrices. The substitution matrix entries correspond to the dot product of amino acid vectors. We apply this vector encoding to the study of the relative importance of various amino acid physicochemical properties upon the substitution matrices. We also characterize and compare the PAM and BLOSUM series substitution matrices. CONCLUSIONS: This vector encoding introduces a Euclidian metric in the amino acid space, consistent with substitution matrices. Such a numerical description of the amino acid is useful when intrinsic properties of amino acids are necessary, for instance, building sequence profiles or finding consensus sequences, using machine learning algorithms such as Support Vector Machine and Neural Networks algorithms.


Assuntos
Substituição de Aminoácidos , Aminoácidos/química , Biologia Computacional/métodos , Sequência de Aminoácidos , Bases de Dados de Proteínas , Alinhamento de Sequência , Análise de Sequência de Proteína/métodos
17.
Nat Biotechnol ; 25(7): 763-9, 2007 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-17592475

RESUMO

We report here the complete genome sequence of the virulent strain JIP02/86 (ATCC 49511) of Flavobacterium psychrophilum, a widely distributed pathogen of wild and cultured salmonid fish. The genome consists of a 2,861,988-base pair (bp) circular chromosome with 2,432 predicted protein-coding genes. Among these predicted proteins, stress response mediators, gliding motility proteins, adhesins and many putative secreted proteases are probably involved in colonization, invasion and destruction of the host tissues. The genome sequence provides the basis for explaining the relationships of the pathogen to the host and opens new perspectives for the development of more efficient disease control strategies. It also allows for a better understanding of the physiology and evolution of a significant representative of the family Flavobacteriaceae, whose members are associated with an interesting diversity of lifestyles and habitats.


Assuntos
Biotecnologia/métodos , Peixes/microbiologia , Flavobacterium/metabolismo , Genoma Bacteriano , Animais , Biofilmes , Adesão Celular , Membrana Celular/metabolismo , Infecções por Flavobacteriaceae/metabolismo , Genoma , Modelos Biológicos , Fases de Leitura Aberta , Parasitos
18.
Appl Environ Microbiol ; 75(19): 6406-9, 2009 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-19648361

RESUMO

Multilocus sequence typing with nine selected genes is shown to be a promising new tool for accurate identifications of Brevibacteriaceae at the species level. A developed microarray also allows intraspecific diversity investigations of Brevibacterium aurantiacum showing that 13% to 15% of the genes of strain ATCC 9174 were absent or divergent in strain BL2 or ATCC 9175.


Assuntos
Técnicas de Tipagem Bacteriana/métodos , Brevibacterium/classificação , Brevibacterium/isolamento & purificação , Hibridização Genômica Comparativa/métodos , Impressões Digitais de DNA/métodos , Análise de Sequência de DNA/métodos , Sequência de Bases , Brevibacterium/genética , Análise por Conglomerados , Variação Genética , Genótipo , Dados de Sequência Molecular , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Filogenia , Alinhamento de Sequência
19.
Biophys Physicobiol ; 16: 444-451, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31984196

RESUMO

This paper presents a preliminary work consisting of two contributions. The first one is the design of a very efficient algorithm based on an "Overlap-Layout-Consensus" (OLC) graph to assemble the long reads provided by 3rd generation technologies. The second concerns the analysis of this graph using algebraic topology concepts to determine, in advance, whether the assembly of the genome will be straightforward, i.e., whether it will lead to a pseudo-Hamiltonian path or cycle, or whether the results will need to be scrutinized. In the latter case, it will be necessary to look for "loops" in the OLC assembly graph caused by unresolved repeated genomic regions, and then try to untie the "knots" created by these regions.

20.
BMC Bioinformatics ; 9: 6, 2008 Jan 07.
Artigo em Inglês | MEDLINE | ID: mdl-18179702

RESUMO

BACKGROUND: Recent approaches for predicting the three-dimensional (3D) structure of proteins such as de novo or fold recognition methods mostly rely on simplified energy potential functions and a reduced representation of the polypeptide chain. These simplifications facilitate the exploration of the protein conformational space but do not permit to capture entirely the subtle relationship that exists between the amino acid sequence and its native structure. It has been proposed that physics-based energy functions together with techniques for sampling the conformational space, e.g., Monte Carlo or molecular dynamics (MD) simulations, are better suited to the task of modelling proteins at higher resolutions than those of models obtained with the former type of methods. In this study we monitor different protein structural properties along MD trajectories to discriminate correct from erroneous models. These models are based on the sequence-structure alignments provided by our fold recognition method, FROST. We define correct models as being built from alignments of sequences with structures similar to their native structures and erroneous models from alignments of sequences with structures unrelated to their native structures. RESULTS: For three test sequences whose native structures belong to the all-alpha, all-beta and alphabeta classes we built a set of models intended to cover the whole spectrum: from a perfect model, i.e., the native structure, to a very poor model, i.e., a random alignment of the test sequence with a structure belonging to another structural class, including several intermediate models based on fold recognition alignments. We submitted these models to 11 ns of MD simulations at three different temperatures. We monitored along the corresponding trajectories the mean of the Root-Mean-Square deviations (RMSd) with respect to the initial conformation, the RMSd fluctuations, the number of conformation clusters, the evolution of secondary structures and the surface area of residues. None of these criteria alone is 100% efficient in discriminating correct from erroneous models. The mean RMSd, RMSd fluctuations, secondary structure and clustering of conformations show some false positives whereas the residue surface area criterion shows false negatives. However if we consider these criteria in combination it is straightforward to discriminate the two types of models. CONCLUSION: The ability of discriminating correct from erroneous models allows us to improve the specificity and sensitivity of our fold recognition method for a number of ambiguous cases.


Assuntos
Modelos Químicos , Modelos Moleculares , Proteínas/química , Proteínas/ultraestrutura , Alinhamento de Sequência/métodos , Análise de Sequência de Proteína/métodos , Sequência de Aminoácidos , Sítios de Ligação , Simulação por Computador , Cinética , Dados de Sequência Molecular , Ligação Proteica , Conformação Proteica , Dobramento de Proteína , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA