Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 26
Filter
1.
Viruses ; 15(2)2023 02 13.
Article in English | MEDLINE | ID: mdl-36851733

ABSTRACT

Profile hidden Markov models (HMMs) are a powerful way of modeling biological sequence diversity and constitute a very sensitive approach to detecting divergent sequences. Here, we report the development of protocols for the rational design of profile HMMs. These methods were implemented on TABAJARA, a program that can be used to either detect all biological sequences of a group or discriminate specific groups of sequences. By calculating position-specific information scores along a multiple sequence alignment, TABAJARA automatically identifies the most informative sequence motifs and uses them to construct profile HMMs. As a proof-of-principle, we applied TABAJARA to generate profile HMMs for the detection and classification of two viral groups presenting different evolutionary rates: bacteriophages of the Microviridae family and viruses of the Flavivirus genus. We obtained conserved models for the generic detection of any Microviridae or Flavivirus sequence, and profile HMMs that can specifically discriminate Microviridae subfamilies or Flavivirus species. In another application, we constructed Cas1 endonuclease-derived profile HMMs that can discriminate CRISPRs and casposons, two evolutionarily related transposable elements. We believe that the protocols described here, and implemented on TABAJARA, constitute a generic toolbox for generating profile HMMs for the highly sensitive and specific detection of sequence classes.


Subject(s)
Bacteriophages , Microviridae , Bacteriophages/genetics , Biodiversity , Biological Evolution , Clustered Regularly Interspaced Short Palindromic Repeats , Markov Chains
2.
Viruses ; 13(1)2020 12 23.
Article in English | MEDLINE | ID: mdl-33374584

ABSTRACT

Hematophagous insects act as the major reservoirs of infectious agents due to their intimate contact with a large variety of vertebrate hosts. Lutzomyia longipalpis is the main vector of Leishmania chagasi in the New World, but its role as a host of viruses is poorly understood. In this work, Lu. longipalpis RNA libraries were subjected to progressive assembly using viral profile HMMs as seeds. A sequence phylogenetically related to fungal viruses of the genus Mitovirus was identified and this novel virus was named Lul-MV-1. The 2697-base genome presents a single gene coding for an RNA-directed RNA polymerase with an organellar genetic code. To determine the possible host of Lul-MV-1, we analyzed the molecular characteristics of the viral genome. Dinucleotide composition and codon usage showed profiles similar to mitochondrial DNA of invertebrate hosts. Also, the virus-derived small RNA profile was consistent with the activation of the siRNA pathway, with size distribution and 5' base enrichment analogous to those observed in viruses of sand flies, reinforcing Lu. longipalpis as a putative host. Finally, RT-PCR of different insect pools and sequences of public Lu. longipalpis RNA libraries confirmed the high prevalence of Lul-MV-1. This is the first report of a mitovirus infecting an insect host.


Subject(s)
Genome, Viral , Host Microbial Interactions , Orthoreovirus/genetics , Psychodidae/classification , Psychodidae/virology , Animals , Codon , Codon Usage , Gene Amplification , Genomics/methods , High-Throughput Nucleotide Sequencing , Markov Chains , Phylogeny , Prevalence , RNA Interference , RNA Viruses/genetics , RNA, Small Interfering/genetics
3.
RNA ; 26(5): 581-594, 2020 05.
Article in English | MEDLINE | ID: mdl-31996404

ABSTRACT

Endogenous viral elements (EVEs) are found in many eukaryotic genomes. Despite considerable knowledge about genomic elements such as transposons (TEs) and retroviruses, we still lack information about nonretroviral EVEs. Aedes aegypti mosquitoes have a highly repetitive genome that is covered with EVEs. Here, we identified 129 nonretroviral EVEs in the AaegL5 version of the A. aegypti genome. These EVEs were significantly associated with TEs and preferentially located in repeat-rich clusters within intergenic regions. Genome-wide transcriptome analysis showed that most EVEs generated transcripts although only around 1.4% were sense RNAs. The majority of EVE transcription was antisense and correlated with the generation of EVE-derived small RNAs. A single genomic cluster of EVEs located in a 143 kb repetitive region in chromosome 2 contributed with 42% of antisense transcription and 45% of small RNAs derived from viral elements. This region was enriched for TE-EVE hybrids organized in the same coding strand. These generated a single long antisense transcript that correlated with the generation of phased primary PIWI-interacting RNAs (piRNAs). The putative promoter of this region had a conserved binding site for the transcription factor Cubitus interruptus, a key regulator of the flamenco locus in Drosophila melanogaster Here, we have identified a single unidirectional piRNA cluster in the A. aegypti genome that is the major source of EVE transcription fueling the generation of antisense small RNAs in mosquitoes. We propose that this region is a flamenco-like locus in A. aegypti due to its relatedness to the major unidirectional piRNA cluster in Drosophila melanogaster.


Subject(s)
Aedes/genetics , Genome, Insect/genetics , RNA, Small Interfering/genetics , Retroelements/genetics , Animals , Binding Sites/genetics , Cadherins/genetics , Culicidae/genetics , DNA-Binding Proteins/genetics , Drosophila Proteins/genetics , Drosophila melanogaster/genetics , Homeodomain Proteins/genetics , Promoter Regions, Genetic , Transcription Factors/genetics
4.
Viruses ; 10(5)2018 May 14.
Article in English | MEDLINE | ID: mdl-29757994

ABSTRACT

The Second Annual Meeting of the European Virus Bioinformatics Center (EVBC), held in Utrecht, Netherlands, focused on computational approaches in virology, with topics including (but not limited to) virus discovery, diagnostics, (meta-)genomics, modeling, epidemiology, molecular structure, evolution, and viral ecology. The goals of the Second Annual Meeting were threefold: (i) to bring together virologists and bioinformaticians from across the academic, industrial, professional, and training sectors to share best practice; (ii) to provide a meaningful and interactive scientific environment to promote discussion and collaboration between students, postdoctoral fellows, and both new and established investigators; (iii) to inspire and suggest new research directions and questions. Approximately 120 researchers from around the world attended the Second Annual Meeting of the EVBC this year, including 15 renowned international speakers. This report presents an overview of new developments and novel research findings that emerged during the meeting.


Subject(s)
Computational Biology , Virology , Congresses as Topic , DNA Viruses , Ecology , Genomics , Humans , Societies, Scientific , Software
5.
Genome Announc ; 4(6)2016 Nov 17.
Article in English | MEDLINE | ID: mdl-27856581

ABSTRACT

Herein, we report a draft genome sequence of the endophytic Curtobacterium sp. strain ER1/6, isolated from a surface-sterilized Citrus sinensis branch, and it presented the capability to control phytopathogens. Functional annotation of the ~3.4-Mb genome revealed 3,100 protein-coding genes, with many products related to known ecological and biotechnological aspects of this bacterium.

6.
Front Microbiol ; 7: 269, 2016.
Article in English | MEDLINE | ID: mdl-26973638

ABSTRACT

This work reports the development of GenSeed-HMM, a program that implements seed-driven progressive assembly, an approach to reconstruct specific sequences from unassembled data, starting from short nucleotide or protein seed sequences or profile Hidden Markov Models (HMM). The program can use any one of a number of sequence assemblers. Assembly is performed in multiple steps and relatively few reads are used in each cycle, consequently the program demands low computational resources. As a proof-of-concept and to demonstrate the power of HMM-driven progressive assemblies, GenSeed-HMM was applied to metagenomic datasets in the search for diverse ssDNA bacteriophages from the recently described Alpavirinae subfamily. Profile HMMs were built using Alpavirinae-specific regions from multiple sequence alignments (MSA) using either the viral protein 1 (VP1; major capsid protein) or VP4 (genome replication initiation protein). These profile HMMs were used by GenSeed-HMM (running Newbler assembler) as seeds to reconstruct viral genomes from sequencing datasets of human fecal samples. All contigs obtained were annotated and taxonomically classified using similarity searches and phylogenetic analyses. The most specific profile HMM seed enabled the reconstruction of 45 partial or complete Alpavirinae genomic sequences. A comparison with conventional (global) assembly of the same original dataset, using Newbler in a standalone execution, revealed that GenSeed-HMM outperformed global genomic assembly in several metrics employed. This approach is capable of detecting organisms that have not been used in the construction of the profile HMM, which opens up the possibility of diagnosing novel viruses, without previous specific information, constituting a de novo diagnosis. Additional applications include, but are not limited to, the specific assembly of extrachromosomal elements such as plastid and mitochondrial genomes from metagenomic data. Profile HMM seeds can also be used to reconstruct specific protein coding genes for gene diversity studies, and to determine all possible gene variants present in a metagenomic sample. Such surveys could be useful to detect the emergence of drug-resistance variants in sensitive environments such as hospitals and animal production facilities, where antibiotics are regularly used. Finally, GenSeed-HMM can be used as an adjunct for gap closure on assembly finishing projects, by using multiple contig ends as anchored seeds.

7.
Genome Biol Evol ; 8(1): 94-108, 2015 Nov 27.
Article in English | MEDLINE | ID: mdl-26615220

ABSTRACT

The alphabaculovirus Anticarsia gemmatalis multiple nucleopolyhedrovirus (AgMNPV) is the world's most successful viral bioinsecticide. Through the 1980s and 1990s, this virus was extensively used for biological control of populations of Anticarsia gemmatalis (Velvetbean caterpillar) in soybean crops. During this period, genetic studies identified several variable loci in the AgMNPV; however, most of them were not characterized at the sequence level. In this study we report a full genome comparison among 17 wild-type isolates of AgMNPV. We found the pangenome of this virus to contain at least 167 hypothetical genes, 151 of which are shared by all genomes. The gene bro-a that might be involved in host specificity and carrying transporter is absent in some genomes, and new hypothetical genes were observed. Among these genes there is a unique rnf12-like gene, probably implicated in ubiquitination. Events of gene fission and fusion are common, as four genes have been observed as single or split open reading frames. Gains and losses of genomic fragments (from 20 to 900 bp) are observed within tandem repeats, such as in eight direct repeats and four homologous regions. Most AgMNPV genes present low nucleotide diversity, and variable genes are mainly located in a locus known to evolve through homologous recombination. The evolution of AgMNPV is mainly driven by small indels, substitutions, gain and loss of nucleotide stretches or entire coding sequences. These variations may cause relevant phenotypic alterations, which probably affect the infectivity of AgMNPV. This work provides novel information on genomic evolution of the AgMNPV in particular and of baculoviruses in general.


Subject(s)
Baculoviridae/genetics , Genome, Viral , Lepidoptera/virology , Animals , Base Sequence , Genomic Instability , Molecular Sequence Data , Open Reading Frames , Polymorphism, Genetic , Recombination, Genetic , Ubiquitins/genetics , Viral Proteins/genetics
8.
Genome Res ; 24(10): 1676-85, 2014 Oct.
Article in English | MEDLINE | ID: mdl-25015382

ABSTRACT

Global production of chickens has trebled in the past two decades and they are now the most important source of dietary animal protein worldwide. Chickens are subject to many infectious diseases that reduce their performance and productivity. Coccidiosis, caused by apicomplexan protozoa of the genus Eimeria, is one of the most important poultry diseases. Understanding the biology of Eimeria parasites underpins development of new drugs and vaccines needed to improve global food security. We have produced annotated genome sequences of all seven species of Eimeria that infect domestic chickens, which reveal the full extent of previously described repeat-rich and repeat-poor regions and show that these parasites possess the most repeat-rich proteomes ever described. Furthermore, while no other apicomplexan has been found to possess retrotransposons, Eimeria is home to a family of chromoviruses. Analysis of Eimeria genes involved in basic biology and host-parasite interaction highlights adaptations to a relatively simple developmental life cycle and a complex array of co-expressed surface proteins involved in host cell binding.


Subject(s)
Eimeria/genetics , Genome, Protozoan , Protozoan Proteins/genetics , Animals , Cell Line , Chickens , Chromosome Mapping , Coccidiosis/parasitology , Coccidiosis/veterinary , Eimeria/classification , Gene Expression Profiling , Phylogeny , Poultry Diseases/parasitology , Proteome , Synteny
9.
Adv Parasitol ; 83: 93-171, 2013.
Article in English | MEDLINE | ID: mdl-23876872

ABSTRACT

Coccidiosis is a widespread and economically significant disease of livestock caused by protozoan parasites of the genus Eimeria. This disease is worldwide in occurrence and costs the animal agricultural industry many millions of dollars to control. In recent years, the modern tools of molecular biology, biochemistry, cell biology and immunology have been used to expand greatly our knowledge of these parasites and the disease they cause. Such studies are essential if we are to develop new means for the control of coccidiosis. In this chapter, selective aspects of the biology of these organisms, with emphasis on recent research in poultry, are reviewed. Topics considered include taxonomy, systematics, genetics, genomics, transcriptomics, proteomics, transfection, oocyst biogenesis, host cell invasion, immunobiology, diagnostics and control.


Subject(s)
Coccidiosis/veterinary , Eimeria/pathogenicity , Parasitology/trends , Poultry Diseases/diagnosis , Poultry Diseases/epidemiology , Veterinary Medicine/trends , Animals , Coccidiosis/diagnosis , Coccidiosis/epidemiology , Coccidiosis/prevention & control , Communicable Disease Control/methods , Eimeria/classification , Eimeria/genetics , Eimeria/physiology , Poultry , Poultry Diseases/parasitology , Poultry Diseases/prevention & control
10.
Database (Oxford) ; 2013: bat006, 2013.
Article in English | MEDLINE | ID: mdl-23411718

ABSTRACT

Parasites of the genus Eimeria infect a wide range of vertebrate hosts, including chickens. We have recently reported a comparative analysis of the transcriptomes of Eimeria acervulina, Eimeria maxima and Eimeria tenella, integrating ORESTES data produced by our group and publicly available Expressed Sequence Tags (ESTs). All cDNA reads have been assembled, and the reconstructed transcripts have been submitted to a comprehensive functional annotation pipeline. Additional studies included orthology assignment across apicomplexan parasites and clustering analyses of gene expression profiles among different developmental stages of the parasites. To make all this body of information publicly available, we constructed the Eimeria Transcript Database (EimeriaTDB), a web repository that provides access to sequence data, annotation and comparative analyses. Here, we describe the web interface, available sequence data sets and query tools implemented on the site. The main goal of this work is to offer a public repository of sequence and functional annotation data of reconstructed transcripts of parasites of the genus Eimeria. We believe that EimeriaTDB will represent a valuable and complementary resource for the Eimeria scientific community and for those researchers interested in comparative genomics of apicomplexan parasites. Database URL: http://www.coccidia.icb.usp.br/eimeriatdb/


Subject(s)
Databases, Genetic , Eimeria/genetics , Molecular Sequence Annotation , Parasites/genetics , Animals , Gene Expression Profiling , Internet , RNA, Messenger/genetics , RNA, Messenger/metabolism , Search Engine , Statistics as Topic
11.
Int J Parasitol ; 42(1): 39-48, 2012 Jan.
Article in English | MEDLINE | ID: mdl-22142560

ABSTRACT

Coccidiosis of the domestic fowl is a worldwide disease caused by seven species of protozoan parasites of the genus Eimeria. The genome of the model species, Eimeria tenella, presents a complexity of 55-60MB distributed in 14 chromosomes. Relatively few studies have been undertaken to unravel the complexity of the transcriptome of Eimeria parasites. We report here the generation of more than 45,000 open reading frame expressed sequence tag (ORESTES) cDNA reads of E. tenella, Eimeria maxima and Eimeria acervulina, covering several developmental stages: unsporulated oocysts, sporoblastic oocysts, sporulated oocysts, sporozoites and second generation merozoites. All reads were assembled to constitute gene indices and submitted to a comprehensive functional annotation pipeline. In the case of E. tenella, we also incorporated publicly available ESTs to generate an integrated body of information. Orthology analyses have identified genes conserved across different apicomplexan parasites, as well as genes restricted to the genus Eimeria. Digital expression profiles obtained from ORESTES/EST countings, submitted to clustering analyses, revealed a high conservation pattern across the three Eimeria spp. Distance trees showed that unsporulated and sporoblastic oocysts constitute a distinct clade in all species, with sporulated oocysts forming a more external branch. This latter stage also shows a close relationship with sporozoites, whereas first and second generation merozoites are more closely related to each other than to sporozoites. The profiles were unambiguously associated with the distinct developmental stages and strongly correlated with the order of the stages in the parasite life cycle. Finally, we present The Eimeria Transcript Database (http://www.coccidia.icb.usp.br/eimeriatdb), a website that provides open access to all sequencing data, annotation and comparative analysis. We expect this repository to represent a useful resource to the Eimeria scientific community, helping to define potential candidates for the development of new strategies to control coccidiosis of the domestic fowl.


Subject(s)
Eimeria/genetics , Gene Expression Profiling , Poultry/parasitology , Animals , Cluster Analysis , Eimeria/growth & development , Eimeria/isolation & purification , Expressed Sequence Tags , Molecular Sequence Data , Sequence Analysis, DNA
12.
Vet Parasitol ; 176(2-3): 275-80, 2011 Mar 10.
Article in English | MEDLINE | ID: mdl-21111537

ABSTRACT

Coccidiosis are the major parasitic diseases in poultry and other domestic animals including the domestic rabbit (Oryctolagus cuniculus). Eleven distinct Eimeria species have been identified in this host, but no PCR-based method has been developed so far for unequivocal species differentiation. In this work, we describe the development of molecular diagnostic assays that allow for the detection and discrimination of the 11 Eimeria species that infect rabbits. We determined the nucleotide sequences of the ITS1 ribosomal DNAs and designed species-specific primers for each species. We performed specificity tests of the assays using heterologous sets of primers and DNA samples, and no cross-specific bands were observed. We obtained a detection limit varying from 500fg to 1pg, which corresponds approximately to 0.8-1.7 sporulated oocysts, respectively. The test reported here showed good reproducibility and presented a consistent sensitivity with three different brands of amplification enzymes. These novel diagnostic assays will permit population surveys to be performed with high sensitivity and specificity, thus contributing to a better understanding of the epidemiology of this important group of coccidian parasites.


Subject(s)
Coccidiosis/veterinary , Eimeria/classification , Eimeria/genetics , Polymerase Chain Reaction/veterinary , Rabbits , Animals , Coccidiosis/parasitology , DNA, Intergenic/genetics , Polymerase Chain Reaction/methods , Species Specificity
13.
Bioinformatics ; 24(15): 1676-80, 2008 Aug 01.
Article in English | MEDLINE | ID: mdl-18544546

ABSTRACT

MOTIVATION: DNA assembly programs classically perform an all-against-all comparison of reads to identify overlaps, followed by a multiple sequence alignment and generation of a consensus sequence. If the aim is to assemble a particular segment, instead of a whole genome or transcriptome, a target-specific assembly is a more sensible approach. GenSeed is a Perl program that implements a seed-driven recursive assembly consisting of cycles comprising a similarity search, read selection and assembly. The iterative process results in a progressive extension of the original seed sequence. GenSeed was tested and validated on many applications, including the reconstruction of nuclear genes or segments, full-length transcripts, and extrachromosomal genomes. The robustness of the method was confirmed through the use of a variety of DNA and protein seeds, including short sequences derived from SAGE and proteome projects. AVAILABILITY: GenSeed is available under the GNU General Public License at http://www.coccidia.icb.usp.br/genseed/


Subject(s)
Algorithms , DNA/genetics , Database Management Systems , Databases, Genetic , Proteome/chemistry , Proteome/genetics , Sequence Analysis/methods , Amino Acid Sequence , Base Sequence , Molecular Sequence Data
14.
Genome Res ; 17(3): 311-9, 2007 Mar.
Article in English | MEDLINE | ID: mdl-17284678

ABSTRACT

Eimeria tenella is an intracellular protozoan parasite that infects the intestinal tracts of domestic fowl and causes coccidiosis, a serious and sometimes lethal enteritis. Eimeria falls in the same phylum (Apicomplexa) as several human and animal parasites such as Cryptosporidium, Toxoplasma, and the malaria parasite, Plasmodium. Here we report the sequencing and analysis of the first chromosome of E. tenella, a chromosome believed to carry loci associated with drug resistance and known to differ between virulent and attenuated strains of the parasite. The chromosome--which appears to be representative of the genome--is gene-dense and rich in simple-sequence repeats, many of which appear to give rise to repetitive amino acid tracts in the predicted proteins. Most striking is the segmentation of the chromosome into repeat-rich regions peppered with transposon-like elements and telomere-like repeats, alternating with repeat-free regions. Predicted genes differ in character between the two types of segment, and the repeat-rich regions appear to be associated with strain-to-strain variation.


Subject(s)
Chromosome Structures/genetics , Eimeria tenella/genetics , Genes, Protozoan/genetics , Animals , Base Sequence , Chromosome Mapping , Computational Biology , Minisatellite Repeats/genetics , Molecular Sequence Data , Polymorphism, Restriction Fragment Length , Sequence Analysis, DNA
15.
Bioinformatics ; 22(3): 361-2, 2006 Feb 01.
Article in English | MEDLINE | ID: mdl-16332714

ABSTRACT

TRAP, the Tandem Repeats Analysis Program, is a Perl program that provides a unified set of analyses for the selection, classification, quantification and automated annotation of tandemly repeated sequences. TRAP uses the results of the Tandem Repeats Finder program to perform a global analysis of the satellite content of DNA sequences, permitting researchers to easily assess the tandem repeat content for both individual sequences and whole genomes. The results can be generated in convenient formats such as HTML and comma-separated values. TRAP can also be used to automatically generate annotation data in the format of feature table and GFF files.


Subject(s)
Algorithms , DNA/genetics , Documentation/methods , Sequence Analysis, DNA/methods , Software , Tandem Repeat Sequences/genetics , User-Computer Interface , Artificial Intelligence , DNA/classification , Databases, Genetic , Pattern Recognition, Automated/methods
16.
Microbes Infect ; 7(11-12): 1184-95, 2005.
Article in English | MEDLINE | ID: mdl-15951215

ABSTRACT

Proteins containing tandemly repetitive sequences are present in several immunodominant protein antigens in pathogenic protozoan parasites. The tandemly repetitive Trypanosoma cruzi B13 protein is recognized by IgG antibodies from 98% of Chagas' disease patients. Little is known about the molecular mechanisms that lead to the immunodominance of the repeated sequences, and there is limited information on T cell epitopes in such repetitive antigens. We finely characterized the T cell recognition of the tandemly repetitive, degenerate B13 protein by T cell lines, clones and PBMC from Chagas' disease cardiomyopathy (CCC), asymptomatic T. cruzi infected (ASY) and non-infected individuals (N). PBMC proliferative responses to recombinant B13 protein were restricted to individuals bearing HLA-DQA1*0501(DQ7), -DR1, and -DR2; B13 peptides bound to the same HLA molecules in binding assays. The HLA-DQ7-restricted minimal T cell epitope [FGQAAAG(D/E)KP] was identified with an overlapping combinatorial peptide library including all B13 sequence variants in T. cruzi Y strain B13 protein; the underlined small residues GQA were the major HLA contact residues. Among natural B13 15-mer variant peptides, molecular modeling showed that several variant positions were solvent (TCR)-exposed, and substitutions at exposed positions abolished recognition. While natural B13 variant peptide S15.9 seems to be the immunodominant epitope for Chagas' disease patients, S15.4 was preferentially recognized by CCC rather than ASY patients, which may be pathogenically relevant. This is the first thorough characterization of T cell epitopes of a tandemly repetitive protozoan antigen and may suggest a role for T cell help in the immunodominance of protozoan repetitive antigens.


Subject(s)
Antigens, Protozoan/immunology , Epitopes, T-Lymphocyte/immunology , Immunodominant Epitopes , Protozoan Proteins/immunology , Trypanosoma cruzi/immunology , Amino Acid Sequence , Animals , Antibodies, Protozoan/blood , Antigens, Protozoan/chemistry , Chagas Disease/immunology , HLA-DQ Antigens/genetics , HLA-DQ Antigens/metabolism , HLA-DQ alpha-Chains , HLA-DR1 Antigen/genetics , HLA-DR1 Antigen/metabolism , HLA-DR2 Antigen/genetics , HLA-DR2 Antigen/metabolism , Humans , Models, Molecular , Molecular Sequence Data , Protein Binding , Protozoan Proteins/chemistry
17.
Bioinformatics ; 21(12): 2812-3, 2005 Jun 15.
Article in English | MEDLINE | ID: mdl-15814554

ABSTRACT

UNLABELLED: EGene is a generic, flexible and modular pipeline generation system that makes pipeline construction a modular job. EGene allows for third-party programs to be used and integrated according to the needs of distinct projects and without any previous programming or formal language experience being required. EGene comes with CoEd, a visual tool to facilitate pipeline construction and documentation. A series of components to build pipelines for sequence processing is provided. AVAILABILITY: http://www.lbm.fmvz.usp.br/egene/ CONTACT: alan@ime.usp.br; argruber@usp.br SUPPLEMENTARY INFORMATION: http://www.lbm.fmvz.usp.br/egene/


Subject(s)
Chromosome Mapping/methods , Computer Graphics , Database Management Systems , Databases, Nucleic Acid , Sequence Analysis, DNA/methods , Software , User-Computer Interface , Information Storage and Retrieval/methods , Systems Integration
18.
FEMS Microbiol Lett ; 238(1): 183-8, 2004 Sep 01.
Article in English | MEDLINE | ID: mdl-15336420

ABSTRACT

This study reports the development and characterization of 151 sequence characterized amplified region (SCAR) markers for the seven Eimeria species that infect the domestic fowl. From this set, 84 markers are species-specific and 67 present partial specificity. The complete nucleotide sequence was derived for all markers, revealing the presence of micro- and minisatellite repetitive units in 22 SCARs, with up to five distinct repeat units being observed per marker. Only 15 markers showed significant hits in similarity searches against public sequence databases, thus confirming their anonymous and non-coding character. Finally, a relational database of the markers (the Eimeria SCARdb) was developed and made available on the Internet, providing a valuable resource of SCAR markers that can be useful for molecular diagnosis, and also for epizootiological, genetic variability and genome mapping studies.


Subject(s)
DNA, Protozoan/chemistry , Databases, Nucleic Acid , Eimeria/genetics , Eimeria/isolation & purification , Genetic Markers , Poultry/microbiology , Animals , Blotting, Southern , Coccidiosis/parasitology , Coccidiosis/veterinary , Computational Biology , DNA, Protozoan/isolation & purification , Microsatellite Repeats , Minisatellite Repeats , Molecular Sequence Data , Poultry Diseases/parasitology , Random Amplified Polymorphic DNA Technique , Sequence Analysis, DNA
19.
Genome Res ; 14(7): 1413-23, 2004 Jul.
Article in English | MEDLINE | ID: mdl-15197164

ABSTRACT

We report the results of a transcript finishing initiative, undertaken for the purpose of identifying and characterizing novel human transcripts, in which RT-PCR was used to bridge gaps between paired EST clusters, mapped against the genomic sequence. Each pair of EST clusters selected for experimental validation was designated a transcript finishing unit (TFU). A total of 489 TFUs were selected for validation, and an overall efficiency of 43.1% was achieved. We generated a total of 59,975 bp of transcribed sequences organized into 432 exons, contributing to the definition of the structure of 211 human transcripts. The structure of several transcripts reported here was confirmed during the course of this project, through the generation of their corresponding full-length cDNA sequences. Nevertheless, for 21% of the validated TFUs, a full-length cDNA sequence is not yet available in public databases, and the structure of 69.2% of these TFUs was not correctly predicted by computer programs. The TF strategy provides a significant contribution to the definition of the complete catalog of human genes and transcripts, because it appears to be particularly useful for identification of low abundance transcripts expressed in a restricted set of tissues as well as for the delineation of gene boundaries and alternatively spliced isoforms.


Subject(s)
Software , Transcription, Genetic/genetics , Alternative Splicing/genetics , Cell Line , Cell Line, Tumor , Computational Biology/methods , Computational Biology/statistics & numerical data , Consensus Sequence/genetics , DNA, Neoplasm , Databases, Genetic/classification , Expressed Sequence Tags , Genes/genetics , Genome, Human , HeLa Cells/pathology , Humans , Molecular Sequence Data , Open Reading Frames/genetics , Software Design , Software Validation , U937 Cells/pathology
SELECTION OF CITATIONS
SEARCH DETAIL