Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 16 de 16
Filtrar
1.
Curr Protoc ; 4(6): e1065, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38857087

RESUMO

The European Bioinformatics Institute (EMBL-EBI)'s Job Dispatcher framework provides access to a wide range of core databases and analysis tools that are of key importance in bioinformatics. As well as providing web interfaces to these resources, web services are available using REST and SOAP protocols that enable programmatic access and allow their integration into other applications and analytical workflows and pipelines. This article describes the various options available to researchers and bioinformaticians who would like to use our resources via the web interface employing RESTful web services clients provided in Perl, Python, and Java or who would like to use Docker containers to integrate the resources into analysis pipelines and workflows. © 2024 The Authors. Current Protocols published by Wiley Periodicals LLC. Basic Protocol 1: Retrieving data from EMBL-EBI using Dbfetch via the web interface Alternate Protocol 1: Retrieving data from EMBL-EBI using WSDbfetch via the REST interface Alternate Protocol 2: Retrieving data from EMBL-EBI using Dbfetch via RESTful web services with Python client Support Protocol 1: Installing Python REST web services clients Basic Protocol 2: Sequence similarity search using FASTA search via the web interface Alternate Protocol 3: Sequence similarity search using FASTA via RESTful web services with Perl client Support Protocol 2: Installing Perl REST web services clients Basic Protocol 3: Sequence similarity search using NCBI BLAST+ RESTful web services with Python client Basic Protocol 4: Sequence similarity search using HMMER3 phmmer REST web services with Perl client and Docker Support Protocol 3: Installing Docker and running the EMBL-EBI client container Basic Protocol 5: Protein functional analysis using InterProScan 5 RESTful web services with the Python client and Docker Alternate Protocol 4: Protein functional analysis using InterProScan 5 RESTful web services with the Java client Support Protocol 4: Installing Java web services clients Basic Protocol 6: Multiple sequence alignment using Clustal Omega via web interface Alternate Protocol 5: Multiple sequence alignment using Clustal Omega with Perl client and Docker Support Protocol 5: Exploring the RESTful API with OpenAPI User Inferface.


Assuntos
Internet , Software , Biologia Computacional/métodos , Interface Usuário-Computador
2.
Nucleic Acids Res ; 52(W1): W521-W525, 2024 Jul 05.
Artigo em Inglês | MEDLINE | ID: mdl-38597606

RESUMO

The EMBL-EBI Job Dispatcher sequence analysis tools framework (https://www.ebi.ac.uk/jdispatcher) enables the scientific community to perform a diverse range of sequence analyses using popular bioinformatics applications. Free access to the tools and required sequence datasets is provided through user-friendly web applications, as well as via RESTful and SOAP-based APIs. These are integrated into popular EMBL-EBI resources such as UniProt, InterPro, ENA and Ensembl Genomes. This paper overviews recent improvements to Job Dispatcher, including its brand new website and documentation, enhanced visualisations, improved job management, and a rising trend of user reliance on the service from low- and middle-income regions.


Assuntos
Software , Internet , Análise de Sequência/métodos , Biologia Computacional/métodos , Bases de Dados Genéticas , Humanos
3.
Nucleic Acids Res ; 50(W1): W276-W279, 2022 07 05.
Artigo em Inglês | MEDLINE | ID: mdl-35412617

RESUMO

The EMBL-EBI search and sequence analysis tools frameworks provide integrated access to EMBL-EBI's data resources and core bioinformatics analytical tools. EBI Search (https://www.ebi.ac.uk/ebisearch) provides a full-text search engine across nearly 5 billion entries, while the Job Dispatcher tools framework (https://www.ebi.ac.uk/services) enables the scientific community to perform a diverse range of sequence analysis using popular bioinformatics applications. Both allow users to interact through user-friendly web applications, as well as via RESTful and SOAP-based APIs. Here, we describe recent improvements to these services and updates made to accommodate the increasing data requirements during the COVID-19 pandemic.


Assuntos
Análise de Sequência , Software , Humanos , Biologia Computacional , COVID-19/epidemiologia , Internet , Pandemias , Alinhamento de Sequência
4.
Curr Protoc Bioinformatics ; 66(1): e74, 2019 06.
Artigo em Inglês | MEDLINE | ID: mdl-31039604

RESUMO

The European Bioinformatics Institute (EMBL-EBI) provides access to a wide range of core databases and analysis tools that are of key importance in bioinformatics. As well as providing web interfaces to these resources, web services are available using REST and SOAP protocols that enable programmatic access and allow their integration into other applications and analytical workflows and pipelines. This article describes the various options available to researchers and bioinformaticians who would like to use our resources via the web interface employing RESTful web service clients provided in Perl, Python, and Java, or would like to use Docker containers to integrate the resources into analysis pipelines and workflows. © 2019 by John Wiley & Sons, Inc.


Assuntos
Bases de Dados Genéticas , Internet , Sequência de Aminoácidos , Bases de Conhecimento , Filogenia , Alinhamento de Sequência , Software , Interface Usuário-Computador
5.
Nucleic Acids Res ; 47(W1): W636-W641, 2019 07 02.
Artigo em Inglês | MEDLINE | ID: mdl-30976793

RESUMO

The EMBL-EBI provides free access to popular bioinformatics sequence analysis applications as well as to a full-featured text search engine with powerful cross-referencing and data retrieval capabilities. Access to these services is provided via user-friendly web interfaces and via established RESTful and SOAP Web Services APIs (https://www.ebi.ac.uk/seqdb/confluence/display/JDSAT/EMBL-EBI+Web+Services+APIs+-+Data+Retrieval). Both systems have been developed with the same core principles that allow them to integrate an ever-increasing volume of biological data, making them an integral part of many popular data resources provided at the EMBL-EBI. Here, we describe the latest improvements made to the frameworks which enhance the interconnectivity between public EMBL-EBI resources and ultimately enhance biological data discoverability, accessibility, interoperability and reusability.


Assuntos
Análise de Sequência , Software , Bases de Dados de Ácidos Nucleicos , Bases de Dados de Proteínas , Alinhamento de Sequência , Análise de Sequência de Proteína
6.
Lancet ; 385(9975): 1305-14, 2015 Apr 04.
Artigo em Inglês | MEDLINE | ID: mdl-25529582

RESUMO

BACKGROUND: Human genome sequencing has transformed our understanding of genomic variation and its relevance to health and disease, and is now starting to enter clinical practice for the diagnosis of rare diseases. The question of whether and how some categories of genomic findings should be shared with individual research participants is currently a topic of international debate, and development of robust analytical workflows to identify and communicate clinically relevant variants is paramount. METHODS: The Deciphering Developmental Disorders (DDD) study has developed a UK-wide patient recruitment network involving over 180 clinicians across all 24 regional genetics services, and has performed genome-wide microarray and whole exome sequencing on children with undiagnosed developmental disorders and their parents. After data analysis, pertinent genomic variants were returned to individual research participants via their local clinical genetics team. FINDINGS: Around 80,000 genomic variants were identified from exome sequencing and microarray analysis in each individual, of which on average 400 were rare and predicted to be protein altering. By focusing only on de novo and segregating variants in known developmental disorder genes, we achieved a diagnostic yield of 27% among 1133 previously investigated yet undiagnosed children with developmental disorders, whilst minimising incidental findings. In families with developmentally normal parents, whole exome sequencing of the child and both parents resulted in a 10-fold reduction in the number of potential causal variants that needed clinical evaluation compared to sequencing only the child. Most diagnostic variants identified in known genes were novel and not present in current databases of known disease variation. INTERPRETATION: Implementation of a robust translational genomics workflow is achievable within a large-scale rare disease research study to allow feedback of potentially diagnostic findings to clinicians and research participants. Systematic recording of relevant clinical data, curation of a gene-phenotype knowledge base, and development of clinical decision support software are needed in addition to automated exclusion of almost all variants, which is crucial for scalable prioritisation and review of possible diagnostic variants. However, the resource requirements of development and maintenance of a clinical reporting system within a research setting are substantial. FUNDING: Health Innovation Challenge Fund, a parallel funding partnership between the Wellcome Trust and the UK Department of Health.


Assuntos
Deficiências do Desenvolvimento/diagnóstico , Genoma Humano/genética , Adolescente , Criança , Pré-Escolar , Deficiências do Desenvolvimento/genética , Feminino , Variação Genética/genética , Estudo de Associação Genômica Ampla/métodos , Heterozigoto , Humanos , Achados Incidentais , Lactente , Recém-Nascido , Disseminação de Informação , Masculino , Fenótipo , Manejo de Espécimes
7.
Nucleic Acids Res ; 40(Database issue): D98-108, 2012 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-22116062

RESUMO

GeneDB (http://www.genedb.org) is a genome database for prokaryotic and eukaryotic pathogens and closely related organisms. The resource provides a portal to genome sequence and annotation data, which is primarily generated by the Pathogen Genomics group at the Wellcome Trust Sanger Institute. It combines data from completed and ongoing genome projects with curated annotation, which is readily accessible from a web based resource. The development of the database in recent years has focused on providing database-driven annotation tools and pipelines, as well as catering for increasingly frequent assembly updates. The website has been significantly redesigned to take advantage of current web technologies, and improve usability. The current release stores 41 data sets, of which 17 are manually curated and maintained by biologists, who review and incorporate data from the scientific literature, as well as other sources. GeneDB is primarily a production and annotation database for the genomes of predominantly pathogenic organisms.


Assuntos
Bases de Dados Genéticas , Genômica , Anotação de Sequência Molecular , Animais , Artrópodes/genética , Genoma Bacteriano , Genoma Helmíntico , Genoma de Protozoário , Internet , Vocabulário Controlado
8.
Nucleic Acids Res ; 38(Database issue): D457-62, 2010 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-19843604

RESUMO

TriTrypDB (http://tritrypdb.org) is an integrated database providing access to genome-scale datasets for kinetoplastid parasites, and supporting a variety of complex queries driven by research and development needs. TriTrypDB is a collaborative project, utilizing the GUS/WDK computational infrastructure developed by the Eukaryotic Pathogen Bioinformatics Resource Center (EuPathDB.org) to integrate genome annotation and analyses from GeneDB and elsewhere with a wide variety of functional genomics datasets made available by members of the global research community, often pre-publication. Currently, TriTrypDB integrates datasets from Leishmania braziliensis, L. infantum, L. major, L. tarentolae, Trypanosoma brucei and T. cruzi. Users may examine individual genes or chromosomal spans in their genomic context, including syntenic alignments with other kinetoplastid organisms. Data within TriTrypDB can be interrogated utilizing a sophisticated search strategy system that enables a user to construct complex queries combining multiple data types. All search strategies are stored, allowing future access and integrated searches. 'User Comments' may be added to any gene page, enhancing available annotation; such comments become immediately searchable via the text search, and are forwarded to curators for incorporation into the reference annotation when appropriate.


Assuntos
Biologia Computacional/métodos , Bases de Dados Genéticas , Bases de Dados de Ácidos Nucleicos , Leishmania/genética , Trypanosoma/genética , Animais , Biologia Computacional/tendências , Bases de Dados de Proteínas , Genoma de Protozoário , Armazenamento e Recuperação da Informação/métodos , Internet , Estrutura Terciária de Proteína , Proteínas de Protozoários/genética , Software , Interface Usuário-Computador
9.
Genome Res ; 19(12): 2231-44, 2009 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-19745113

RESUMO

Candida dubliniensis is the closest known relative of Candida albicans, the most pathogenic yeast species in humans. However, despite both species sharing many phenotypic characteristics, including the ability to form true hyphae, C. dubliniensis is a significantly less virulent and less versatile pathogen. Therefore, to identify C. albicans-specific genes that may be responsible for an increased capacity to cause disease, we have sequenced the C. dubliniensis genome and compared it with the known C. albicans genome sequence. Although the two genome sequences are highly similar and synteny is conserved throughout, 168 species-specific genes are identified, including some encoding known hyphal-specific virulence factors, such as the aspartyl proteinases Sap4 and Sap5 and the proposed invasin Als3. Among the 115 pseudogenes confirmed in C. dubliniensis are orthologs of several filamentous growth regulator (FGR) genes that also have suspected roles in pathogenesis. However, the principal differences in genomic repertoire concern expansion of the TLO gene family of putative transcription factors and the IFA family of putative transmembrane proteins in C. albicans, which represent novel candidate virulence-associated factors. The results suggest that the recent evolutionary histories of C. albicans and C. dubliniensis are quite different. While gene families instrumental in pathogenesis have been elaborated in C. albicans, C. dubliniensis has lost genomic capacity and key pathogenic functions. This could explain why C. albicans is a more potent pathogen in humans than C. dubliniensis.


Assuntos
Candida albicans , Candida , Proteínas Fúngicas , Genoma Fúngico , Genômica , Fatores de Virulência , Candida/classificação , Candida/genética , Candida/patogenicidade , Candida albicans/genética , Candida albicans/patogenicidade , Proteínas Fúngicas/genética , Proteínas Fúngicas/metabolismo , Ordem dos Genes , Humanos , Hifas/genética , Hifas/metabolismo , Proteínas de Membrana/genética , Proteínas de Membrana/metabolismo , Dados de Sequência Molecular , Filogenia , Análise de Sequência de DNA , Especificidade da Espécie , Sintenia , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo , Virulência , Fatores de Virulência/genética , Fatores de Virulência/metabolismo
10.
Nature ; 460(7253): 352-8, 2009 Jul 16.
Artigo em Inglês | MEDLINE | ID: mdl-19606141

RESUMO

Schistosoma mansoni is responsible for the neglected tropical disease schistosomiasis that affects 210 million people in 76 countries. Here we present analysis of the 363 megabase nuclear genome of the blood fluke. It encodes at least 11,809 genes, with an unusual intron size distribution, and new families of micro-exon genes that undergo frequent alternative splicing. As the first sequenced flatworm, and a representative of the Lophotrochozoa, it offers insights into early events in the evolution of the animals, including the development of a body pattern with bilateral symmetry, and the development of tissues into organs. Our analysis has been informed by the need to find new drug targets. The deficits in lipid metabolism that make schistosomes dependent on the host are revealed, and the identification of membrane receptors, ion channels and more than 300 proteases provide new insights into the biology of the life cycle and new targets. Bioinformatics approaches have identified metabolic chokepoints, and a chemogenomic screen has pinpointed schistosome proteins for which existing drugs may be active. The information generated provides an invaluable resource for the research community to develop much needed new control tools for the treatment and eradication of this important and neglected disease.


Assuntos
Genoma Helmíntico/genética , Schistosoma mansoni/genética , Animais , Evolução Biológica , Éxons/genética , Genes de Helmintos/genética , Interações Hospedeiro-Parasita/genética , Íntrons/genética , Dados de Sequência Molecular , Mapeamento Físico do Cromossomo , Schistosoma mansoni/efeitos dos fármacos , Schistosoma mansoni/embriologia , Schistosoma mansoni/fisiologia , Esquistossomose mansoni/tratamento farmacológico , Esquistossomose mansoni/parasitologia
11.
Bioinformatics ; 24(23): 2672-6, 2008 Dec 01.
Artigo em Inglês | MEDLINE | ID: mdl-18845581

RESUMO

MOTIVATION: Artemis and Artemis Comparison Tool (ACT) have become mainstream tools for viewing and annotating sequence data, particularly for microbial genomes. Since its first release, Artemis has been continuously developed and supported with additional functionality for editing and analysing sequences based on feedback from an active user community of laboratory biologists and professional annotators. Nevertheless, its utility has been somewhat restricted by its limitation to reading and writing from flat files. Therefore, a new version of Artemis has been developed, which reads from and writes to a relational database schema, and allows users to annotate more complex, often large and fragmented, genome sequences. RESULTS: Artemis and ACT have now been extended to read and write directly to the Generic Model Organism Database (GMOD, http://www.gmod.org) Chado relational database schema. In addition, a Gene Builder tool has been developed to provide structured forms and tables to edit coordinates of gene models and edit functional annotation, based on standard ontologies, controlled vocabularies and free text. AVAILABILITY: Artemis and ACT are freely available (under a GPL licence) for download (for MacOSX, UNIX and Windows) at the Wellcome Trust Sanger Institute web sites: http://www.sanger.ac.uk/Software/Artemis/ http://www.sanger.ac.uk/Software/ACT/


Assuntos
Bases de Dados Genéticas , Genômica , Software , Bases de Dados de Ácidos Nucleicos
12.
Nat Genet ; 39(7): 839-47, 2007 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-17572675

RESUMO

Leishmania parasites cause a broad spectrum of clinical disease. Here we report the sequencing of the genomes of two species of Leishmania: Leishmania infantum and Leishmania braziliensis. The comparison of these sequences with the published genome of Leishmania major reveals marked conservation of synteny and identifies only approximately 200 genes with a differential distribution between the three species. L. braziliensis, contrary to Leishmania species examined so far, possesses components of a putative RNA-mediated interference pathway, telomere-associated transposable elements and spliced leader-associated SLACS retrotransposons. We show that pseudogene formation and gene loss are the principal forces shaping the different genomes. Genes that are differentially distributed between the species encode proteins implicated in host-pathogen interactions and parasite survival in the macrophage.


Assuntos
Genoma , Genômica , Leishmania/genética , Leishmaniose/parasitologia , Sequência de Aminoácidos , Animais , Humanos , Leishmania braziliensis/genética , Leishmania infantum/genética , Leishmania major/genética , Leishmaniose Cutânea/parasitologia , Leishmaniose Visceral/parasitologia , Dados de Sequência Molecular
13.
Genome Res ; 17(3): 311-9, 2007 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-17284678

RESUMO

Eimeria tenella is an intracellular protozoan parasite that infects the intestinal tracts of domestic fowl and causes coccidiosis, a serious and sometimes lethal enteritis. Eimeria falls in the same phylum (Apicomplexa) as several human and animal parasites such as Cryptosporidium, Toxoplasma, and the malaria parasite, Plasmodium. Here we report the sequencing and analysis of the first chromosome of E. tenella, a chromosome believed to carry loci associated with drug resistance and known to differ between virulent and attenuated strains of the parasite. The chromosome--which appears to be representative of the genome--is gene-dense and rich in simple-sequence repeats, many of which appear to give rise to repetitive amino acid tracts in the predicted proteins. Most striking is the segmentation of the chromosome into repeat-rich regions peppered with transposon-like elements and telomere-like repeats, alternating with repeat-free regions. Predicted genes differ in character between the two types of segment, and the repeat-rich regions appear to be associated with strain-to-strain variation.


Assuntos
Estruturas Cromossômicas/genética , Eimeria tenella/genética , Genes de Protozoários/genética , Animais , Sequência de Bases , Mapeamento Cromossômico , Biologia Computacional , Repetições Minissatélites/genética , Dados de Sequência Molecular , Polimorfismo de Fragmento de Restrição , Análise de Sequência de DNA
14.
Science ; 309(5731): 131-3, 2005 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-15994557

RESUMO

Theileria annulata and T. parva are closely related protozoan parasites that cause lymphoproliferative diseases of cattle. We sequenced the genome of T. annulata and compared it with that of T. parva to understand the mechanisms underlying transformation and tropism. Despite high conservation of gene sequences and synteny, the analysis reveals unequally expanded gene families and species-specific genes. We also identify divergent families of putative secreted polypeptides that may reduce immune recognition, candidate regulators of host-cell transformation, and a Theileria-specific protein domain [frequently associated in Theileria (FAINT)] present in a large number of secreted proteins.


Assuntos
Genoma de Protozoário , Proteínas de Protozoários/genética , Theileria annulata/genética , Theileria parva/genética , Motivos de Aminoácidos , Animais , Bovinos , Proliferação de Células , Mapeamento Cromossômico , Cromossomos/genética , Sequência Conservada , Genes de Protozoários , Estágios do Ciclo de Vida , Metabolismo dos Lipídeos , Linfócitos/citologia , Linfócitos/parasitologia , Dados de Sequência Molecular , Família Multigênica , Filogenia , Sinais Direcionadores de Proteínas/genética , Estrutura Terciária de Proteína , Proteoma , Proteínas de Protozoários/química , Proteínas de Protozoários/fisiologia , Análise de Sequência de DNA , Especificidade da Espécie , Sintenia , Telômero/genética , Theileria annulata/crescimento & desenvolvimento , Theileria annulata/imunologia , Theileria annulata/patogenicidade , Theileria parva/crescimento & desenvolvimento , Theileria parva/imunologia , Theileria parva/patogenicidade
15.
Int J Parasitol ; 35(5): 481-93, 2005 Apr 30.
Artigo em Inglês | MEDLINE | ID: mdl-15826641

RESUMO

Centralisation of tools for analysis of genomic data is paramount in ensuring that research is always carried out on the latest currently available data. As such, World Wide Web sites providing a range of online analyses and displays of data can play a crucial role in guaranteeing consistency of in silico work. In this respect, the protozoan parasite research community is served by several resources, either focussing on data and tools for one species or taking a broader view and providing tools for analysis of data from many species, thereby facilitating comparative studies. In this paper, we give a broad overview of the online resources available. We then focus on the GeneDB project, detailing the features and tools currently available through it. Finally, we discuss data curation and its importance in keeping genomic data 'relevant' to the research community.


Assuntos
Bases de Dados Genéticas , Genoma de Protozoário , Genômica , Animais , Biologia Computacional , Armazenamento e Recuperação da Informação , Sistemas On-Line
16.
Nucleic Acids Res ; 32(Database issue): D339-43, 2004 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-14681429

RESUMO

GeneDB (http://www.genedb.org/) is a genome database for prokaryotic and eukaryotic organisms. The resource provides a portal through which data generated by the Pathogen Sequencing Unit at the Wellcome Trust Sanger Institute and other collaborating sequencing centres can be made publicly available. It combines data from finished and ongoing genome and expressed sequence tag (EST) projects with curated annotation, that can be searched, sorted and downloaded, using a single web based resource. The current release stores 11 datasets of which six are curated and maintained by biologists, who review and incorporate information from the scientific literature, public databases and the respective research communities.


Assuntos
Bases de Dados Genéticas , Células Eucarióticas , Genoma , Células Procarióticas , Animais , Biologia Computacional , Etiquetas de Sequências Expressas , Genômica , Armazenamento e Recuperação da Informação , Internet
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA