Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 40
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
PLoS Biol ; 18(1): e3000583, 2020 01.
Artículo en Inglés | MEDLINE | ID: mdl-31971940

RESUMEN

We present Knowledge Engine for Genomics (KnowEnG), a free-to-use computational system for analysis of genomics data sets, designed to accelerate biomedical discovery. It includes tools for popular bioinformatics tasks such as gene prioritization, sample clustering, gene set analysis, and expression signature analysis. The system specializes in "knowledge-guided" data mining and machine learning algorithms, in which user-provided data are analyzed in light of prior information about genes, aggregated from numerous knowledge bases and encoded in a massive "Knowledge Network." KnowEnG adheres to "FAIR" principles (findable, accessible, interoperable, and reuseable): its tools are easily portable to diverse computing environments, run on the cloud for scalable and cost-effective execution, and are interoperable with other computing platforms. The analysis tools are made available through multiple access modes, including a web portal with specialized visualization modules. We demonstrate the KnowEnG system's potential value in democratization of advanced tools for the modern genomics era through several case studies that use its tools to recreate and expand upon the published analysis of cancer data sets.


Asunto(s)
Algoritmos , Nube Computacional , Minería de Datos/métodos , Genómica/métodos , Programas Informáticos , Análisis por Conglomerados , Biología Computacional/métodos , Análisis de Datos , Conjuntos de Datos como Asunto , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Conocimiento , Aprendizaje Automático , Metabolómica/métodos
2.
Genome Res ; 26(2): 271-7, 2016 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-26627985

RESUMEN

The application of genomics technologies to medicine and biomedical research is increasing in popularity, made possible by new high-throughput genotyping and sequencing technologies and improved data analysis capabilities. Some of the greatest genetic diversity among humans, animals, plants, and microbiota occurs in Africa, yet genomic research outputs from the continent are limited. The Human Heredity and Health in Africa (H3Africa) initiative was established to drive the development of genomic research for human health in Africa, and through recognition of the critical role of bioinformatics in this process, spurred the establishment of H3ABioNet, a pan-African bioinformatics network for H3Africa. The limitations in bioinformatics capacity on the continent have been a major contributory factor to the lack of notable outputs in high-throughput biology research. Although pockets of high-quality bioinformatics teams have existed previously, the majority of research institutions lack experienced faculty who can train and supervise bioinformatics students. H3ABioNet aims to address this dire need, specifically in the area of human genetics and genomics, but knock-on effects are ensuring this extends to other areas of bioinformatics. Here, we describe the emergence of genomics research and the development of bioinformatics in Africa through H3ABioNet.


Asunto(s)
Población Negra/genética , Promoción de la Salud , África , Biología Computacional , Sistemas de Computación , Variación Genética , Genética Médica , Genómica , Humanos
3.
BMC Bioinformatics ; 19(1): 457, 2018 Nov 29.
Artículo en Inglés | MEDLINE | ID: mdl-30486782

RESUMEN

BACKGROUND: The Pan-African bioinformatics network, H3ABioNet, comprises 27 research institutions in 17 African countries. H3ABioNet is part of the Human Health and Heredity in Africa program (H3Africa), an African-led research consortium funded by the US National Institutes of Health and the UK Wellcome Trust, aimed at using genomics to study and improve the health of Africans. A key role of H3ABioNet is to support H3Africa projects by building bioinformatics infrastructure such as portable and reproducible bioinformatics workflows for use on heterogeneous African computing environments. Processing and analysis of genomic data is an example of a big data application requiring complex interdependent data analysis workflows. Such bioinformatics workflows take the primary and secondary input data through several computationally-intensive processing steps using different software packages, where some of the outputs form inputs for other steps. Implementing scalable, reproducible, portable and easy-to-use workflows is particularly challenging. RESULTS: H3ABioNet has built four workflows to support (1) the calling of variants from high-throughput sequencing data; (2) the analysis of microbial populations from 16S rDNA sequence data; (3) genotyping and genome-wide association studies; and (4) single nucleotide polymorphism imputation. A week-long hackathon was organized in August 2016 with participants from six African bioinformatics groups, and US and European collaborators. Two of the workflows are built using the Common Workflow Language framework (CWL) and two using Nextflow. All the workflows are containerized for improved portability and reproducibility using Docker, and are publicly available for use by members of the H3Africa consortium and the international research community. CONCLUSION: The H3ABioNet workflows have been implemented in view of offering ease of use for the end user and high levels of reproducibility and portability, all while following modern state of the art bioinformatics data processing protocols. The H3ABioNet workflows will service the H3Africa consortium projects and are currently in use. All four workflows are also publicly available for research scientists worldwide to use and adapt for their respective needs. The H3ABioNet workflows will help develop bioinformatics capacity and assist genomics research within Africa and serve to increase the scientific output of H3Africa and its Pan-African Bioinformatics Network.


Asunto(s)
Biología Computacional/métodos , Genómica/métodos , África , Humanos , Reproducibilidad de los Resultados
4.
PLoS Comput Biol ; 13(6): e1005419, 2017 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-28570565

RESUMEN

The H3ABioNet pan-African bioinformatics network, which is funded to support the Human Heredity and Health in Africa (H3Africa) program, has developed node-assessment exercises to gauge the ability of its participating research and service groups to analyze typical genome-wide datasets being generated by H3Africa research groups. We describe a framework for the assessment of computational genomics analysis skills, which includes standard operating procedures, training and test datasets, and a process for administering the exercise. We present the experiences of 3 research groups that have taken the exercise and the impact on their ability to manage complex projects. Finally, we discuss the reasons why many H3ABioNet nodes have declined so far to participate and potential strategies to encourage them to do so.


Asunto(s)
Población Negra/genética , Bases de Datos Genéticas , Genómica/métodos , Sistemas de Administración de Bases de Datos , Países en Desarrollo , Humanos , Nigeria , Sudáfrica
6.
BMC Genomics ; 13: 241, 2012 Jun 15.
Artículo en Inglés | MEDLINE | ID: mdl-22702538

RESUMEN

BACKGROUND: Genotypes obtained with commercial SNP arrays have been extensively used in many large case-control or population-based cohorts for SNP-based genome-wide association studies for a multitude of traits. Yet, these genotypes capture only a small fraction of the variance of the studied traits. Genomic structural variants (GSV) such as Copy Number Variation (CNV) may account for part of the missing heritability, but their comprehensive detection requires either next-generation arrays or sequencing. Sophisticated algorithms that infer CNVs by combining the intensities from SNP-probes for the two alleles can already be used to extract a partial view of such GSV from existing data sets. RESULTS: Here we present several advances to facilitate the latter approach. First, we introduce a novel CNV detection method based on a Gaussian Mixture Model. Second, we propose a new algorithm, PCA merge, for combining copy-number profiles from many individuals into consensus regions. We applied both our new methods as well as existing ones to data from 5612 individuals from the CoLaus study who were genotyped on Affymetrix 500K arrays. We developed a number of procedures in order to evaluate the performance of the different methods. This includes comparison with previously published CNVs as well as using a replication sample of 239 individuals, genotyped with Illumina 550K arrays. We also established a new evaluation procedure that employs the fact that related individuals are expected to share their CNVs more frequently than randomly selected individuals. The ability to detect both rare and common CNVs provides a valuable resource that will facilitate association studies exploring potential phenotypic associations with CNVs. CONCLUSION: Our new methodologies for CNV detection and their evaluation will help in extracting additional information from the large amount of SNP-genotyping data on various cohorts and use this to explore structural variants and their impact on complex traits.


Asunto(s)
Variaciones en el Número de Copia de ADN/genética , Genoma Humano , Polimorfismo de Nucleótido Simple , Algoritmos , Estudios de Cohortes , Estudio de Asociación del Genoma Completo , Genotipo , Humanos , Distribución Normal , Análisis de Secuencia por Matrices de Oligonucleótidos , Análisis de Componente Principal
7.
Front Genet ; 13: 769919, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35571023

RESUMEN

Genomics policy development involves assessing a wide range of issues extending from specimen collection and data sharing to whether and how to utilize advanced technologies in clinical practice and public health initiatives. A survey was conducted among African scientists and stakeholders with an interest in genomic medicine, seeking to evaluate: 1) Their knowledge and understanding of the field. 2) The institutional environment and infrastructure available to them. 3) The state and awareness of the field in their country. 4) Their perception of potential barriers to implementation of precision medicine. We discuss how the information gathered in the survey could instruct the policies of African institutions seeking to implement precision, and more specifically, genomic medicine approaches in their health care systems in the following areas: 1) Prioritization of infrastructures. 2) Need for translational research. 3) Information dissemination to potential users. 4) Training programs for specialized personnel. 5) Engaging political stakeholders and the public. A checklist with key requirements to assess readiness for implementation of genomic medicine programs is provided to guide the process from scientific discovery to clinical application.

8.
Proc Natl Acad Sci U S A ; 105(51): 20422-7, 2008 Dec 23.
Artículo en Inglés | MEDLINE | ID: mdl-19088187

RESUMEN

Cancer/Testis (CT) genes, normally expressed in germ line cells but also activated in a wide range of cancer types, often encode antigens that are immunogenic in cancer patients, and present potential for use as biomarkers and targets for immunotherapy. Using multiple in silico gene expression analysis technologies, including twice the number of expressed sequence tags used in previous studies, we have performed a comprehensive genome-wide survey of expression for a set of 153 previously described CT genes in normal and cancer expression libraries. We find that although they are generally highly expressed in testis, these genes exhibit heterogeneous gene expression profiles, allowing their classification into testis-restricted (39), testis/brain-restricted (14), and a testis-selective (85) group of genes that show additional expression in somatic tissues. The chromosomal distribution of these genes confirmed the previously observed dominance of X chromosome location, with CT-X genes being significantly more testis-restricted than non-X CT. Applying this core classification in a genome-wide survey we identified >30 CT candidate genes; 3 of them, PEPP-2, OTOA, and AKAP4, were confirmed as testis-restricted or testis-selective using RT-PCR, with variable expression frequencies observed in a panel of cancer cell lines. Our classification provides an objective ranking for potential CT genes, which is useful in guiding further identification and characterization of these potentially important diagnostic and therapeutic targets.


Asunto(s)
Perfilación de la Expresión Génica/métodos , Genoma Humano , Neoplasias Testiculares/genética , Testículo , Proteínas de Anclaje a la Quinasa A , Línea Celular Tumoral , Cromosomas Humanos , Cromosomas Humanos X , Biología Computacional , Proteínas Ligadas a GPI , Genómica/métodos , Proteínas de Homeodominio/genética , Humanos , Masculino , Proteínas de la Membrana/genética
9.
Nucleic Acids Res ; 35(Web Server issue): W433-7, 2007 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-17545200

RESUMEN

The MyHits web site (http://myhits.isb-sib.ch) is an integrated service dedicated to the analysis of protein sequences. Since its first description in 2004, both the user interface and the back end of the server were improved. A number of tools (e.g. MAFFT, Jacop, Dotlet, Jalview, ESTScan) were added or updated to improve the usability of the service. The MySQL schema and its associated API were revamped and the database engine (HitKeeper) was separated from the web interface. This paper summarizes the current status of the server, with an emphasis on the new services.


Asunto(s)
Biología Computacional/métodos , Estructura Terciaria de Proteína , Análisis de Secuencia de Proteína , Programas Informáticos , Gráficos por Computador , Bases de Datos de Proteínas , Internet , Lenguajes de Programación , Alineación de Secuencia , Integración de Sistemas , Interfaz Usuario-Computador
10.
Cancer Immun ; 8: 11, 2008 Jun 27.
Artículo en Inglés | MEDLINE | ID: mdl-18581998

RESUMEN

Despite the high prevalence of colon cancer in the world and the great interest in targeted anti-cancer therapy, only few tumor-specific gene products have been identified that could serve as targets for the immunological treatment of colorectal cancers. The aim of our study was therefore to identify frequently expressed colon cancer-specific antigens. We performed a large-scale analysis of genes expressed in normal colon and colon cancer tissues isolated from colorectal cancer patients using massively parallel signal sequencing (MPSS). Candidates were additionally subjected to experimental evaluation by semi-quantitative RT-PCR on a cohort of colorectal cancer patients. From a pool of more than 6000 genes identified unambiguously in the analysis, we found 2124 genes that were selectively expressed in colon cancer tissue and 147 genes that were differentially expressed to a significant degree between normal and cancer cells. Differential expression of many genes was confirmed by RT-PCR on a cohort of patients. Despite the fact that deregulated genes were involved in many different cellular pathways, we found that genes expressed in the extracellular space were significantly over-represented in colorectal cancer. Strikingly, we identified a transcript from a chromosome X-linked member of the human endogenous retrovirus (HERV) H family that was frequently and selectively expressed in colon cancer but not in normal tissues. Our data suggest that this sequence should be considered as a target of immunological interventions against colorectal cancer.


Asunto(s)
Antígenos de Neoplasias/genética , Biomarcadores de Tumor/genética , Neoplasias Colorrectales/genética , Perfilación de la Expresión Génica , Regulación Neoplásica de la Expresión Génica , Antígenos de Neoplasias/análisis , Biomarcadores de Tumor/análisis , Neoplasias Colorrectales/inmunología , Neoplasias Colorrectales/metabolismo , Regulación hacia Abajo , Retrovirus Endógenos/genética , Humanos
11.
AAS Open Res ; 1: 9, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-32382696

RESUMEN

The need for portable and reproducible genomics analysis pipelines is growing globally as well as in Africa, especially with the growth of collaborative projects like the Human Health and Heredity in Africa Consortium (H3Africa). The Pan-African H3Africa Bioinformatics Network (H3ABioNet) recognized the need for portable, reproducible pipelines adapted to heterogeneous compute environments, and for the nurturing of technical expertise in workflow languages and containerization technologies. To address this need, in 2016 H3ABioNet arranged its first Cloud Computing and Reproducible Workflows Hackathon, with the purpose of building key genomics analysis pipelines able to run on heterogeneous computing environments and meeting the needs of H3Africa research projects. This paper describes the preparations for this hackathon and reflects upon the lessons learned about its impact on building the technical and scientific expertise of African researchers. The workflows developed were made publicly available in GitHub repositories and deposited as container images on quay.io.

12.
BMC Genomics ; 8: 398, 2007 Oct 31.
Artículo en Inglés | MEDLINE | ID: mdl-17973996

RESUMEN

BACKGROUND: The comparison of complete genomes has revealed surprisingly large numbers of conserved non-protein-coding (CNC) DNA regions. However, the biological function of CNC remains elusive. CNC differ in two aspects from conserved protein-coding regions. They are not conserved across phylum boundaries, and they do not contain readily detectable sub-domains. Here we characterize the persistence length and time of CNC and conserved protein-coding regions in the vertebrate and insect lineages. RESULTS: The persistence length is the length of a genome region over which a certain level of sequence identity is consistently maintained. The persistence time is the evolutionary period during which a conserved region evolves under the same selective constraints. Our main findings are: (i) Insect genomes contain 1.60 times less conserved information than vertebrates; (ii) Vertebrate CNC have a higher persistence length than conserved coding regions or insect CNC; (iii) CNC have shorter persistence times as compared to conserved coding regions in both lineages. CONCLUSION: Higher persistence length of vertebrate CNC indicates that the conserved information in vertebrates and insects is organized in functional elements of different lengths. These findings might be related to the higher morphological complexity of vertebrates and give clues about the structure of active CNC elements. Shorter persistence time might explain the previously puzzling observations of highly conserved CNC within each phylum, and of a lack of conservation between phyla. It suggests that CNC divergence might be a key factor in vertebrate evolution. Further evolutionary studies will help to relate individual CNC to specific developmental processes.


Asunto(s)
ADN Intergénico/genética , Evolución Molecular , Genoma/genética , Vertebrados/genética , Animales , Secuencia Conservada , Drosophila/genética , Genoma de los Insectos/genética , Humanos , Factores de Tiempo
13.
BMC Genomics ; 8: 129, 2007 May 23.
Artículo en Inglés | MEDLINE | ID: mdl-17521433

RESUMEN

BACKGROUND: Cancer/testis (CT) genes are normally expressed only in germ cells, but can be activated in the cancer state. This unusual property, together with the finding that many CT proteins elicit an antigenic response in cancer patients, has established a role for this class of genes as targets in immunotherapy regimes. Many families of CT genes have been identified in the human genome, but their biological function for the most part remains unclear. While it has been shown that some CT genes are under diversifying selection, this question has not been addressed before for the class as a whole. RESULTS: To shed more light on this interesting group of genes, we exploited the generation of a draft chimpanzee (Pan troglodytes) genomic sequence to examine CT genes in an organism that is closely related to human, and generated a high-quality, manually curated set of human:chimpanzee CT gene alignments. We find that the chimpanzee genome contains homologues to most of the human CT families, and that the genes are located on the same chromosome and at a similar copy number to those in human. Comparison of putative human:chimpanzee orthologues indicates that CT genes located on chromosome X are diverging faster and are undergoing stronger diversifying selection than those on the autosomes or than a set of control genes on either chromosome X or autosomes. CONCLUSION: Given their high level of diversifying selection, we suggest that CT genes are primarily responsible for the observed rapid evolution of protein-coding genes on the X chromosome.


Asunto(s)
Cromosomas Humanos X/genética , Genes Relacionados con las Neoplasias/genética , Animales , Evolución Molecular , Etiquetas de Secuencia Expresada , Femenino , Regulación Neoplásica de la Expresión Génica , Genoma Humano , Humanos , Inmunoterapia , Masculino , Pan troglodytes , Reacción en Cadena de la Polimerasa , Alineación de Secuencia , Testículo
14.
BMC Genomics ; 7: 176, 2006 Jul 12.
Artículo en Inglés | MEDLINE | ID: mdl-16836751

RESUMEN

BACKGROUND: Cleavage of messenger RNA (mRNA) precursors is an essential step in mRNA maturation. The signal recognized by the cleavage enzyme complex has been characterized as an A rich region upstream of the cleavage site containing a motif with consensus AAUAAA, followed by a U or UG rich region downstream of the cleavage site. RESULTS: We studied these signals using exhaustive databases of cleavage sites obtained from aligning raw expressed sequence tags (EST) sequences to genomic sequences in Homo sapiens and Drosophila melanogaster. These data show that the polyadenylation signal is highly conserved in human and fly. In addition, de novo motif searches generated a refined description of the U-rich downstream sequence (DSE) element, which shows more divergence between the two species. These refined motifs are applied, within a Hidden Markov Model (HMM) framework, to predict mRNA cleavage sites. CONCLUSION: We demonstrate that the DSE is a specific motif in both human and Drosophila. These findings shed light on the sequence correlates of a highly conserved biological process, and improve in silico prediction of 3' mRNA cleavage and polyadenylation sites.


Asunto(s)
Drosophila melanogaster/genética , Poli A/genética , Poliadenilación/genética , Regiones no Traducidas 3'/genética , Animales , Composición de Base/genética , Secuencia de Bases , Etiquetas de Secuencia Expresada , Humanos , Modelos Genéticos , Procesamiento Postranscripcional del ARN , ARN Mensajero/genética
15.
Breast Cancer Res ; 8(5): R56, 2006.
Artículo en Inglés | MEDLINE | ID: mdl-17014703

RESUMEN

INTRODUCTION: Diverse microarray and sequencing technologies have been widely used to characterise the molecular changes in malignant epithelial cells in breast cancers. Such gene expression studies to identify markers and targets in tumour cells are, however, compromised by the cellular heterogeneity of solid breast tumours and by the lack of appropriate counterparts representing normal breast epithelial cells. METHODS: Malignant neoplastic epithelial cells from primary breast cancers and luminal and myoepithelial cells isolated from normal human breast tissue were isolated by immunomagnetic separation methods. Pools of RNA from highly enriched preparations of these cell types were subjected to expression profiling using massively parallel signature sequencing (MPSS) and four different genome wide microarray platforms. Functional related transcripts of the differential tumour epithelial transcriptome were used for gene set enrichment analysis to identify enrichment of luminal and myoepithelial type genes. Clinical pathological validation of a small number of genes was performed on tissue microarrays. RESULTS: MPSS identified 6,553 differentially expressed genes between the pool of normal luminal cells and that of primary tumours substantially enriched for epithelial cells, of which 98% were represented and 60% were confirmed by microarray profiling. Significant expression level changes between these two samples detected only by microarray technology were shown by 4,149 transcripts, resulting in a combined differential tumour epithelial transcriptome of 8,051 genes. Microarray gene signatures identified a comprehensive list of 907 and 955 transcripts whose expression differed between luminal epithelial cells and myoepithelial cells, respectively. Functional annotation and gene set enrichment analysis highlighted a group of genes related to skeletal development that were associated with the myoepithelial/basal cells and upregulated in the tumour sample. One of the most highly overexpressed genes in this category, that encoding periostin, was analysed immunohistochemically on breast cancer tissue microarrays and its expression in neoplastic cells correlated with poor outcome in a cohort of poor prognosis estrogen receptor-positive tumours. CONCLUSION: Using highly enriched cell populations in combination with multiplatform gene expression profiling studies, a comprehensive analysis of molecular changes between the normal and malignant breast tissue was established. This study provides a basis for the identification of novel and potentially important targets for diagnosis, prognosis and therapy in breast cancer.


Asunto(s)
Neoplasias de la Mama/genética , Moléculas de Adhesión Celular/genética , Perfilación de la Expresión Génica , Análisis de Secuencia por Matrices de Oligonucleótidos , Biomarcadores de Tumor/análisis , Mama , Células Cultivadas , Células Epiteliales , Femenino , Humanos , Pronóstico , Transcripción Genética , Células Tumorales Cultivadas
16.
Nucleic Acids Res ; 31(13): 3782-3, 2003 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-12824417

RESUMEN

EMBnet is a consortium of collaborating bioinformatics groups located mainly within Europe (http://www.embnet.org). Each member country is represented by a 'node', a group responsible for the maintenance of local services for their users (e.g. education, training, software, database distribution, technical support, helpdesk). Among these services a web portal with links and access to locally developed and maintained software is essential and different for each node. Our web portal targets biomedical scientists in Switzerland and elsewhere, offering them access to a collection of important sequence analysis tools mirrored from other sites or developed locally. We describe here the Swiss EMBnet node web site (http://www.ch.embnet.org), which presents a number of original services not available anywhere else.


Asunto(s)
Análisis de Secuencia de Proteína/métodos , Programas Informáticos , Bases de Datos de Proteínas , Europa (Continente) , Internet , Alineación de Secuencia
17.
Nucleic Acids Res ; 32(Web Server issue): W332-5, 2004 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-15215405

RESUMEN

The MyHits web server (http://myhits.isb-sib.ch) is a new integrated service dedicated to the annotation of protein sequences and to the analysis of their domains and signatures. Guest users can use the system anonymously, with full access to (i) standard bioinformatics programs (e.g. PSI-BLAST, ClustalW, T-Coffee, Jalview); (ii) a large number of protein sequence databases, including standard (Swiss-Prot, TrEMBL) and locally developed databases (splice variants); (iii) databases of protein motifs (Prosite, Interpro); (iv) a precomputed list of matches ('hits') between the sequence and motif databases. All databases are updated on a weekly basis and the hit list is kept up to date incrementally. The MyHits server also includes a new collection of tools to generate graphical representations of pairwise and multiple sequence alignments including their annotated features. Free registration enables users to upload their own sequences and motifs to private databases. These are then made available through the same web interface and the same set of analytical tools. Registered users can manage their own sequences and annotations using only web tools and freeze their data in their private database for publication purposes.


Asunto(s)
Estructura Terciaria de Proteína , Análisis de Secuencia de Proteína , Programas Informáticos , Gráficos por Computador , Bases de Datos de Proteínas , Internet , Alineación de Secuencia , Integración de Sistemas , Interfaz Usuario-Computador
18.
Nucleic Acids Res ; 32(Database issue): D509-11, 2004 Jan 01.
Artículo en Inglés | MEDLINE | ID: mdl-14681469

RESUMEN

We previously introduced two new protein databases (trEST and trGEN) of hypothetical protein sequences predicted from EST and HTG sequences, respectively. Here, we present the updates made on these two databases plus a new database (trome), which uses alignments of EST data to HTG or full genomes to generate virtual transcripts and coding sequences. This new database is of higher quality and since it contains the information in a much denser format it is of much smaller size. These new databases are in a Swiss-Prot-like format and are updated on a weekly basis (trEST and trGEN) or every 3 months (trome). They can be downloaded by anonymous ftp from ftp://ftp.isrec.isb-sib.ch/pub/databases.


Asunto(s)
Bases de Datos Genéticas , Etiquetas de Secuencia Expresada , Proteínas/química , Proteínas/genética , Animales , Biología Computacional , Exones , Genómica , Humanos , Almacenamiento y Recuperación de la Información , Internet , Transcripción Genética
19.
Cancer Immun ; 5: 9, 2005 Jul 07.
Artículo en Inglés | MEDLINE | ID: mdl-15999985

RESUMEN

Transcripts with ESTs derived exclusively or predominantly from testis, and not from other normal tissues, are likely to be products of genes with testis-restricted expression, and are thus potential cancer/testis (CT) antigen genes. A list of 371 genes with such characteristics was compiled by analyzing publicly available EST databases. RT-PCR analysis of normal and tumor tissues was performed to validate an initial selection of 20 of these genes. Several new CT and CT-like genes were identified. One of these, CT46/HORMAD1, is expressed strongly in testis and weakly in placenta; the highest level of expression in other tissues is <1% of testicular expression. The CT46/HORMAD1 gene was expressed in 31% (34/109) of the carcinomas examined, with 11% (12/109) showing expression levels >10% of the testicular level of expression. CT46/HORMAD1 is a single-copy gene on chromosome 1q21.3, encoding a putative protein of 394 aa. Conserved protein domain analysis identified a HORMA domain involved in chromatin binding. The CT46/HORMAD1 protein was found to be homologous to the prototype HORMA domain-containing protein, Hop1, a yeast meiosis-specific protein, as well as to asy1, a meiotic synaptic mutant protein in Arabidopsis thaliana.


Asunto(s)
Antígenos de Neoplasias/genética , Meiosis/inmunología , Testículo/inmunología , Secuencia de Aminoácidos , Animales , Antígenos de Neoplasias/química , Secuencia de Bases , Humanos , Masculino , Datos de Secuencia Molecular , Reacción en Cadena de la Polimerasa de Transcriptasa Inversa
20.
Gene ; 310: 49-57, 2003 May 22.
Artículo en Inglés | MEDLINE | ID: mdl-12801632

RESUMEN

We applied a systematic bioinformatics approach, followed by careful manual inspection and experimental validation to identify additional expressed sequences located at the Hereditary Prostate Cancer Region (HPC1) between D1S2818 and D1S1642 on chromosome 1q25. All transcripts already described for the 1q25 region were identified and we were able to define 11 additional expressed sequences within this region (three full-length cDNA clone sequences and eight ESTs), increasing the total number of gene count in this region by 38%. Five out of the 11 expressed sequences identified were shown to be expressed in prostate tissue and thus represent novel disease gene candidates for the HPC1 region. Here, we report a detailed characterization of these five novel disease gene candidates, their expression pattern in various tissues, their genomic organization and functional annotation. Two candidates (RGSL1 and RGSL2) correspond to novel members of the RGS family, which is involved in the regulation of G-protein signaling. RGSL1 and RGLS2 expression was detected by real-time polymerase chain reaction in normal prostate tissue, but could not be detected in prostate tumor cell lines, suggesting they might have a role in prostate cancer.


Asunto(s)
Cromosomas Humanos Par 1/genética , Neoplasias de la Próstata/genética , Proteínas/genética , Proteínas RGS/genética , Mapeo Cromosómico , ADN Complementario/química , ADN Complementario/genética , Bases de Datos de Ácidos Nucleicos , Etiquetas de Secuencia Expresada , Humanos , Masculino , Repeticiones de Microsatélite , Datos de Secuencia Molecular , Neoplasias de la Próstata/patología , Análisis de Secuencia de ADN , Transcripción Genética/genética , Células Tumorales Cultivadas
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA