Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 22
Filtrar
1.
Pathology ; 56(5): 633-642, 2024 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-38719771

RESUMEN

Prostate and breast cancer incidence rates have been on the rise in Japan, emphasising the need for precise histopathological diagnosis to determine patient prognosis and guide treatment decisions. However, existing diagnostic methods face numerous challenges and are susceptible to inconsistencies between observers. To tackle these issues, artificial intelligence (AI) algorithms have been developed to aid in the diagnosis of prostate and breast cancer. This study focuses on validating the performance of two such algorithms, Galen Prostate and Galen Breast, in a Japanese cohort, with a particular focus on the grading accuracy and the ability to differentiate between invasive and non-invasive tumours. The research entailed a retrospective examination of 100 consecutive prostate and 100 consecutive breast biopsy cases obtained from a Japanese institution. Our findings demonstrated that the AI algorithms showed accurate cancer detection, with AUCs of 0.969 and 0.997 for the Galen Prostate and Galen Breast, respectively. The Galen Prostate was able to detect a higher Gleason score in four adenocarcinoma cases and detect a previously unreported cancer. The two algorithms successfully identified relevant pathological features, such as perineural invasions and lymphovascular invasions. Although further improvements are required to accurately differentiate rare cancer subtypes, these findings highlight the potential of these algorithms to enhance the precision and efficiency of prostate and breast cancer diagnosis in Japan. Furthermore, this validation paves the way for broader adoption of these algorithms as decision support tools within the Asian population.


Asunto(s)
Algoritmos , Inteligencia Artificial , Neoplasias de la Mama , Clasificación del Tumor , Neoplasias de la Próstata , Humanos , Estudios Retrospectivos , Masculino , Neoplasias de la Mama/diagnóstico , Neoplasias de la Mama/patología , Neoplasias de la Próstata/diagnóstico , Neoplasias de la Próstata/patología , Femenino , Japón , Anciano , Persona de Mediana Edad , Anciano de 80 o más Años , Adulto , Adenocarcinoma/diagnóstico , Adenocarcinoma/patología , Estudios de Cohortes , Pueblos del Este de Asia
2.
NPJ Breast Cancer ; 8(1): 129, 2022 Dec 06.
Artículo en Inglés | MEDLINE | ID: mdl-36473870

RESUMEN

Breast cancer is the most common malignant disease worldwide, with over 2.26 million new cases in 2020. Its diagnosis is determined by a histological review of breast biopsy specimens, which can be labor-intensive, subjective, and error-prone. Artificial Intelligence (AI)-based tools can support cancer detection and classification in breast biopsies ensuring rapid, accurate, and objective diagnosis. We present here the development, external clinical validation, and deployment in routine use of an AI-based quality control solution for breast biopsy review. The underlying AI algorithm is trained to identify 51 different types of clinical and morphological features, and it achieves very high accuracy in a large, multi-site validation study. Specifically, the area under the receiver operating characteristic curves (AUC) for the detection of invasive carcinoma and of ductal carcinoma in situ (DCIS) are 0.99 (specificity and sensitivity of 93.57 and 95.51%, respectively) and 0.98 (specificity and sensitivity of 93.79 and 93.20% respectively), respectively. The AI algorithm differentiates well between subtypes of invasive and different grades of in situ carcinomas with an AUC of 0.97 for invasive ductal carcinoma (IDC) vs. invasive lobular carcinoma (ILC) and AUC of 0.92 for DCIS high grade vs. low grade/atypical ductal hyperplasia, respectively, as well as accurately identifies stromal tumor-infiltrating lymphocytes (TILs) with an AUC of 0.965. Deployment of this AI solution as a real-time quality control solution in clinical routine leads to the identification of cancers initially missed by the reviewing pathologist, demonstrating both clinical utility and accuracy in real-world clinical application.

3.
Lancet Digit Health ; 2(8): e407-e416, 2020 08.
Artículo en Inglés | MEDLINE | ID: mdl-33328045

RESUMEN

BACKGROUND: There is high demand to develop computer-assisted diagnostic tools to evaluate prostate core needle biopsies (CNBs), but little clinical validation and a lack of clinical deployment of such tools. We report here on a blinded clinical validation study and deployment of an artificial intelligence (AI)-based algorithm in a pathology laboratory for routine clinical use to aid prostate diagnosis. METHODS: An AI-based algorithm was developed using haematoxylin and eosin (H&E)-stained slides of prostate CNBs digitised with a Philips scanner, which were divided into training (1 357 480 image patches from 549 H&E-stained slides) and internal test (2501 H&E-stained slides) datasets. The algorithm provided slide-level scores for probability of cancer, Gleason score 7-10 (vs Gleason score 6 or atypical small acinar proliferation [ASAP]), Gleason pattern 5, and perineural invasion and calculation of cancer percentage present in CNB material. The algorithm was subsequently validated on an external dataset of 100 consecutive cases (1627 H&E-stained slides) digitised on an Aperio AT2 scanner. In addition, the AI tool was implemented in a pathology laboratory within routine clinical workflow as a second read system to review all prostate CNBs. Algorithm performance was assessed with area under the receiver operating characteristic curve (AUC), specificity, and sensitivity, as well as Pearson's correlation coefficient (Pearson's r) for cancer percentage. FINDINGS: The algorithm achieved an AUC of 0·997 (95% CI 0·995 to 0·998) for cancer detection in the internal test set and 0·991 (0·979 to 1·00) in the external validation set. The AUC for distinguishing between a low-grade (Gleason score 6 or ASAP) and high-grade (Gleason score 7-10) cancer diagnosis was 0·941 (0·905 to 0·977) and the AUC for detecting Gleason pattern 5 was 0·971 (0·943 to 0·998) in the external validation set. Cancer percentage calculated by pathologists and the algorithm showed good agreement (r=0·882, 95% CI 0·834 to 0·915; p<0·0001) with a mean bias of -4·14% (-6·36 to -1·91). The algorithm achieved an AUC of 0·957 (0·930 to 0·985) for perineural invasion. In routine practice, the algorithm was used to assess 11 429 H&E-stained slides pertaining to 941 cases leading to 90 Gleason score 7-10 alerts and 560 cancer alerts. 51 (9%) cancer alerts led to additional cuts or stains being ordered, two (4%) of which led to a third opinion request. We report on the first case of missed cancer that was detected by the algorithm. INTERPRETATION: This study reports the successful development, external clinical validation, and deployment in clinical practice of an AI-based algorithm to accurately detect, grade, and evaluate clinically relevant findings in digitised slides of prostate CNBs. FUNDING: Ibex Medical Analytics.


Asunto(s)
Inteligencia Artificial , Biopsia con Aguja Gruesa , Interpretación de Imagen Asistida por Computador , Clasificación del Tumor , Próstata/patología , Neoplasias de la Próstata/diagnóstico , Adulto , Anciano , Anciano de 80 o más Años , Algoritmos , Área Bajo la Curva , Análisis de Datos , Humanos , Masculino , Microscopía , Persona de Mediana Edad , Patólogos , Patología Clínica/métodos , Neoplasias de la Próstata/patología , Curva ROC
4.
J Mol Biol ; 431(13): 2398-2406, 2019 06 14.
Artículo en Inglés | MEDLINE | ID: mdl-31100387

RESUMEN

Genome-wide analysis of cellular transcriptomes using RNA-seq or expression arrays is a major mainstay of current biological and biomedical research. EXPANDER (EXPression ANalyzer and DisplayER) is a comprehensive software package for analysis of expression data, with built-in support for 18 different organisms. It is designed as a "one-stop shop" platform for transcriptomic analysis, allowing for execution of all analysis steps starting with gene expression data matrix. Analyses offered include low-level preprocessing and normalization, differential expression analysis, clustering, bi-clustering, supervised grouping, high-level functional and pathway enrichment tests, and networks and motif analyses. A variety of options is offered for each step, using established algorithms, including many developed and published by our laboratory. EXPANDER has been continuously developed since 2003, having to date over 18,000 downloads and 540 citations. One of the innovations in the recent version is support for combined analysis of gene expression and ChIP-seq data to enhance the inference of transcriptional networks and their functional interpretation. EXPANDER implements cutting-edge algorithms and makes them accessible to users through user-friendly interface and intuitive visualizations. It is freely available to users at http://acgt.cs.tau.ac.il/expander/.


Asunto(s)
Biología Computacional/métodos , Perfilación de la Expresión Génica/métodos , Animales , Análisis por Conglomerados , Regulación de la Expresión Génica , Humanos , Internet , Análisis de Secuencia por Matrices de Oligonucleótidos , Análisis de Secuencia de ARN , Programas Informáticos
5.
PLoS One ; 7(9): e46145, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-23029415

RESUMEN

The new technology of protein binding microarrays (PBMs) allows simultaneous measurement of the binding intensities of a transcription factor to tens of thousands of synthetic double-stranded DNA probes, covering all possible 10-mers. A key computational challenge is inferring the binding motif from these data. We present a systematic comparison of four methods developed specifically for reconstructing a binding site motif represented as a positional weight matrix from PBM data. The reconstructed motifs were evaluated in terms of three criteria: concordance with reference motifs from the literature and ability to predict in vivo and in vitro bindings. The evaluation encompassed over 200 transcription factors and some 300 assays. The results show a tradeoff between how the methods perform according to the different criteria, and a dichotomy of method types. Algorithms that construct motifs with low information content predict PBM probe ranking more faithfully, while methods that produce highly informative motifs match reference motifs better. Interestingly, in predicting high-affinity binding, all methods give far poorer results for in vivo assays compared to in vitro assays.


Asunto(s)
Algoritmos , Sondas de ADN/metabolismo , Análisis por Matrices de Proteínas/métodos , Factores de Transcripción/metabolismo , Animales , Secuencia de Bases , Sitios de Unión , Sondas de ADN/química , Ratones , Unión Proteica
6.
Genome Res ; 22(1): 76-83, 2012 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-21930893

RESUMEN

In this study we report on a novel pair of cis-regulatory motifs in promoter sequences of the nematode Caenorhabditis elegans. The motif pair exhibits extraordinary genomic traits: The order and the orientation of the two motifs are highly specific, and the distance between them is almost always one of two frequent distances. In contrast, the sequence between the motifs is variable across occurrences. Thus, the motif pair constitutes a nearly combinatorial sequence configuration. We further show that this module is conserved among, and unique to, the entire Caenorhabditis genus. By analyzing several gene expression data sets, our data suggest that this motif pair may function in germline development, oogenesis, and early embryogenesis. Finally, we verify that the motifs are indeed functional cis-regulatory elements using reporter constructs in transgenic C. elegans.


Asunto(s)
Caenorhabditis elegans/metabolismo , Regulación de la Expresión Génica/fisiología , Células Germinativas/fisiología , Oogénesis/fisiología , Sitios de Carácter Cuantitativo/fisiología , Elementos de Respuesta/fisiología , Animales , Caenorhabditis elegans/genética
7.
Genome Biol ; 12(6): R61, 2011 Jun 29.
Artículo en Inglés | MEDLINE | ID: mdl-21714908

RESUMEN

BACKGROUND: Chromosomal aneuploidy, that is to say the gain or loss of chromosomes, is the most common abnormality in cancer. While certain aberrations, most commonly translocations, are known to be strongly associated with specific cancers and contribute to their formation, most aberrations appear to be non-specific and arbitrary, and do not have a clear effect. The understanding of chromosomal aneuploidy and its role in tumorigenesis is a fundamental open problem in cancer biology. RESULTS: We report on a systematic study of the characteristics of chromosomal aberrations in cancers, using over 15,000 karyotypes and 62 cancer classes in the Mitelman Database. Remarkably, we discovered a very high co-occurrence rate of chromosome gains with other chromosome gains, and of losses with losses. Gains and losses rarely show significant co-occurrence. This finding was consistent across cancer classes and was confirmed on an independent comparative genomic hybridization dataset of cancer samples. The results of our analysis are available for further investigation via an accompanying website. CONCLUSIONS: The broad generality and the intricate characteristics of the dichotomy of aneuploidy, ranging across numerous tumor classes, are revealed here rigorously for the first time using statistical analyses of large-scale datasets. Our finding suggests that aneuploid cancer cells may use extra chromosome gain or loss events to restore a balance in their altered protein ratios, needed for maintaining their cellular fitness.


Asunto(s)
Aneuploidia , Aberraciones Cromosómicas , Cariotipo , Neoplasias/genética , Análisis por Conglomerados , Minería de Datos , Humanos , Internet , Neoplasias/clasificación , Interfaz Usuario-Computador
8.
Mol Oncol ; 5(4): 336-48, 2011 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-21795128

RESUMEN

The cellular response to DNA damage is vital for maintaining genomic stability and preventing undue cell death or cancer formation. The DNA damage response (DDR), most robustly mobilized by double-strand breaks (DSBs), rapidly activates an extensive signaling network that affects numerous cellular systems, leading to cell survival or programmed cell death. A major component of the DDR is the widespread modulation of gene expression. We analyzed together six datasets that probed transcriptional responses to ionizing radiation (IR) - our novel experimental data and 5 published datasets - to elucidate the scope of this response and identify its gene targets. According to the mRNA expression profiles we recorded from 5 cancerous and non-cancerous human cell lines after exposure to 5 Gy of IR, most of the responses were cell line-specific. Computational analysis identified significant enrichment for p53 target genes and cell cycle-related pathways among groups of up-regulated and down-regulated genes, respectively. Computational promoter analysis of the six datasets disclosed that a statistically significant number of the induced genes contained p53 binding site signatures. p53-mediated regulation had previously been documented for subsets of these gene groups, making our lists a source of novel potential p53 targets. Real-time qPCR and chromatin immunoprecipitation (ChIP) assays validated the IR-induced p53-dependent induction and p53 binding to the respective promoters of 11 selected genes. Our results demonstrate the power of a combined computational and experimental approach to identify new transcriptional targets in the DNA damage response network.


Asunto(s)
Daño del ADN/efectos de la radiación , Regulación de la Expresión Génica/efectos de la radiación , Radiación Ionizante , Transcripción Genética/efectos de la radiación , Proteína p53 Supresora de Tumor/metabolismo , Línea Celular Tumoral , Bases de Datos Genéticas , Perfilación de la Expresión Génica , Redes Reguladoras de Genes , Humanos , Metaanálisis como Asunto , Análisis de Secuencia por Matrices de Oligonucleótidos , Regiones Promotoras Genéticas , Transducción de Señal/fisiología , Proteína p53 Supresora de Tumor/genética
9.
Nat Protoc ; 5(2): 303-22, 2010 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-20134430

RESUMEN

A major challenge in the analysis of gene expression microarray data is to extract meaningful biological knowledge out of the huge volume of raw data. Expander (EXPression ANalyzer and DisplayER) is an integrated software platform for the analysis of gene expression data, which is freely available for academic use. It is designed to support all the stages of microarray data analysis, from raw data normalization to inference of transcriptional regulatory networks. The microarray analysis described in this protocol starts with importing the data into Expander 5.0 and is followed by normalization and filtering. Then, clustering and network-based analyses are performed. The gene groups identified are tested for enrichment in function (based on Gene Ontology), co-regulation (using transcription factor and microRNA target predictions) or co-location. The results of each analysis step can be visualized in a number of ways. The complete protocol can be executed in approximately 1 h.


Asunto(s)
Regulación de la Expresión Génica , Análisis de Secuencia por Matrices de Oligonucleótidos , Algoritmos , Animales , Mapeo Cromosómico , Escherichia coli/genética , Expresión Génica , Técnicas Genéticas , Humanos , Ratones , MicroARNs/genética , Familia de Multigenes/genética , Plantas/genética , Regiones Promotoras Genéticas , Ratas , Saccharomyces cerevisiae/genética , Programas Informáticos , Factores de Transcripción/genética
10.
Nucleic Acids Res ; 37(5): 1566-79, 2009 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-19151090

RESUMEN

A major goal of system biology is the characterization of transcription factors and microRNAs (miRNAs) and the transcriptional programs they regulate. We present Allegro, a method for de-novo discovery of cis-regulatory transcriptional programs through joint analysis of genome-wide expression data and promoter or 3' UTR sequences. The algorithm uses a novel log-likelihood-based, non-parametric model to describe the expression pattern shared by a group of co-regulated genes. We show that Allegro is more accurate and sensitive than existing techniques, and can simultaneously analyze multiple expression datasets with more than 100 conditions. We apply Allegro on datasets from several species and report on the transcriptional modules it uncovers. Our analysis reveals a novel motif over-represented in the promoters of genes highly expressed in murine oocytes, and several new motifs related to fly development. Finally, using stem-cell expression profiles, we identify three miRNA families with pivotal roles in human embryogenesis.


Asunto(s)
Regiones no Traducidas 3'/química , Algoritmos , Perfilación de la Expresión Génica , Regulación de la Expresión Génica , Regiones Promotoras Genéticas , Animales , Ciclo Celular/genética , Humanos , Ratones , MicroARNs/metabolismo , Proteínas Quinasas Activadas por Mitógenos/metabolismo , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/metabolismo , Análisis de Secuencia de ADN , Análisis de Secuencia de ARN , Programas Informáticos , Células Madre/metabolismo , Factores de Transcripción/metabolismo
11.
Genome Res ; 18(7): 1180-9, 2008 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-18411406

RESUMEN

We present a threefold contribution to the computational task of motif discovery, a key component in the effort of delineating the regulatory map of a genome: (1) We constructed a comprehensive large-scale, publicly-available compendium of transcription factor and microRNA target gene sets derived from diverse high-throughput experiments in several metazoans. We used the compendium as a benchmark for motif discovery tools. (2) We developed Amadeus, a highly efficient, user-friendly software platform for genome-scale detection of novel motifs, applicable to a wide range of motif discovery tasks. Amadeus improves upon extant tools in terms of accuracy, running time, output information, and ease of use and is the only program that attained a high success rate on the metazoan compendium. (3) We demonstrate that by searching for motifs based on their genome-wide localization or chromosomal distributions (without using a predefined target set), Amadeus uncovers diverse known phenomena, as well as novel regulatory motifs.


Asunto(s)
Secuencias de Aminoácidos/genética , MicroARNs/genética , Análisis de Secuencia de Proteína , Análisis de Secuencia de ARN , Programas Informáticos , Factores de Transcripción/genética , Algoritmos , Animales , Sitios de Unión/genética , Biología Computacional/métodos , Humanos , Ratones , Estructura Terciaria de Proteína , Alineación de Secuencia , Factores de Transcripción/metabolismo
12.
BMC Genomics ; 8: 394, 2007 Oct 29.
Artículo en Inglés | MEDLINE | ID: mdl-17967192

RESUMEN

BACKGROUND: The innate immune system is the first line of defense mechanisms protecting the host from invading pathogens such as bacteria and viruses. The innate immunity responses are triggered by recognition of prototypical pathogen components by cellular receptors. Prominent among these pathogen sensors are Toll-like receptors (TLRs). We sought global delineation of transcriptional networks induced by TLRs, analyzing four genome-wide expression datasets in mouse and human macrophages stimulated with pathogen-mimetic agents that engage various TLRs. RESULTS: Combining computational analysis of expression profiles and cis-regulatory promoter sequences, we dissected the TLR-induced transcriptional program into two major components: the first is universally activated by all examined TLRs, and the second is specific to activated TLR3 and TLR4. Our results point to NF-kappaB and ISRE-binding transcription factors as the key regulators of the universal and the TLR3/4-specific responses, respectively, and identify novel putative positive and negative feedback loops in these transcriptional programs. Analysis of the kinetics of the induced network showed that while NF-kappaB regulates mainly an early-induced and sustained response, the ISRE element functions primarily in the induction of a delayed wave. We further demonstrate that co-occurrence of the NF-kappaB and ISRE elements in the same promoter endows its targets with enhanced responsiveness. CONCLUSION: Our results enhance system-level understanding of the networks induced by TLRs and demonstrate the power of genomics approaches to delineate intricate transcriptional webs in mammalian systems. Such systems-level knowledge of the TLR network can be useful for designing ways to pharmacologically manipulate the activity of the innate immunity in pathological conditions in which either enhancement or repression of this branch of the immune system is desired.


Asunto(s)
Genoma , Receptores Toll-Like/genética , Transcripción Genética , Animales , Humanos , Inmunidad Innata/genética , Ratones , Análisis de Secuencia por Matrices de Oligonucleótidos
13.
Methods Mol Biol ; 402: 221-44, 2007.
Artículo en Inglés | MEDLINE | ID: mdl-17951798

RESUMEN

A polymerase chain reaction (PCR) primer sequence is called degenerate if some of its positions have several possible bases. The degeneracy of the primer is the number of unique sequence combinations it contains. We study the problem of designing a pair of primers with prescribed degeneracy that match a maximum number of given input sequences. Such problems occur, for example, when studying a family of genes that is known only in part or is known in a related species. We discuss the complexity of several versions of the problem and give approximation algorithms for one simplified variant. On the basis of these algorithms, we developed a program called HYDEN for designing highly degenerate primers for a set of genomic sequences. We describe HYDEN, and report on its success in several applications for identifying olfactory receptor genes in mammals.


Asunto(s)
Cartilla de ADN/química , Genoma , Reacción en Cadena de la Polimerasa , Análisis de Secuencia de ADN , Olfato/genética , Programas Informáticos , Animales , Cartilla de ADN/genética , Mamíferos/genética
14.
Cell Cycle ; 4(12): 1788-97, 2005 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-16294034

RESUMEN

Transcriptional regulation is a major tier in the periodic engine that mobilizes cell cycle progression. The availability of complete genome sequences of multiple organisms holds promise for significantly improving the specificity of computational identification of functional elements. Here, we applied a comparative genomics analysis to decipher transcriptional regulatory elements that control cell cycle phasing. We analyzed genome-wide promoter sequences from 12 organisms, including worm, fly, fish, rodents and human, and identified conserved transcriptional modules that determine the expression of genes in specific cell cycle phases. We demonstrate that a canonical E2F signal encodes for expression highly specific to the G1/S phase, and that a cis-regulatory module comprising CHR-NF-Y elements dictates expression that is restricted to the G2 and G2/M phases. B-Myb binding site signatures occur in many of the CHR-NF-Y target genes, suggesting a specific role for this triplet in the regulation of the cell cycle transcriptional program. Remarkably, E2F signals are conserved in promoters of G1/S genes in all organisms from worm to human. The CHR-NF-Y module is conserved in promoters of G2/M regulated genes in all analyzed vertebrates. Our results reveal novel modules that determine specific cell cycle phasing, and identify their respective putative target genes with remarkably high specificity.


Asunto(s)
Ciclo Celular/genética , Genómica , Elementos Reguladores de la Transcripción/genética , Animales , Secuencia de Bases , Sitios de Unión/genética , Huella de ADN , Factores de Transcripción E2F/metabolismo , Expresión Génica , Humanos , Modelos Genéticos , Datos de Secuencia Molecular , Filogenia , Regiones Promotoras Genéticas/genética , Factores de Transcripción/metabolismo
15.
BMC Bioinformatics ; 6: 232, 2005 Sep 21.
Artículo en Inglés | MEDLINE | ID: mdl-16176576

RESUMEN

BACKGROUND: Gene expression microarrays are a prominent experimental tool in functional genomics which has opened the opportunity for gaining global, systems-level understanding of transcriptional networks. Experiments that apply this technology typically generate overwhelming volumes of data, unprecedented in biological research. Therefore the task of mining meaningful biological knowledge out of the raw data is a major challenge in bioinformatics. Of special need are integrative packages that provide biologist users with advanced but yet easy to use, set of algorithms, together covering the whole range of steps in microarray data analysis. RESULTS: Here we present the EXPANDER 2.0 (EXPression ANalyzer and DisplayER) software package. EXPANDER 2.0 is an integrative package for the analysis of gene expression data, designed as a 'one-stop shop' tool that implements various data analysis algorithms ranging from the initial steps of normalization and filtering, through clustering and biclustering, to high-level functional enrichment analysis that points to biological processes that are active in the examined conditions, and to promoter cis-regulatory elements analysis that elucidates transcription factors that control the observed transcriptional response. EXPANDER is available with pre-compiled functional Gene Ontology (GO) and promoter sequence-derived data files for yeast, worm, fly, rat, mouse and human, supporting high-level analysis applied to data obtained from these six organisms. CONCLUSION: EXPANDER integrated capabilities and its built-in support of multiple organisms make it a very powerful tool for analysis of microarray data. The package is freely available for academic users at http://www.cs.tau.ac.il/~rshamir/expander.


Asunto(s)
Algoritmos , Interpretación Estadística de Datos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Programas Informáticos , Análisis por Conglomerados , Análisis de Secuencia por Matrices de Oligonucleótidos/instrumentación
16.
Genome Biol ; 6(5): R43, 2005.
Artículo en Inglés | MEDLINE | ID: mdl-15892871

RESUMEN

BACKGROUND: Gene-expression microarrays and RNA interferences (RNAi) are among the most prominent techniques in functional genomics. The combination of the two holds promise for systematic, large-scale dissection of transcriptional networks. Recent studies, however, raise the concern that nonspecific responses to small interfering RNAs (siRNAs) might obscure the consequences of silencing the gene of interest, throwing into question the ability of this experimental strategy to achieve precise network dissections. RESULTS: We used microarrays and RNAi to dissect a transcriptional network induced by DNA damage in a human cellular system. We recorded expression profiles with and without exposure of the cells to a radiomimetic drug that induces DNA double-strand breaks (DSBs). Profiles were measured in control cells and in cells knocked-down for the Rel-A subunit of NFkappaB and for p53, two pivotal stress-induced transcription factors, and for the protein kinase ATM, the major transducer of the cellular responses to DSBs. We observed that NFkappaB and p53 mediated most of the damage-induced gene activation; that they controlled the activation of largely disjoint sets of genes; and that ATM was required for the activation of both pathways. Applying computational promoter analysis, we demonstrated that the dissection of the network into ATM/NFkappaB and ATM/p53-mediated arms was highly accurate. CONCLUSIONS: Our results demonstrate that the combined experimental strategy of expression arrays and RNAi is indeed a powerful method for the dissection of complex transcriptional networks, and that computational promoter analysis can provide a strong complementary means for assessing the accuracy of this dissection.


Asunto(s)
Biología Computacional/métodos , Daño del ADN/genética , Perfilación de la Expresión Génica , Regiones Promotoras Genéticas/genética , Interferencia de ARN , Análisis de Secuencia de ADN/métodos , Activación Transcripcional , Proteínas de la Ataxia Telangiectasia Mutada , Proteínas de Ciclo Celular/genética , Células Cultivadas , Análisis por Conglomerados , Proteínas de Unión al ADN/genética , Genes p53 , Humanos , Análisis por Micromatrices , Mutagénesis Sitio-Dirigida , Proteínas Serina-Treonina Quinasas/genética , Proteínas Supresoras de Tumor/genética , Cinostatina/farmacología , Quinasa de Factor Nuclear kappa B
17.
J Comput Biol ; 12(4): 431-56, 2005 May.
Artículo en Inglés | MEDLINE | ID: mdl-15882141

RESUMEN

A PCR primer sequence is called degenerate if some of its positions have several possible bases. The degeneracy of the primer is the number of unique sequence combinations it contains. We study the problem of designing a pair of primers with prescribed degeneracy that match a maximum number of given input sequences. Such problems occur when studying a family of genes that is known only in part, or is known in a related species. We prove that various simplified versions of the problem are hard, show the polynomiality of some restricted cases, and develop approximation algorithms for one variant. Based on these algorithms, we implemented a program called HYDEN for designing highly degenerate primers for a set of genomic sequences. We report on the success of the program in several applications, one of which is an experimental scheme for identifying all human olfactory receptor (OR) genes. In that project, HYDEN was used to design primers with degeneracies up to 10(10) that amplified with high specificity many novel genes of that family, tripling the number of OR genes known at the time.


Asunto(s)
Biología Computacional/métodos , Cartilla de ADN/síntesis química , Programas Informáticos , Algoritmos , Animales , Perros , Diseño de Fármacos , Humanos , Reacción en Cadena de la Polimerasa/métodos
18.
Nucleic Acids Res ; 32(17): 4955-61, 2004.
Artículo en Inglés | MEDLINE | ID: mdl-15388797

RESUMEN

The development of powerful experimental strategies for functional genomics and accompanying computational tools has brought major advances in the delineation of transcriptional networks in organisms ranging from yeast to human. Regulation of transcription of eukaryotic genes is to a large extent combinatorial. Here, we used an in silico approach to identify transcription factors (TFs) that form recurring regulatory modules with c-Myc, a protein encoded by an oncogene that is frequently disregulated in human malignancies. A recent study identified, on a genomic scale, human genes whose promoters are bound by c-Myc and its heterodimer partner Max in Burkitt's lymphoma cells. Using computational methods, we identified nine TFs whose binding-site signatures are highly overrepresented in this promoter set of c-Myc targets, pointing to possible functional links between these TFs and c-Myc. Binding sites of most of these TFs are also enriched on the set of mouse homolog promoters, suggesting functional conservation. Among the enriched TFs, there are several regulators known to control cell cycle progression. Another TF in this set, EGR-1, is rapidly activated by numerous stress challenges and plays a central role in angiogenesis. Experimental investigation confirmed that c-Myc and EGR-1 bind together on several target promoters. The approach applied here is general and demonstrates how computational analysis of functional genomics experiments can identify novel modules in complex networks of transcriptional regulation.


Asunto(s)
Biología Computacional , Genómica , Proteínas Proto-Oncogénicas c-myc/metabolismo , Factores de Transcripción/metabolismo , Animales , Factores de Transcripción Básicos con Cremalleras de Leucinas y Motivos Hélice-Asa-Hélice , Factores de Transcripción con Cremalleras de Leucina de Carácter Básico , Sitios de Unión , Secuencia Conservada , Proteínas de Unión al ADN/metabolismo , Proteína 1 de la Respuesta de Crecimiento Precoz , Humanos , Proteínas Inmediatas-Precoces/metabolismo , Ratones , Regiones Promotoras Genéticas
19.
Genomics ; 83(3): 361-72, 2004 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-14962662

RESUMEN

We identified 971 olfactory receptor (OR) genes in the dog genome, estimated to constitute approximately 80% of the canine OR repertoire. This was achieved by directed genomic DNA cloning of olfactory sequence tags as well as by mining the Celera canine genome sequences. The dog OR subgenome is estimated to have 12% pseudogenes, suggesting a functional repertoire similar to that of mouse and considerably larger than for humans. No novel OR families were discovered, but as many as 34 gene subfamilies were unique to the dog. "Fish-like" Class I ancient ORs constituted 18% of the repertoire, significantly more than in human and mouse. A set of 122 dog-human-mouse ortholog triplets was identified, with a relatively high fraction of Class I ORs. The elucidation of a large portion of the canine olfactory receptor gene superfamily, with some dog-specific attributes, may help us understand the unique chemosensory capacities of this species.


Asunto(s)
Perros/genética , Genoma , Receptores Odorantes/genética , Animales , Evolución Molecular , Humanos , Masculino , Ratones , Filogenia , Seudogenes , Análisis de Secuencia de Proteína , Especificidad de la Especie
20.
Genome Res ; 13(5): 773-80, 2003 May.
Artículo en Inglés | MEDLINE | ID: mdl-12727897

RESUMEN

Dissection of regulatory networks that control gene transcription is one of the greatest challenges of functional genomics. Using human genomic sequences, models for binding sites of known transcription factors, and gene expression data, we demonstrate that the reverse engineering approach, which infers regulatory mechanisms from gene expression patterns, can reveal transcriptional networks in human cells. To date, such methodologies were successfully demonstrated only in prokaryotes and low eukaryotes. We developed computational methods for identifying putative binding sites of transcription factors and for evaluating the statistical significance of their prevalence in a given set of promoters. Focusing on transcriptional mechanisms that control cell cycle progression, our computational analyses revealed eight transcription factors whose binding sites are significantly overrepresented in promoters of genes whose expression is cell-cycle-dependent. The enrichment of some of these factors is specific to certain phases of the cell cycle. In addition, several pairs of these transcription factors show a significant co-occurrence rate in cell-cycle-regulated promoters. Each such pair indicates functional cooperation between its members in regulating the transcriptional program associated with cell cycle progression. The methods presented here are general and can be applied to the analysis of transcriptional networks controlling any biological process.


Asunto(s)
Ciclo Celular/genética , Regulación de la Expresión Génica/genética , Genoma Humano , Transcripción Genética/genética , Sitios de Unión/genética , Proteínas de Ciclo Celular/biosíntesis , Biología Computacional/métodos , Biología Computacional/estadística & datos numéricos , Proteínas de Unión al ADN/genética , Bases de Datos Genéticas/estadística & datos numéricos , Factores de Transcripción E2F , Perfilación de la Expresión Génica/métodos , Perfilación de la Expresión Génica/estadística & datos numéricos , Genes cdc , Humanos , Regiones Promotoras Genéticas/genética , Factores de Transcripción/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA