Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 21
Filtrar
1.
Hum Genome Var ; 8(1): 44, 2021 Dec 10.
Artigo em Inglês | MEDLINE | ID: mdl-34887386

RESUMO

To reveal gene-environment interactions underlying common diseases and estimate the risk for common diseases, the Tohoku Medical Megabank (TMM) project has conducted prospective cohort studies and genomic and multiomics analyses. To establish an integrated biobank, we developed an integrated database called "dbTMM" that incorporates both the individual cohort/clinical data and the genome/multiomics data of 157,191 participants in the Tohoku Medical Megabank project. To our knowledge, dbTMM is the first database to store individual whole-genome data on a variant-by-variant basis as well as cohort/clinical data for over one hundred thousand participants in a prospective cohort study. dbTMM enables us to stratify our cohort by both genome-wide genetic factors and environmental factors, and it provides a research and development platform that enables prospective analysis of large-scale data from genome cohorts.

2.
Alzheimers Res Ther ; 13(1): 92, 2021 05 03.
Artigo em Inglês | MEDLINE | ID: mdl-33941241

RESUMO

BACKGROUND: Identifying novel therapeutic targets is crucial for the successful development of drugs. However, the cost to experimentally identify therapeutic targets is huge and only approximately 400 genes are targets for FDA-approved drugs. As a result, it is inevitable to develop powerful computational tools that can identify potential novel therapeutic targets. Fortunately, the human protein-protein interaction network (PIN) could be a useful resource to achieve this objective. METHODS: In this study, we developed a deep learning-based computational framework that extracts low-dimensional representations of high-dimensional PIN data. Our computational framework uses latent features and state-of-the-art machine learning techniques to infer potential drug target genes. RESULTS: We applied our computational framework to prioritize novel putative target genes for Alzheimer's disease and successfully identified key genes that may serve as novel therapeutic targets (e.g., DLG4, EGFR, RAC1, SYK, PTK2B, SOCS1). Furthermore, based on these putative targets, we could infer repositionable candidate-compounds for the disease (e.g., tamoxifen, bosutinib, and dasatinib). CONCLUSIONS: Our deep learning-based computational framework could be a powerful tool to efficiently prioritize new therapeutic targets and enhance the drug repositioning strategy.


Assuntos
Doença de Alzheimer , Preparações Farmacêuticas , Doença de Alzheimer/tratamento farmacológico , Doença de Alzheimer/genética , Inteligência Artificial , Reposicionamento de Medicamentos , Humanos , Aprendizado de Máquina
3.
Hum Genet ; 138(4): 389-409, 2019 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-30887117

RESUMO

Incidence rates of Mendelian diseases vary among ethnic groups, and frequencies of variant types of causative genes also vary among human populations. In this study, we examined to what extent we can predict population frequencies of recessive disorders from genomic data, and explored better strategies for variant interpretation and classification. We used a whole-genome reference panel from 3552 general Japanese individuals constructed by the Tohoku Medical Megabank Organization (ToMMo). Focusing on 32 genes for 17 congenital metabolic disorders included in newborn screening (NBS) in Japan, we identified reported and predicted pathogenic variants through variant annotation, interpretation, and multiple ways of classifications. The estimated carrier frequencies were compared with those from the Japanese NBS data based on 1,949,987 newborns from a previous study. The estimated carrier frequency based on genomic data with a recent guideline of variant interpretation for the PAH gene, in which defects cause hyperphenylalaninemia (HPA) and phenylketonuria (PKU), provided a closer estimate to that by the observed incidence than the other methods. In contrast, the estimated carrier frequencies for SLC25A13, which causes citrin deficiency, were much higher compared with the incidence rate. The results varied greatly among the 11 NBS diseases with single responsible genes; the possible reasons for departures from the carrier frequencies by reported incidence rates were discussed. Of note, (1) the number of pathogenic variants increases by including additional lines of evidence, (2) common variants with mild effects also contribute to the actual frequency of patients, and (3) penetrance of each variant remains unclear.


Assuntos
Doenças Genéticas Inatas/diagnóstico , Doenças Genéticas Inatas/genética , Doenças do Recém-Nascido/diagnóstico , Doenças do Recém-Nascido/genética , Triagem Neonatal/métodos , Povo Asiático/genética , Povo Asiático/estatística & dados numéricos , Estudos de Coortes , Feminino , Frequência do Gene , Doenças Genéticas Inatas/epidemiologia , Estudo de Associação Genômica Ampla/normas , Heterozigoto , Humanos , Incidência , Recém-Nascido , Doenças do Recém-Nascido/epidemiologia , Japão/epidemiologia , Masculino , Padrões de Referência
4.
Nucleic Acids Res ; 47(D1): D859-D866, 2019 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-30371824

RESUMO

Understanding anatomical structures and biological functions based on gene expression is critical in a systemic approach to address the complexity of the mammalian brain, where >25 000 genes are expressed in a precise manner. Co-expressed genes are thought to regulate cell type- or region-specific brain functions. Thus, well-designed data acquisition and visualization systems for profiling combinatorial gene expression in relation to anatomical structures are crucial. To this purpose, using our techniques of microtomy-based gene expression measurements and WebGL-based visualization programs, we mapped spatial expression densities of genome-wide transcripts to the 3D coordinates of mouse brains at four post-natal stages, and built a database, ViBrism DB (http://vibrism.neuroinf.jp/). With the DB platform, users can access a total of 172 022 expression maps of transcripts, including coding, non-coding and lncRNAs in the whole context of 3D magnetic resonance (MR) images. Co-expression of transcripts is represented in the image space and in topological network graphs. In situ hybridization images and anatomical area maps are browsable in the same space of 3D expression maps using a new browser-based 2D/3D viewer, BAH viewer. Created images are shareable using URLs, including scene-setting parameters. The DB has multiple links and is expandable by community activity.


Assuntos
Encéfalo/diagnóstico por imagem , Bases de Dados Genéticas , Expressão Gênica/genética , Redes Reguladoras de Genes/genética , Animais , Encéfalo/anatomia & histologia , Imageamento Tridimensional/classificação , Camundongos , Software
5.
Stud Health Technol Inform ; 216: 1057, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26262356

RESUMO

The Tohoku Medical Megabank project is a national project to revitalization of the disaster area in the Tohoku region by the Great East Japan Earthquake, and have conducted large-scale prospective genome-cohort study. Along with prospective genome-cohort study, we have developed integrated database and knowledge base which will be key database for realizing personalized prevention and medicine.


Assuntos
Bases de Dados Genéticas , Registros Eletrônicos de Saúde/organização & administração , Predisposição Genética para Doença/genética , Registro Médico Coordenado/métodos , Medicina de Precisão/métodos , Medicina Preventiva/organização & administração , Estudos de Coortes , Sistemas de Gerenciamento de Base de Dados/organização & administração , Conjuntos de Dados como Assunto , Genômica/organização & administração , Japão , Processamento de Linguagem Natural , Integração de Sistemas , Interface Usuário-Computador
6.
Sci Rep ; 4: 6969, 2014 Nov 10.
Artigo em Inglês | MEDLINE | ID: mdl-25382412

RESUMO

Using a recently invented technique for gene expression mapping in the whole-anatomy context, termed transcriptome tomography, we have generated a dataset of 36,000 maps of overall gene expression in the adult-mouse brain. Here, using an informatics approach, we identified a broad co-expression network that follows an inverse power law and is rich in functional interaction and gene-ontology terms. Our framework for the integrated analysis of expression maps and graphs of co-expression networks revealed that groups of combinatorially expressed genes, which regulate cell differentiation during development, were present in the adult brain and each of these groups was associated with a discrete cell types. These groups included non-coding genes of unknown function. We found that these genes specifically linked developmentally conserved groups in the network. A previously unrecognized robust expression pattern covering the whole brain was related to the molecular anatomy of key biological processes occurring in particular areas.


Assuntos
Encéfalo/metabolismo , Perfilação da Expressão Gênica , Redes Reguladoras de Genes , Transcriptoma , Animais , Encéfalo/anatomia & histologia , Biologia Computacional/métodos , Proteínas de Homeodomínio/genética , Masculino , Camundongos , Especificidade de Órgãos/genética , Fatores de Transcrição/genética
7.
BMC Med Genet ; 15: 65, 2014 Jun 05.
Artigo em Inglês | MEDLINE | ID: mdl-24903457

RESUMO

BACKGROUND: Genome-wide association studies have identified many genetic loci associated with blood pressure (BP). Genetic effects on BP can be altered by environmental exposures via multiple biological pathways. Especially, obesity is one of important environmental risk factors that can have considerable effect on BP and it may interact with genetic factors. Given that, we aimed to test whether genetic factors and obesity may jointly influence BP. METHODS: We performed meta-analyses of genome-wide association data for systolic blood pressure (SBP) and diastolic blood pressure (DBP) that included analyses of interaction between single nucleotide polymorphisms (SNPs) and the obesity-related anthropometric measures, body mass index (BMI), height, weight, and waist/hip ratio (WHR) in East-Asians (n = 12,030). RESULTS: We identified that rs13390641 on 2q12.1 demonstrated significant association with SBP when the interaction between SNPs and BMI was considered (P < 5 × 10 -8). The gene located nearest to rs13390641, TMEM182, encodes transmembrane protein 182. In stratified analyses, the effect of rs13390641 on BP was much stronger in obese individuals (BMI ≥ 30) than non-obese individuals and the effect of BMI on BP was strongest in individuals with the homozygous A allele of rs13390641. CONCLUSIONS: Our analyses that included interactions between SNPs and environmental factors identified a genetic variant associated with BP that was overlooked in standard analyses in which only genetic factors were included. This result also revealed a potential mechanism that integrates genetic factors and obesity related traits in the development of high BP.


Assuntos
Povo Asiático/genética , Pressão Sanguínea/genética , Cromossomos Humanos Par 2 , Interação Gene-Ambiente , Variação Genética , Estudo de Associação Genômica Ampla , Índice de Massa Corporal , Estudos de Associação Genética , Genótipo , Humanos , Obesidade/genética , Obesidade/fisiopatologia , Fenótipo , Polimorfismo de Nucleotídeo Único
8.
PLoS One ; 7(9): e46385, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-23050023

RESUMO

BACKGROUND/OBJECTIVE: In Japanese populations, we performed a replication study of genetic loci previously identified in European-descent populations as being associated with lipid levels and risk of coronary artery disease (CAD). METHODS: We genotyped 48 single nucleotide polymorphisms (SNPs) from 22 candidate loci that had previously been identified by genome-wide association (GWA) meta-analyses for low-density lipoprotein cholesterol (LDL-C), high-density lipoprotein cholesterol (HDL-C), and/or triglycerides in Europeans. We selected 22 loci with 2 parallel tracks from 95 reported loci: 16 significant loci (p<1 × 10(-30) in Europeans) and 6 other loci including those with suggestive evidence of lipid associations in 1292 GWA-scanned Japanese samples. Genotyping was done in 4990 general population samples, and 1347 CAD cases and 1337 controls. For 9 SNPs, we further examined CAD associations in an additional panel of 3052 CAD cases and 6335 controls. PRINCIPAL FINDINGS: Significant lipid associations (one-tailed p<0.05) were replicated for 18 of 22 loci in Japanese samples, with significant inter-ethnic heterogeneity at 4 loci-APOB, APOE-C1, CETP, and APOA5-and allelic heterogeneity. The strongest association was detected at APOE rs7412 for LDL-C (p=1.3 × 10(-41)), CETP rs3764261 for HDL-C (p=5.2 × 10(-24)), and APOA5 rs662799 for triglycerides (p=5.8 × 10(-54)). CAD association was replicated and/or verified for 4 loci: SORT1 rs611917 (p=1.7 × 10(-8)), APOA5 rs662799 (p=0.0014), LDLR rs1433099 (p=2.1 × 10(-7)), and APOE rs7412 (p=6.1 × 10(-13)). CONCLUSIONS: Our results confirm that most of the tested lipid loci are associated with lipid traits in the Japanese, further indicating that in genetic susceptibility to lipid levels and CAD, the related metabolic pathways are largely common across the populations, while causal variants at individual loci can be population-specific.


Assuntos
Doença da Artéria Coronariana/sangue , Doença da Artéria Coronariana/genética , Apolipoproteína A-V , Apolipoproteínas A/genética , Apolipoproteínas B/genética , Apolipoproteínas E/genética , Povo Asiático , Proteínas de Transferência de Ésteres de Colesterol/genética , HDL-Colesterol/sangue , LDL-Colesterol/sangue , Feminino , Estudo de Associação Genômica Ampla , Genótipo , Humanos , Masculino , Polimorfismo de Nucleotídeo Único/genética
9.
PLoS One ; 7(9): e45373, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-23028969

RESUMO

Increased information on the encoded mammalian genome is expected to facilitate an integrated understanding of complex anatomical structure and function based on the knowledge of gene products. Determination of gene expression-anatomy associations is crucial for this understanding. To elicit the association in the three-dimensional (3D) space, we introduce a novel technique for comprehensive mapping of endogenous gene expression into a web-accessible standard space: Transcriptome Tomography. The technique is based on conjugation of sequential tissue-block sectioning, all fractions of which are used for molecular measurements of gene expression densities, and the block- face imaging, which are used for 3D reconstruction of the fractions. To generate a 3D map, tissues are serially sectioned in each of three orthogonal planes and the expression density data are mapped using a tomographic technique. This rapid and unbiased mapping technique using a relatively small number of original data points allows researchers to create their own expression maps in the broad anatomical context of the space. In the first instance we generated a dataset of 36,000 maps, reconstructed from data of 61 fractions measured with microarray, covering the whole mouse brain (ViBrism: http://vibrism.riken.jp/3dviewer/ex/index.html) in one month. After computational estimation of the mapping accuracy we validated the dataset against existing data with respect to the expression location and density. To demonstrate the relevance of the framework, we showed disease related expression of Huntington's disease gene and Bdnf. Our tomographic approach is applicable to analysis of any biological molecules derived from frozen tissues, organs and whole embryos, and the maps are spatially isotropic and well suited to the analysis in the standard space (e.g. Waxholm Space for brain-atlas databases). This will facilitate research creating and using open-standards for a molecular-based understanding of complex structures; and will contribute to new insights into a broad range of biological and medical questions.


Assuntos
Encéfalo/metabolismo , Transcriptoma/genética , Animais , Perfilação da Expressão Gênica , Doença de Huntington , Imageamento Tridimensional , Masculino , Camundongos , Camundongos Endogâmicos C57BL
10.
Cancer Genomics Proteomics ; 9(2): 67-75, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22399497

RESUMO

AIM: Unearthing of silenced genes in colorectal cancer (CRC). MATERIALS AND METHODS: Oligonucleotide microarray was used in order to find changes in gene expression in five CRC cell lines before and after 5-aza-2'-Deoxycitidine treatment. Up-regulated genes were integrated with expression profile of matched colorectal tissue samples. Methylation-specific polymerase chain reaction and Real-time quantitative reverse transcription polymerase chain reaction were used to further analyze candidates using 15 CRC cell lines and 23 paired samples. RESULTS: After applying study selection criteria for 68 genes obtained from integrated arrays, we identified 16 genes; apoptosis-stimulating of p53 protein 1(ASPP1) and Scavenger receptor class A, member 5 (SCARA5) were selected for further analysis. Methylation was only identified for SCARA5 in 20% of the cell lines and in 17% of tumor the samples. Down expression of SCARA5 was observed in CRC cell lines and in tumor samples compared to normal (p<0.001 and p=0.001, respectively). CONCLUSION: Genome-wide screening identifies genes potentially affected by methylation in CRC. SCARA5 may have a role in tumorigenesis in CRC.


Assuntos
Azacitidina/análogos & derivados , Neoplasias Colorretais/genética , Metilação de DNA/efeitos dos fármacos , Epigênese Genética/efeitos dos fármacos , Perfilação da Expressão Gênica , Adulto , Idoso , Idoso de 80 Anos ou mais , Azacitidina/farmacologia , Linhagem Celular Tumoral , Análise por Conglomerados , Metilases de Modificação do DNA/antagonistas & inibidores , Decitabina , Epigenômica , Feminino , Regulação Neoplásica da Expressão Gênica/efeitos dos fármacos , Genes p16 , Humanos , Masculino , Pessoa de Meia-Idade , Regiões Promotoras Genéticas , Receptores Depuradores Classe A/genética , Proteína Supressora de Tumor p14ARF/genética
11.
BMC Genomics ; 11 Suppl 4: S19, 2010 Dec 02.
Artigo em Inglês | MEDLINE | ID: mdl-21143802

RESUMO

BACKGROUND: Variety of information relating between genome and the pathological findings in disease will yield a wealth of clues to discover new function, the role of genes and pathways, and future medicine. In addition to molecular information such as gene expression and genome copy number, detailed clinical information is essential for such systematic omics analysis. RESULTS: In order to provide a basic platform to realize a future medicine based on the integration of molecular and clinico-pathological information of disease, we have developed an integrated clinical omics database (iCOD) in which comprehensive disease information of the patients is collected, including not only molecular omics data such as CGH (Comparative Genomic Hybridization) and gene expression profiles but also comprehensive clinical information such as clinical manifestations, medical images (CT, X-ray, ultrasounds, etc), laboratory tests, drug histories, pathological findings and even life-style/environmental information. The iCOD is developed to combine the molecular and clinico-pathological information of the patients to provide the holistic understanding of the disease. Furthermore, we developed several kinds of integrated view maps of disease in the iCOD, which summarize the comprehensive patient data to provide the information for the interrelation between the molecular omics data and clinico-pathological findings as well as estimation for the disease pathways, such as three layer-linked disease map, disease pathway map, and pathome-genome map. CONCLUSIONS: With these utilities, our iCOD aims to contribute to provide the omics basis of the disease as well as to promote the pathway-directed disease view. The iCOD database is available online, containing 140 patient cases of hepatocellular carcinoma, with raw data of each case as supplemental data set to download. The iCOD and supplemental data can be accessed at http://omics.tmd.ac.jp/icod_pub_eng.


Assuntos
Biologia Computacional , Doença/genética , Genômica , Carcinoma Hepatocelular/patologia , Bases de Dados Genéticas , Perfilação da Expressão Gênica/métodos , Genoma , Humanos , Internet , Neoplasias Hepáticas/patologia
12.
Cancer Inform ; 9: 147-61, 2010 Jul 29.
Artigo em Inglês | MEDLINE | ID: mdl-20706620

RESUMO

BACKGROUND: Colorectal cancer (CRC) is one of the most frequently occurring cancers in Japan, and thus a wide range of methods have been deployed to study the molecular mechanisms of CRC. In this study, we performed a comprehensive analysis of CRC, incorporating copy number aberration (CRC) and gene expression data. For the last four years, we have been collecting data from CRC cases and organizing the information as an "omics" study by integrating many kinds of analysis into a single comprehensive investigation. In our previous studies, we had experienced difficulty in finding genes related to CRC, as we observed higher noise levels in the expression data than in the data for other cancers. Because chromosomal aberrations are often observed in CRC, here, we have performed a combination of CNA analysis and expression analysis in order to identify some new genes responsible for CRC. This study was performed as part of the Clinical Omics Database Project at Tokyo Medical and Dental University. The purpose of this study was to investigate the mechanism of genetic instability in CRC by this combination of expression analysis and CNA, and to establish a new method for the diagnosis and treatment of CRC. MATERIALS AND METHODS: Comprehensive gene expression analysis was performed on 79 CRC cases using an Affymetrix Gene Chip, and comprehensive CNA analysis was performed using an Affymetrix DNA Sty array. To avoid the contamination of cancer tissue with normal cells, laser micro-dissection was performed before DNA/RNA extraction. Data analysis was performed using original software written in the R language. RESULT: We observed a high percentage of CNA in colorectal cancer, including copy number gains at 7, 8q, 13 and 20q, and copy number losses at 8p, 17p and 18. Gene expression analysis provided many candidates for CRC-related genes, but their association with CRC did not reach the level of statistical significance. The combination of CNA and gene expression analysis, together with the clinical information, suggested UGT2B28, LOC440995, CXCL6, SULT1B1, RALBP1, TYMS, RAB12, RNMT, ARHGDIB, S1000A2, ABHD2, OIT3 and ABHD12 as genes that are possibly associated with CRC. Some of these genes have already been reported as being related to CRC. TYMS has been reported as being associated with resistance to the anti-cancer drug 5-fluorouracil, and we observed a copy number increase for this gene. RALBP1, ARHGDIB and S100A2 have been reported as oncogenes, and we observed copy number increases in each. ARHGDIB has been reported as a metastasis-related gene, and our data also showed copy number increases of this gene in cases with metastasis. CONCLUSION: The combination of CNA analysis and gene expression analysis was a more effective method for finding genes associated with the clinicopathological classification of CRC than either analysis alone. Using this combination of methods, we were able to detect genes that have already been associated with CRC. We also identified additional candidate genes that may be new markers or targets for this form of cancer.

13.
Biochem Biophys Res Commun ; 368(1): 43-9, 2008 Mar 28.
Artigo em Inglês | MEDLINE | ID: mdl-18211820

RESUMO

Inactivation of serotonin transporter (HTT) by pharmacologically in the neonate or genetically increases risk for depression in adulthood, whereas pharmacological inhibition of HTT ameliorates symptoms in depressed patients. The differing role of HTT function during early development and in adult brain plasticity in causing or reversing depression remains an unexplained paradox. To address this we profiled the gene expression of adult Htt knockout (Htt KO) mice and HTT inhibitor-treated mice. Inverted profile changes between the two experimental conditions were seen in 30 genes. Consistent results of the upstream regulatory element search and the co-localization search of these genes indicated that the regulation may be executed by Pax5, Pax7 and Gata3, known to be involved in the survival, proliferation, and migration of serotonergic neurons in the developing brain, and these factors are supposed to keep functioning to regulate downstream genes related to serotonin system in the adult brain.


Assuntos
Regulação da Expressão Gênica/genética , Proteínas da Membrana Plasmática de Transporte de Serotonina/metabolismo , Animais , Perfilação da Expressão Gênica , Camundongos , Camundongos Knockout , Análise de Sequência com Séries de Oligonucleotídeos , RNA Mensageiro/genética , Proteínas da Membrana Plasmática de Transporte de Serotonina/deficiência , Proteínas da Membrana Plasmática de Transporte de Serotonina/genética
14.
CSH Protoc ; 2008: pdb.prot4937, 2008 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-21356768

RESUMO

INTRODUCTIONIn terms of cost per measurement, the use of DNA microarrays for comprehensive and quantitative expression measurements is vastly superior to other methods such as Northern blotting or quantitative reverse transcriptase polymerase chain reaction (QRT-PCR). However, the output values of DNA microarrays are not always highly reliable or accurate compared with other techniques, and the output data sometimes consist of measurements of relative expression (treated sample vs. untreated) rather than absolute expression values as desired. In effect, some measurements from some laboratories do not represent absolute expression values (such as the number of transcripts) and as such are experimentally deficient. This protocol addresses one problem in some microarray data: the absence of accurate measurements. Spot reliability evaluation score for DNA microarrays (SRED) offers a reliability value for each spot in the microarray. SRED does not require an entire microarray to assess the reliability, but rather analyzes the reliability of individual spots of the microarray. The calculation of a reliability index can be used for different microarray systems, which facilitates the analysis of multiple microarray data sets from different experimental platforms.

15.
CSH Protoc ; 2008: pdb.prot4938, 2008 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-21356769

RESUMO

INTRODUCTIONIn terms of cost per measurement, the use of DNA microarrays for comprehensive and quantitative expression measurements is vastly superior to other methods such as Northern blotting or quantitative reverse transcriptase polymerase chain reaction (QRT-PCR). However, the output values of DNA microarrays are not always highly reliable or accurate compared with other techniques, and the output data sometimes consist of measurements of relative expression (treated sample vs. untreated) rather than absolute expression values as desired. In effect, some measurements from some laboratories do not represent absolute expression values (such as the number of transcripts) and as such are experimentally deficient. To address the problem that some microarray data sets fail to reflect the number of mRNA molecules sufficiently in a given sample (i.e., fail to provide absolute expression levels), additional methods are required. The procedure described here provides a new method for converting microarray data to absolute expression values with the use of external data such as expressed sequence tags (ESTs) and cap analysis of gene expression (CAGE) tags.

16.
BMC Bioinformatics ; 8: 161, 2007 May 21.
Artigo em Inglês | MEDLINE | ID: mdl-17517134

RESUMO

BACKGROUND: Recent analyses have suggested that many genes possess multiple transcription start sites (TSSs) that are differentially utilized in different tissues and cell lines. We have identified a huge number of TSSs mapped onto the mouse genome using the cap analysis of gene expression (CAGE) method. The standard hierarchical clustering algorithm, which gives us easily understandable graphical tree images, has difficulties in processing such huge amounts of TSS data and a better method to calculate and display the results is needed. RESULTS: We use a combination of hierarchical and non-hierarchical clustering to cluster expression profiles of TSSs based on a large amount of CAGE data to profit from the best of both methods. We processed the genome-wide expression data, including 159,075 TSSs derived from 127 RNA samples of various organs of mouse, and succeeded in categorizing them into 70-100 clusters. The clusters exhibited intriguing biological features: a cluster supergroup with a ubiquitous expression profile, tissue-specific patterns, a distinct distribution of non-coding RNA and functional TSS groups. CONCLUSION: Our approach succeeded in greatly reducing the calculation cost, and is an appropriate solution for analyzing large-scale TSS usage data.


Assuntos
Mapeamento Cromossômico/métodos , Etiquetas de Sequências Expressas , Perfilação da Expressão Gênica/métodos , Família Multigênica/genética , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Análise de Sequência de DNA/métodos , Fatores de Transcrição/genética , Algoritmos , Animais , Camundongos
17.
Nucleic Acids Res ; 34(13): e97, 2006 Aug 08.
Artigo em Inglês | MEDLINE | ID: mdl-16896013

RESUMO

We have developed a RecA-mediated simple, rapid and scalable method for identifying novel alternatively spliced full-length cDNA candidates. This method is based on the principle that RecA proteins allow to carry radioisotope-labeled probe DNAs to their homologous sequences, resulting in forming triplexes. The resulting complex is easily detected by mobility difference on electrophoresis. We applied this exon profiling method to four selected mouse genes as a feasibility study. To design probes for detection, the information on known exonic regions was extracted from public database, RefSeq. Concerning the potentially transcribed novel exonic regions, RNA mapping experiment using Affymetrix tiling array was performed. As a result, we were able to identify alternative splice variants of Thioredoxin domain containing 5, Interleukin1beta, Interleukin 1 family 6 and glutamine-rich hypothetical protein. In addition, full-length sequencing demonstrated that our method could profile exon structures with >90% accuracy. This reliable method can allow us to screen novel splice variants from a huge number of cDNA clone set effectively.


Assuntos
Processamento Alternativo , Éxons , Hibridização de Ácido Nucleico/métodos , Recombinases Rec A , Animais , DNA/química , DNA Complementar , Eletroforese em Gel de Ágar , Perfilação da Expressão Gênica , Interleucina-1/genética , Camundongos , Sondas de Oligonucleotídeos/química
18.
Nat Genet ; 38(6): 626-35, 2006 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-16645617

RESUMO

Mammalian promoters can be separated into two classes, conserved TATA box-enriched promoters, which initiate at a well-defined site, and more plastic, broad and evolvable CpG-rich promoters. We have sequenced tags corresponding to several hundred thousand transcription start sites (TSSs) in the mouse and human genomes, allowing precise analysis of the sequence architecture and evolution of distinct promoter classes. Different tissues and families of genes differentially use distinct types of promoters. Our tagging methods allow quantitative analysis of promoter usage in different tissues and show that differentially regulated alternative TSSs are a common feature in protein-coding genes and commonly generate alternative N termini. Among the TSSs, we identified new start sites associated with the majority of exons and with 3' UTRs. These data permit genome-scale identification of tissue-specific promoters and analysis of the cis-acting elements associated with them.


Assuntos
Evolução Molecular , Regiões Promotoras Genéticas , Regiões 3' não Traduzidas , Animais , Sequência de Bases , DNA , Genoma , Proteoma , TATA Box
19.
Gene ; 350(2): 149-60, 2005 May 09.
Artigo em Inglês | MEDLINE | ID: mdl-15788151

RESUMO

We developed a reliability index named SRED (Spot Reliability Evaluation Score for DNA microarrays) that represents the probability that the calibrated gene expression level from a DNA microarray would be less than a factor of 2 different from that of quantitative real-time polymerase chain reaction assays whose dynamic quantification range is treated statistically to be similar to that of the DNA microarray. To define the SRED score, two parameters, the reproducibility of measurement value and the relative expression value were selected from nine candidate parameters. The SRED score supplies the probability that the expression level in each spot of a microarray is less than a certain-fold different compared to other expression profiling data, such as QRT-PCR. This score was applied to approximately 1,500,000 points of the expression profile in the RIKEN Expression Array Database.


Assuntos
Perfilação da Expressão Gênica/métodos , Análise de Sequência com Séries de Oligonucleotídeos/normas , Algoritmos , Animais , Encéfalo/metabolismo , Linhagem Celular Tumoral , Cerebelo/metabolismo , Proteínas de Ligação a DNA/genética , Embrião de Mamíferos/metabolismo , Fator 3-beta Nuclear de Hepatócito , Camundongos , Camundongos Endogâmicos C57BL , Proteínas Nucleares/genética , RNA/genética , RNA/metabolismo , Reprodutibilidade dos Testes , Reação em Cadeia da Polimerase Via Transcriptase Reversa/métodos , Reação em Cadeia da Polimerase Via Transcriptase Reversa/normas , Fatores de Tempo , Fatores de Transcrição/genética , Transfecção
20.
FEBS Lett ; 559(1-3): 22-6, 2004 Feb 13.
Artigo em Inglês | MEDLINE | ID: mdl-14960301

RESUMO

The RIKEN expression array database (READ) provides comprehensive gene expression data for the mouse, which were obtained as relative values from microarray double-staining experiments with E17.5 mRNA as common reference. To assign absolute expression values for mouse transcripts within READ, we applied the E17.5 reference sample to CAGE (cap analysis of gene expression) and expressed sequence tag (EST) high-throughput tag sequencing. Newly assigned values within the READ database were validated by comparison to expression data from serial analysis of gene expression, CAGE and EST experiments. These experiments confirmed the great significance of the absolute expression values within the improved READ database. The new Absolute READ database on absolute expression data is available under.


Assuntos
Bases de Dados de Ácidos Nucleicos , Perfilação da Expressão Gênica/normas , Camundongos/genética , RNA Mensageiro/análise , Animais , Bases de Dados de Ácidos Nucleicos/normas , Etiquetas de Sequências Expressas , Perfilação da Expressão Gênica/métodos , Camundongos Endogâmicos C57BL , Análise de Sequência com Séries de Oligonucleotídeos , Especificidade de Órgãos , Capuzes de RNA
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...