RESUMEN
Risk prediction for heart failure (HF) using machine learning methods (MLM) has not yet been established at practical application levels in clinical settings. This study aimed to create a new risk prediction model for HF with a minimum number of predictor variables using MLM. We used two datasets of hospitalized HF patients: retrospective data for creating the model and prospectively registered data for model validation. Critical clinical events (CCEs) were defined as death or LV assist device implantation within 1 year from the discharge date. We randomly divided the retrospective data into training and testing datasets and created a risk prediction model based on the training dataset (MLM-risk model). The prediction model was validated using both the testing dataset and the prospectively registered data. Finally, we compared predictive power with published conventional risk models. In the patients with HF (n = 987), CCEs occurred in 142 patients. In the testing dataset, the substantial predictive power of the MLM-risk model was obtained (AUC = 0.87). We generated the model using 15 variables. Our MLM-risk model showed superior predictive power in the prospective study compared to conventional risk models such as the Seattle Heart Failure Model (c-statistics: 0.86 vs. 0.68, p < 0.05). Notably, the model with an input variable number (n = 5) has comparable predictive power for CCE with the model (variable number = 15). This study developed and validated a model with minimized variables to predict mortality more accurately in patients with HF, using a MLM, than the existing risk scores.
Asunto(s)
Inteligencia Artificial , Insuficiencia Cardíaca , Humanos , Estudios Retrospectivos , Estudios Prospectivos , Pronóstico , Insuficiencia Cardíaca/diagnóstico , Insuficiencia Cardíaca/terapia , AlgoritmosRESUMEN
Amyotrophic lateral sclerosis (ALS) is a devastating neurological disorder characterized by the degeneration of motor neurons and typically results in death within 3-5 years from onset. Familial ALS (FALS) comprises 5%-10% of ALS cases, and the identification of genes associated with FALS is indispensable to elucidating the molecular pathogenesis. We identified a Japanese family affected by late-onset, autosomal-dominant ALS in which mutations in genes known to be associated with FALS were excluded. A whole- genome sequencing and parametric linkage analysis under the assumption of an autosomal-dominant mode of inheritance with incomplete penetrance revealed the mutation c.2780G>A (p. Arg927Gln) in ERBB4. An extensive mutational analysis revealed the same mutation in a Canadian individual with familial ALS and a de novo mutation, c.3823C>T (p. Arg1275Trp), in a Japanese simplex case. These amino acid substitutions involve amino acids highly conserved among species, are predicted as probably damaging, and are located within a tyrosine kinase domain (p. Arg927Gln) or a C-terminal domain (p. Arg1275Trp), both of which mediate essential functions of ErbB4 as a receptor tyrosine kinase. Functional analysis revealed that these mutations led to a reduced autophosphorylation of ErbB4 upon neuregulin-1 (NRG-1) stimulation. Clinical presentations of the individuals with mutations were characterized by the involvement of both upper and lower motor neurons, a lack of obvious cognitive dysfunction, and relatively slow progression. This study indicates that disruption of the neuregulin-ErbB4 pathway is involved in the pathogenesis of ALS and potentially paves the way for the development of innovative therapeutic strategies such using NRGs or their agonists to upregulate ErbB4 functions.
Asunto(s)
Esclerosis Amiotrófica Lateral/genética , Receptores ErbB/genética , Mutación , Neurregulinas/genética , Anciano , Secuencia de Aminoácidos , Sustitución de Aminoácidos , Esclerosis Amiotrófica Lateral/patología , Pueblo Asiatico/genética , Canadá , Análisis Mutacional de ADN , Receptores ErbB/metabolismo , Femenino , Estudio de Asociación del Genoma Completo , Humanos , Masculino , Persona de Mediana Edad , Datos de Secuencia Molecular , Neuronas Motoras/metabolismo , Neuronas Motoras/patología , Neurregulinas/metabolismo , Linaje , Fosforilación , Receptor ErbB-4 , Análisis de Secuencia de ADN , Transducción de SeñalRESUMEN
Hereditary motor and sensory neuropathy with proximal dominant involvement (HMSN-P) is an autosomal-dominant neurodegenerative disorder characterized by widespread fasciculations, proximal-predominant muscle weakness, and atrophy followed by distal sensory involvement. To date, large families affected by HMSN-P have been reported from two different regions in Japan. Linkage and haplotype analyses of two previously reported families and two new families with the use of high-density SNP arrays further defined the minimum candidate region of 3.3 Mb in chromosomal region 3q12. Exome sequencing showed an identical c.854C>T (p.Pro285Leu) mutation in the TRK-fused gene (TFG) in the four families. Detailed haplotype analysis suggested two independent origins of the mutation. Pathological studies of an autopsied patient revealed TFG- and ubiquitin-immunopositive cytoplasmic inclusions in the spinal and cortical motor neurons. Fragmentation of the Golgi apparatus, a frequent finding in amyotrophic lateral sclerosis, was also observed in the motor neurons with inclusion bodies. Moreover, TAR DNA-binding protein 43 kDa (TDP-43)-positive cytoplasmic inclusions were also demonstrated. In cultured cells expressing mutant TFG, cytoplasmic aggregation of TDP-43 was demonstrated. These findings indicate that formation of TFG-containing cytoplasmic inclusions and concomitant mislocalization of TDP-43 underlie motor neuron degeneration in HMSN-P. Pathological overlap of proteinopathies involving TFG and TDP-43 highlights a new pathway leading to motor neuron degeneration.
Asunto(s)
Cromosomas Humanos Par 3/genética , Predisposición Genética a la Enfermedad/genética , Neuropatía Hereditaria Motora y Sensorial/genética , Proteínas/genética , Secuencia de Bases , Proteínas de Unión al ADN/genética , Exoma/genética , Ligamiento Genético , Aparato de Golgi/patología , Haplotipos/genética , Neuropatía Hereditaria Motora y Sensorial/patología , Humanos , Cuerpos de Inclusión/patología , Japón , Datos de Secuencia Molecular , Neuronas Motoras/patología , Linaje , Mutación Puntual/genética , Polimorfismo de Nucleótido Simple/genética , Análisis de Secuencia de ADNRESUMEN
Memory CD4(+) T cells are central regulators of both humoral and cellular immune responses. T cell differentiation results in specific changes in chromatin structure and DNA methylation of cytokine genes. Although the methylation status of a limited number of gene loci in T cells has been examined, the genome-wide DNA methylation status of memory CD4(+) T cells remains unexplored. To further elucidate the molecular signature of memory T cells, we conducted methylome and transcriptome analyses of memory CD4(+) T cells generated using T cells from TCR-transgenic mice. The resulting genome-wide DNA methylation profile revealed 1144 differentially methylated regions (DMRs) across the murine genome during the process of T cell differentiation, 552 of which were associated with gene loci. Interestingly, the majority of these DMRs were located in introns. These DMRs included genes such as CXCR6, Tbox21, Chsy1, and Cish, which are associated with cytokine production, homing to bone marrow, and immune responses. Methylation changes in memory T cells exposed to specific Ag appeared to regulate enhancer activity rather than promoter activity of immunologically relevant genes. In addition, methylation profiles differed between memory T cell subsets, demonstrating a link between T cell methylation status and T cell differentiation. By comparing DMRs between naive and Ag-specific memory T cells, this study provides new insights into the functional status of memory T cells.
Asunto(s)
Linfocitos T CD4-Positivos/inmunología , Linfocitos T CD4-Positivos/metabolismo , Metilación de ADN/genética , Epítopos de Linfocito T/metabolismo , Memoria Inmunológica/genética , Animales , Linfocitos T CD4-Positivos/citología , Diferenciación Celular/genética , Diferenciación Celular/inmunología , Epítopos de Linfocito T/inmunología , Ratones , Ratones Endogámicos BALB C , Ratones Noqueados , Ratones Transgénicos , TranscriptomaRESUMEN
Teleosts comprise more than half of all vertebrate species and have adapted to a variety of marine and freshwater habitats. Their genome evolution and diversification are important subjects for the understanding of vertebrate evolution. Although draft genome sequences of two pufferfishes have been published, analysis of more fish genomes is desirable. Here we report a high-quality draft genome sequence of a small egg-laying freshwater teleost, medaka (Oryzias latipes). Medaka is native to East Asia and an excellent model system for a wide range of biology, including ecotoxicology, carcinogenesis, sex determination and developmental genetics. In the assembled medaka genome (700 megabases), which is less than half of the zebrafish genome, we predicted 20,141 genes, including approximately 2,900 new genes, using 5'-end serial analysis of gene expression tag information. We found single nucleotide polymorphisms (SNPs) at an average rate of 3.42% between the two inbred strains derived from two regional populations; this is the highest SNP rate seen in any vertebrate species. Analyses based on the dense SNP information show a strict genetic separation of 4 million years (Myr) between the two populations, and suggest that differential selective pressures acted on specific gene categories. Four-way comparisons with the human, pufferfish (Tetraodon), zebrafish and medaka genomes revealed that eight major interchromosomal rearrangements took place in a remarkably short period of approximately 50 Myr after the whole-genome duplication event in the teleost ancestor and afterwards, intriguingly, the medaka genome preserved its ancestral karyotype for more than 300 Myr.
Asunto(s)
Evolución Molecular , Genoma/genética , Oryzias/genética , Animales , China , Cromosomas/genética , Proteínas de Peces/genética , Genómica , Humanos , Japón , Oryzias/clasificación , Filogenia , Polimorfismo de Nucleótido Simple/genética , Homología de Secuencia de Ácido Nucleico , Especificidad de la Especie , Taiwán , Factores de TiempoRESUMEN
War is an extreme form of collective human behaviour characterized by coordinated violence. We show that this nature of war is substantiated in the temporal patterns of conflict occurrence that obey power law. The focal metric is the interconflict interval (ICI), the interval between the end of a conflict in a dyad (i.e. a pair of states) and the start of the subsequent conflict in the same dyad. Using elaborate statistical tests, we confirmed that ICI samples compiled from the history of interstate conflicts from 1816 to 2014 followed a power-law distribution. We then demonstrate that the power-law properties of ICIs can be explained by a hypothetical model assuming an information-theoretic formulation of the Clausewitz thesis on war: the use of force is a means of interstate communication. Our findings help us to understand the nature of wars between regular states, the significance of which has increased since the Russian invasion of Ukraine in 2022.
RESUMEN
Genetic and phenotypic heterogeneities are considerably high in adult-onset leukoencephalopathy, in which comprehensive mutational analyses of the candidate genes by conventional methods are too laborious. We applied exome sequencing to conduct a comprehensive mutational analysis of genes for autosomal dominant leukoencephalopathies. Genomic DNA samples from four patients of three families with autosomal dominantly inherited adult-onset leukodystrophy were subjected to exome sequencing. On the basis of the results, 21 patients with adult-onset sporadic leukodystrophy and one patient with pathologically proven HDLS were additionally screened for CSF1R mutations. Exome sequencing identified heterozygous CSF1R mutations (p.I794T and p.R777W) in two families. I794T has recently been reported as a causative mutation for hereditary diffuse leukoencephalopathy with spheroids (HDLS), and R777W is a novel mutation. Although mutational analysis of CSF1R in 21 sporadic cases revealed no mutations, another novel CSF1R mutation, p.C653Y, was identified in one patient with autopsy-proven HDSL. These variants were located in the PTK domain where the causative mutations cluster. Functional prediction of the mutant CSF1R as well as cross-species conservation of the affected amino acids supports the notion that these variants are pathogenic for HDLS. Exome sequencing is useful for a comprehensive mutational analysis of causative genes for hereditary leukoencephalopathies, and CSF1R should be considered a candidate gene for patients with autosomal dominant leukoencephalopathies.
Asunto(s)
Predisposición Genética a la Enfermedad , Leucoencefalopatías/genética , Receptor de Factor Estimulante de Colonias de Macrófagos/genética , Adulto , Anciano , Análisis Mutacional de ADN , Exoma/genética , Familia , Femenino , Variación Genética , Humanos , Masculino , Persona de Mediana Edad , Mutación , Linaje , Análisis de Secuencia de ADNRESUMEN
Posterior column ataxia with retinitis pigmentosa (PCARP) is an autosomal recessive neurodegenerative disorder characterized by retinitis pigmentosa and sensory ataxia. Previous studies of PCARP in two families showed a linkage to 1q31-q32. However, detailed investigations on the clinical presentations as well as molecular genetics of PCARP have been limited. Here, we describe a Japanese consanguineous family with PCARP. Two affected siblings suffered from childhood-onset retinitis pigmentosa and slowly progressive sensory ataxia. They also showed mild mental retardation, which has not been described in patients with PCARP. Parametric linkage analysis using high-density single nucleotide polymorphism arrays supported a linkage to the same locus. Target capture and high-throughput sequencing technologies revealed a novel homozygous c.1477G>C (G493R) mutation in FLVCR1, which cosegregated with the disease. A recent study has identified three independent mutations in FLVCR1 in the original and other families. Our results further confirmed that PCARP is caused by mutations in FLVCR1.
Asunto(s)
Pueblo Asiatico/genética , Ataxia/genética , Familia , Proteínas de Transporte de Membrana/genética , Mutación Missense , Receptores Virales/genética , Retinitis Pigmentosa/genética , Adulto , Secuencia de Aminoácidos , Secuencia de Bases , Femenino , Ligamiento Genético , Humanos , Masculino , Mutación Missense/fisiología , LinajeRESUMEN
MachiBase (http://machibase.gi.k.u-tokyo.ac.jp/) provides a comprehensive and freely accessible resource regarding Drosophila melanogaster 5'-end mRNA transcription at different developmental states, supporting studies on the variabilities of promoter transcriptional activities and gene-expression profiles in the fruitfly. The data were generated in conjunction with the recently developed high-throughput genome sequencer Illumina/Solexa using a newly developed 5'-end mRNA collection method.
Asunto(s)
Regiones no Traducidas 5' , Bases de Datos Genéticas , Drosophila melanogaster/genética , Transcripción Genética , Animales , Drosophila melanogaster/embriología , Drosophila melanogaster/crecimiento & desarrollo , Femenino , Perfilación de la Expresión Génica , Masculino , Lugares Marcados de Secuencia , Sitio de Iniciación de la TranscripciónRESUMEN
UNLABELLED: The advent of high-throughput DNA sequencers has increased the pace of collecting enormous amounts of genomic information, yielding billions of nucleotides on a weekly basis. This advance represents an improvement of two orders of magnitude over traditional Sanger sequencers in terms of the number of nucleotides per unit time, allowing even small groups of researchers to obtain huge volumes of genomic data over fairly short period. Consequently, a pressing need exists for the development of personalized genome browsers for analyzing these immense amounts of locally stored data. The UTGB (University of Tokyo Genome Browser) Toolkit is designed to meet three major requirements for personalization of genome browsers: easy installation of the system with minimum efforts, browsing locally stored data and rapid interactive design of web interfaces tailored to individual needs. The UTGB Toolkit is licensed under an open source license. AVAILABILITY: The software is freely available at http://utgenome.org/.
Asunto(s)
Genoma , Genómica/métodos , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Bases de Datos Genéticas , Internet , Interfaz Usuario-ComputadorRESUMEN
Medaka (Oryzias latipes) is a small egg-laying freshwater teleost native to East Asia that has become an excellent model system for developmental genetics and evolutionary biology. The draft medaka genome sequence (700 Mb) was reported in June 2007, and its substantial genomic resources have been opened to the public through the University of Tokyo Genome Browser Medaka (UTGB/medaka) database. This database provides basic genomic information, such as predicted genes, expressed sequence tags (ESTs), guanine/cytosine (GC) content, repeats and comparative genomics, as well as unique data resources including (i) 2473 genetic markers and experimentally confirmed PCR primers that amplify these markers, (ii) 142,414 bacterial artificial chromosome (BAC) and 217,344 fosmid end sequences that amount to 15.0- and 11.1-fold clone coverage of the entire genome, respectively, and were used for draft genome assembly, (iii) 16,519,460 single nucleotide polymorphisms (SNPs), and 2 859 905 insertions/deletions detected between two medaka inbred strain genomes and (iv) 841 235 5'-end serial analyses of gene-expression (SAGE) tags that identified 344 266 transcription start sites on the genome. UTGB/medaka is available at: http://medaka.utgenome.org/.
Asunto(s)
Bases de Datos Genéticas , Genómica , Oryzias/genética , Animales , Cromosomas Artificiales Bacterianos , Expresión Génica , Marcadores Genéticos , Variación Genética , Internet , Plásmidos/genética , Polimorfismo de Nucleótido Simple , Sitio de Iniciación de la Transcripción , Interfaz Usuario-ComputadorRESUMEN
OBJECTIVE: Glucocerebrosidase gene (GBA) variants that cause Gaucher disease are associated with Parkinson disease (PD) and dementia with Lewy bodies (DLB). To investigate the role of GBA variants in multiple system atrophy (MSA), we analyzed GBA variants in a large case-control series. METHODS: We sequenced coding regions and flanking splice sites of GBA in 969 MSA patients (574 Japanese, 223 European, and 172 North American) and 1509 control subjects (900 Japanese, 315 European, and 294 North American). We focused solely on Gaucher-disease-causing GBA variants. RESULTS: In the Japanese series, we found nine carriers among the MSA patients (1.65%) and eight carriers among the control subjects (0.89%). In the European series, we found three carriers among the MSA patients (1.35%) and two carriers among the control subjects (0.63%). In the North American series, we found five carriers among the MSA patients (2.91%) and one carrier among the control subjects (0.34%). Subjecting each series to a Mantel-Haenszel analysis yielded a pooled odds ratio (OR) of 2.44 (95% confidence interval [CI], 1.14-5.21) and a P-value of 0.029 without evidence of significant heterogeneity. Logistic regression analysis yielded similar results, with an adjusted OR of 2.43 (95% CI 1.15-5.37) and a P-value of 0.022. Subtype analysis showed that Gaucher-disease-causing GBA variants are significantly associated with MSA cerebellar subtype (MSA-C) patients (P = 7.3 × 10(-3)). INTERPRETATION: The findings indicate that, as in PD and DLB, Gaucher-disease-causing GBA variants are associated with MSA.
RESUMEN
Massively parallel, tag-based sequencing systems, such as the SOLiD system, hold the promise of revolutionizing the study of whole genome gene expression due to the number of data points that can be generated in a simple and cost-effective manner. We describe the development of a 5'-end transcriptome workflow for the SOLiD system and demonstrate the advantages in sensitivity and dynamic range offered by this tag-based application over traditional approaches for the study of whole genome gene expression. 5'-end transcriptome analysis was used to study whole genome gene expression within a colon cancer cell line, HT-29, treated with the DNA methyltransferase inhibitor, 5-aza-2'-deoxycytidine (5Aza). More than 20 million 25-base 5'-end tags were obtained from untreated and 5Aza-treated cells and matched to sequences within the human genome. Seventy three percent of the mapped unique tags were associated with RefSeq cDNA sequences, corresponding to approximately 14,000 different protein-coding genes in this single cell type. The level of expression of these genes ranged from 0.02 to 4,704 transcripts per cell. The sensitivity of a single sequence run of the SOLiD platform was 100-1,000 fold greater than that observed from 5'end SAGE data generated from the analysis of 70,000 tags obtained by Sanger sequencing. The high-resolution 5'end gene expression profiling presented in this study will not only provide novel insight into the transcriptional machinery but should also serve as a basis for a better understanding of cell biology.
Asunto(s)
Regiones no Traducidas 5' , Perfilación de la Expresión Génica/instrumentación , Expresión Génica , Análisis de Secuencia de ADN/instrumentación , Regiones no Traducidas 5'/genética , Ciclo Celular/fisiología , Línea Celular Tumoral , Exones , Perfilación de la Expresión Génica/métodos , Biblioteca de Genes , Humanos , Intrones , Datos de Secuencia Molecular , Análisis de Secuencia por Matrices de Oligonucleótidos , Reproducibilidad de los Resultados , Análisis de Secuencia de ADN/métodosRESUMEN
Might DNA sequence variation reflect germline genetic activity and underlying chromatin structure? We investigated this question using medaka (Japanese killifish, Oryzias latipes), by comparing the genomic sequences of two strains (Hd-rR and HNI) and by mapping approximately 37.3 million nucleosome cores from Hd-rR blastulae and 11,654 representative transcription start sites from six embryonic stages. We observed a distinctive approximately 200-base pair (bp) periodic pattern of genetic variation downstream of transcription start sites; the rate of insertions and deletions longer than 1 bp peaked at positions of approximately +200, +400, and +600 bp, whereas the point mutation rate showed corresponding valleys. This approximately 200-bp periodicity was correlated with the chromatin structure, with nucleosome occupancy minimized at positions 0, +200, +400, and +600 bp. These data exemplify the potential for genetic activity (transcription) and chromatin structure to contribute to molding the DNA sequence on an evolutionary time scale.