Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 43
Filtrar
Más filtros

Base de datos
País/Región como asunto
Tipo del documento
País de afiliación
Intervalo de año de publicación
1.
Sci Adv ; 10(16): eadi8419, 2024 Apr 19.
Artículo en Inglés | MEDLINE | ID: mdl-38630824

RESUMEN

We generated Japanese Encyclopedia of Whole-Genome/Exome Sequencing Library (JEWEL), a high-depth whole-genome sequencing dataset comprising 3256 individuals from across Japan. Analysis of JEWEL revealed genetic characteristics of the Japanese population that were not discernible using microarray data. First, rare variant-based analysis revealed an unprecedented fine-scale genetic structure. Together with population genetics analysis, the present-day Japanese can be decomposed into three ancestral components. Second, we identified unreported loss-of-function (LoF) variants and observed that for specific genes, LoF variants appeared to be restricted to a more limited set of transcripts than would be expected by chance, with PTPRD as a notable example. Third, we identified 44 archaic segments linked to complex traits, including a Denisovan-derived segment at NKX6-1 associated with type 2 diabetes. Most of these segments are specific to East Asians. Fourth, we identified candidate genetic loci under recent natural selection. Overall, our work provided insights into genetic characteristics of the Japanese population.


Asunto(s)
Diabetes Mellitus Tipo 2 , Humanos , Japón , Selección Genética , Secuenciación Completa del Genoma , Exoma
2.
Hum Genome Var ; 11(1): 18, 2024 Apr 17.
Artículo en Inglés | MEDLINE | ID: mdl-38632226

RESUMEN

Short- and long-read sequencing technologies are routinely used to detect DNA variants, including SNVs, indels, and structural variations (SVs). However, the differences in the quality and quantity of variants detected between short- and long-read data are not fully understood. In this study, we comprehensively evaluated the variant calling performance of short- and long-read-based SNV, indel, and SV detection algorithms (6 for SNVs, 12 for indels, and 13 for SVs) using a novel evaluation framework incorporating manual visual inspection. The results showed that indel-insertion calls greater than 10 bp were poorly detected by short-read-based detection algorithms compared to long-read-based algorithms; however, the recall and precision of SNV and indel-deletion detection were similar between short- and long-read data. The recall of SV detection with short-read-based algorithms was significantly lower in repetitive regions, especially for small- to intermediate-sized SVs, than that detected with long-read-based algorithms. In contrast, the recall and precision of SV detection in nonrepetitive regions were similar between short- and long-read data. These findings suggest the need for refined strategies, such as incorporating multiple variant detection algorithms, to generate a more complete set of variants using short-read data.

3.
Cell Genom ; 3(6): 100328, 2023 Jun 14.
Artículo en Inglés | MEDLINE | ID: mdl-37388916

RESUMEN

Genomic structural variation (SV) affects genetic and phenotypic characteristics in diverse organisms, but the lack of reliable methods to detect SV has hindered genetic analysis. We developed a computational algorithm (MOPline) that includes missing call recovery combined with high-confidence SV call selection and genotyping using short-read whole-genome sequencing (WGS) data. Using 3,672 high-coverage WGS datasets, MOPline stably detected ∼16,000 SVs per individual, which is over ∼1.7-3.3-fold higher than previous large-scale projects while exhibiting a comparable level of statistical quality metrics. We imputed SVs from 181,622 Japanese individuals for 42 diseases and 60 quantitative traits. A genome-wide association study with the imputed SVs revealed 41 top-ranked or nearly top-ranked genome-wide significant SVs, including 8 exonic SVs with 5 novel associations and enriched mobile element insertions. This study demonstrates that short-read WGS data can be used to identify rare and common SVs associated with a variety of traits.

4.
Hum Mol Genet ; 32(12): 2046-2054, 2023 06 05.
Artículo en Inglés | MEDLINE | ID: mdl-36905328

RESUMEN

Von Hippel-Lindau (VHL) disease is an autosomal dominant, inherited syndrome with variants in the VHL gene, causing predisposition to multi-organ neoplasms with vessel abnormality. Germline variants in VHL can be detected in 80-90% of patients clinically diagnosed with VHL disease. Here, we summarize the results of genetic tests for 206 Japanese VHL families, and elucidate the molecular mechanisms of VHL disease, especially in variant-negative unsolved cases. Of the 206 families, genetic diagnosis was positive in 175 families (85%), including 134 families (65%) diagnosed by exon sequencing (15 novel variants) and 41 (20%) diagnosed by multiplex ligation-dependent probe amplification (MLPA) (one novel variant). The deleterious variants were significantly enriched in VHL disease Type 1. Interestingly, five synonymous or non-synonymous variants within exon 2 caused exon 2 skipping, which is the first report of exon 2 skipping caused by several missense variants. Whole genome and target deep sequencing analysis were performed for 22 unsolved cases with no variant identified and found three cases with VHL mosaicism (variant allele frequency: 2.5-22%), one with mobile element insertion in the VHL promoter region, and two with a pathogenic variant of BAP1 or SDHB. The variants associated with VHL disease are heterogeneous, and for more accuracy of the genetic diagnosis of VHL disease, comprehensive genome and DNA/RNA analyses are required to detect VHL mosaicism, complicated structure variants and other related gene variants.


Asunto(s)
Enfermedad de von Hippel-Lindau , Humanos , Enfermedad de von Hippel-Lindau/genética , Enfermedad de von Hippel-Lindau/diagnóstico , Japón , Análisis Mutacional de ADN , Proteína Supresora de Tumores del Síndrome de Von Hippel-Lindau/genética , Genómica , Linaje
5.
Plant Direct ; 5(10): e352, 2021 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-34646975

RESUMEN

Wild plants are often tolerant to biotic and abiotic stresses in their natural environments, whereas domesticated plants such as crops frequently lack such resilience. This difference is thought to be due to the high levels of genome heterozygosity in wild plant populations and the low levels of heterozygosity in domesticated crop species. In this study, common vetch (Vicia sativa) was used as a model to examine this hypothesis. The common vetch genome (2n = 14) was estimated as 1.8 Gb in size. Genome sequencing produced a reference assembly that spanned 1.5 Gb, from which 31,146 genes were predicted. Using this sequence as a reference, 24,118 single nucleotide polymorphisms were discovered in 1243 plants from 12 natural common vetch populations in Japan. Common vetch genomes exhibited high heterozygosity at the population level, with lower levels of heterozygosity observed at specific genome regions. Such patterns of heterozygosity are thought to be essential for adaptation to different environments. The resources generated in this study will provide insights into de novo domestication of wild plants and agricultural enhancement.

6.
Am J Med Genet A ; 185(5): 1468-1480, 2021 05.
Artículo en Inglés | MEDLINE | ID: mdl-33624935

RESUMEN

Intellectual disability (ID) is characterized by significant limitations in both intellectual functioning and adaptive behaviors, originating before the age of 18 years. However, the genetic etiologies of ID are still incompletely elucidated due to the wide range of clinical and genetic heterogeneity. Whole genome sequencing (WGS) has been applied as a single-step clinical diagnostic tool for ID because it detects genetic variations with a wide range of resolution from single nucleotide variants (SNVs) to structural variants (SVs). To explore the causative genes for ID, we employed WGS in 45 patients from 44 unrelated Japanese families and performed a stepwise screening approach focusing on the coding variants in the genes. Here, we report 12 pathogenic and likely pathogenic variants: seven heterozygous variants of ADNP, SATB2, ANKRD11, PTEN, TCF4, SPAST, and KCNA2, three hemizygous variants of SMS, SLC6A8, and IQSEC2, and one homozygous variant in AGTPBP1. Of these, four were considered novel. Furthermore, a novel 76 kb deletion containing exons 1 and 2 in DYRK1A was identified. We confirmed the clinical and genetic heterogeneity and high frequency of de novo causative variants (8/12, 66.7%). This is the first report of WGS analysis in Japanese patients with ID. Our results would provide insight into the correlation between novel variants and expanded phenotypes of the disease.


Asunto(s)
Predisposición Genética a la Enfermedad , Discapacidad Intelectual/genética , Proteínas Serina-Treonina Quinasas/genética , Proteínas Tirosina Quinasas/genética , Adolescente , Heterogeneidad Genética , Genoma Humano/genética , Heterocigoto , Proteínas de Homeodominio/genética , Homocigoto , Humanos , Discapacidad Intelectual/epidemiología , Discapacidad Intelectual/patología , Japón/epidemiología , Masculino , Secuenciación Completa del Genoma , Quinasas DyrK
7.
PLoS Genet ; 16(8): e1008915, 2020 08.
Artículo en Inglés | MEDLINE | ID: mdl-32776928

RESUMEN

Sequences homologous to human herpesvirus 6 (HHV-6) are integrated within the nuclear genome of about 1% of humans, but it is not clear how this came about. It is also uncertain whether integrated HHV-6 can reactivate into an infectious virus. HHV-6 integrates into telomeres, and this has recently been associated with polymorphisms affecting MOV10L1. MOV10L1 is located on the subtelomere of chromosome 22q (chr22q) and is required to make PIWI-interacting RNAs (piRNAs). As piRNAs block germline integration of transposons, piRNA-mediated repression of HHV-6 integration has been proposed to explain this association. In vitro, recombination of the HHV-6 genome along its terminal direct repeats (DRs) leads to excision from the telomere and viral reactivation, but the expected "solo-DR scar" has not been described in vivo. Here we screened for integrated HHV-6 in 7,485 Japanese subjects using whole-genome sequencing (WGS). Integrated HHV-6 was associated with polymorphisms on chr22q. However, in contrast to prior work, we find that the reported MOV10L1 polymorphism is physically linked to an ancient endogenous HHV-6A variant integrated into the telomere of chr22q in East Asians. Unexpectedly, an HHV-6B variant has also endogenized in chr22q; two endogenous HHV-6 variants at this locus thus account for 72% of all integrated HHV-6 in Japan. We also report human genomes carrying only one portion of the HHV-6B genome, a solo-DR, supporting in vivo excision and possible viral reactivation. Together these results explain the recently-reported association between integrated HHV-6 and MOV10L1/piRNAs, suggest potential exaptation of HHV-6 in its coevolution with human chr22q, and clarify the evolution and risk of reactivation of the only intact (non-retro)viral genome known to be present in human germlines.


Asunto(s)
Genoma Humano , Herpesvirus Humano 6/genética , Integración Viral , Pueblo Asiatico/genética , Cromosomas Humanos Par 22/genética , Evolución Molecular , Mutación de Línea Germinal , Humanos , Polimorfismo de Nucleótido Simple , ARN Interferente Pequeño/genética
8.
Genome Biol ; 21(1): 119, 2020 05 18.
Artículo en Inglés | MEDLINE | ID: mdl-32423416

RESUMEN

Recent advances in long-read sequencing solve inaccuracies in alternative transcript identification of full-length transcripts in short-read RNA-Seq data, which encourages the development of methods for isoform-centered functional analysis. Here, we present tappAS, the first framework to enable a comprehensive Functional Iso-Transcriptomics (FIT) analysis, which is effective at revealing the functional impact of context-specific post-transcriptional regulation. tappAS uses isoform-resolved annotation of coding and non-coding functional domains, motifs, and sites, in combination with novel analysis methods to interrogate different aspects of the functional readout of transcript variants and isoform regulation. tappAS software and documentation are available at https://app.tappas.org.


Asunto(s)
Empalme Alternativo , Perfilación de la Expresión Génica/métodos , Isoformas de Proteínas/metabolismo , Programas Informáticos , Animales , Ratones , Células Precursoras de Oligodendrocitos/metabolismo , Poliadenilación
9.
JCO Precis Oncol ; 4: 183-191, 2020 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-35050733

RESUMEN

PUPOSE: We investigated the prevalence and spectrum of pathogenic germline variants in patients with early-onset colorectal cancer (CRC), breast cancer (BC), and prostate cancer (PCA) in the Japanese population. We also identified pathogenic variants in other cancer risk genes, giving consideration to future multigene testing panels for this population. METHODS: We performed whole-genome sequencing for 1,037 Japanese individuals, including patients with early-onset CRC (n = 196), BC (n = 237), and PCA (n = 215) and controls (n = 389). We screened for pathogenic variants, including single nucleotide variants and copy number variants, among well-established first-tier cancer genes for each cancer type and examined an expended second-tier panel including cancer-predisposing genes from the Cancer Gene Census. RESULTS: Proportions of patients with germline pathogenic variants differed by cancer subgroup, with the highest in BC (14.8%), followed by CRC (9.2%), and PCA (3.7%). In contrast, 2 of 389 control subjects (0.5%) carried a germline pathogenic variant. In comparison with controls, the proportion of patients with pathogenic variants in the second-tier panel was increased significantly for PCA (3.7% to 11.6%, P = 2.96 × 10-4), but not for CRC or BC, after multitesting adjustment. In patients with PCA, DNA repair pathway genes in the extended panel often contained pathogenic variants (P = .011). CONCLUSION: Our analyses support the clinical usefulness of established cancer gene panels in the Japanese population for 3 major cancer types. Additional genes, especially those involved in DNA repair, might be considered for developing multipanel testing in Japanese patients with early-onset PCA.

10.
Artículo en Inglés | MEDLINE | ID: mdl-31444167

RESUMEN

Intellectual disability (ID) is a clinically and genetically heterogeneous developmental brain disorder. The present study describes two male siblings, aged 7 and 1 yr old, with severe ID, spastic quadriplegia, nystagmus, and brain atrophy with acquired microcephaly. We used the exome sequencing to identify the causative gene in the patients and identified a hemizygous missense variant, c.1282T>A (p.W428R), in the p21-activated serine/threonine kinase 3 gene (PAK3), which is associated with X-linked ID. p.W428R is located within the highly conserved kinase domain and was predicted to induce loss of enzymatic function by three mutation prediction tools (SIFT, PolyPhen-2, and MutationTaster). In addition, this variant has not been reported in public databases (as of the middle of December 2018) or in the data from 3275 individuals of the Japanese general population analyzed using high-depth whole-genome sequencing. To date, only 13 point mutations and deletions in PAK3 in ID have been reported. The literature review illustrated a phenotypic spectrum of PAK3 pathogenic variant, and our cases represented the most severe form of the PAK3-associated phenotypes. This is the first report of a PAK3 pathogenic variant in Japanese patients with X-linked ID.


Asunto(s)
Discapacidad Intelectual Ligada al Cromosoma X/genética , Quinasas p21 Activadas/genética , Niño , Discapacidades del Desarrollo/genética , Exoma , Genes Ligados a X/genética , Estudios de Asociación Genética , Humanos , Lactante , Discapacidad Intelectual/genética , Discapacidad Intelectual/metabolismo , Japón , Masculino , Discapacidad Intelectual Ligada al Cromosoma X/metabolismo , Microcefalia/genética , Mutación , Mutación Missense/genética , Linaje , Fenotipo , Hermanos , Secuenciación del Exoma/métodos , Quinasas p21 Activadas/metabolismo
11.
Genome Biol ; 20(1): 117, 2019 06 03.
Artículo en Inglés | MEDLINE | ID: mdl-31159850

RESUMEN

BACKGROUND: Structural variations (SVs) or copy number variations (CNVs) greatly impact the functions of the genes encoded in the genome and are responsible for diverse human diseases. Although a number of existing SV detection algorithms can detect many types of SVs using whole genome sequencing (WGS) data, no single algorithm can call every type of SVs with high precision and high recall. RESULTS: We comprehensively evaluate the performance of 69 existing SV detection algorithms using multiple simulated and real WGS datasets. The results highlight a subset of algorithms that accurately call SVs depending on specific types and size ranges of the SVs and that accurately determine breakpoints, sizes, and genotypes of the SVs. We enumerate potential good algorithms for each SV category, among which GRIDSS, Lumpy, SVseq2, SoftSV, Manta, and Wham are better algorithms in deletion or duplication categories. To improve the accuracy of SV calling, we systematically evaluate the accuracy of overlapping calls between possible combinations of algorithms for every type and size range of SVs. The results demonstrate that both the precision and recall for overlapping calls vary depending on the combinations of specific algorithms rather than the combinations of methods used in the algorithms. CONCLUSION: These results suggest that careful selection of the algorithms for each type and size range of SVs is required for accurate calling of SVs. The selection of specific pairs of algorithms for overlapping calls promises to effectively improve the SV detection accuracy.


Asunto(s)
Variación Estructural del Genoma , Genómica/métodos , Secuenciación Completa del Genoma , Algoritmos , Humanos
12.
Hum Genome Var ; 6: 1, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-30534410

RESUMEN

Dandy-Walker malformation (DWM) is a rare congenital malformation defined by hypoplasia of the cerebellar vermis and cystic dilatation of the fourth ventricle. Oligophrenin-1 is mutated in X-linked intellectual disability with or without cerebellar hypoplasia. Here, we report a Japanese DWM patient carrying a novel intragenic 13.5-kb deletion in OPHN1 ranging from exon 11-15. This is the first report of an OPHN1 deletion in a Japanese patient with DWM.

13.
BMC Genomics ; 17: 370, 2016 05 18.
Artículo en Inglés | MEDLINE | ID: mdl-27194050

RESUMEN

BACKGROUND: Magnaporthe oryzae (anamorph Pyricularia oryzae) is the causal agent of blast disease of Poaceae crops and their wild relatives. To understand the genetic mechanisms that drive host specialization of M. oryzae, we carried out whole genome resequencing of four M. oryzae isolates from rice (Oryza sativa), one from foxtail millet (Setaria italica), three from wild foxtail millet S. viridis, and one isolate each from finger millet (Eleusine coracana), wheat (Triticum aestivum) and oat (Avena sativa), in addition to an isolate of a sister species M. grisea, that infects the wild grass Digitaria sanguinalis. RESULTS: Whole genome sequence comparison confirmed that M. oryzae Oryza and Setaria isolates form a monophyletic and close to another monophyletic group consisting of isolates from Triticum and Avena. This supports previous phylogenetic analysis based on a small number of genes and molecular markers. When comparing the host specific subgroups, 1.2-3.5 % of genes showed presence/absence polymorphisms and 0-6.5 % showed an excess of non-synonymous substitutions. Most of these genes encoded proteins whose functional domains are present in multiple copies in each genome. Therefore, the deleterious effects of these mutations could potentially be compensated by functional redundancy. Unlike the accumulation of nonsynonymous nucleotide substitutions, gene loss appeared to be independent of divergence time. Interestingly, the loss and gain of genes in pathogens from the Oryza and Setaria infecting lineages occurred more frequently when compared to those infecting Triticum and Avena even though the genetic distance between Oryza and Setaria lineages was smaller than that between Triticum and Avena lineages. In addition, genes showing gain/loss and nucleotide polymorphisms are linked to transposable elements highlighting the relationship between genome position and gene evolution in this pathogen species. CONCLUSION: Our comparative genomics analyses of host-specific M. oryzae isolates revealed gain and loss of genes as a major evolutionary mechanism driving specialization to Oryza and Setaria. Transposable elements appear to facilitate gene evolution possibly by enhancing chromosomal rearrangements and other forms of genetic variation.


Asunto(s)
Elementos Transponibles de ADN , Genes Fúngicos , Variación Genética , Interacciones Huésped-Patógeno , Magnaporthe/genética , Mapeo Cromosómico , Cromosomas Fúngicos , Evolución Molecular , Genoma Fúngico , Genómica/métodos , Magnaporthe/clasificación , Mutación , Filogenia
15.
DNA Res ; 23(2): 171-80, 2016 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-26975196

RESUMEN

Zoysiais a warm-season turfgrass, which comprises 11 allotetraploid species (2n= 4x= 40), each possessing different morphological and physiological traits. To characterize the genetic systems of Zoysia plants and to analyse their structural and functional differences in individual species and accessions, we sequenced the genomes of Zoysia species using HiSeq and MiSeq platforms. As a reference sequence of Zoysia species, we generated a high-quality draft sequence of the genome of Z. japonica accession 'Nagirizaki' (334 Mb) in which 59,271 protein-coding genes were predicted. In parallel, draft genome sequences of Z. matrella 'Wakaba' and Z. pacifica 'Zanpa' were also generated for comparative analyses. To investigate the genetic diversity among the Zoysia species, genome sequence reads of three additional accessions, Z. japonica'Kyoto', Z. japonica'Miyagi' and Z. matrella'Chiba Fair Green', were accumulated, and aligned against the reference genome of 'Nagirizaki' along with those from 'Wakaba' and 'Zanpa'. As a result, we detected 7,424,163 single-nucleotide polymorphisms and 852,488 short indels among these species. The information obtained in this study will be valuable for basic studies on zoysiagrass evolution and genetics as well as for the breeding of zoysiagrasses, and is made available in the 'Zoysia Genome Database' at http://zoysia.kazusa.or.jp.


Asunto(s)
Variación Genética , Genoma de Planta , Poaceae/genética , Análisis de Secuencia de ADN , Secuencia de Bases
16.
Bioinformatics ; 31(23): 3733-41, 2015 Dec 01.
Artículo en Inglés | MEDLINE | ID: mdl-26261222

RESUMEN

MOTIVATION: Genome assemblies generated with next-generation sequencing (NGS) reads usually contain a number of gaps. Several tools have recently been developed to close the gaps in these assemblies with NGS reads. Although these gap-closing tools efficiently close the gaps, they entail a high rate of misassembly at gap-closing sites. RESULTS: We have found that the assembly error rates caused by these tools are 20-500-fold higher than the rate of errors introduced into contigs by de novo assemblers. We here describe GMcloser, a tool that accurately closes these gaps with a preassembled contig set or a long read set (i.e., error-corrected PacBio reads). GMcloser uses likelihood-based classifiers calculated from the alignment statistics between scaffolds, contigs and paired-end reads to correctly assign contigs or long reads to gap regions of scaffolds, thereby achieving accurate and efficient gap closure. We demonstrate with sequencing data from various organisms that the gap-closing accuracy of GMcloser is 3-100-fold higher than those of other available tools, with similar efficiency. AVAILABILITY AND IMPLEMENTATION: GMcloser and an accompanying tool (GMvalue) for evaluating the assembly and correcting misassemblies except SNPs and short indels in the assembly are available at https://sourceforge.net/projects/gmcloser/. CONTACT: shunichi.kosugi@riken.jp. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Mapeo Contig/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Alineación de Secuencia , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Funciones de Verosimilitud
18.
Nat Genet ; 47(4): 405-9, 2015 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-25751626

RESUMEN

In Batesian mimicry, animals avoid predation by resembling distasteful models. In the swallowtail butterfly Papilio polytes, only mimetic-form females resemble the unpalatable butterfly Pachliopta aristolochiae. A recent report showed that a single gene, doublesex (dsx), controls this mimicry; however, the detailed molecular mechanisms remain unclear. Here we determined two whole-genome sequences of P. polytes and a related species, Papilio xuthus, identifying a single ∼130-kb autosomal inversion, including dsx, between mimetic (H-type) and non-mimetic (h-type) chromosomes in P. polytes. This inversion is associated with the mimicry-related locus H, as identified by linkage mapping. Knockdown experiments demonstrated that female-specific dsx isoforms expressed from the inverted H allele (dsx(H)) induce mimetic coloration patterns and simultaneously repress non-mimetic patterns. In contrast, dsx(h) does not alter mimetic patterns. We propose that dsx(H) switches the coloration of predetermined wing patterns and that female-limited polymorphism is tightly maintained by chromosomal inversion.


Asunto(s)
Adaptación Biológica/genética , Mariposas Diurnas/anatomía & histología , Mariposas Diurnas/genética , Alas de Animales/anatomía & histología , Animales , Secuencia de Bases , Reacción de Fuga , Femenino , Cadena Alimentaria , Genoma de los Insectos , Datos de Secuencia Molecular , Filogenia , Factores Sexuales
19.
PLoS Comput Biol ; 10(9): e1003841, 2014 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-25233087

RESUMEN

The nuclear export of proteins is regulated largely through the exportin/CRM1 pathway, which involves the specific recognition of leucine-rich nuclear export signals (NESs) in the cargo proteins, and modulates nuclear-cytoplasmic protein shuttling by antagonizing the nuclear import activity mediated by importins and the nuclear import signal (NLS). Although the prediction of NESs can help to define proteins that undergo regulated nuclear export, current methods of predicting NESs, including computational tools and consensus-sequence-based searches, have limited accuracy, especially in terms of their specificity. We found that each residue within an NES largely contributes independently and additively to the entire nuclear export activity. We created activity-based profiles of all classes of NESs with a comprehensive mutational analysis in mammalian cells. The profiles highlight a number of specific activity-affecting residues not only at the conserved hydrophobic positions but also in the linker and flanking regions. We then developed a computational tool, NESmapper, to predict NESs by using profiles that had been further optimized by training and combining the amino acid properties of the NES-flanking regions. This tool successfully reduced the considerable number of false positives, and the overall prediction accuracy was higher than that of other methods, including NESsential and Wregex. This profile-based prediction strategy is a reliable way to identify functional protein motifs. NESmapper is available at http://sourceforge.net/projects/nesmapper.


Asunto(s)
Biología Computacional/métodos , Leucina/química , Señales de Exportación Nuclear/genética , Análisis de Secuencia de Proteína/métodos , Transporte Activo de Núcleo Celular/genética , Transporte Activo de Núcleo Celular/fisiología , Secuencia de Aminoácidos , Animales , Análisis Mutacional de ADN , Leucina/genética , Ratones , Datos de Secuencia Molecular , Células 3T3 NIH , Señales de Exportación Nuclear/fisiología , Curva ROC , Programas Informáticos
20.
DNA Res ; 21(2): 169-81, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-24282021

RESUMEN

Cultivated strawberry (Fragaria x ananassa) is octoploid and shows allogamous behaviour. The present study aims at dissecting this octoploid genome through comparison with its wild relatives, F. iinumae, F. nipponica, F. nubicola, and F. orientalis by de novo whole-genome sequencing on an Illumina and Roche 454 platforms. The total length of the assembled Illumina genome sequences obtained was 698 Mb for F. x ananassa, and ∼200 Mb each for the four wild species. Subsequently, a virtual reference genome termed FANhybrid_r1.2 was constructed by integrating the sequences of the four homoeologous subgenomes of F. x ananassa, from which heterozygous regions in the Roche 454 and Illumina genome sequences were eliminated. The total length of FANhybrid_r1.2 thus created was 173.2 Mb with the N50 length of 5137 bp. The Illumina-assembled genome sequences of F. x ananassa and the four wild species were then mapped onto the reference genome, along with the previously published F. vesca genome sequence to establish the subgenomic structure of F. x ananassa. The strategy adopted in this study has turned out to be successful in dissecting the genome of octoploid F. x ananassa and appears promising when applied to the analysis of other polyploid plant species.


Asunto(s)
Fragaria/genética , Genoma de Planta , Repeticiones de Microsatélite , Filogenia , Poliploidía , Análisis de Secuencia de ADN
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA