Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 57
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Genome Res ; 33(3): 435-447, 2023 03.
Artículo en Inglés | MEDLINE | ID: mdl-37307504

RESUMEN

Tandem repeats (TRs) are one of the largest sources of polymorphism, and their length is associated with gene regulation. Although previous studies reported several tandem repeats regulating gene splicing in cis (spl-TRs), no large-scale study has been conducted. In this study, we established a genome-wide catalog of 9537 spl-TRs with a total of 58,290 significant TR-splicing associations across 49 tissues (false discovery rate 5%) by using Genotype-Tissue expression (GTex) Project data. Regression models explaining splicing variation by using spl-TRs and other flanking variants suggest that at least some of the spl-TRs directly modulate splicing. In our catalog, two spl-TRs are known loci for repeat expansion diseases, spinocerebellar ataxia 6 (SCA6) and 12 (SCA12). Splicing alterations by these spl-TRs were compatible with those observed in SCA6 and SCA12. Thus, our comprehensive spl-TR catalog may help elucidate the pathomechanism of genetic diseases.


Asunto(s)
Ingeniería Genética , Empalme del ARN , Humanos , Polimorfismo Genético , Secuencias Repetidas en Tándem
2.
Genomics ; 116(5): 110894, 2024 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-39019410

RESUMEN

Technologies for detecting structural variation (SV) have advanced with the advent of long-read sequencing, which enables the validation of SV at a nucleotide level. Optical genome mapping (OGM), a technology based on physical mapping, can also provide comprehensive SVs analysis. We applied long-read whole genome sequencing (LRWGS) to accurately reconstruct breakpoint (BP) segments in a patient with complex chromosome 6q rearrangements that remained elusive by conventional karyotyping. Although all BPs were precisely identified by LRWGS, there were two possible ways to construct the BP segments in terms of their orders and orientations. Thus, we also used OGM analysis. Notably, OGM recognized entire inversions exceeding 500 kb in size, which LRWGS could not characterize. Consequently, here we successfully unveil the full genomic structure of this complex chromosomal 6q rearrangement and cryptic SVs through combined long-molecule genomic analyses, showcasing how LRWGS and OGM can complement each other in SV analysis.


Asunto(s)
Cromosomas Humanos Par 6 , Humanos , Cromosomas Humanos Par 6/genética , Genómica/métodos , Secuenciación Completa del Genoma/métodos , Masculino , Variación Estructural del Genoma , Mapeo Cromosómico/métodos , Puntos de Rotura del Cromosoma
3.
J Hum Genet ; 69(3-4): 153-157, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38216729

RESUMEN

Aromatic l-amino acid decarboxylase (AADC) deficiency is an autosomal recessive neurotransmitter disorder caused by pathogenic DOPA decarboxylase (DDC) variants. We previously reported Japanese siblings with AADC deficiency, which was confirmed by the lack of enzyme activity; however, only a heterozygous missense variant was detected. We therefore performed targeted long-read sequencing by adaptive sampling to identify any missing variants. Haplotype phasing and variant calling identified a novel deep intronic variant (c.714+255 C > A), which was predicted to potentially activate the noncanonical splicing acceptor site. Minigene assay revealed that wild-type and c.714+255 C > A alleles had different impacts on splicing. Three transcripts, including the canonical transcript, were detected from the wild-type allele, but only the noncanonical cryptic exon was produced from the variant allele, indicating that c.714+255 C > A was pathogenic. Target long-read sequencing may be used to detect hidden pathogenic variants in unresolved autosomal recessive cases with only one disclosed hit variant.


Asunto(s)
Errores Innatos del Metabolismo de los Aminoácidos , Descarboxilasas de Aminoácido-L-Aromático/deficiencia , Dopa-Decarboxilasa , Humanos , Dopa-Decarboxilasa/genética , Errores Innatos del Metabolismo de los Aminoácidos/genética , Intrones , Mutación Missense
4.
J Hum Genet ; 69(2): 85-90, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-38030753

RESUMEN

Ubiquitin-specific protease 8 (USP8) is a deubiquitinating enzyme involved in deubiquitinating the enhanced epidermal growth factor receptor for escape from degradation. Somatic variants at a hotspot in USP8 are a cause of Cushing's disease, and a de novo germline USP8 variant at this hotspot has been described only once previously, in a girl with Cushing's disease and developmental delay. In this study, we investigated an exome-negative patient with severe developmental delay, dysmorphic features, and multiorgan dysfunction by long-read sequencing, and identified a 22-kb de novo germline deletion within USP8 (chr15:50469966-50491995 [GRCh38]). The deletion involved the variant hotspot, one rhodanese domain, and two SH3 binding motifs, and was presumed to be generated through nonallelic homologous recombination through Alu elements. Thus, the patient may have perturbation of the endosomal sorting system and mitochondrial autophagy through the USP8 defect. This is the second reported case of a germline variant in USP8.


Asunto(s)
Hipersecreción de la Hormona Adrenocorticotrópica Pituitaria (HACT) , Femenino , Humanos , Endopeptidasas/genética , Complejos de Clasificación Endosomal Requeridos para el Transporte/genética , Complejos de Clasificación Endosomal Requeridos para el Transporte/metabolismo , Células Germinativas/metabolismo , Mutación de Línea Germinal/genética , Hipersecreción de la Hormona Adrenocorticotrópica Pituitaria (HACT)/metabolismo , Ubiquitina Tiolesterasa/genética , Ubiquitina Tiolesterasa/metabolismo
5.
J Hum Genet ; 2024 Oct 16.
Artículo en Inglés | MEDLINE | ID: mdl-39414989

RESUMEN

CEP55 encodes centrosomal protein 55 kDa, which plays a crucial role in mitosis, particularly cytokinesis. Biallelic CEP55 variants cause MARCH syndrome (multinucleated neurons, anhydramnios, renal dysplasia, cerebellar hypoplasia and hydranencephaly). Here, we describe a Japanese family with two affected siblings harboring novel compound heterozygous CEP55 variants, NM_001127182: c.[1357 C > T];[1358 G > A] p.[(Arg453Cys)];[(Arg453His)]. Both presented clinically with typical lethal MARCH syndrome. Although a combination of missense and nonsense variants has been reported previously, this is the first report of biallelic missense CEP55 variants. These variants biallelically affected the same amino acid, Arg453, in the last 40 amino acids of CEP55. These residues are functionally important for CEP55 localization to the midbody during cell division, and may be associated with severe clinical outcomes. More cases of pathogenic CEP55 variants are needed to establish the genotype-phenotype correlation.

6.
J Hum Genet ; 69(2): 69-77, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-38012394

RESUMEN

SLC5A6 encodes the sodium-dependent multivitamin transporter, a transmembrane protein that uptakes biotin, pantothenic acid, and lipoic acid. Biallelic SLC5A6 variants cause sodium-dependent multivitamin transporter deficiency (SMVTD) and childhood-onset biotin-responsive peripheral motor neuropathy (COMNB), which both respond well to replacement therapy with the above three nutrients. SMVTD usually presents with various symptoms in multiple organs, such as gastrointestinal hemorrhage, brain atrophy, and global developmental delay, at birth or in infancy. Without nutrient replacement therapy, SMVTD can be lethal in early childhood. COMNB is clinically milder and has a later onset than SMVTD, at approximately 10 years of age. COMNB symptoms are mostly limited to peripheral motor neuropathy. Here we report three patients from one Japanese family harboring novel compound heterozygous missense variants in SLC5A6, namely NM_021095.4:c.[221C>T];[642G>C] p.[(Ser74Phe)];[(Gln214His)]. Both variants were predicted to be deleterious through multiple lines of evidence, including amino acid conservation, in silico predictions of pathogenicity, and protein structure considerations. Drosophila analysis also showed c.221C>T to be pathogenic. All three patients had congenital brain cysts on neonatal cranial imaging, but no other morphological abnormalities. They also had a mild motor developmental delay that almost completely resolved despite no treatment. In terms of severity, their phenotypes were intermediate between SMVTD and COMNB. From these findings we propose a new SLC5A6-related disorder, spontaneously remitting developmental delay with brain cysts (SRDDBC) whose phenotypic severity is between that of SMVTD and COMNB. Further clinical and genetic evidence is needed to support our suggestion.


Asunto(s)
Quistes , Simportadores , Preescolar , Humanos , Recién Nacido , Biotina/genética , Biotina/metabolismo , Fenotipo , Sodio/metabolismo , Simportadores/genética , Simportadores/metabolismo
7.
J Hum Genet ; 69(3-4): 163-167, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38228874

RESUMEN

The gene for ATP binding cassette subfamily A member 2 (ABCA2) is located at chromosome 9q34.3. Biallelic ABCA2 variants lead to intellectual developmental disorder with poor growth and with or without seizures or ataxia (IDPOGSA). In this study, we identified novel compound heterozygous ABCA2 variants (NM_001606.5:c.[5300-17C>A];[6379C>T]) by whole exome sequencing in a 28-year-old Korean female patient with intellectual disability. These variants included intronic and nonsense variants of paternal and maternal origin, respectively, and are absent from gnomAD. SpliceAI predicted that the intron variant creates a cryptic acceptor site. Reverse transcription-PCR using RNA extracted from a lymphoblastoid cell line of the patient confirmed two aberrant transcripts. Her clinical features are compatible with those of IDPOGSA.


Asunto(s)
Discapacidad Intelectual , Humanos , Femenino , Adulto , Discapacidad Intelectual/genética , Mutación , Familia , Síndrome , Ataxia/genética
8.
Artículo en Inglés | MEDLINE | ID: mdl-38816190

RESUMEN

BACKGROUND: Although pure GAA expansion is considered pathogenic in SCA27B, non-GAA repeat motif is mostly mixed into longer repeat sequences. This study aimed to unravel the complete sequencing of FGF14 repeat expansion to elucidate its repeat motifs and pathogenicity. METHODS: We screened FGF14 repeat expansion in a Japanese cohort of 460 molecularly undiagnosed adult-onset cerebellar ataxia patients and 1022 controls, together with 92 non-Japanese controls, and performed nanopore sequencing of FGF14 repeat expansion. RESULTS: In the Japanese population, the GCA motif was predominantly observed as the non-GAA motif, whereas the GGA motif was frequently detected in non-Japanese controls. The 5'-common flanking variant was observed in all Japanese GAA repeat alleles within normal length, demonstrating its meiotic stability against repeat expansion. In both patients and controls, pure GAA repeat was up to 400 units in length, whereas non-pathogenic GAA-GCA repeat was larger, up to 900 units, but they evolved from different haplotypes, as rs534066520, located just upstream of the repeat sequence, completely discriminated them. Both (GAA)≥250 and (GAA)≥200 were enriched in patients, whereas (GAA-GCA)≥200 was similarly observed in patients and controls, suggesting the pathogenic threshold of (GAA)≥200 for cerebellar ataxia. We identified 14 patients with SCA27B (3.0%), but their single-nucleotide polymorphism genotype indicated different founder alleles between Japanese and Caucasians. The low prevalence of SCA27B in Japanese may be due to the lower allele frequency of (GAA)≥250 in the Japanese population than in Caucasians (0.15% vs 0.32%-1.26%). CONCLUSIONS: FGF14 repeat expansion has unique features of pathogenicity and allelic origin, as revealed by a single ethnic study.

9.
J Hum Genet ; 68(5): 363-367, 2023 May.
Artículo en Inglés | MEDLINE | ID: mdl-36631501

RESUMEN

TNNI2 at 11p15.5 encodes troponin I2, fast skeletal type, which is a member of the troponin I gene family and a component of the troponin complex. Distal arthrogryposis (DA) is characterized by congenital limb contractures without primary neurological or muscular effects. DA is inherited in an autosomal dominant fashion and is clinically and genetically heterogeneous. Exome sequencing identified a causative variant in TNNI2 [NM_003282.4:c.532T>C p.(Phe178Leu)] in a Japanese girl with typical DA2b. Interestingly, the familial study using Sanger sequencing suggested a mosaic variant in her healthy father. Subsequent targeted amplicon-based deep sequencing detected the TNNI2 variant with variant allele frequencies of 9.4-17.7% in genomic DNA derived from peripheral blood leukocytes, saliva, hair, and nails in the father. We confirmed a disease-causing variant in TNNI2 in the proband inherited from her asymptomatic father with its somatic variant. Our case demonstrates that careful clinical and genetic evaluation is required in DA.


Asunto(s)
Artrogriposis , Humanos , Femenino , Masculino , Artrogriposis/genética , Mosaicismo , Troponina I/genética , Sarcómeros , Linaje , Padre
10.
J Hum Genet ; 68(4): 247-253, 2023 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-36509868

RESUMEN

Pontocerebellar hypoplasia (PCH) is currently classified into 16 subgroups. Using mostly next-generation sequencing, pathogenic variants have been identified in as many as 24 PCH-associated genes. PCH type 8 (PCH8) is a rare heterogeneous disorder. Its clinical presentation includes severe development delay, increased muscle tone, microcephaly, and magnetic resonance imaging (MRI) abnormalities such as reduced cerebral white matter, a thin corpus callosum, and brainstem and cerebellar hypoplasia. To date, only two variants in the CHMP1A gene (MIM: 164010), NM_002768.5: c.88 C > T (p.Glu30*) and c.28-13 G > A, have been identified homozygously in seven patients with PCH8 from four families (MIM: 614961). CHMP1A is a subunit of the endosomal sorting complex required for transport III (ESCRT-III), which regulates the formation and release of extracellular vesicles. Biallelic CHMP1A loss of function impairs the ESCRT-III-mediated release of extracellular vesicles, which causes impaired progenitor proliferation in the developing brain. Herein, we report a patient with PCH8 who had a homozygous CHMP1A variant, c.122delA (p.Asn41Metfs*2), which arose from segmental uniparental disomy. Although our patient had similar MRI findings to those of previously reported patients, with no progression, we report some novel neurological and developmental findings that expand our knowledge of the clinical consequences associated with CHMP1A variants.


Asunto(s)
Enfermedades Cerebelosas , Microcefalia , Humanos , Disomía Uniparental/genética , Enfermedades Cerebelosas/genética , Microcefalia/diagnóstico por imagen , Microcefalia/genética , Microcefalia/complicaciones , Complejos de Clasificación Endosomal Requeridos para el Transporte/genética , Proteínas de Transporte Vesicular/genética
11.
J Hum Genet ; 68(12): 875-878, 2023 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-37592133

RESUMEN

Benign adult familial myoclonic epilepsy type 1 (BAFME1) is an autosomal dominant, adult-onset neurological disease caused by SAMD12 repeat expansion. In BAFME1, anticipation, such as the earlier onset of tremor and/or seizures in the next generation, was reported. This could be explained by intergenerational repeat instability, leading to larger expansions in successive generations. We report a four-generation BAFME1-affected family with anticipation. Using Nanopore long-read sequencing, detailed information regarding the sizes, configurations, and compositions of the expanded SAMD12 repeats across generations was obtained. Unexpectedly, a grandmother-mother-daughter triad showed similar repeat structures but with slight repeat expansions, despite quite variable age of onset of seizures (range: 52-14 years old), implying a complex relationship between the SAMD12 repeat expansion sequence and anticipation. This study suggests that different factor(s) from repeat expansion could modify the anticipation in BAFME1.


Asunto(s)
Epilepsias Mioclónicas , Humanos , Epilepsias Mioclónicas/genética , Linaje , Convulsiones
12.
J Hum Genet ; 68(10): 689-697, 2023 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-37308565

RESUMEN

Hereditary spastic paraplegias (HSPs) are a heterogeneous group of neurodegenerative disorders characterized by progressive spasticity and weakness in the lower extremities. To date, a total of 88 types of SPG are known. To diagnose HSP, multiple technologies, including microarray, direct sequencing, multiplex ligation-dependent probe amplification, and short-read next-generation sequencing, are often chosen based on the frequency of HSP subtypes. Exome sequencing (ES) is commonly used. We used ES to analyze ten cases of HSP from eight families. We identified pathogenic variants in three cases (from three different families); however, we were unable to determine the cause of the other seven cases using ES. We therefore applied long-read sequencing to the seven undetermined HSP cases (from five families). We detected intragenic deletions within the SPAST gene in four families, and a deletion within PSEN1 in the remaining family. The size of the deletion ranged from 4.7 to 12.5 kb and involved 1-7 exons. All deletions were entirely included in one long read. We retrospectively performed an ES-based copy number variation analysis focusing on pathogenic deletions, but were not able to accurately detect these deletions. This study demonstrated the efficiency of long-read sequencing in detecting intragenic pathogenic deletions in ES-negative HSP patients.


Asunto(s)
Adenosina Trifosfatasas , Paraplejía Espástica Hereditaria , Humanos , Adenosina Trifosfatasas/genética , Exoma/genética , Mutación , Variaciones en el Número de Copia de ADN , Estudios Retrospectivos , Espastina/genética , Paraplejía Espástica Hereditaria/diagnóstico , Paraplejía Espástica Hereditaria/genética , Paraplejía/genética
13.
Genomics ; 114(5): 110469, 2022 09.
Artículo en Inglés | MEDLINE | ID: mdl-36041634

RESUMEN

We report two patients with autosomal dominant neuronal intranuclear inclusion disease (NIID) harboring the biallelic GGC repeat expansion in NOTCH2NLC to uncover the impact of repeat expansion zygosity on the clinical phenotype. The zygosity of the entire NOTCH2NLC GGC repeat expansion and DNA methylation were comprehensively evaluated using fluorescent amplicon length PCR (AL-PCR), Southern blotting and targeted long-read sequencing, and detailed genetic/epigenetic and clinical features were described. In AL-PCR, we could not recognize the wild-type allele in both patients. Targeted long-read sequencing revealed that one patient harbored a homozygous repeat expansion. The other patient harbored compound heterozygous repeat expansions. The GGC repeats and the nearest CpG island were hypomethylated in all expanded alleles in both patients. Both patients harboring the biallelic GGC repeat expansion showed a typical dementia-dominant NIID phenotype. In conclusion, the biallelic GGC repeat expansion in two typical NIID patients indicated that NOTCH2NLC-related diseases could be completely dominant.


Asunto(s)
Cuerpos de Inclusión Intranucleares , Enfermedades Neurodegenerativas , Receptor Notch2/metabolismo , Humanos , Cuerpos de Inclusión Intranucleares/genética , Enfermedades Neurodegenerativas/genética , Fenotipo
14.
Genomics ; 114(5): 110468, 2022 09.
Artículo en Inglés | MEDLINE | ID: mdl-36041635

RESUMEN

Recent studies suggest that transcript isoforms significantly overlap (approximately 60%) between brain tissue and Epstein-Barr virus-transformed lymphoblastoid cell lines (LCLs). Interestingly, 14 cohesion-related genes with variants that cause Cornelia de Lange Syndrome (CdLS) are highly expressed in the brain and LCLs. In this context, we first performed RNA sequencing of LCLs from 22 solved (with pathogenic variants) and 19 unsolved (with no confirmed variants) CdLS cases. Next, an RNA sequencing pipeline was developed using solved cases with two different methods: short variant analysis (for single-nucleotide and indel variants) and aberrant splicing detection analysis. Then, 19 unsolved cases were subsequently applied to our pipeline, and four pathogenic variants in NIPBL (one inframe deletion and three intronic variants) were newly identified. Two of three intronic variants were located at Alu elements in deep-intronic regions, creating cryptic exons. RNA sequencing with LCLs was useful for identifying hidden variants in exome-negative cases.


Asunto(s)
Síndrome de Cornelia de Lange , Infecciones por Virus de Epstein-Barr , Proteínas de Ciclo Celular/genética , Síndrome de Cornelia de Lange/diagnóstico , Síndrome de Cornelia de Lange/genética , Síndrome de Cornelia de Lange/patología , Herpesvirus Humano 4/genética , Humanos , Nucleótidos , Fenotipo , Isoformas de Proteínas/genética , Análisis de Secuencia de ARN
15.
Pharmacogenomics J ; 19(2): 136-146, 2019 04.
Artículo en Inglés | MEDLINE | ID: mdl-29352165

RESUMEN

Human leukocyte antigen (HLA) is a gene complex known for its exceptional diversity across populations, importance in organ and blood stem cell transplantation, and associations of specific alleles with various diseases. We constructed a Japanese reference panel of class I HLA genes (ToMMo HLA panel), comprising a distinct set of HLA-A, HLA-B, HLA-C, and HLA-H alleles, by single-molecule, real-time (SMRT) sequencing of 208 individuals included in the 1070 whole-genome Japanese reference panel (1KJPN). For high-quality allele reconstruction, we developed a novel pipeline, Primer-Separation Assembly and Refinement Pipeline (PSARP), in which the SMRT sequencing and additional short-read data were used. The panel consisted of 139 alleles, which were all extended from known IPD-IMGT/HLA sequences, contained 40 with novel variants, and captured more than 96.5% of allelic diversity in 1KJPN. These newly available sequences would be important resources for research and clinical applications including high-resolution HLA typing, genetic association studies, and analyzes of cis-regulatory elements.


Asunto(s)
Variación Genética , Genoma Humano/genética , Secuenciación de Nucleótidos de Alto Rendimiento , Antígenos de Histocompatibilidad Clase I/genética , Alelos , Genotipo , Prueba de Histocompatibilidad , Humanos , Japón , Análisis de Secuencia de ADN
17.
BMC Genomics ; 17(1): 991, 2016 12 03.
Artículo en Inglés | MEDLINE | ID: mdl-27912743

RESUMEN

BACKGROUND: In the estimation of repeat numbers in a short tandem repeat (STR) region from high-throughput sequencing data, two types of strategies are mainly taken: a strategy based on counting repeat patterns included in sequence reads spanning the region and a strategy based on estimating the difference between the actual insert size and the insert size inferred from paired-end reads. The quality of sequence alignment is crucial, especially in the former approaches although usual alignment methods have difficulty in STR regions due to insertions and deletions caused by the variations of repeat numbers. RESULTS: We proposed a new dynamic programming based realignment method named STR-realigner that considers repeat patterns in STR regions as prior knowledge. By allowing the size change of repeat patterns with low penalty in STR regions, accurate realignment is expected. For the performance evaluation, publicly available STR variant calling tools were applied to three types of aligned reads: synthetically generated sequencing reads aligned with BWA-MEM, those realigned with STR-realigner, those realigned with ReviSTER, and those realigned with GATK IndelRealigner. From the comparison of root mean squared errors between estimated and true STR region size, the results for the dataset realigned with STR-realigner are better than those for other cases. For real data analysis, we used a real sequencing dataset from Illumina HiSeq 2000 for a parent-offspring trio. RepeatSeq and lobSTR were applied to the sequence reads for these individuals aligned with BWA-MEM, those realigned with STR-realigner, ReviSTER, and GATK IndelRealigner. STR-realigner shows the best performance in terms of consistency of the size of estimated STR regions in Mendelian inheritance. Root mean squared error values were also calculated from the comparison of these estimated results with STR region sizes obtained from high coverage PacBio sequencing data, and the results from the realigned sequencing data with STR-realigner showed the least (the best) root mean squared error value. CONCLUSIONS: The effectiveness of the proposed realignment method for STR regions was verified from the comparison with an existing method on both simulation datasets and real whole genome sequencing dataset.


Asunto(s)
Repeticiones de Microsatélite , Alineación de Secuencia/métodos , Programas Informáticos , Algoritmos , Biología Computacional/métodos , Genómica/métodos , Secuenciación de Nucleótidos de Alto Rendimiento , Reproducibilidad de los Resultados , Análisis de Secuencia de ADN/métodos
18.
BMC Genomics ; 17(1): 745, 2016 Sep 21.
Artículo en Inglés | MEDLINE | ID: mdl-27654840

RESUMEN

BACKGROUND: Genome-wide association studies have revealed associations between single-nucleotide polymorphisms (SNPs) and phenotypes such as disease symptoms and drug tolerance. To address the small sample size for rare variants, association studies tend to group gene or pathway level variants and evaluate the effect on the set of variants. One of such strategies, known as the sequential kernel association test (SKAT), is a widely used collapsing method. However, the reported p-values from SKAT tend to be biased because the asymptotic property of the statistic is used to calculate the p-value. Although this bias can be corrected by applying permutation procedures for the test statistics, the computational cost of obtaining p-values with high resolution is prohibitive. RESULTS: To address this problem, we devise an adaptive SKAT procedure termed AP-SKAT that efficiently classifies significant SNP sets and ranks them according to the permuted p-values. Our procedure adaptively stops the permutation test when the significance level is outside some confidence interval of the estimated p-value for a binomial distribution. To evaluate the performance, we first compare the power and sample size calculation and the type I error rates estimate of SKAT, SKAT-O, and the proposed procedure using genotype data in the SKAT R package and from 1000 Genome Project. Through computational experiments using whole genome sequencing and SNP array data, we show that our proposed procedure is highly efficient and has comparable accuracy to the standard procedure. CONCLUSIONS: For several types of genetic data, the developed procedure could achieve competitive power and sample size under small and large sample size conditions with controlling considerable type I error rates, and estimate p-values of significant SNP sets that are consistent with those estimated by the standard permutation test within a realistic time. This demonstrates that the procedure is sufficiently powerful for recent whole genome sequencing and SNP array data with increasing numbers of phenotypes. Additionally, this procedure can be used in other association tests by employing alternative methods to calculate the statistics.

19.
Genomics ; 106(5): 265-7, 2015 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-26387926

RESUMEN

DNA sequencers that can conduct real-time sequencing from a single polymerase molecule are known as third-generation sequencers. Third-generation sequencers enable sequencing of reads that are several kilobases long. However, the raw data generated from third-generation sequencers are known to be error-prone. Because of sequencing errors, it is difficult to identify which genes are homologous to the reads obtained using third-generation sequencers. In this study, a new method for homology search algorithm, PAFFT, is developed. This method is the extension of the MAFFT algorithm which was used for multiple alignments. PAFFT detects global homology rather than local homology so that homologous regions can be detected even when the error rate of sequencing is high. PAFFT will boost application of third-generation sequencers.


Asunto(s)
Algoritmos , Análisis de Secuencia de ADN/métodos , Homología de Secuencia de Ácido Nucleico
20.
Genomics ; 102(1): 35-7, 2013 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-23542167

RESUMEN

Next-generation sequencing platforms generate short (50-150bp) reads that can be mapped onto the reference genome. Repetitive sequences in the genome, because of the presence of similar or identical sequences, cause mapping errors in the case of the short reads. By filtering short reads with repeats, mapping will be improved. I developed RF. RF is a new method that filters short reads with tandem repeats. A scoring scheme was developed that assigned higher scores to regions with tandem repeats and lower scores to regions without tandem repeats. In this study, RF was applied to filter out short reads with repeats, before short reads were mapped onto the same genomic contig by using a short read-mapping program. The result suggests RF improved the proportion of correctly mapped short reads on filtering the repeats. RF is a useful tool for reducing mapping errors of short reads onto reference genomes.


Asunto(s)
Mapeo Cromosómico , Biología Computacional , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Secuencias Repetidas en Tándem/genética , Algoritmos , Genoma Humano , Humanos , Análisis de Secuencia de ADN , Programas Informáticos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA