Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 2 de 2
Filter
Add more filters











Database
Language
Publication year range
1.
Nat Commun ; 14(1): 5530, 2023 09 14.
Article in English | MEDLINE | ID: mdl-37709751

ABSTRACT

Markedly expanded tandem repeats (TRs) have been correlated with ~60 diseases. TR diversity has been considered a clue toward understanding missing heritability. However, haplotype-resolved long TRs remain mostly hidden or blacked out because their complex structures (TRs composed of various units and minisatellites containing >10-bp units) make them difficult to determine accurately with existing methods. Here, using a high-precision algorithm to determine complex TR structures from long, accurate reads of PacBio HiFi, an investigation of 270 Japanese control samples yields several genome-wide findings. Approximately 322,000 TRs are difficult to impute from the surrounding single-nucleotide variants. Greater genetic divergence of TR loci is significantly correlated with more events of younger replication slippage. Complex TRs are more abundant than single-unit TRs, and a tendency for complex TRs to consist of <10-bp units and single-unit TRs to be minisatellites is statistically significant at loci with ≥500-bp TRs. Of note, 8909 loci with extended TRs (>100b longer than the mode) contain several known disease-associated TRs and are considered candidates for association with disorders. Overall, complex TRs and minisatellites are found to be abundant and diverse, even in genetically small Japanese populations, yielding insights into the landscape of long TRs.


Subject(s)
Genome, Human , Tandem Repeat Sequences , Humans , Genome, Human/genetics , Minisatellite Repeats , Algorithms , Genetic Drift
2.
Bioinformatics ; 39(4)2023 04 03.
Article in English | MEDLINE | ID: mdl-37039842

ABSTRACT

MOTIVATION: Over the past 30 years, extended tandem repeats (TRs) have been correlated with ∼60 diseases with high odds ratios, and most known TRs consist of single repeat units. However, in the last few years, mosaic TRs composed of different units have been found to be associated with several brain disorders by long-read sequencing techniques. Mosaic TRs are difficult-to-characterize sequence configurations that are usually confirmed by manual inspection. Widely used tools are not designed to solve the mosaic TR problem and often fail to properly decompose mosaic TRs. RESULTS: We propose an efficient algorithm that can decompose mosaic TRs in the input string with high sensitivity. Using synthetic benchmark data, we demonstrate that our program named uTR outperforms TRF and RepeatMasker in terms of prediction accuracy, this is especially true when mosaic TRs are more complex, and uTR is faster than TRF and RepeatMasker in most cases. AVAILABILITY AND IMPLEMENTATION: The software program uTR that implements the proposed algorithm is available at https://github.com/morisUtokyo/uTR.


Subject(s)
Software , Tandem Repeat Sequences , Sequence Analysis, DNA/methods , Algorithms , High-Throughput Nucleotide Sequencing
SELECTION OF CITATIONS
SEARCH DETAIL