Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
medRxiv ; 2024 Mar 18.
Artículo en Inglés | MEDLINE | ID: mdl-38562723

RESUMEN

Comprehending the mechanism behind human diseases with an established heritable component represents the forefront of personalized medicine. Nevertheless, numerous medically important genes are inaccurately represented in short-read sequencing data analysis due to their complexity and repetitiveness or the so-called 'dark regions' of the human genome. The advent of PacBio as a long-read platform has provided new insights, yet HiFi whole-genome sequencing (WGS) cost remains frequently prohibitive. We introduce a targeted sequencing and analysis framework, Twist Alliance Dark Genes Panel (TADGP), designed to offer phased variants across 389 medically important yet complex autosomal genes. We highlight TADGP accuracy across eleven control samples and compare it to WGS. This demonstrates that TADGP achieves variant calling accuracy comparable to HiFi-WGS data, but at a fraction of the cost. Thus, enabling scalability and broad applicability for studying rare diseases or complementing previously sequenced samples to gain insights into these complex genes. TADGP revealed several candidate variants across all cases and provided insight into LPA diversity when tested on samples from rare disease and cardiovascular disease cohorts. In both cohorts, we identified novel variants affecting individual disease-associated genes (e.g., IKZF1, KCNE1). Nevertheless, the annotation of the variants across these 389 medically important genes remains challenging due to their underrepresentation in ClinVar and gnomAD. Consequently, we also offer an annotation resource to enhance the evaluation and prioritization of these variants. Overall, we can demonstrate that TADGP offers a cost-efficient and scalable approach to routinely assess the dark regions of the human genome with clinical relevance.

2.
Am J Hum Genet ; 110(2): 240-250, 2023 02 02.
Artículo en Inglés | MEDLINE | ID: mdl-36669496

RESUMEN

Spinal muscular atrophy, a leading cause of early infant death, is caused by bi-allelic mutations of SMN1. Sequence analysis of SMN1 is challenging due to high sequence similarity with its paralog SMN2. Both genes have variable copy numbers across populations. Furthermore, without pedigree information, it is currently not possible to identify silent carriers (2+0) with two copies of SMN1 on one chromosome and zero copies on the other. We developed Paraphase, an informatics method that identifies full-length SMN1 and SMN2 haplotypes, determines the gene copy numbers, and calls phased variants using long-read PacBio HiFi data. The SMN1 and SMN2 copy-number calls by Paraphase are highly concordant with orthogonal methods (99.2% for SMN1 and 100% for SMN2). We applied Paraphase to 438 samples across 5 ethnic populations to conduct a population-wide haplotype analysis of these highly homologous genes. We identified major SMN1 and SMN2 haplogroups and characterized their co-segregation through pedigree-based analyses. We identified two SMN1 haplotypes that form a common two-copy SMN1 allele in African populations. Testing positive for these two haplotypes in an individual with two copies of SMN1 gives a silent carrier risk of 88.5%, which is significantly higher than the currently used marker (1.7%-3.0%). Extending beyond simple copy-number testing, Paraphase can detect pathogenic variants and enable potential haplotype-based screening of silent carriers through statistical phasing of haplotypes into alleles. Future analysis of larger population data will allow identification of more diverse haplotypes and genetic markers for silent carriers.


Asunto(s)
Atrofia Muscular Espinal , Lactante , Humanos , Atrofia Muscular Espinal/genética , Atrofia Muscular Espinal/diagnóstico , Mutación , Dosificación de Gen , Linaje , Análisis de Secuencia , Proteína 1 para la Supervivencia de la Neurona Motora/genética , Proteína 2 para la Supervivencia de la Neurona Motora/genética
3.
Genome Res ; 33(1): 61-70, 2023 01.
Artículo en Inglés | MEDLINE | ID: mdl-36657977

RESUMEN

High-throughput sequencing provides sufficient means for determining genotypes of clinically important pharmacogenes that can be used to tailor medical decisions to individual patients. However, pharmacogene genotyping, also known as star-allele calling, is a challenging problem that requires accurate copy number calling, structural variation identification, variant calling, and phasing within each pharmacogene copy present in the sample. Here we introduce Aldy 4, a fast and efficient tool for genotyping pharmacogenes that uses combinatorial optimization for accurate star-allele calling across different sequencing technologies. Aldy 4 adds support for long reads and uses a novel phasing model and improved copy number and variant calling models. We compare Aldy 4 against the current state-of-the-art star-allele callers on a large and diverse set of samples and genes sequenced by various sequencing technologies, such as whole-genome and targeted Illumina sequencing, barcoded 10x Genomics, and Pacific Biosciences (PacBio) HiFi. We show that Aldy 4 is the most accurate star-allele caller with near-perfect accuracy in all evaluated contexts, and hope that Aldy remains an invaluable tool in the clinical toolbox even with the advent of long-read sequencing technologies.


Asunto(s)
Farmacogenética , Polimorfismo de Nucleótido Simple , Humanos , Alelos , Genotipo , Genómica , Secuenciación de Nucleótidos de Alto Rendimiento , Análisis de Secuencia de ADN
4.
Hum Mutat ; 43(11): 1557-1566, 2022 11.
Artículo en Inglés | MEDLINE | ID: mdl-36057977

RESUMEN

To determine the phase of NUDT15 sequence variants for more comprehensive star (*) allele diplotyping, we developed a novel long-read single-molecule real-time HiFi amplicon sequencing method. A 10.5 kb NUDT15 amplicon assay was validated using reference material positive controls and additional samples for specimen type and blinded accuracy assessment. Triplicate NUDT15 HiFi sequencing of two reference material samples had nonreference genotype concordances of >99.9%, indicating that the assay is robust. Notably, short-read genome sequencing of a subset of samples was unable to determine the phase of star (*) allele-defining NUDT15 variants, resulting in ambiguous diplotype results. In contrast, long-read HiFi sequencing phased all variants across the NUDT15 amplicons, including a *2/*9 diplotype that previously was characterized as *1/*2 in the 1000 Genomes Project v3 data set. Assay throughput was also tested using 8.5 kb amplicons from 100 Ashkenazi Jewish individuals, which identified a novel NUDT15 *1 suballele (c.-121G>A) and a rare likely deleterious coding variant (p.Pro129Arg). Both novel alleles were Sanger confirmed and assigned as *1.007 and *20, respectively, by the PharmVar Consortium. Taken together, NUDT15 HiFi amplicon sequencing is an innovative method for phased full-gene characterization and novel allele discovery, which could improve NUDT15 pharmacogenomic testing and subsequent phenotype prediction.


Asunto(s)
Farmacogenética , Alelos , Genotipo , Haplotipos , Humanos , Análisis de Secuencia de ADN/métodos
5.
Int J Mol Sci ; 22(5)2021 Mar 05.
Artículo en Inglés | MEDLINE | ID: mdl-33807660

RESUMEN

Myotonic dystrophy type 1 (DM1) is the most complex and variable trinucleotide repeat disorder caused by an unstable CTG repeat expansion, reaching up to 4000 CTG in the most severe cases. The genetic and clinical variability of DM1 depend on the sex and age of the transmitting parent, but also on the CTG repeat number, presence of repeat interruptions and/or on the degree of somatic instability. Currently, it is difficult to simultaneously and accurately determine these contributing factors in DM1 patients due to the limitations of gold standard methods used in molecular diagnostics and research laboratories. Our study showed the efficiency of the latest PacBio long-read sequencing technology to sequence large CTG trinucleotides, detect multiple and single repeat interruptions and estimate the levels of somatic mosaicism in DM1 patients carrying complex CTG repeat expansions inaccessible to most methods. Using this innovative approach, we revealed the existence of de novo CCG interruptions associated with CTG stabilization/contraction across generations in a new DM1 family. We also demonstrated that our method is suitable to sequence the DM1 locus and measure somatic mosaicism in DM1 families carrying more than 1000 pure CTG repeats. Better characterization of expanded alleles in DM1 patients can significantly improve prognosis and genetic counseling, not only in DM1 but also for other tandem DNA repeat disorders.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento , Mosaicismo , Distrofia Miotónica/genética , Expansión de Repetición de Trinucleótido , Adulto , Femenino , Humanos , Masculino , Persona de Mediana Edad
6.
Brain ; 144(4): 1082-1088, 2021 05 07.
Artículo en Inglés | MEDLINE | ID: mdl-33889947

RESUMEN

To examine the length of a hexanucleotide expansion in C9orf72, which represents the most frequent genetic cause of frontotemporal lobar degeneration and motor neuron disease, we employed a targeted amplification-free long-read sequencing technology: No-Amp sequencing. In our cross-sectional study, we assessed cerebellar tissue from 28 well-characterized C9orf72 expansion carriers. We obtained 3507 on-target circular consensus sequencing reads, of which 814 bridged the C9orf72 repeat expansion (23%). Importantly, we observed a significant correlation between expansion sizes obtained using No-Amp sequencing and Southern blotting (P = 5.0 × 10-4). Interestingly, we also detected a significant survival advantage for individuals with smaller expansions (P = 0.004). Additionally, we uncovered that smaller expansions were significantly associated with higher levels of C9orf72 transcripts containing intron 1b (P = 0.003), poly(GP) proteins (P = 1.3 × 10- 5), and poly(GA) proteins (P = 0.005). Thorough examination of the composition of the expansion revealed that its GC content was extremely high (median: 100%) and that it was mainly composed of GGGGCC repeats (median: 96%), suggesting that expanded C9orf72 repeats are quite pure. Taken together, our findings demonstrate that No-Amp sequencing is a powerful tool that enables the discovery of relevant clinicopathological associations, highlighting the important role played by the cerebellar size of the expanded repeat in C9orf72-linked diseases.


Asunto(s)
Proteína C9orf72/genética , Enfermedades Neurodegenerativas/genética , Análisis de Secuencia de ADN/métodos , Anciano , Cerebelo/metabolismo , Estudios Transversales , Expansión de las Repeticiones de ADN/genética , Femenino , Humanos , Masculino , Persona de Mediana Edad
7.
Front Immunol ; 9: 2294, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-30337930

RESUMEN

Although NGS technologies fuel advances in high-throughput HLA genotyping methods for identification and classification of HLA genes to assist with precision medicine efforts in disease and transplantation, the efficiency of these methods are impeded by the absence of adequately-characterized high-frequency HLA allele reference sequence databases for the highly polymorphic HLA gene system. Here, we report on producing a comprehensive collection of full-length HLA allele sequences for eight classical HLA loci found in the Japanese population. We augmented the second-generation short read data generated by the Ion Torrent technology with long amplicon spanning consensus reads delivered by the third-generation SMRT sequencing method to create reference grade high-quality sequences of HLA class I and II gene alleles resolved at the genomic coding and non-coding level. Forty-six DNAs were obtained from a reference set used previously to establish the HLA allele frequency data in Japanese subjects. The samples included alleles with a collective allele frequency in the Japanese population of more than 99.2%. The HLA loci were independently amplified by long-range PCR using previously designed HLA-locus specific primers and subsequently sequenced using SMRT and Ion PGM sequencers. The mapped long and short-reads were used to produce a reference library of consensus HLA allelic sequences with the help of the reference-aware software tool LAA for SMRT Sequencing. A total of 253 distinct alleles were determined for 46 healthy subjects. Of them, 137 were novel alleles: 101 SNVs and/or indels and 36 extended alleles at a partial or full-length level. Comparing the HLA sequences from the perspective of nucleotide diversity revealed that HLA-DRB1 was the most divergent among the eight HLA genes, and that the HLA-DPB1 gene sequences diverged into two distinct groups, DP2 and DP5, with evidence of independent polymorphisms generated in exon 2. We also identified two specific intronic variations in HLA-DRB1 that might be involved in rheumatoid arthritis. In conclusion, full-length HLA allele sequencing by third-generation and second-generation technologies has provided polymorphic gene reference sequences at a genomic allelic resolution including allelic variations assigned up to the field-4 level for a stronger foundation in precision medicine and HLA-related disease and transplantation studies.


Asunto(s)
Biología Computacional/métodos , Genes MHC Clase II , Genes MHC Clase I , Secuenciación de Nucleótidos de Alto Rendimiento , Análisis de Secuencia de ADN , Programas Informáticos , Adulto , Anciano , Anciano de 80 o más Años , Alelos , Artritis Reumatoide/genética , Femenino , Frecuencia de los Genes , Estudios de Asociación Genética , Predisposición Genética a la Enfermedad , Genómica/métodos , Genotipo , Técnicas de Genotipaje , Humanos , Masculino , Persona de Mediana Edad , Filogenia , Polimorfismo Genético
9.
Genome Res ; 25(1): 129-41, 2015 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-25236617

RESUMEN

Burkholderia pseudomallei (Bp) is the causative agent of the infectious disease melioidosis. To investigate population diversity, recombination, and horizontal gene transfer in closely related Bp isolates, we performed whole-genome sequencing (WGS) on 106 clinical, animal, and environmental strains from a restricted Asian locale. Whole-genome phylogenies resolved multiple genomic clades of Bp, largely congruent with multilocus sequence typing (MLST). We discovered widespread recombination in the Bp core genome, involving hundreds of regions associated with multiple haplotypes. Highly recombinant regions exhibited functional enrichments that may contribute to virulence. We observed clade-specific patterns of recombination and accessory gene exchange, and provide evidence that this is likely due to ongoing recombination between clade members. Reciprocally, interclade exchanges were rarely observed, suggesting mechanisms restricting gene flow between clades. Interrogation of accessory elements revealed that each clade harbored a distinct complement of restriction-modification (RM) systems, predicted to cause clade-specific patterns of DNA methylation. Using methylome sequencing, we confirmed that representative strains from separate clades indeed exhibit distinct methylation profiles. Finally, using an E. coli system, we demonstrate that Bp RM systems can inhibit uptake of non-self DNA. Our data suggest that RM systems borne on mobile elements, besides preventing foreign DNA invasion, may also contribute to limiting exchanges of genetic material between individuals of the same species. Genomic clades may thus represent functional units of genetic isolation in Bp, modulating intraspecies genetic diversity.


Asunto(s)
Burkholderia pseudomallei/genética , Epigénesis Genética , Genoma Bacteriano , Recombinación Genética , Transcriptoma , Animales , Cartilla de ADN , ADN Bacteriano/genética , Escherichia coli/genética , Femenino , Eliminación de Gen , Estudios de Asociación Genética , Genómica , Haplotipos , Humanos , Melioidosis/microbiología , Ratones , Ratones Endogámicos BALB C , Tipificación de Secuencias Multilocus , Filogenia , Polimorfismo de Nucleótido Simple , Análisis de Secuencia de ADN
10.
PLoS One ; 7(11): e48995, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-23185288

RESUMEN

Direct analysis of unassembled genomic data could greatly increase the power of short read DNA sequencing technologies and allow comparative genomics of organisms without a completed reference available. Here, we compare 174 chloroplasts by analyzing the taxanomic distribution of short kmers across genomes [1]. We then assemble de novo contigs centered on informative variation. The localized de novo contigs can be separated into two major classes: tip = unique to a single genome and group = shared by a subset of genomes. Prior to assembly, we found that ~18% of the chloroplast was duplicated in the inverted repeat (IR) region across a four-fold difference in genome sizes, from a highly reduced parasitic orchid [2] to a massive algal chloroplast [3], including gnetophytes [4] and cycads [5]. The conservation of this ratio between single copy and duplicated sequence was basal among green plants, independent of photosynthesis and mechanism of genome size change, and different in gymnosperms and lower plants. Major lineages in the angiosperm clade differed in the pattern of shared kmers and de novo contigs. For example, parasitic plants demonstrated an expected accelerated overall rate of evolution, while the hemi-parasitic genomes contained a great deal more novel sequence than holo-parasitic plants, suggesting different mechanisms at different stages of genomic contraction. Additionally, the legumes are diverging more quickly and in different ways than other major families. Small duplicated fragments of the rrn23 genes were deeply conserved among seed plants, including among several species without the IR regions, indicating a crucial functional role of this duplication. Localized de novo assembly of informative kmers greatly reduces the complexity of large comparative analyses by confining the analysis to a small partition of data and genomes relevant to the specific question, allowing direct analysis of next-gen sequence data from previously unstudied genomes and rapid discovery of informative candidate regions.


Asunto(s)
Cloroplastos/genética , Genoma de Planta/genética , Genómica , Secuencia Conservada/genética , Mapeo Contig , Tamaño del Genoma , Plantas/clasificación , Plantas/genética , Polimorfismo Genético , Estándares de Referencia
12.
J Comp Neurol ; 463(4): 360-71, 2003 Sep 01.
Artículo en Inglés | MEDLINE | ID: mdl-12836172

RESUMEN

The advance of knowledge of the thalamic reticular nucleus and its connections has been reviewed and Max Cowan's contributions to this knowledge and to the methods used for studying the nucleus have been summarized. Whereas 50 years ago the nucleus was seen as a diffusely organized cell group closely related to the brain stem reticular formation, it can now be seen as a complex, tightly organized entity that has a significant inhibitory, modulatory action on the thalamic relay to cortex. The nucleus is under the control, on the one hand, of topographically organized afferents from the cerebral cortex and the thalamus, and on the other of more diffuse afferents from brain stem, basal forebrain, and other regions. Whereas the second group of afferents can be expected to have global actions on thalamocortical transmission, relevant for overall attentive state, the former group will have local actions, modulating transmission through the thalamus to cortex with highly specific local effects. Since it appears that all areas of cortex and all parts of the thalamus are linked directly to the reticular nucleus, it now becomes important to define how the several pathways that pass through the thalamus relate to each other in their reticular connections.


Asunto(s)
Tálamo/anatomía & histología , Animales , Corteza Cerebral/anatomía & histología , Vías Nerviosas/anatomía & histología , Sinapsis
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...