Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 143
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Am J Hum Genet ; 110(9): 1454-1469, 2023 09 07.
Artigo em Inglês | MEDLINE | ID: mdl-37595579

RESUMO

Short-read genome sequencing (GS) holds the promise of becoming the primary diagnostic approach for the assessment of autism spectrum disorder (ASD) and fetal structural anomalies (FSAs). However, few studies have comprehensively evaluated its performance against current standard-of-care diagnostic tests: karyotype, chromosomal microarray (CMA), and exome sequencing (ES). To assess the clinical utility of GS, we compared its diagnostic yield against these three tests in 1,612 quartet families including an individual with ASD and in 295 prenatal families. Our GS analytic framework identified a diagnostic variant in 7.8% of ASD probands, almost 2-fold more than CMA (4.3%) and 3-fold more than ES (2.7%). However, when we systematically captured copy-number variants (CNVs) from the exome data, the diagnostic yield of ES (7.4%) was brought much closer to, but did not surpass, GS. Similarly, we estimated that GS could achieve an overall diagnostic yield of 46.1% in unselected FSAs, representing a 17.2% increased yield over karyotype, 14.1% over CMA, and 4.1% over ES with CNV calling or 36.1% increase without CNV discovery. Overall, GS provided an added diagnostic yield of 0.4% and 0.8% beyond the combination of all three standard-of-care tests in ASD and FSAs, respectively. This corresponded to nine GS unique diagnostic variants, including sequence variants in exons not captured by ES, structural variants (SVs) inaccessible to existing standard-of-care tests, and SVs where the resolution of GS changed variant classification. Overall, this large-scale evaluation demonstrated that GS significantly outperforms each individual standard-of-care test while also outperforming the combination of all three tests, thus warranting consideration as the first-tier diagnostic approach for the assessment of ASD and FSAs.


Assuntos
Transtorno do Espectro Autista , Feminino , Gravidez , Humanos , Transtorno do Espectro Autista/diagnóstico , Transtorno do Espectro Autista/genética , Primeiro Trimestre da Gravidez , Ultrassonografia Pré-Natal , Mapeamento Cromossômico , Exoma
2.
Am J Hum Genet ; 110(2): 300-313, 2023 02 02.
Artigo em Inglês | MEDLINE | ID: mdl-36706759

RESUMO

While extensively studied in clinical cohorts, the phenotypic consequences of 22q11.2 copy-number variants (CNVs) in the general population remain understudied. To address this gap, we performed a phenome-wide association scan in 405,324 unrelated UK Biobank (UKBB) participants by using CNV calls from genotyping array. We mapped 236 Human Phenotype Ontology terms linked to any of the 90 genes encompassed by the region to 170 UKBB traits and assessed the association between these traits and the copy-number state of 504 genotyping array probes in the region. We found significant associations for eight continuous and nine binary traits associated under different models (duplication-only, deletion-only, U-shape, and mirror models). The causal effect of the expression level of 22q11.2 genes on associated traits was assessed through transcriptome-wide Mendelian randomization (TWMR), revealing that increased expression of ARVCF increased BMI. Similarly, increased DGCR6 expression causally reduced mean platelet volume, in line with the corresponding CNV effect. Furthermore, cross-trait multivariable Mendelian randomization (MVMR) suggested a predominant role of genuine (horizontal) pleiotropy in the CNV region. Our findings show that within the general population, 22q11.2 CNVs are associated with traits previously linked to genes in the region, and duplications and deletions act upon traits in different fashions. We also showed that gain or loss of distinct segments within 22q11.2 may impact a trait under different association models. Our results have provided new insights to help further the understanding of the complex 22q11.2 region.


Assuntos
Variações do Número de Cópias de DNA , Fenômica , Humanos , Variações do Número de Cópias de DNA/genética , Fenótipo , Cromossomos Humanos Par 22
3.
Brief Bioinform ; 25(4)2024 May 23.
Artigo em Inglês | MEDLINE | ID: mdl-38851298

RESUMO

Deletion is a crucial type of genomic structural variation and is associated with numerous genetic diseases. The advent of third-generation sequencing technology has facilitated the analysis of complex genomic structures and the elucidation of the mechanisms underlying phenotypic changes and disease onset due to genomic variants. Importantly, it has introduced innovative perspectives for deletion variants calling. Here we propose a method named Dual Attention Structural Variation (DASV) to analyze deletion structural variations in sequencing data. DASV converts gene alignment information into images and integrates them with genomic sequencing data through a dual attention mechanism. Subsequently, it employs a multi-scale network to precisely identify deletion regions. Compared with four widely used genome structural variation calling tools: cuteSV, SVIM, Sniffles and PBSV, the results demonstrate that DASV consistently achieves a balance between precision and recall, enhancing the F1 score across various datasets. The source code is available at https://github.com/deconvolution-w/DASV.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Software , Humanos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Deleção de Sequência , Análise de Sequência de DNA/métodos , Algoritmos , Genômica/métodos , Biologia Computacional/métodos
4.
Trends Genet ; 38(6): 572-586, 2022 06.
Artigo em Inglês | MEDLINE | ID: mdl-34906378

RESUMO

The development of new sequencing platforms, technologies, and bioinformatics tools in the past decade fostered key discoveries in human genomics. Among the most recent sequencing technologies, nanopore sequencing (NS) has caught the interest of researchers for its intriguing potential and flexibility. This up-to-date review highlights the recent application of NS in the hematology field, focusing on progress and challenges of the technological approaches employed for the identification of pathologic alterations. The molecular and analytic pipelines developed for the analysis of the whole-genome, target regions, and transcriptomics provide a proof of evidence of the unparalleled amount of information that could be retrieved by an innovative approach based on long-read sequencing.


Assuntos
Hematologia , Sequenciamento por Nanoporos , Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Análise de Sequência de DNA
5.
Am J Hum Genet ; 109(2): 195-209, 2022 02 03.
Artigo em Inglês | MEDLINE | ID: mdl-35032432

RESUMO

Whole-genome sequencing resolves many clinical cases where standard diagnostic methods have failed. However, at least half of these cases remain unresolved after whole-genome sequencing. Structural variants (SVs; genomic variants larger than 50 base pairs) of uncertain significance are the genetic cause of a portion of these unresolved cases. As sequencing methods using long or linked reads become more accessible and SV detection algorithms improve, clinicians and researchers are gaining access to thousands of reliable SVs of unknown disease relevance. Methods to predict the pathogenicity of these SVs are required to realize the full diagnostic potential of long-read sequencing. To address this emerging need, we developed StrVCTVRE to distinguish pathogenic SVs from benign SVs that overlap exons. In a random forest classifier, we integrated features that capture gene importance, coding region, conservation, expression, and exon structure. We found that features such as expression and conservation are important but are absent from SV classification guidelines. We leveraged multiple resources to construct a size-matched training set of rare, putatively benign and pathogenic SVs. StrVCTVRE performs accurately across a wide SV size range on independent test sets, which will allow clinicians and researchers to eliminate about half of SVs from consideration while retaining a 90% sensitivity. We anticipate clinicians and researchers will use StrVCTVRE to prioritize SVs in probands where no SV is immediately compelling, empowering deeper investigation into novel SVs to resolve cases and understand new mechanisms of disease. StrVCTVRE runs rapidly and is publicly available.


Assuntos
Algoritmos , Genoma Humano , Variação Estrutural do Genoma , Software , Aprendizado de Máquina Supervisionado , Conjuntos de Dados como Assunto , Éxons , Genômica/métodos , Humanos , Curva ROC , Sequenciamento Completo do Genoma/estatística & dados numéricos
6.
Brief Bioinform ; 24(4)2023 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-37200087

RESUMO

Structural variant (SV) detection is essential for genomic studies, and long-read sequencing technologies have advanced our capacity to detect SVs directly from read or de novo assembly, also known as read-based and assembly-based strategy. However, to date, no independent studies have compared and benchmarked the two strategies. Here, on the basis of SVs detected by 20 read-based and eight assembly-based detection pipelines from six datasets of HG002 genome, we investigated the factors that influence the two strategies and assessed their performance with well-curated SVs. We found that up to 80% of the SVs could be detected by both strategies among different long-read datasets, whereas variant type, size, and breakpoint detected by read-based strategy were greatly affected by aligners. For the high-confident insertions and deletions at non-tandem repeat regions, a remarkable subset of them (82% in assembly-based calls and 93% in read-based calls), accounting for around 4000 SVs, could be captured by both reads and assemblies. However, discordance between two strategies was largely caused by complex SVs and inversions, which resulted from inconsistent alignment of reads and assemblies at these loci. Finally, benchmarking with SVs at medically relevant genes, the recall of read-based strategy reached 77% on 5X coverage data, whereas assembly-based strategy required 20X coverage data to achieve similar performance. Therefore, integrating SVs from read and assembly is suggested for general-purpose detection because of inconsistently detected complex SVs and inversions, whereas assembly-based strategy is optional for applications with limited resources.


Assuntos
Benchmarking , Genoma Humano , Humanos , Análise de Sequência , Genômica/métodos , Análise de Sequência de DNA/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos
7.
Mol Biol Evol ; 40(8)2023 08 03.
Artigo em Inglês | MEDLINE | ID: mdl-37565562

RESUMO

During the origin of great apes about 14 million years ago, a series of phenotypic innovations emerged, such as the increased body size, the enlarged brain volume, the improved cognitive skill, and the diversified diet. Yet, the genomic basis of these evolutionary changes remains unclear. Utilizing the high-quality genome assemblies of great apes (including human), gibbon, and macaque, we conducted comparative genome analyses and identified 15,885 great ape-specific structural variants (GSSVs), including eight coding GSSVs resulting in the creation of novel proteins (e.g., ACAN and CMYA5). Functional annotations of the GSSV-related genes revealed the enrichment of genes involved in development and morphogenesis, especially neurogenesis and neural network formation, suggesting the potential role of GSSVs in shaping the great ape-shared traits. Further dissection of the brain-related GSSVs shows great ape-specific changes of enhancer activities and gene expression in the brain, involving a group of GSSV-regulated genes (such as NOL3) that potentially contribute to the altered brain development and function in great apes. The presented data highlight the evolutionary role of structural variants in the phenotypic innovations during the origin of the great ape lineage.


Assuntos
Hominidae , Animais , Humanos , Hominidae/genética , Evolução Biológica , Genoma , Genômica , Fenótipo
8.
Ann Hum Genet ; 88(2): 113-125, 2024 03.
Artigo em Inglês | MEDLINE | ID: mdl-37807935

RESUMO

INTRODUCTION: Next generation sequencing technology has greatly reduced the cost and time required for sequencing a genome. An approach that is rapidly being adopted as an alternative method for CNV analysis is the low-pass whole genome sequencing (LP-WGS). Here, we evaluated the performance of LP-WGS to detect copy number variants (CNVs) in clinical cytogenetics. MATERIALS AND METHODS: DNA samples with known CNVs detected by chromosomal microarray analyses (CMA) were selected for comparison and used as positive controls; our panel included 44 DNA samples (12 prenatal and 32 postnatal), comprising a total of 55 chromosome imbalances. The selected cases were chosen to provide a wide range of clinically relevant CNVs, the vast majority being associated with intellectual disability or recognizable syndromes. The chromosome imbalances ranged in size from 75 kb to 90.3 Mb, including aneuploidies and two cases of mosaicism. RESULTS: All CNVs were successfully detected by LP-WGS, showing a high level of consistency and robust performance of the sequencing method. Notably, the size of chromosome imbalances detected by CMA and LP-WGS were compatible between the two different platforms, which indicates that the resolution and sensitivity of the LP-WGS approach are at least similar to those provided by CMA. DISCUSSION: Our data show the potential use of LP-WGS to detect CNVs in clinical diagnosis and confirm the method as an alternative for chromosome imbalances detection. The diagnostic effectiveness and feasibility of LP-WGS, in this technical validation study, were evidenced by a clinically representative dataset of CNVs that allowed a systematic assessment of the detection power and the accuracy of the sequencing approach. Further, since the software used in this study is commercially available, the method can easily be tested and implemented in a routine diagnostic setting.


Assuntos
Aneuploidia , Variações do Número de Cópias de DNA , Gravidez , Feminino , Humanos , Análise Custo-Benefício , Sequenciamento Completo do Genoma/métodos , DNA
9.
Brief Bioinform ; 23(4)2022 07 18.
Artigo em Inglês | MEDLINE | ID: mdl-35753701

RESUMO

Advances in whole-genome sequencing (WGS) promise to enable the accurate and comprehensive structural variant (SV) discovery. Dissecting SVs from WGS data presents a substantial number of challenges and a plethora of SV detection methods have been developed. Currently, evidence that investigators can use to select appropriate SV detection tools is lacking. In this article, we have evaluated the performance of SV detection tools on mouse and human WGS data using a comprehensive polymerase chain reaction-confirmed gold standard set of SVs and the genome-in-a-bottle variant set, respectively. In contrast to the previous benchmarking studies, our gold standard dataset included a complete set of SVs allowing us to report both precision and sensitivity rates of the SV detection methods. Our study investigates the ability of the methods to detect deletions, thus providing an optimistic estimate of SV detection performance as the SV detection methods that fail to detect deletions are likely to miss more complex SVs. We found that SV detection tools varied widely in their performance, with several methods providing a good balance between sensitivity and precision. Additionally, we have determined the SV callers best suited for low- and ultralow-pass sequencing data as well as for different deletion length categories.


Assuntos
Benchmarking , Genoma Humano , Animais , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Camundongos , Sequenciamento Completo do Genoma/métodos
10.
J Transl Med ; 22(1): 65, 2024 01 16.
Artigo em Inglês | MEDLINE | ID: mdl-38229122

RESUMO

BACKGROUND: Accurate clinical structural variant (SV) calling is essential for cancer target identification and diagnosis but has been historically challenging due to the lack of ground truth for clinical specimens. Meanwhile, reduced clinical-testing cost is the key to the widespread clinical utility. METHODS: We analyzed massive data from tumor samples of 476 patients and developed a computational framework for accurate and cost-effective detection of clinically-relevant SVs. In addition, standard materials and classical experiments including immunohistochemistry and/or fluorescence in situ hybridization were used to validate the developed computational framework. RESULTS: We systematically evaluated the common algorithms for SV detection and established an expert-reviewed SV call set of 1,303 tumor-specific SVs with high-evidence levels. Moreover, we developed a random-forest-based decision model to improve the true positive of SVs. To independently validate the tailored 'two-step' strategy, we utilized standard materials and classical experiments. The accuracy of the model was over 90% (92-99.78%) for all types of data. CONCLUSION: Our study provides a valuable resource and an actionable guide to improve cancer-specific SV detection accuracy and clinical applicability.


Assuntos
Genômica , Neoplasias , Humanos , Benchmarking , Análise Custo-Benefício , Hibridização in Situ Fluorescente , Neoplasias/diagnóstico , Neoplasias/genética , Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala
11.
Am J Med Genet A ; : e63802, 2024 Jun 25.
Artigo em Inglês | MEDLINE | ID: mdl-38924610

RESUMO

Low-pass whole genome sequencing (LP-WGS) has been applied as alternative method to detect copy number variants (CNVs) in the clinical setting. Compared with chromosomal microarray analysis (CMA), the sequencing-based approach provides a similar resolution of CNV detection at a lower cost. In this study, we assessed the efficiency and reliability of LP-WGS as a more affordable alternative to CMA. A total of 1363 patients with unexplained neurodevelopmental delay/intellectual disability, autism spectrum disorders, and/or multiple congenital anomalies were enrolled. Those patients were referred from 15 nonprofit organizations and university centers located in different states in Brazil. The analysis of LP-WGS at 1x coverage (>50kb) revealed a positive testing result in 22% of the cases (304/1363), in which 219 and 85 correspond to pathogenic/likely pathogenic (P/LP) CNVs and variants of uncertain significance (VUS), respectively. The 16% (219/1363) diagnostic yield observed in our cohort is comparable to the 15%-20% reported for CMA in the literature. The use of commercial software, as demonstrated in this study, simplifies the implementation of the test in clinical settings. Particularly for countries like Brazil, where the cost of CMA presents a substantial barrier to most of the population, LP-WGS emerges as a cost-effective alternative for investigating copy number changes in cytogenetics.

12.
Int J Mol Sci ; 25(5)2024 Feb 25.
Artigo em Inglês | MEDLINE | ID: mdl-38473917

RESUMO

Ocular malformations (OMs) arise from early defects during embryonic eye development. Despite the identification of over 100 genes linked to this heterogeneous group of disorders, the genetic cause remains unknown for half of the individuals following Whole-Exome Sequencing. Diagnosis procedures are further hampered by the difficulty of studying samples from clinically relevant tissue, which is one of the main obstacles in OMs. Whole-Genome Sequencing (WGS) to screen for non-coding regions and structural variants may unveil new diagnoses for OM individuals. In this study, we report a patient exhibiting a syndromic OM with a de novo 3.15 Mb inversion in the 6p25 region identified by WGS. This balanced structural variant was located 100 kb away from the FOXC1 gene, previously associated with ocular defects in the literature. We hypothesized that the inversion disrupts the topologically associating domain of FOXC1 and impairs the expression of the gene. Using a new type of samples to study transcripts, we were able to show that the patient presented monoallelic expression of FOXC1 in conjunctival cells, consistent with the abolition of the expression of the inverted allele. This report underscores the importance of investigating structural variants, even in non-coding regions, in individuals affected by ocular malformations.


Assuntos
Anormalidades do Olho , Microftalmia , Humanos , Fatores de Transcrição/genética , Microftalmia/genética , Segmento Anterior do Olho/anormalidades , Anormalidades do Olho/genética , Alelos , Fatores de Transcrição Forkhead/genética , Mutação
13.
BMC Bioinformatics ; 24(1): 352, 2023 Sep 20.
Artigo em Inglês | MEDLINE | ID: mdl-37730581

RESUMO

We published a paper in BMC Bioinformatics comprehensively evaluating the performance of structural variation (SV) calling with long-read SV detection methods based on simulated error-prone long-read data under various sequencing settings. Recently, C.Y.T. et al. wrote a correspondence claiming that the performance of NanoVar was underestimated in our benchmarking and listed some errors in our previous manuscripts. To clarify these matters, we reproduced our previous benchmarking results and carried out a series of parallel experiments on both the newly generated simulated datasets and the ones provided by C.Y.T. et al. The robust benchmark results indicate that NanoVar has unstable performance on simulated data produced from different versions of VISOR, while other tools do not exhibit this phenomenon. Furthermore, the errors proposed by C.Y.T. et al. were due to them using another version of VISOR and Sniffles, which caused many changes in usage and results compared to the versions applied in our previous work. We hope that this commentary proves the validity of our previous publication, clarifies and eliminates the misunderstanding about the commands and results in our benchmarking. Furthermore, we welcome more experts and scholars in the scientific community to pay attention to our research and help us better optimize these valuable works.


Assuntos
Benchmarking , Redação
14.
BMC Bioinformatics ; 24(1): 119, 2023 Mar 28.
Artigo em Inglês | MEDLINE | ID: mdl-36977976

RESUMO

BACKGROUND: Genomic structural variant detection is a significant and challenging issue in genome analysis. The existing long-read based structural variant detection methods still have space for improvement in detecting multi-type structural variants. RESULTS: In this paper, we propose a method called cnnLSV to obtain detection results with higher quality by eliminating false positives in the detection results merged from the callsets of existing methods. We design an encoding strategy for four types of structural variants to represent long-read alignment information around structural variants into images, input the images into a constructed convolutional neural network to train a filter model, and load the trained model to remove the false positives to improve the detection performance. We also eliminate mislabeled training samples in the training model phase by using principal component analysis algorithm and unsupervised clustering algorithm k-means. Experimental results on both simulated and real datasets show that our proposed method outperforms existing methods overall in detecting insertions, deletions, inversions, and duplications. The program of cnnLSV is available at https://github.com/mhuidong/cnnLSV . CONCLUSIONS: The proposed cnnLSV can detect structural variants by using long-read alignment information and convolutional neural network to achieve overall higher performance, and effectively eliminate incorrectly labeled samples by using the principal component analysis and k-means algorithms in training model stage.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Software , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Algoritmos , Genoma , Redes Neurais de Computação
15.
Plant J ; 110(6): 1536-1550, 2022 06.
Artigo em Inglês | MEDLINE | ID: mdl-35514123

RESUMO

Tomato has undergone extensive selections during domestication. Recent progress has shown that genomic structural variants (SVs) have contributed to gene expression dynamics during tomato domestication, resulting in changes of important traits. Here, we performed comprehensive analyses of small RNAs (sRNAs) from nine representative tomato accessions. We demonstrate that SVs substantially contribute to the dynamic expression of the three major classes of plant sRNAs: microRNAs (miRNAs), phased secondary short interfering RNAs (phasiRNAs), and 24-nucleotide heterochromatic siRNAs (hc-siRNAs). Changes in the abundance of phasiRNAs and 24-nucleotide hc-siRNAs likely contribute to the alteration of mRNA gene expression in cis during tomato domestication, particularly for genes associated with biotic and abiotic stress tolerance. We also observe that miRNA expression dynamics are associated with imprecise processing, alternative miRNA-miRNA* selections, and SVs. SVs mainly affect the expression of less-conserved miRNAs that do not have established regulatory functions or low abundant members in highly expressed miRNA families. Our data highlight different selection pressures on miRNAs compared to phasiRNAs and 24-nucleotide hc-siRNAs. Our findings provide insights into plant sRNA evolution as well as SV-based gene regulation during crop domestication. Furthermore, our dataset provides a rich resource for mining the sRNA regulatory network in tomato.


Assuntos
MicroRNAs , Solanum lycopersicum , Domesticação , Regulação da Expressão Gênica de Plantas/genética , Variação Estrutural do Genoma , Solanum lycopersicum/genética , Solanum lycopersicum/metabolismo , MicroRNAs/genética , MicroRNAs/metabolismo , Nucleotídeos , RNA de Plantas/genética , RNA Interferente Pequeno/genética , Transcriptoma/genética
16.
BMC Genomics ; 24(1): 469, 2023 Aug 21.
Artigo em Inglês | MEDLINE | ID: mdl-37605126

RESUMO

BACKGROUND: All cancers harbor somatic mutations in their genomes. In principle, mutations affecting between one and fifty base pairs are generally classified as small mutational events. Conversely, large mutational events affect more than fifty base pairs, and, in most cases, they encompass copy-number and structural variants affecting many thousands of base pairs. Prior studies have demonstrated that examining patterns of somatic mutations can be leveraged to provide both biological and clinical insights, thus, resulting in an extensive repertoire of tools for evaluating small mutational events. Recently, classification schemas for examining large-scale mutational events have emerged and shown their utility across the spectrum of human cancers. However, there has been no computationally efficient bioinformatics tool that allows visualizing and exploring these large-scale mutational events. RESULTS: Here, we present a new version of SigProfilerMatrixGenerator that now delivers integrated capabilities for examining large mutational events. The tool provides support for examining copy-number variants and structural variants under two previously developed classification schemas and it supports data from numerous algorithms and data modalities. SigProfilerMatrixGenerator is written in Python with an R wrapper package provided for users that prefer working in an R environment. CONCLUSIONS: The new version of SigProfilerMatrixGenerator provides the first standardized bioinformatics tool for optimized exploration and visualization of two previously developed classification schemas for copy number and structural variants. The tool is freely available at https://github.com/AlexandrovLab/SigProfilerMatrixGenerator with an extensive documentation at https://osf.io/s93d5/wiki/home/ .


Assuntos
Algoritmos , Biologia Computacional , Humanos , Mutação
17.
Mol Genet Genomics ; 298(3): 735-754, 2023 May.
Artigo em Inglês | MEDLINE | ID: mdl-37017807

RESUMO

Trichoderma atroviride and Trichoderma harzianum are widely used as commercial biocontrol agents against plant diseases. Recently, T. harzianum IOC-3844 (Th3844) and T. harzianum CBMAI-0179 (Th0179) demonstrated great potential in the enzymatic conversion of lignocellulose into fermentable sugars. Herein, we performed whole-genome sequencing and assembly of the Th3844 and Th0179 strains. To assess the genetic diversity within the genus Trichoderma, the results of both strains were compared with strains of T. atroviride CBMAI-00020 (Ta0020) and T. reesei CBMAI-0711 (Tr0711). The sequencing coverage value of all genomes evaluated in this study was higher than that of previously reported genomes for the same species of Trichoderma. The resulting assembly revealed total lengths of 40 Mb (Th3844), 39 Mb (Th0179), 36 Mb (Ta0020), and 32 Mb (Tr0711). A genome-wide phylogenetic analysis provided details on the relationships of the newly sequenced species with other Trichoderma species. Structural variants revealed genomic rearrangements among Th3844, Th0179, Ta0020, and Tr0711 relative to the T. reesei QM6a reference genome and showed the functional effects of such variants. In conclusion, the findings presented herein allow the visualization of genetic diversity in the evaluated strains and offer opportunities to explore such fungal genomes in future biotechnological and industrial applications.


Assuntos
Trichoderma , Filogenia , Trichoderma/genética , Genômica
18.
Brief Bioinform ; 22(3)2021 05 20.
Artigo em Inglês | MEDLINE | ID: mdl-32379294

RESUMO

Somatic structural variants (SVs), which are variants that typically impact >50 nucleotides, play a significant role in cancer development and evolution but are notoriously more difficult to detect than small variants from short-read next-generation sequencing (NGS) data. This is due to a combination of challenges attributed to the purity of tumour samples, tumour heterogeneity, limitations of short-read information from NGS and sequence alignment ambiguities. In spite of active development of SV detection tools (callers) over the past few years, each method has inherent advantages and limitations. In this review, we highlight some of the important factors affecting somatic SV detection and compared the performance of seven commonly used SV callers. In particular, we focus on the extent of change in sensitivity and precision for detecting different SV types and size ranges from samples with differing variant allele frequencies and sequencing depths of coverage. We highlight the reasons for why some SV callers perform well in some settings but not others, allowing our evaluation findings to be extended beyond the seven SV callers examined in this paper. As the importance of large SVs become increasingly recognized in cancer genomics, this paper provides a timely review on some of the most impactful factors influencing somatic SV detection that should be considered when choosing SV callers.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/métodos , Neoplasias/genética , Frequência do Gene , Variação Genética , Humanos , Neoplasias/patologia , Análise de Sequência de DNA/métodos
19.
Brief Bioinform ; 22(4)2021 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-33378767

RESUMO

Short read whole genome sequencing has become widely used to detect structural variants in human genetic studies and clinical practices. However, accurate detection of structural variants is a challenging task. Especially existing structural variant detection approaches produce a large proportion of incorrect calls, so effective structural variant filtering approaches are urgently needed. In this study, we propose a novel deep learning-based approach, DeepSVFilter, for filtering structural variants in short read whole genome sequencing data. DeepSVFilter encodes structural variant signals in the read alignments as images and adopts the transfer learning with pre-trained convolutional neural networks as the classification models, which are trained on the well-characterized samples with known high confidence structural variants. We use two well-characterized samples to demonstrate DeepSVFilter's performance and its filtering effect coupled with commonly used structural variant detection approaches. The software DeepSVFilter is implemented using Python and freely available from the website at https://github.com/yongzhuang/DeepSVFilter.


Assuntos
Aprendizado Profundo , Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala , Software , Sequenciamento Completo do Genoma , Humanos
20.
Clin Genet ; 104(3): 390-392, 2023 09.
Artigo em Inglês | MEDLINE | ID: mdl-37157895

RESUMO

We describe a patient from the 100,000 Genomes Project with a complex de novo structural variant within KMT2E leading to O'Donnell-Luria-Rodan syndrome. This case expands the mutational spectrum for this syndrome and highlights the importance of revisiting unsolved cases using better SV prioritisation tools and updated gene panels.


Assuntos
Mapeamento Cromossômico , Humanos , Feminino , Sequência de Bases , Mutação
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA