Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 153
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Am J Hum Genet ; 2024 Sep 23.
Artigo em Inglês | MEDLINE | ID: mdl-39332408

RESUMO

Whereas 16p11.2 BP4-5 copy-number variants (CNVs) represent one of the most pleiotropic etiologies of genomic syndromes in both clinical and population cohorts, the mechanisms leading to such pleiotropy remain understudied. Identifying 73 deletion and 89 duplication carrier individuals among unrelated White British UK Biobank participants, we performed a phenome-wide association study (PheWAS) between the region's copy number and 117 complex traits and diseases, mimicking four dosage models. Forty-six phenotypes (39%) were affected by 16p11.2 BP4-5 CNVs, with the deletion-only, mirror, U-shape, and duplication-only models being the best fit for 30, 10, 4, and 2 phenotypes, respectively, aligning with the stronger deleteriousness of the deletion. Upon individually adjusting CNV effects for either body mass index (BMI), height, or educational attainment (EA), we found that sixteen testable deletion-driven associations-primarily with cardiovascular and metabolic traits-were BMI dependent, with EA playing a more subtle role and no association depending on height. Bidirectional Mendelian randomization supported that 13 out of these 16 associations were secondary consequences of the CNV's impact on BMI. For the 23 traits that remained significantly associated upon individual adjustment for mediators, matched-control analyses found that 10 phenotypes, including musculoskeletal traits, liver enzymes, fluid intelligence, platelet count, and pneumonia and acute kidney injury risk, remained associated under strict Bonferroni correction, with 10 additional nominally significant associations. These results paint a complex picture of 16p11.2 BP4-5's pleiotropic pattern that involves direct effects on multiple physiological systems and indirect co-morbidities consequential to the CNV's impact on BMI and EA, acting through trait-specific dosage mechanisms.

2.
Am J Hum Genet ; 2024 Sep 23.
Artigo em Inglês | MEDLINE | ID: mdl-39332410

RESUMO

Recurrent genomic rearrangements at 16p11.2 BP4-5 represent one of the most common causes of genomic disorders. Originally associated with increased risk for autism spectrum disorder, schizophrenia, and intellectual disability, as well as adiposity and head circumference, these CNVs have since been associated with a plethora of phenotypic alterations, albeit with high variability in expressivity and incomplete penetrance. Here, we comprehensively review the pleiotropy associated with 16p11.2 BP4-5 rearrangements to shine light on its full phenotypic spectrum. Illustrating this phenotypic heterogeneity, we expose many parallels between findings gathered from clinical versus population-based cohorts, which often point to the same physiological systems, and emphasize the role of the CNV beyond neuropsychiatric and anthropometric traits. Revealing the complex and variable clinical manifestations of this CNV is crucial for accurate diagnosis and personalized treatment strategies for carrier individuals. Furthermore, we discuss areas of research that will be key to identifying factors contributing to phenotypic heterogeneity and gaining mechanistic insights into the molecular pathways underlying observed associations, while demonstrating how diversity in affected individuals, cohorts, experimental models, and analytical approaches can catalyze discoveries.

3.
Am J Hum Genet ; 110(9): 1454-1469, 2023 09 07.
Artigo em Inglês | MEDLINE | ID: mdl-37595579

RESUMO

Short-read genome sequencing (GS) holds the promise of becoming the primary diagnostic approach for the assessment of autism spectrum disorder (ASD) and fetal structural anomalies (FSAs). However, few studies have comprehensively evaluated its performance against current standard-of-care diagnostic tests: karyotype, chromosomal microarray (CMA), and exome sequencing (ES). To assess the clinical utility of GS, we compared its diagnostic yield against these three tests in 1,612 quartet families including an individual with ASD and in 295 prenatal families. Our GS analytic framework identified a diagnostic variant in 7.8% of ASD probands, almost 2-fold more than CMA (4.3%) and 3-fold more than ES (2.7%). However, when we systematically captured copy-number variants (CNVs) from the exome data, the diagnostic yield of ES (7.4%) was brought much closer to, but did not surpass, GS. Similarly, we estimated that GS could achieve an overall diagnostic yield of 46.1% in unselected FSAs, representing a 17.2% increased yield over karyotype, 14.1% over CMA, and 4.1% over ES with CNV calling or 36.1% increase without CNV discovery. Overall, GS provided an added diagnostic yield of 0.4% and 0.8% beyond the combination of all three standard-of-care tests in ASD and FSAs, respectively. This corresponded to nine GS unique diagnostic variants, including sequence variants in exons not captured by ES, structural variants (SVs) inaccessible to existing standard-of-care tests, and SVs where the resolution of GS changed variant classification. Overall, this large-scale evaluation demonstrated that GS significantly outperforms each individual standard-of-care test while also outperforming the combination of all three tests, thus warranting consideration as the first-tier diagnostic approach for the assessment of ASD and FSAs.


Assuntos
Transtorno do Espectro Autista , Feminino , Gravidez , Humanos , Transtorno do Espectro Autista/diagnóstico , Transtorno do Espectro Autista/genética , Primeiro Trimestre da Gravidez , Ultrassonografia Pré-Natal , Mapeamento Cromossômico , Exoma
4.
Am J Hum Genet ; 110(2): 300-313, 2023 02 02.
Artigo em Inglês | MEDLINE | ID: mdl-36706759

RESUMO

While extensively studied in clinical cohorts, the phenotypic consequences of 22q11.2 copy-number variants (CNVs) in the general population remain understudied. To address this gap, we performed a phenome-wide association scan in 405,324 unrelated UK Biobank (UKBB) participants by using CNV calls from genotyping array. We mapped 236 Human Phenotype Ontology terms linked to any of the 90 genes encompassed by the region to 170 UKBB traits and assessed the association between these traits and the copy-number state of 504 genotyping array probes in the region. We found significant associations for eight continuous and nine binary traits associated under different models (duplication-only, deletion-only, U-shape, and mirror models). The causal effect of the expression level of 22q11.2 genes on associated traits was assessed through transcriptome-wide Mendelian randomization (TWMR), revealing that increased expression of ARVCF increased BMI. Similarly, increased DGCR6 expression causally reduced mean platelet volume, in line with the corresponding CNV effect. Furthermore, cross-trait multivariable Mendelian randomization (MVMR) suggested a predominant role of genuine (horizontal) pleiotropy in the CNV region. Our findings show that within the general population, 22q11.2 CNVs are associated with traits previously linked to genes in the region, and duplications and deletions act upon traits in different fashions. We also showed that gain or loss of distinct segments within 22q11.2 may impact a trait under different association models. Our results have provided new insights to help further the understanding of the complex 22q11.2 region.


Assuntos
Variações do Número de Cópias de DNA , Fenômica , Humanos , Variações do Número de Cópias de DNA/genética , Fenótipo , Cromossomos Humanos Par 22
5.
Brief Bioinform ; 25(4)2024 May 23.
Artigo em Inglês | MEDLINE | ID: mdl-38851298

RESUMO

Deletion is a crucial type of genomic structural variation and is associated with numerous genetic diseases. The advent of third-generation sequencing technology has facilitated the analysis of complex genomic structures and the elucidation of the mechanisms underlying phenotypic changes and disease onset due to genomic variants. Importantly, it has introduced innovative perspectives for deletion variants calling. Here we propose a method named Dual Attention Structural Variation (DASV) to analyze deletion structural variations in sequencing data. DASV converts gene alignment information into images and integrates them with genomic sequencing data through a dual attention mechanism. Subsequently, it employs a multi-scale network to precisely identify deletion regions. Compared with four widely used genome structural variation calling tools: cuteSV, SVIM, Sniffles and PBSV, the results demonstrate that DASV consistently achieves a balance between precision and recall, enhancing the F1 score across various datasets. The source code is available at https://github.com/deconvolution-w/DASV.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Software , Humanos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Deleção de Sequência , Análise de Sequência de DNA/métodos , Algoritmos , Genômica/métodos , Biologia Computacional/métodos
6.
Trends Genet ; 38(6): 572-586, 2022 06.
Artigo em Inglês | MEDLINE | ID: mdl-34906378

RESUMO

The development of new sequencing platforms, technologies, and bioinformatics tools in the past decade fostered key discoveries in human genomics. Among the most recent sequencing technologies, nanopore sequencing (NS) has caught the interest of researchers for its intriguing potential and flexibility. This up-to-date review highlights the recent application of NS in the hematology field, focusing on progress and challenges of the technological approaches employed for the identification of pathologic alterations. The molecular and analytic pipelines developed for the analysis of the whole-genome, target regions, and transcriptomics provide a proof of evidence of the unparalleled amount of information that could be retrieved by an innovative approach based on long-read sequencing.


Assuntos
Hematologia , Sequenciamento por Nanoporos , Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Análise de Sequência de DNA
7.
Am J Hum Genet ; 109(2): 195-209, 2022 02 03.
Artigo em Inglês | MEDLINE | ID: mdl-35032432

RESUMO

Whole-genome sequencing resolves many clinical cases where standard diagnostic methods have failed. However, at least half of these cases remain unresolved after whole-genome sequencing. Structural variants (SVs; genomic variants larger than 50 base pairs) of uncertain significance are the genetic cause of a portion of these unresolved cases. As sequencing methods using long or linked reads become more accessible and SV detection algorithms improve, clinicians and researchers are gaining access to thousands of reliable SVs of unknown disease relevance. Methods to predict the pathogenicity of these SVs are required to realize the full diagnostic potential of long-read sequencing. To address this emerging need, we developed StrVCTVRE to distinguish pathogenic SVs from benign SVs that overlap exons. In a random forest classifier, we integrated features that capture gene importance, coding region, conservation, expression, and exon structure. We found that features such as expression and conservation are important but are absent from SV classification guidelines. We leveraged multiple resources to construct a size-matched training set of rare, putatively benign and pathogenic SVs. StrVCTVRE performs accurately across a wide SV size range on independent test sets, which will allow clinicians and researchers to eliminate about half of SVs from consideration while retaining a 90% sensitivity. We anticipate clinicians and researchers will use StrVCTVRE to prioritize SVs in probands where no SV is immediately compelling, empowering deeper investigation into novel SVs to resolve cases and understand new mechanisms of disease. StrVCTVRE runs rapidly and is publicly available.


Assuntos
Algoritmos , Genoma Humano , Variação Estrutural do Genoma , Software , Aprendizado de Máquina Supervisionado , Conjuntos de Dados como Assunto , Éxons , Genômica/métodos , Humanos , Curva ROC , Sequenciamento Completo do Genoma/estatística & dados numéricos
8.
Brief Bioinform ; 24(4)2023 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-37200087

RESUMO

Structural variant (SV) detection is essential for genomic studies, and long-read sequencing technologies have advanced our capacity to detect SVs directly from read or de novo assembly, also known as read-based and assembly-based strategy. However, to date, no independent studies have compared and benchmarked the two strategies. Here, on the basis of SVs detected by 20 read-based and eight assembly-based detection pipelines from six datasets of HG002 genome, we investigated the factors that influence the two strategies and assessed their performance with well-curated SVs. We found that up to 80% of the SVs could be detected by both strategies among different long-read datasets, whereas variant type, size, and breakpoint detected by read-based strategy were greatly affected by aligners. For the high-confident insertions and deletions at non-tandem repeat regions, a remarkable subset of them (82% in assembly-based calls and 93% in read-based calls), accounting for around 4000 SVs, could be captured by both reads and assemblies. However, discordance between two strategies was largely caused by complex SVs and inversions, which resulted from inconsistent alignment of reads and assemblies at these loci. Finally, benchmarking with SVs at medically relevant genes, the recall of read-based strategy reached 77% on 5X coverage data, whereas assembly-based strategy required 20X coverage data to achieve similar performance. Therefore, integrating SVs from read and assembly is suggested for general-purpose detection because of inconsistently detected complex SVs and inversions, whereas assembly-based strategy is optional for applications with limited resources.


Assuntos
Benchmarking , Genoma Humano , Humanos , Análise de Sequência , Genômica/métodos , Análise de Sequência de DNA/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos
9.
Mol Biol Evol ; 40(8)2023 08 03.
Artigo em Inglês | MEDLINE | ID: mdl-37565562

RESUMO

During the origin of great apes about 14 million years ago, a series of phenotypic innovations emerged, such as the increased body size, the enlarged brain volume, the improved cognitive skill, and the diversified diet. Yet, the genomic basis of these evolutionary changes remains unclear. Utilizing the high-quality genome assemblies of great apes (including human), gibbon, and macaque, we conducted comparative genome analyses and identified 15,885 great ape-specific structural variants (GSSVs), including eight coding GSSVs resulting in the creation of novel proteins (e.g., ACAN and CMYA5). Functional annotations of the GSSV-related genes revealed the enrichment of genes involved in development and morphogenesis, especially neurogenesis and neural network formation, suggesting the potential role of GSSVs in shaping the great ape-shared traits. Further dissection of the brain-related GSSVs shows great ape-specific changes of enhancer activities and gene expression in the brain, involving a group of GSSV-regulated genes (such as NOL3) that potentially contribute to the altered brain development and function in great apes. The presented data highlight the evolutionary role of structural variants in the phenotypic innovations during the origin of the great ape lineage.


Assuntos
Hominidae , Animais , Humanos , Hominidae/genética , Evolução Biológica , Genoma , Genômica , Fenótipo
10.
Ann Hum Genet ; 88(2): 113-125, 2024 03.
Artigo em Inglês | MEDLINE | ID: mdl-37807935

RESUMO

INTRODUCTION: Next generation sequencing technology has greatly reduced the cost and time required for sequencing a genome. An approach that is rapidly being adopted as an alternative method for CNV analysis is the low-pass whole genome sequencing (LP-WGS). Here, we evaluated the performance of LP-WGS to detect copy number variants (CNVs) in clinical cytogenetics. MATERIALS AND METHODS: DNA samples with known CNVs detected by chromosomal microarray analyses (CMA) were selected for comparison and used as positive controls; our panel included 44 DNA samples (12 prenatal and 32 postnatal), comprising a total of 55 chromosome imbalances. The selected cases were chosen to provide a wide range of clinically relevant CNVs, the vast majority being associated with intellectual disability or recognizable syndromes. The chromosome imbalances ranged in size from 75 kb to 90.3 Mb, including aneuploidies and two cases of mosaicism. RESULTS: All CNVs were successfully detected by LP-WGS, showing a high level of consistency and robust performance of the sequencing method. Notably, the size of chromosome imbalances detected by CMA and LP-WGS were compatible between the two different platforms, which indicates that the resolution and sensitivity of the LP-WGS approach are at least similar to those provided by CMA. DISCUSSION: Our data show the potential use of LP-WGS to detect CNVs in clinical diagnosis and confirm the method as an alternative for chromosome imbalances detection. The diagnostic effectiveness and feasibility of LP-WGS, in this technical validation study, were evidenced by a clinically representative dataset of CNVs that allowed a systematic assessment of the detection power and the accuracy of the sequencing approach. Further, since the software used in this study is commercially available, the method can easily be tested and implemented in a routine diagnostic setting.


Assuntos
Aneuploidia , Variações do Número de Cópias de DNA , Gravidez , Feminino , Humanos , Análise Custo-Benefício , Sequenciamento Completo do Genoma/métodos , DNA
11.
Brief Bioinform ; 23(4)2022 07 18.
Artigo em Inglês | MEDLINE | ID: mdl-35753701

RESUMO

Advances in whole-genome sequencing (WGS) promise to enable the accurate and comprehensive structural variant (SV) discovery. Dissecting SVs from WGS data presents a substantial number of challenges and a plethora of SV detection methods have been developed. Currently, evidence that investigators can use to select appropriate SV detection tools is lacking. In this article, we have evaluated the performance of SV detection tools on mouse and human WGS data using a comprehensive polymerase chain reaction-confirmed gold standard set of SVs and the genome-in-a-bottle variant set, respectively. In contrast to the previous benchmarking studies, our gold standard dataset included a complete set of SVs allowing us to report both precision and sensitivity rates of the SV detection methods. Our study investigates the ability of the methods to detect deletions, thus providing an optimistic estimate of SV detection performance as the SV detection methods that fail to detect deletions are likely to miss more complex SVs. We found that SV detection tools varied widely in their performance, with several methods providing a good balance between sensitivity and precision. Additionally, we have determined the SV callers best suited for low- and ultralow-pass sequencing data as well as for different deletion length categories.


Assuntos
Benchmarking , Genoma Humano , Animais , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Camundongos , Sequenciamento Completo do Genoma/métodos
12.
J Transl Med ; 22(1): 65, 2024 01 16.
Artigo em Inglês | MEDLINE | ID: mdl-38229122

RESUMO

BACKGROUND: Accurate clinical structural variant (SV) calling is essential for cancer target identification and diagnosis but has been historically challenging due to the lack of ground truth for clinical specimens. Meanwhile, reduced clinical-testing cost is the key to the widespread clinical utility. METHODS: We analyzed massive data from tumor samples of 476 patients and developed a computational framework for accurate and cost-effective detection of clinically-relevant SVs. In addition, standard materials and classical experiments including immunohistochemistry and/or fluorescence in situ hybridization were used to validate the developed computational framework. RESULTS: We systematically evaluated the common algorithms for SV detection and established an expert-reviewed SV call set of 1,303 tumor-specific SVs with high-evidence levels. Moreover, we developed a random-forest-based decision model to improve the true positive of SVs. To independently validate the tailored 'two-step' strategy, we utilized standard materials and classical experiments. The accuracy of the model was over 90% (92-99.78%) for all types of data. CONCLUSION: Our study provides a valuable resource and an actionable guide to improve cancer-specific SV detection accuracy and clinical applicability.


Assuntos
Genômica , Neoplasias , Humanos , Benchmarking , Análise Custo-Benefício , Hibridização in Situ Fluorescente , Neoplasias/diagnóstico , Neoplasias/genética , Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala
13.
New Phytol ; 243(4): 1490-1505, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-39021210

RESUMO

Grapevine downy mildew, caused by the oomycete Plasmopara viticola (P. viticola, Berk. & M. A. Curtis; Berl. & De Toni), is a global threat to Eurasian wine grapes Vitis vinifera. Although resistant grapevine varieties are becoming more accessible, P. viticola populations are rapidly evolving to overcome these resistances. We aimed to uncover avirulence genes related to Rpv3.1-mediated grapevine resistance. We sequenced the genomes and characterized the development of 136 P. viticola strains on resistant and sensitive grapevine cultivars. A genome-wide association study was conducted to identify genomic variations associated with resistant-breaking phenotypes. We identified a genomic region associated with the breakdown of Rpv3.1 grapevine resistance (avrRpv3.1 locus). A diploid-aware reassembly of the P. viticola INRA-Pv221 genome revealed structural variations in this locus, including a 30 kbp deletion. Virulent P. viticola strains displayed multiple deletions on both haplotypes at the avrRpv3.1 locus. These deletions involve two paralog genes coding for proteins with 800-900 amino acids and signal peptides. These proteins exhibited a structure featuring LWY-fold structural modules, common among oomycete effectors. When transiently expressed, these proteins induced cell death in grapevines carrying Rpv3.1 resistance, confirming their avirulence nature. This discovery sheds light on the genetic mechanisms enabling P. viticola to adapt to grapevine resistance, laying a foundation for developing strategies to manage this destructive crop pathogen.


Assuntos
Resistência à Doença , Doenças das Plantas , Vitis , Vitis/genética , Vitis/microbiologia , Doenças das Plantas/microbiologia , Doenças das Plantas/genética , Doenças das Plantas/imunologia , Resistência à Doença/genética , Oomicetos/patogenicidade , Estudo de Associação Genômica Ampla , Deleção de Sequência , Genes de Plantas , Haplótipos/genética , Deleção de Genes , Fenótipo
14.
Am J Med Genet A ; : e63802, 2024 Jun 25.
Artigo em Inglês | MEDLINE | ID: mdl-38924610

RESUMO

Low-pass whole genome sequencing (LP-WGS) has been applied as alternative method to detect copy number variants (CNVs) in the clinical setting. Compared with chromosomal microarray analysis (CMA), the sequencing-based approach provides a similar resolution of CNV detection at a lower cost. In this study, we assessed the efficiency and reliability of LP-WGS as a more affordable alternative to CMA. A total of 1363 patients with unexplained neurodevelopmental delay/intellectual disability, autism spectrum disorders, and/or multiple congenital anomalies were enrolled. Those patients were referred from 15 nonprofit organizations and university centers located in different states in Brazil. The analysis of LP-WGS at 1x coverage (>50kb) revealed a positive testing result in 22% of the cases (304/1363), in which 219 and 85 correspond to pathogenic/likely pathogenic (P/LP) CNVs and variants of uncertain significance (VUS), respectively. The 16% (219/1363) diagnostic yield observed in our cohort is comparable to the 15%-20% reported for CMA in the literature. The use of commercial software, as demonstrated in this study, simplifies the implementation of the test in clinical settings. Particularly for countries like Brazil, where the cost of CMA presents a substantial barrier to most of the population, LP-WGS emerges as a cost-effective alternative for investigating copy number changes in cytogenetics.

15.
Int J Mol Sci ; 25(5)2024 Feb 25.
Artigo em Inglês | MEDLINE | ID: mdl-38473917

RESUMO

Ocular malformations (OMs) arise from early defects during embryonic eye development. Despite the identification of over 100 genes linked to this heterogeneous group of disorders, the genetic cause remains unknown for half of the individuals following Whole-Exome Sequencing. Diagnosis procedures are further hampered by the difficulty of studying samples from clinically relevant tissue, which is one of the main obstacles in OMs. Whole-Genome Sequencing (WGS) to screen for non-coding regions and structural variants may unveil new diagnoses for OM individuals. In this study, we report a patient exhibiting a syndromic OM with a de novo 3.15 Mb inversion in the 6p25 region identified by WGS. This balanced structural variant was located 100 kb away from the FOXC1 gene, previously associated with ocular defects in the literature. We hypothesized that the inversion disrupts the topologically associating domain of FOXC1 and impairs the expression of the gene. Using a new type of samples to study transcripts, we were able to show that the patient presented monoallelic expression of FOXC1 in conjunctival cells, consistent with the abolition of the expression of the inverted allele. This report underscores the importance of investigating structural variants, even in non-coding regions, in individuals affected by ocular malformations.


Assuntos
Anormalidades do Olho , Microftalmia , Humanos , Fatores de Transcrição/genética , Microftalmia/genética , Segmento Anterior do Olho/anormalidades , Anormalidades do Olho/genética , Alelos , Fatores de Transcrição Forkhead/genética , Mutação
16.
Int J Mol Sci ; 25(15)2024 Jul 30.
Artigo em Inglês | MEDLINE | ID: mdl-39125883

RESUMO

Bardet-Biedl syndrome (BBS) is a rare recessive multisystem disorder characterized by retinitis pigmentosa, obesity, postaxial polydactyly, cognitive deficits, and genitourinary defects. BBS is clinically variable and genetically heterogeneous, with 26 genes identified to contribute to the disorder when mutated, the majority encoding proteins playing role in primary cilium biogenesis, intraflagellar transport, and ciliary trafficking. Here, we report on an 18-year-old boy with features including severe photophobia and central vision loss since childhood, hexadactyly of the right foot and a supernumerary nipple, which were suggestive of BBS. Genetic analyses using targeted resequencing and exome sequencing failed to provide a conclusive genetic diagnosis. Whole-genome sequencing (WGS) allowed us to identify compound heterozygosity for a missense variant and a large intragenic deletion encompassing exon 12 in BBS9 as underlying the condition. We assessed the functional impact of the identified variants and demonstrated that they impair BBS9 function, with significant consequences for primary cilium formation and morphology. Overall, this study further highlights the usefulness of WGS in the diagnostic workflow of rare diseases to reach a definitive diagnosis. This report also remarks on a requirement for functional validation analyses to more effectively classify variants that are identified in the frame of the diagnostic workflow.


Assuntos
Síndrome de Bardet-Biedl , Sequenciamento Completo do Genoma , Síndrome de Bardet-Biedl/genética , Síndrome de Bardet-Biedl/diagnóstico , Humanos , Masculino , Adolescente , Cílios/patologia , Cílios/genética , Proteínas do Citoesqueleto
17.
BMC Bioinformatics ; 24(1): 352, 2023 Sep 20.
Artigo em Inglês | MEDLINE | ID: mdl-37730581

RESUMO

We published a paper in BMC Bioinformatics comprehensively evaluating the performance of structural variation (SV) calling with long-read SV detection methods based on simulated error-prone long-read data under various sequencing settings. Recently, C.Y.T. et al. wrote a correspondence claiming that the performance of NanoVar was underestimated in our benchmarking and listed some errors in our previous manuscripts. To clarify these matters, we reproduced our previous benchmarking results and carried out a series of parallel experiments on both the newly generated simulated datasets and the ones provided by C.Y.T. et al. The robust benchmark results indicate that NanoVar has unstable performance on simulated data produced from different versions of VISOR, while other tools do not exhibit this phenomenon. Furthermore, the errors proposed by C.Y.T. et al. were due to them using another version of VISOR and Sniffles, which caused many changes in usage and results compared to the versions applied in our previous work. We hope that this commentary proves the validity of our previous publication, clarifies and eliminates the misunderstanding about the commands and results in our benchmarking. Furthermore, we welcome more experts and scholars in the scientific community to pay attention to our research and help us better optimize these valuable works.


Assuntos
Benchmarking , Redação
18.
BMC Bioinformatics ; 24(1): 119, 2023 Mar 28.
Artigo em Inglês | MEDLINE | ID: mdl-36977976

RESUMO

BACKGROUND: Genomic structural variant detection is a significant and challenging issue in genome analysis. The existing long-read based structural variant detection methods still have space for improvement in detecting multi-type structural variants. RESULTS: In this paper, we propose a method called cnnLSV to obtain detection results with higher quality by eliminating false positives in the detection results merged from the callsets of existing methods. We design an encoding strategy for four types of structural variants to represent long-read alignment information around structural variants into images, input the images into a constructed convolutional neural network to train a filter model, and load the trained model to remove the false positives to improve the detection performance. We also eliminate mislabeled training samples in the training model phase by using principal component analysis algorithm and unsupervised clustering algorithm k-means. Experimental results on both simulated and real datasets show that our proposed method outperforms existing methods overall in detecting insertions, deletions, inversions, and duplications. The program of cnnLSV is available at https://github.com/mhuidong/cnnLSV . CONCLUSIONS: The proposed cnnLSV can detect structural variants by using long-read alignment information and convolutional neural network to achieve overall higher performance, and effectively eliminate incorrectly labeled samples by using the principal component analysis and k-means algorithms in training model stage.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Software , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Algoritmos , Genoma , Redes Neurais de Computação
19.
Plant J ; 110(6): 1536-1550, 2022 06.
Artigo em Inglês | MEDLINE | ID: mdl-35514123

RESUMO

Tomato has undergone extensive selections during domestication. Recent progress has shown that genomic structural variants (SVs) have contributed to gene expression dynamics during tomato domestication, resulting in changes of important traits. Here, we performed comprehensive analyses of small RNAs (sRNAs) from nine representative tomato accessions. We demonstrate that SVs substantially contribute to the dynamic expression of the three major classes of plant sRNAs: microRNAs (miRNAs), phased secondary short interfering RNAs (phasiRNAs), and 24-nucleotide heterochromatic siRNAs (hc-siRNAs). Changes in the abundance of phasiRNAs and 24-nucleotide hc-siRNAs likely contribute to the alteration of mRNA gene expression in cis during tomato domestication, particularly for genes associated with biotic and abiotic stress tolerance. We also observe that miRNA expression dynamics are associated with imprecise processing, alternative miRNA-miRNA* selections, and SVs. SVs mainly affect the expression of less-conserved miRNAs that do not have established regulatory functions or low abundant members in highly expressed miRNA families. Our data highlight different selection pressures on miRNAs compared to phasiRNAs and 24-nucleotide hc-siRNAs. Our findings provide insights into plant sRNA evolution as well as SV-based gene regulation during crop domestication. Furthermore, our dataset provides a rich resource for mining the sRNA regulatory network in tomato.


Assuntos
MicroRNAs , Solanum lycopersicum , Domesticação , Regulação da Expressão Gênica de Plantas/genética , Variação Estrutural do Genoma , Solanum lycopersicum/genética , Solanum lycopersicum/metabolismo , MicroRNAs/genética , MicroRNAs/metabolismo , Nucleotídeos , RNA de Plantas/genética , RNA Interferente Pequeno/genética , Transcriptoma/genética
20.
BMC Genomics ; 24(1): 469, 2023 Aug 21.
Artigo em Inglês | MEDLINE | ID: mdl-37605126

RESUMO

BACKGROUND: All cancers harbor somatic mutations in their genomes. In principle, mutations affecting between one and fifty base pairs are generally classified as small mutational events. Conversely, large mutational events affect more than fifty base pairs, and, in most cases, they encompass copy-number and structural variants affecting many thousands of base pairs. Prior studies have demonstrated that examining patterns of somatic mutations can be leveraged to provide both biological and clinical insights, thus, resulting in an extensive repertoire of tools for evaluating small mutational events. Recently, classification schemas for examining large-scale mutational events have emerged and shown their utility across the spectrum of human cancers. However, there has been no computationally efficient bioinformatics tool that allows visualizing and exploring these large-scale mutational events. RESULTS: Here, we present a new version of SigProfilerMatrixGenerator that now delivers integrated capabilities for examining large mutational events. The tool provides support for examining copy-number variants and structural variants under two previously developed classification schemas and it supports data from numerous algorithms and data modalities. SigProfilerMatrixGenerator is written in Python with an R wrapper package provided for users that prefer working in an R environment. CONCLUSIONS: The new version of SigProfilerMatrixGenerator provides the first standardized bioinformatics tool for optimized exploration and visualization of two previously developed classification schemas for copy number and structural variants. The tool is freely available at https://github.com/AlexandrovLab/SigProfilerMatrixGenerator with an extensive documentation at https://osf.io/s93d5/wiki/home/ .


Assuntos
Algoritmos , Biologia Computacional , Humanos , Mutação
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA