RESUMO
MOTIVATION: Molecular quantitative trait locus (QTL) mapping has proven to be a powerful approach for prioritizing genetic regulatory variants and causal genes identified by genome-wide association studies. Recently, this success has been extended to circular RNA (circRNA), a potential group of RNAs that can serve as markers for the diagnosis, prognosis, or therapeutic targets of various human diseases. However, a well-developed computational pipeline for circRNA QTL (circQTL) discovery is still lacking. RESULTS: We introduce an integrative method for circQTL mapping and implement it as an automated pipeline based on Nextflow, named cscQTL. The proposed method has two main advantages. Firstly, cscQTL improves the specificity by systematically combining outputs of multiple circRNA calling algorithms to obtain highly confident circRNA annotations. Secondly, cscQTL improves the sensitivity by accurately quantifying circRNA expression with the help of pseudo references. Compared to the single method approach, cscQTL effectively identifies circQTLs with an increase of 20%-100% circQTLs detected and recovered all circQTLs that are highly supported by the single method approach. We apply cscQTL to a dataset of human T cells and discover genetic variants that control the expression of 55 circRNAs. By colocalization tests, we further identify circBACH2 and circYY1AP1 as potential candidates for immune disease regulation. AVAILABILITY AND IMPLEMENTATION: cscQTL is freely available at: https://github.com/datngu/cscQTL and https://doi.org/10.5281/zenodo.7851982.
Assuntos
Locos de Características Quantitativas , RNA Circular , Humanos , Estudo de Associação Genômica Ampla/métodos , RNA/genética , Linfócitos TRESUMO
Purpose of the study: Alzheimer's disease (AD) is the most common type of dementia and its prevalence is rapidly increasing worldwide. Early-onset Alzheimer's disease (EOAD) constitutes of patients with age of onset earlier than 65 year-old and is known to be associated with genetic mutations. In this study, we reported the first genetic analysis of Vietnamese patients with EOAD.Materials and methods: We analyzed targeted sequencing data obtained from a cohort of 51 Vietnamese EOAD patients to identify pathogenic variants in twenty nine well-characterized neurodengerative genes.Results: We identified four missense mutations in APP/PSEN1 genes from six individuals, which accounts for 11.8% of all tested cases. Three of these mutations were previously reported as pathogenic and one mutation in the APP gene was newly identified and might be specific for Vietnamese patients. Our study also found eight individuals carrying homozygous APOE ε4 allele, the main risk factor gene for late-onset AD.Conclusions: Our findings showed that mutation rate in APP/PSEN genes in Vietnamese EOAD patients is consistent with that in other ethnic groups. Although further functional studies are required to validate the pathogenesis of the new mutations, our study demonstrated the necessity of genetic screening for EOAD patients as well as additional genetic data collection in Vietnamese population.
Assuntos
Doença de Alzheimer , Humanos , Idoso , Doença de Alzheimer/epidemiologia , Doença de Alzheimer/genética , Presenilina-1/genética , Precursor de Proteína beta-Amiloide/genética , Testes Genéticos , Mutação/genética , Povo Asiático/genética , Idade de InícioRESUMO
OBJECTIVES: In this study, we developed a deep learning pipeline that detects large vessel occlusion (LVO) and predicts functional outcome based on computed tomography angiography (CTA) images to improve the management of the LVO patients. METHODS: A series identifier picked out 8650 LVO-protocoled studies from 2015 to 2019 at Rhode Island Hospital with an identified thin axial series that served as the data pool. Data were annotated into 2 classes: 1021 LVOs and 7629 normal. The Inception-V1 I3D architecture was applied for LVO detection. For outcome prediction, 323 patients undergoing thrombectomy were selected. A 3D convolution neural network (CNN) was used for outcome prediction (30-day mRS) with CTA volumes and embedded pre-treatment variables as inputs. RESULT: For LVO-detection model, CTAs from 8,650 patients (median age 68 years, interquartile range (IQR): 58-81; 3934 females) were analyzed. The cross-validated AUC for LVO vs. not was 0.74 (95% CI: 0.72-0.75). For the mRS classification model, CTAs from 323 patients (median age 75 years, IQR: 63-84; 164 females) were analyzed. The algorithm achieved a test AUC of 0.82 (95% CI: 0.79-0.84), sensitivity of 89%, and specificity 66%. The two models were then integrated with hospital infrastructure where CTA was collected in real-time and processed by the model. If LVO was detected, interventionists were notified and provided with predicted clinical outcome information. CONCLUSION: 3D CNNs based on CTA were effective in selecting LVO and predicting LVO mechanical thrombectomy short-term prognosis. End-to-end AI platform allows users to receive immediate prognosis prediction and facilitates clinical workflow.
Assuntos
Isquemia Encefálica , Acidente Vascular Cerebral , Feminino , Humanos , Idoso , Inteligência Artificial , Trombectomia/efeitos adversos , Angiografia por Tomografia Computadorizada/métodos , Artéria Cerebral Média , Estudos RetrospectivosRESUMO
BACKGROUND: Circular RNA (circRNA) is an emerging class of RNA molecules attracting researchers due to its potential for serving as markers for diagnosis, prognosis, or therapeutic targets of cancer, cardiovascular, and autoimmune diseases. Current methods for detection of circRNA from RNA sequencing (RNA-seq) focus mostly on improving mapping quality of reads supporting the back-splicing junction (BSJ) of a circRNA to eliminate false positives (FPs). We show that mapping information alone often cannot predict if a BSJ-supporting read is derived from a true circRNA or not, thus increasing the rate of FP circRNAs. RESULTS: We have developed Circall, a novel circRNA detection method from RNA-seq. Circall controls the FPs using a robust multidimensional local false discovery rate method based on the length and expression of circRNAs. It is computationally highly efficient by using a quasi-mapping algorithm for fast and accurate RNA read alignments. We applied Circall on two simulated datasets and three experimental datasets of human cell-lines. The results show that Circall achieves high sensitivity and precision in the simulated data. In the experimental datasets it performs well against current leading methods. Circall is also substantially faster than the other methods, particularly for large datasets. CONCLUSIONS: With those better performances in the detection of circRNAs and in computational time, Circall facilitates the analyses of circRNAs in large numbers of samples. Circall is implemented in C++ and R, and available for use at https://www.meb.ki.se/sites/biostatwiki/circall and https://github.com/datngu/Circall.
Assuntos
RNA Circular , RNA , Humanos , RNA/genética , Splicing de RNA , RNA-Seq , Análise de Sequência de RNARESUMO
In this paper, we provide an in-depth assessment on the Bjøntegaard Delta. We construct a large data set of video compression performance comparisons using a diverse set of metrics including PSNR, VMAF, bitrate, and processing energies. These metrics are evaluated for visual data types such as classic perspective video, 360° video, point clouds, and screen content. As compression technology, we consider multiple hybrid video codecs as well as state-of-the-art neural network based compression methods. Using additional supporting points in-between standard points defined by parameters such as the quantization parameter, we assess the interpolation error of the Bjøntegaard-Delta (BD) calculus and its impact on the final BD value. From the analysis, we find that the BD calculus is most accurate in the standard application of rate-distortion comparisons with mean errors below 0.5 percentage points. For other applications and special cases, e.g., VMAF quality, energy considerations, or inter-codec comparisons, the errors are higher (up to 5 percentage points), but can be halved by using a higher number of supporting points. We finally come up with recommendations on how to use the BD calculus such that the validity of the resulting BD-values is maximized. Main recommendations are as follows: First, relative curve differences should be plotted and analyzed. Second, the logarithmic domain should be used for saturating metrics such as SSIM and VMAF. Third, BD values below a certain threshold indicated by the subset error should not be used to draw recommendations. Fourth, using two supporting points is sufficient to obtain rough performance estimates.
RESUMO
Our brain employs mechanisms to adapt to changing visual conditions. In addition to natural changes in our physiology and those in the environment, our brain is also capable of adapting to "unnatural" changes, such as inverted visual-inputs generated by inverting prisms. In this study, we examined the brain's capability to adapt to hyperspaces. We generated four spatial-dimensional stimuli in virtual reality and tested the ability to distinguish between rigid and non-rigid motion. We found that observers are able to differentiate rigid and non-rigid motion of hypercubes (4D) with a performance comparable to that obtained using cubes (3D). Moreover, observers' performance improved when they were provided with more immersive 3D experience but remained robust against increasing shape variations. At this juncture, we characterize our findings as "3 1/2 D perception" since, while we show the ability to extract and use 4D information, we do not have yet evidence of a complete phenomenal 4D experience.
RESUMO
Regardless of the overwhelming use of next-generation sequencing technologies, microarray-based genotyping combined with the imputation of untyped variants remains a cost-effective means to interrogate genetic variations across the human genome. This technology is widely used in genome-wide association studies (GWAS) at bio-bank scales, and more recently, in polygenic score (PGS) analysis to predict and stratify disease risk. Over the last decade, human genotyping arrays have undergone a tremendous growth in both number and content making a comprehensive evaluation of their performances became more important. Here, we performed a comprehensive performance assessment for 23 available human genotyping arrays in 6 ancestry groups using diverse public and in-house datasets. The analyses focus on performance estimation of derived imputation (in terms of accuracy and coverage) and PGS (in terms of concordance to PGS estimated from whole-genome sequencing data) in three different traits and diseases. We found that the arrays with a higher number of SNPs are not necessarily the ones with higher imputation performance, but the arrays that are well-optimized for the targeted population could provide very good imputation performance. In addition, PGS estimated by imputed SNP array data is highly correlated to PGS estimated by whole-genome sequencing data in most cases. When optimal arrays are used, the correlations of PGS between two types of data are higher than 0.97, but interestingly, arrays with high density can result in lower PGS performance. Our results suggest the importance of properly selecting a suitable genotyping array for PGS applications. Finally, we developed a web tool that provides interactive analyses of tag SNP contents and imputation performance based on population and genomic regions of interest. This study would act as a practical guide for researchers to design their genotyping arrays-based studies. The tool is available at: https://genome.vinbigdata.org/tools/saa/ .
Assuntos
Genoma Humano , Estudo de Associação Genômica Ampla , Humanos , Genótipo , Polimorfismo de Nucleotídeo Único , Sequenciamento de Nucleotídeos em Larga Escala/métodosRESUMO
Targeted therapy with tyrosine kinase inhibitors (TKI) provides survival benefits to a majority of patients with non-small cell lung cancer (NSCLC). However, resistance to TKI almost always develops after treatment. Although genetic and epigenetic alterations have each been shown to drive resistance to TKI in cell line models, clinical evidence for their contribution in the acquisition of resistance remains limited. Here, we employed liquid biopsy for simultaneous analysis of genetic and epigenetic changes in 122 Vietnamese NSCLC patients undergoing TKI therapy and displaying acquired resistance. We detected multiple profiles of resistance mutations in 51 patients (41.8%). Of those, genetic alterations in EGFR, particularly EGFR amplification (n = 6), showed pronounced genome instability and genome-wide hypomethylation. Interestingly, the level of hypomethylation was associated with the duration of response to TKI treatment. We also detected hypermethylation in regulatory regions of Homeobox genes which are known to be involved in tumor differentiation. In contrast, such changes were not observed in cases with MET (n = 4) and HER2 (n = 4) amplification. Thus, our study showed that liquid biopsy could provide important insights into the heterogeneity of TKI resistance mechanisms in NSCLC patients, providing essential information for prediction of resistance and selection of subsequent treatment.