Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 43
Filter
Add more filters

Country/Region as subject
Publication year range
1.
Genome Res ; 33(5): 750-762, 2023 May.
Article in English | MEDLINE | ID: mdl-37308294

ABSTRACT

For most biological and medical applications of single-cell transcriptomics, an integrative study of multiple heterogeneous single-cell RNA sequencing (scRNA-seq) data sets is crucial. However, present approaches are unable to integrate diverse data sets from various biological conditions effectively because of the confounding effects of biological and technical differences. We introduce single-cell integration (scInt), an integration method based on accurate, robust cell-cell similarity construction and unified contrastive biological variation learning from multiple scRNA-seq data sets. scInt provides a flexible and effective approach to transfer knowledge from the already integrated reference to the query. We show that scInt outperforms 10 other cutting-edge approaches using both simulated and real data sets, particularly in the case of complex experimental designs. Application of scInt to mouse developing tracheal epithelial data shows its ability to integrate development trajectories from different developmental stages. Furthermore, scInt successfully identifies functionally distinct condition-specific cell subpopulations in single-cell heterogeneous samples from a variety of biological conditions.


Subject(s)
Single-Cell Analysis , Single-Cell Gene Expression Analysis , Animals , Mice , Single-Cell Analysis/methods , Gene Expression Profiling/methods , Exome Sequencing , Sequence Analysis, RNA/methods
2.
J Virol ; 96(18): e0073922, 2022 09 28.
Article in English | MEDLINE | ID: mdl-36094314

ABSTRACT

Epstein-Barr virus (EBV) persists in human cells as episomes. EBV episomes are chromatinized and their 3D conformation varies greatly in cells expressing different latency genes. We used HiChIP, an assay which combines genome-wide chromatin conformation capture followed by deep sequencing (Hi-C) and chromatin immunoprecipitation (ChIP), to interrogate the EBV episome 3D conformation in different cancer cell lines. In an EBV-transformed lymphoblastoid cell line (LCL) GM12878 expressing type III EBV latency genes, abundant genomic interactions were identified by H3K27ac HiChIP. A strong enhancer was located near the BILF2 gene and looped to multiple genes around BALFs loci. Perturbation of the BILF2 enhancer by CRISPR interference (CRISPRi) and CRISPR activation (CRISPRa) altered the expression of BILF2 enhancer-linked genes, including BARF0 and BALF2, suggesting that this enhancer regulates the expression of linked genes. H3K27ac ChIP followed by deep sequencing (ChIP-seq) identified several strong EBV enhancers in T/NK (natural killer) lymphoma cells that express type II EBV latency genes. Extensive intragenomic interactions were also found which linked enhancers to target genes. A strong enhancer at BILF2 also looped to the BALF loci. CRISPRi also validated the functional connection between BILF2 enhancer and BARF1 gene. In contrast, H3K27ac HiChIP found significantly fewer intragenomic interactions in type I EBV latency gene-expressing primary effusion lymphoma (PEL) cell lines. These data provided new insight into the regulation of EBV latency gene expression in different EBV-associated tumors. IMPORTANCE EBV is the first human DNA tumor virus identified, discovered over 50 years ago. EBV causes ~200,000 cases of various cancers each year. EBV-encoded oncogenes, noncoding RNAs, and microRNAs (miRNAs) can promote cell growth and survival and suppress senescence. Regulation of EBV gene expression is very complex. The viral C promoter regulates the expression of all EBV nuclear antigens (EBNAs), some of which are very far away from the C promoter. Another way by which the virus activates remote gene expression is through DNA looping. In this study, we describe the viral genome looping patterns in various EBV-associated cancer cell lines and identify important EBV enhancers in these cells. This study also identified novel opportunities to perturb and eventually control EBV gene expression in these cancer cells.


Subject(s)
Epstein-Barr Virus Infections , Herpesvirus 4, Human , Plasmids , Virus Latency , Cell Line, Tumor , Enhancer Elements, Genetic/genetics , Epstein-Barr Virus Infections/genetics , Epstein-Barr Virus Infections/virology , Epstein-Barr Virus Nuclear Antigens/genetics , Herpesvirus 4, Human/genetics , Humans , MicroRNAs/metabolism , Neoplasms/virology , Plasmids/chemistry , Plasmids/genetics , Plasmids/metabolism , Viral Proteins/genetics , Virus Latency/genetics
3.
PLoS Comput Biol ; 18(1): e1009770, 2022 Jan.
Article in English | MEDLINE | ID: mdl-34986151

ABSTRACT

[This corrects the article DOI: 10.1371/journal.pcbi.1009118.].

4.
Mol Cancer ; 21(1): 74, 2022 03 12.
Article in English | MEDLINE | ID: mdl-35279145

ABSTRACT

BACKGROUND: Epithelial-to-mesenchymal transition (EMT) is a process linked to metastasis and drug resistance with non-coding RNAs (ncRNAs) playing pivotal roles. We previously showed that miR-100 and miR-125b, embedded within the third intron of the ncRNA host gene MIR100HG, confer resistance to cetuximab, an anti-epidermal growth factor receptor (EGFR) monoclonal antibody, in colorectal cancer (CRC). However, whether the MIR100HG transcript itself has a role in cetuximab resistance or EMT is unknown. METHODS: The correlation between MIR100HG and EMT was analyzed by curating public CRC data repositories. The biological roles of MIR100HG in EMT, metastasis and cetuximab resistance in CRC were determined both in vitro and in vivo. The expression patterns of MIR100HG, hnRNPA2B1 and TCF7L2 in CRC specimens from patients who progressed on cetuximab and patients with metastatic disease were analyzed by RNAscope and immunohistochemical staining. RESULTS: The expression of MIR100HG was strongly correlated with EMT markers and acted as a positive regulator of EMT. MIR100HG sustained cetuximab resistance and facilitated invasion and metastasis in CRC cells both in vitro and in vivo. hnRNPA2B1 was identified as a binding partner of MIR100HG. Mechanistically, MIR100HG maintained mRNA stability of TCF7L2, a major transcriptional coactivator of the Wnt/ß-catenin signaling, by interacting with hnRNPA2B1. hnRNPA2B1 recognized the N6-methyladenosine (m6A) site of TCF7L2 mRNA in the presence of MIR100HG. TCF7L2, in turn, activated MIR100HG transcription, forming a feed forward regulatory loop. The MIR100HG/hnRNPA2B1/TCF7L2 axis was augmented in specimens from CRC patients who either developed local or distant metastasis or had disease progression that was associated with cetuximab resistance. CONCLUSIONS: MIR100HG and hnRNPA2B1 interact to control the transcriptional activity of Wnt signaling in CRC via regulation of TCF7L2 mRNA stability. Our findings identified MIR100HG as a potent EMT inducer in CRC that may contribute to cetuximab resistance and metastasis by activation of a MIR100HG/hnRNPA2B1/TCF7L2 feedback loop.


Subject(s)
Colorectal Neoplasms , Heterogeneous-Nuclear Ribonucleoprotein Group A-B , MicroRNAs , RNA, Long Noncoding , Cell Line, Tumor , Cell Movement/genetics , Cetuximab/genetics , Cetuximab/metabolism , Colorectal Neoplasms/pathology , Epithelial-Mesenchymal Transition/genetics , Gene Expression Regulation, Neoplastic , Heterogeneous-Nuclear Ribonucleoprotein Group A-B/genetics , Humans , MicroRNAs/genetics , MicroRNAs/metabolism , RNA, Long Noncoding/genetics , RNA, Long Noncoding/metabolism , RNA, Messenger/genetics , Transcription Factor 7-Like 2 Protein/genetics , Transcription Factor 7-Like 2 Protein/metabolism , Wnt Signaling Pathway/genetics
5.
PLoS Comput Biol ; 17(6): e1009118, 2021 06.
Article in English | MEDLINE | ID: mdl-34138847

ABSTRACT

The single-cell RNA sequencing (scRNA-seq) technologies obtain gene expression at single-cell resolution and provide a tool for exploring cell heterogeneity and cell types. As the low amount of extracted mRNA copies per cell, scRNA-seq data exhibit a large number of dropouts, which hinders the downstream analysis of the scRNA-seq data. We propose a statistical method, SDImpute (Single-cell RNA-seq Dropout Imputation), to implement block imputation for dropout events in scRNA-seq data. SDImpute automatically identifies the dropout events based on the gene expression levels and the variations of gene expression across similar cells and similar genes, and it implements block imputation for dropouts by utilizing gene expression unaffected by dropouts from similar cells. In the experiments, the results of the simulated datasets and real datasets suggest that SDImpute is an effective tool to recover the data and preserve the heterogeneity of gene expression across cells. Compared with the state-of-the-art imputation methods, SDImpute improves the accuracy of the downstream analysis including clustering, visualization, and differential expression analysis.


Subject(s)
RNA-Seq/statistics & numerical data , Single-Cell Analysis/statistics & numerical data , Software , Animals , Cluster Analysis , Computational Biology , Computer Simulation , Data Interpretation, Statistical , Data Visualization , Databases, Nucleic Acid/statistics & numerical data , Gene Expression Profiling/statistics & numerical data , Genetic Techniques/statistics & numerical data , Humans , RNA, Messenger/genetics , RNA, Messenger/isolation & purification
6.
Genomics ; 113(2): 456-462, 2021 03.
Article in English | MEDLINE | ID: mdl-33383142

ABSTRACT

T-cell receptor (TCR) is crucial in T cell-mediated virus clearance. To date, TCR bias has been observed in various diseases. However, studies on the TCR repertoire of COVID-19 patients are lacking. Here, we used single-cell V(D)J sequencing to conduct comparative analyses of TCR repertoire between 12 COVID-19 patients and 6 healthy controls, as well as other virus-infected samples. We observed distinct T cell clonal expansion in COVID-19. Further analysis of VJ gene combination revealed 6 VJ pairs significantly increased, while 139 pairs significantly decreased in COVID-19 patients. When considering the VJ combination of α and ß chains at the same time, the combination with the highest frequency on COVID-19 was TRAV12-2-J27-TRBV7-9-J2-3. Besides, preferential usage of V and J gene segments was also observed in samples infected by different viruses. Our study provides novel insights on TCR in COVID-19, which contribute to our understanding of the immune response induced by SARS-CoV-2.


Subject(s)
COVID-19/genetics , High-Throughput Nucleotide Sequencing , Receptors, Antigen, T-Cell/genetics , SARS-CoV-2 , Single-Cell Analysis , COVID-19/immunology , Female , Humans , Male , T-Lymphocytes/immunology
7.
BMC Bioinformatics ; 21(Suppl 16): 540, 2020 Dec 16.
Article in English | MEDLINE | ID: mdl-33323107

ABSTRACT

BACKGROUND: Single-cell RNA sequencing can be used to fairly determine cell types, which is beneficial to the medical field, especially the many recent studies on COVID-19. Generally, single-cell RNA data analysis pipelines include data normalization, size reduction, and unsupervised clustering. However, different normalization and size reduction methods will significantly affect the results of clustering and cell type enrichment analysis. Choices of preprocessing paths is crucial in scRNA-Seq data mining, because a proper preprocessing path can extract more important information from complex raw data and lead to more accurate clustering results. RESULTS: We proposed a method called NDRindex (Normalization and Dimensionality Reduction index) to evaluate data quality of outcomes of normalization and dimensionality reduction methods. The method includes a function to calculate the degree of data aggregation, which is the key to measuring data quality before clustering. For the five single-cell RNA sequence datasets we tested, the results proved the efficacy and accuracy of our index. CONCLUSIONS: This method we introduce focuses on filling the blanks in the selection of preprocessing paths, and the result proves its effectiveness and accuracy. Our research provides useful indicators for the evaluation of RNA-Seq data.


Subject(s)
Computational Biology/methods , Databases, Nucleic Acid/classification , Databases, Nucleic Acid/standards , RNA-Seq/methods , COVID-19/virology , Cluster Analysis , Humans , SARS-CoV-2/genetics
8.
BMC Genomics ; 21(1): 149, 2020 Feb 11.
Article in English | MEDLINE | ID: mdl-32046631

ABSTRACT

BACKGROUND: With the rapid development of high-throughput sequencing technologies, many datasets on the same biological subject are generated. A meta-analysis is an approach that combines results from different studies on the same topic. The random-effects model in a meta-analysis enables the modeling of differences between studies by incorporating the between-study variance. RESULTS: This paper proposes a moments estimator of the between-study variance that represents the across-study variation. A new random-effects method (DSLD2), which involves two-step estimation starting with the DSL estimate and the [Formula: see text] in the second step, is presented. The DSLD2 method is compared with 6 other meta-analysis methods based on effect sizes across 8 aspects under three hypothesis settings. The results show that DSLD2 is a suitable method for identifying differentially expressed genes under the first hypothesis. The DSLD2 method is also applied to Alzheimer's microarray datasets. The differentially expressed genes detected by the DSLD2 method are significantly enriched in neurological diseases. CONCLUSIONS: The results from both simulationes and an application show that DSLD2 is a suitable method for detecting differentially expressed genes under the first hypothesis.


Subject(s)
Gene Expression Profiling/methods , Alzheimer Disease/genetics , Data Interpretation, Statistical , Humans , Likelihood Functions , Meta-Analysis as Topic , Models, Statistical , Monte Carlo Method , ROC Curve
9.
BMC Bioinformatics ; 20(Suppl 18): 573, 2019 Nov 25.
Article in English | MEDLINE | ID: mdl-31760933

ABSTRACT

BACKGROUND: During procedures for conducting multiple sequence alignment, that is so essential to use the substitution score of pairwise alignment. To compute adaptive scores for alignment, researchers usually use Hidden Markov Model or probabilistic consistency methods such as partition function. Recent studies show that optimizing the parameters for hidden Markov model, as well as integrating hidden Markov model with partition function can raise the accuracy of alignment. The combination of partition function and optimized HMM, which could further improve the alignment's accuracy, however, was ignored by these researches. RESULTS: A novel algorithm for MSA called ProbPFP is presented in this paper. It intergrate optimized HMM by particle swarm with partition function. The algorithm of PSO was applied to optimize HMM's parameters. After that, the posterior probability obtained by the HMM was combined with the one obtained by partition function, and thus to calculate an integrated substitution score for alignment. In order to evaluate the effectiveness of ProbPFP, we compared it with 13 outstanding or classic MSA methods. The results demonstrate that the alignments obtained by ProbPFP got the maximum mean TC scores and mean SP scores on these two benchmark datasets: SABmark and OXBench, and it got the second highest mean TC scores and mean SP scores on the benchmark dataset BAliBASE. ProbPFP is also compared with 4 other outstanding methods, by reconstructing the phylogenetic trees for six protein families extracted from the database TreeFam, based on the alignments obtained by these 5 methods. The result indicates that the reference trees are closer to the phylogenetic trees reconstructed from the alignments obtained by ProbPFP than the other methods. CONCLUSIONS: We propose a new multiple sequence alignment method combining optimized HMM and partition function in this paper. The performance validates this method could make a great improvement of the alignment's accuracy.


Subject(s)
Computational Biology/methods , Proteins/genetics , Sequence Alignment/methods , Algorithms , Animals , Humans , Markov Chains , Multigene Family , Phylogeny , Proteins/chemistry , Software
10.
BMC Bioinformatics ; 20(Suppl 25): 691, 2019 Dec 24.
Article in English | MEDLINE | ID: mdl-31874619

ABSTRACT

BACKGROUND: The association between BIN1 rs744373 variant and Alzheimer's disease (AD) had been identified by genome-wide association studies (GWASs) as well as candidate gene studies in Caucasian populations. But in East Asian populations, both positive and negative results had been identified by association studies. Considering the smaller sample sizes of the studies in East Asian, we believe that the results did not have enough statistical power. RESULTS: We conducted a meta-analysis with 71,168 samples (22,395 AD cases and 48,773 controls, from 37 studies of 19 articles). Based on the additive model, we observed significant genetic heterogeneities in pooled populations as well as Caucasians and East Asians. We identified a significant association between rs744373 polymorphism with AD in pooled populations (P = 5 × 10- 07, odds ratio (OR) = 1.12, and 95% confidence interval (CI) 1.07-1.17) and in Caucasian populations (P = 3.38 × 10- 08, OR = 1.16, 95% CI 1.10-1.22). But in the East Asian populations, the association was not identified (P = 0.393, OR = 1.057, and 95% CI 0.95-1.15). Besides, the regression analysis suggested no significant publication bias. The results for sensitivity analysis as well as meta-analysis under the dominant model and recessive model remained consistent, which demonstrated the reliability of our finding. CONCLUSIONS: The large-scale meta-analysis highlighted the significant association between rs744373 polymorphism and AD risk in Caucasian populations but not in the East Asian populations.


Subject(s)
Adaptor Proteins, Signal Transducing/genetics , Alzheimer Disease/genetics , Nuclear Proteins/genetics , Tumor Suppressor Proteins/genetics , Asian People/genetics , Genetic Heterogeneity , Genome-Wide Association Study , Humans , Polymorphism, Genetic , Reproducibility of Results , White People/genetics
11.
Bioinformatics ; 34(15): 2657-2658, 2018 08 01.
Article in English | MEDLINE | ID: mdl-29566144

ABSTRACT

Motivation: With the development of biotechnology, DNA methylation data showed exponential growth. Epigenome-wide association study (EWAS) provide a systematic approach to uncovering epigenetic variants underlying common diseases/phenotypes. But the EWAS software has lagged behind compared with genome-wide association study (GWAS). To meet the requirements of users, we developed a convenient and useful software, EWAS2.0. Results: EWAS2.0 can analyze EWAS data and identify the association between epigenetic variations and disease/phenotype. On the basis of EWAS1.0, we have added more distinctive features. EWAS2.0 software was developed based on our 'population epigenetic framework' and can perform: (i) epigenome-wide single marker association study; (ii) epigenome-wide methylation haplotype (meplotype) association study and (iii) epigenome-wide association meta-analysis. Users can use EWAS2.0 to execute chi-square test, t-test, linear regression analysis, logistic regression analysis, identify the association between epi-alleles, identify the methylation disequilibrium (MD) blocks, calculate the MD coefficient, the frequency of meplotype and Pearson's correlation coefficients and carry out meta-analysis and so on. Finally, we expect EWAS2.0 to become a popular software and be widely used in epigenome-wide associated studies in the future. Availability and implementation: The EWAS software is freely available at http://www.ewas.org.cn or http://www.bioapp.org/ewas.


Subject(s)
DNA Methylation , Epigenomics/methods , Genome-Wide Association Study/methods , Software , Epigenesis, Genetic , Phenotype
12.
Entropy (Basel) ; 21(3)2019 Mar 04.
Article in English | MEDLINE | ID: mdl-33266957

ABSTRACT

The advancement of high-throughput RNA sequencing has uncovered the profound truth in biology, ranging from the study of differential expressed genes to the identification of different genomic phenotype across multiple conditions. However, lack of biological replicates and low expressed data are still obstacles to measuring differentially expressed genes effectively. We present an algorithm based on differential entropy-like function (DEF) to test for the differential expression across time-course data or multi-sample data with few biological replicates. Compared with limma, edgeR, DESeq2, and baySeq, DEF maintains equivalent or better performance on the real data of two conditions. Moreover, DEF is well suited for predicting the genes that show the greatest differences across multiple conditions such as time-course data and identifies various biologically relevant genes.

13.
BMC Med Genet ; 19(1): 38, 2018 03 07.
Article in English | MEDLINE | ID: mdl-29514658

ABSTRACT

BACKGROUND: Large scale association studies have found a significant association between type 2 diabetes mellitus (T2DM) and transcription factor 7-like 2 (TCF7L2) polymorphism rs7903146. However, the quality of data varies greatly, as the studies report inconsistent results in different populations. Hence, we perform this meta-analysis to give a more convincing result. METHODS: The articles, published from January 1st, 2000 to April 1st, 2017, were identified by searching in PubMed and Google Scholar. A total of 56628 participants (34232 cases and 22396 controls) were included in the meta-analysis. A total of 28 studies were divided into 4 subgroups: Caucasian (10 studies), East Asian (5 studies), South Asian (5 studies) and Others (8 studies). All the data analyses were analyzed by the R package meta. RESULTS: The significant association was observed by using the dominant model (OR = 1.41, CI = 1.36 - 1.47, p < 0.0001), recessive model (OR = 1.58, CI = 1.48 - 1.69, p < 0.0001), additive model(CT vs CC) (OR = 1.34, CI = 1.28-1.39, p < 0.0001), additive model(TT vs CC) (OR = 1.81, CI = 1.69-1.94, p < 0.0001)and allele model (OR = 1.35, CI = 1.31-1.39, p < 0.0001). CONCLUSION: The meta-analysis suggested that rs7903146 was significantly associated with T2DM in Caucasian, East Asian, South Asian and other ethnicities.


Subject(s)
Diabetes Mellitus, Type 2/diagnosis , Diabetes Mellitus, Type 2/genetics , Polymorphism, Single Nucleotide , Transcription Factor 7-Like 2 Protein/genetics , Alleles , Asian People/genetics , Databases, Factual , Genetic Predisposition to Disease , Humans , Publication Bias , White People/genetics
14.
BMC Med Genet ; 19(Suppl 1): 215, 2018 12 31.
Article in English | MEDLINE | ID: mdl-30598082

ABSTRACT

BACKGROUND: Alzheimer's disease (AD) and Parkinson's disease (PD) are the top two common neurodegenerative diseases in elderly. Recent studies found the α-synuclein have a key role in AD. Although many clinical and pathological features between AD and PD are shared, the genetic association between them remains unclear, especially whether α-synuclein in PD genetically alters AD risk. RESULTS: We did not obtain any significant result (OR = 0.918, 95% CI: 0.782-1.076, P = 0.291) in MR analysis between PD and AD risk. In MR between α-synuclein in PD with AD risk, we only extracted rs356182 as the IV through a strict screening process. The result indicated a significant association based on IVW method (OR = 0.638, 95% CI: 0.485-0.838, P = 1.20E-03). In order to examine the robustness of the IVW method, we used other three complementary analytical methods and also obtained consistent results. CONCLUSION: The overall PD genetic risk factors did not predict AD risk, but the α-synuclein susceptibility genetic variants in PD reduce the AD risk. We believe that our findings may help to understand the association between them, which may be useful for future genetic studies for both diseases.


Subject(s)
Alzheimer Disease/genetics , Nerve Tissue Proteins/genetics , Parkinson Disease/genetics , Polymorphism, Single Nucleotide , alpha-Synuclein/genetics , Aged , Alleles , Alzheimer Disease/diagnosis , Alzheimer Disease/physiopathology , Female , Gene Expression , Gene Frequency , Genome-Wide Association Study , Humans , Male , Mendelian Randomization Analysis , Odds Ratio , Parkinson Disease/diagnosis , Parkinson Disease/physiopathology , Risk Factors
15.
BMC Bioinformatics ; 18(1): 270, 2017 May 23.
Article in English | MEDLINE | ID: mdl-28535748

ABSTRACT

BACKGROUND: The development of single-cell RNA sequencing has enabled profound discoveries in biology, ranging from the dissection of the composition of complex tissues to the identification of novel cell types and dynamics in some specialized cellular environments. However, the large-scale generation of single-cell RNA-seq (scRNA-seq) data collected at multiple time points remains a challenge to effective measurement gene expression patterns in transcriptome analysis. RESULTS: We present an algorithm based on the Dynamic Time Warping score (DTWscore) combined with time-series data, that enables the detection of gene expression changes across scRNA-seq samples and recovery of potential cell types from complex mixtures of multiple cell types. CONCLUSIONS: The DTWscore successfully classify cells of different types with the most highly variable genes from time-series scRNA-seq data. The study was confined to methods that are implemented and available within the R framework. Sample datasets and R packages are available at https://github.com/xiaoxiaoxier/DTWscore .


Subject(s)
Algorithms , Gene Expression Profiling/methods , Sequence Analysis, RNA/methods , Single-Cell Analysis/methods , Statistics as Topic , Cluster Analysis , Computer Simulation , Gene Expression Regulation , Humans , Muscle, Skeletal/cytology , Myoblasts/metabolism , RNA/genetics , RNA/metabolism , ROC Curve , Time Factors
16.
BMC Genomics ; 18(Suppl 1): 1043, 2017 01 25.
Article in English | MEDLINE | ID: mdl-28198675

ABSTRACT

BACKGROUND: Identifying the genes associated to human diseases is crucial for disease diagnosis and drug design. Computational approaches, esp. the network-based approaches, have been recently developed to identify disease-related genes effectively from the existing biomedical networks. Meanwhile, the advance in biotechnology enables researchers to produce multi-omics data, enriching our understanding on human diseases, and revealing the complex relationships between genes and diseases. However, none of the existing computational approaches is able to integrate the huge amount of omics data into a weighted integrated network and utilize it to enhance disease related gene discovery. RESULTS: We propose a new network-based disease gene prediction method called SLN-SRW (Simplified Laplacian Normalization-Supervised Random Walk) to generate and model the edge weights of a new biomedical network that integrates biomedical data from heterogeneous sources, thus far enhancing the disease related gene discovery. CONCLUSIONS: The experiment results show that SLN-SRW significantly improves the performance of disease gene prediction on both the real and the synthetic data sets.


Subject(s)
Computational Biology/methods , Gene Regulatory Networks , Genetic Association Studies , Genetic Predisposition to Disease , Algorithms , Databases, Genetic , Gene Ontology , Humans , ROC Curve , Reproducibility of Results , Workflow
17.
Neurol Sci ; 38(7): 1255-1262, 2017 Jul.
Article in English | MEDLINE | ID: mdl-28429084

ABSTRACT

In 2006, a candidate gene study reported death-associated protein kinase 1 (DAPK1) rs4878104 variant to be significantly associated with Alzheimer's disease (AD) risk. However, the following studies showed inconsistent association results. Here, we conducted an updated analysis to investigate the potential association between rs4878104 and AD using a total of 60,751 samples (20,161 AD cases and 40,590 controls). In the pooled population, the results based on the allele and genotype genetic models show that rs4878104 variant is not significantly associated with AD risk. Interestingly, we identified rs4878104 variant to be significantly associated with AD risk in American population and Chinese population in subgroup analysis. Using multiple large-scale expression quantitative trait loci datasets, we further found that rs4878104 T allele could significantly regulate increased DAPK1 expression in European population. These findings suggest that rs4878104 may contribute AD susceptibility by modifying DAPK1 expression in European population.


Subject(s)
Alzheimer Disease/genetics , Death-Associated Protein Kinases/genetics , Genetic Predisposition to Disease/genetics , Polymorphism, Single Nucleotide/genetics , Alleles , Apoptosis Regulatory Proteins/genetics , Asian People/genetics , Female , Humans , Male , Risk , White People/genetics
18.
Nucleic Acids Res ; 43(Database issue): D193-6, 2015 Jan.
Article in English | MEDLINE | ID: mdl-25399422

ABSTRACT

Long non-coding RNAs (lncRNAs) have emerged as critical regulators of genes at epigenetic, transcriptional and post-transcriptional levels, yet what genes are regulated by a specific lncRNA remains to be characterized. To assess the effects of the lncRNA on gene expression, an increasing number of researchers profiled the genome-wide or individual gene expression level change after knocking down or overexpressing the lncRNA. Herein, we describe a curated database named LncRNA2Target, which stores lncRNA-to-target genes and is publicly accessible at http://www.lncrna2target.org. A gene was considered as a target of a lncRNA if it is differentially expressed after the lncRNA knockdown or overexpression. LncRNA2Target provides a web interface through which its users can search for the targets of a particular lncRNA or for the lncRNAs that target a particular gene. Both search types are performed either by browsing a provided catalog of lncRNA names or by inserting lncRNA/target gene IDs/names in a search box.


Subject(s)
Databases, Nucleic Acid , RNA, Long Noncoding/metabolism , Gene Expression Profiling , Gene Expression Regulation , Gene Knockdown Techniques , Internet , RNA, Long Noncoding/antagonists & inhibitors , RNA, Long Noncoding/genetics
SELECTION OF CITATIONS
SEARCH DETAIL