Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 65
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Brief Bioinform ; 25(4)2024 May 23.
Artigo em Inglês | MEDLINE | ID: mdl-38975895

RESUMO

Spatial transcriptomics provides valuable insights into gene expression within the native tissue context, effectively merging molecular data with spatial information to uncover intricate cellular relationships and tissue organizations. In this context, deciphering cellular spatial domains becomes essential for revealing complex cellular dynamics and tissue structures. However, current methods encounter challenges in seamlessly integrating gene expression data with spatial information, resulting in less informative representations of spots and suboptimal accuracy in spatial domain identification. We introduce stCluster, a novel method that integrates graph contrastive learning with multi-task learning to refine informative representations for spatial transcriptomic data, consequently improving spatial domain identification. stCluster first leverages graph contrastive learning technology to obtain discriminative representations capable of recognizing spatially coherent patterns. Through jointly optimizing multiple tasks, stCluster further fine-tunes the representations to be able to capture complex relationships between gene expression and spatial organization. Benchmarked against six state-of-the-art methods, the experimental results reveal its proficiency in accurately identifying complex spatial domains across various datasets and platforms, spanning tissue, organ, and embryo levels. Moreover, stCluster can effectively denoise the spatial gene expression patterns and enhance the spatial trajectory inference. The source code of stCluster is freely available at https://github.com/hannshu/stCluster.


Assuntos
Perfilação da Expressão Gênica , Transcriptoma , Perfilação da Expressão Gênica/métodos , Biologia Computacional/métodos , Algoritmos , Humanos , Animais , Software , Aprendizado de Máquina
2.
Bioinformatics ; 40(2)2024 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-38290765

RESUMO

SUMMARY: Single-cell multi-omics technologies provide a unique platform for characterizing cell states and reconstructing developmental process by simultaneously quantifying and integrating molecular signatures across various modalities, including genome, transcriptome, epigenome, and other omics layers. However, there is still an urgent unmet need for novel computational tools in this nascent field, which are critical for both effective and efficient interrogation of functionality across different omics modalities. Scbean represents a user-friendly Python library, designed to seamlessly incorporate a diverse array of models for the examination of single-cell data, encompassing both paired and unpaired multi-omics data. The library offers uniform and straightforward interfaces for tasks, such as dimensionality reduction, batch effect elimination, cell label transfer from well-annotated scRNA-seq data to scATAC-seq data, and the identification of spatially variable genes. Moreover, Scbean's models are engineered to harness the computational power of GPU acceleration through Tensorflow, rendering them capable of effortlessly handling datasets comprising millions of cells. AVAILABILITY AND IMPLEMENTATION: Scbean is released on the Python Package Index (PyPI) (https://pypi.org/project/scbean/) and GitHub (https://github.com/jhu99/scbean) under the MIT license. The documentation and example code can be found at https://scbean.readthedocs.io/en/latest/.


Assuntos
Multiômica , Software , Genoma , Transcriptoma , Análise de Célula Única , Análise de Dados
3.
Brief Bioinform ; 23(1)2022 01 17.
Artigo em Inglês | MEDLINE | ID: mdl-34585247

RESUMO

Single-cell technologies provide us new ways to profile transcriptomic landscape, chromatin accessibility, spatial expression patterns in heterogeneous tissues at the resolution of single cell. With enormous generated single-cell datasets, a key analytic challenge is to integrate these datasets to gain biological insights into cellular compositions. Here, we developed a domain-adversarial and variational approximation, DAVAE, which can integrate multiple single-cell datasets across samples, technologies and modalities with a single strategy. Besides, DAVAE can also integrate paired data of ATAC profile and transcriptome profile that are simultaneously measured from a same cell. With a mini-batch stochastic gradient descent strategy, it is scalable for large-scale data and can be accelerated by GPUs. Results on seven real data integration applications demonstrated the effectiveness and scalability of DAVAE in batch-effect removing, transfer learning and cell-type predictions for multiple single-cell datasets across samples, technologies and modalities. Availability: DAVAE has been implemented in a toolkit package "scbean" in the pypi repository, and the source code can be also freely accessible at https://github.com/jhu99/scbean. All our data and source code for reproducing the results of this paper can be accessible at https://github.com/jhu99/davae_paper.


Assuntos
Análise de Célula Única , Software , Algoritmos , Cromatina , Transcriptoma
4.
Bioinformatics ; 39(1)2023 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-36622018

RESUMO

MOTIVATION: Single-cell multimodal assays allow us to simultaneously measure two different molecular features of the same cell, enabling new insights into cellular heterogeneity, cell development and diseases. However, most existing methods suffer from inaccurate dimensionality reduction for the joint-modality data, hindering their discovery of novel or rare cell subpopulations. RESULTS: Here, we present VIMCCA, a computational framework based on variational-assisted multi-view canonical correlation analysis to integrate paired multimodal single-cell data. Our statistical model uses a common latent variable to interpret the common source of variances in two different data modalities. Our approach jointly learns an inference model and two modality-specific non-linear models by leveraging variational inference and deep learning. We perform VIMCCA and compare it with 10 existing state-of-the-art algorithms on four paired multi-modal datasets sequenced by different protocols. Results demonstrate that VIMCCA facilitates integrating various types of joint-modality data, thus leading to more reliable and accurate downstream analysis. VIMCCA improves our ability to identify novel or rare cell subtypes compared to existing widely used methods. Besides, it can also facilitate inferring cell lineage based on joint-modality profiles. AVAILABILITY AND IMPLEMENTATION: The VIMCCA algorithm has been implemented in our toolkit package scbean (≥0.5.0), and its code has been archived at https://github.com/jhu99/scbean under MIT license. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Modelos Estatísticos , Diferenciação Celular , Linhagem da Célula
5.
Nucleic Acids Res ; 50(4): e21, 2022 02 28.
Artigo em Inglês | MEDLINE | ID: mdl-34871454

RESUMO

Data alignment is one of the first key steps in single cell analysis for integrating multiple datasets and performing joint analysis across studies. Data alignment is challenging in extremely large datasets, however, as the major of the current single cell data alignment methods are not computationally efficient. Here, we present VIPCCA, a computational framework based on non-linear canonical correlation analysis for effective and scalable single cell data alignment. VIPCCA leverages both deep learning for effective single cell data modeling and variational inference for scalable computation, thus enabling powerful data alignment across multiple samples, multiple data platforms, and multiple data types. VIPCCA is accurate for a range of alignment tasks including alignment between single cell RNAseq and ATACseq datasets and can easily accommodate millions of cells, thereby providing researchers unique opportunities to tackle challenges emerging from large-scale single-cell atlas.


Assuntos
Análise de Correlação Canônica , Análise de Célula Única
6.
Circulation ; 145(24): 1749-1760, 2022 06 14.
Artigo em Inglês | MEDLINE | ID: mdl-35450432

RESUMO

BACKGROUND: Short-term exposure to ambient air pollution has been linked with daily hospitalization and mortality from acute coronary syndrome (ACS); however, the associations of subdaily (hourly) levels of criteria air pollutants with the onset of ACS and its subtypes have rarely been evaluated. METHODS: We conducted a time-stratified case-crossover study among 1 292 880 patients with ACS from 2239 hospitals in 318 Chinese cities between January 1, 2015, and September 30, 2020. Hourly concentrations of fine particulate matter (PM2.5), coarse particulate matter (PM2.5-10), nitrogen dioxide (NO2), sulfur dioxide (SO2), carbon monoxide (CO), and ozone (O3) were collected. Hourly onset data of ACS and its subtypes, including ST-segment-elevation myocardial infarction, non-ST-segment-elevation myocardial infarction, and unstable angina, were also obtained. Conditional logistic regressions combined with polynomial distributed lag models were applied. RESULTS: Acute exposures to PM2.5, NO2, SO2, and CO were each associated with the onset of ACS and its subtypes. These associations were strongest in the concurrent hour of exposure and were attenuated thereafter, with the weakest effects observed after 15 to 29 hours. There were no apparent thresholds in the concentration-response curves. An interquartile range increase in concentrations of PM2.5 (36.0 µg/m3), NO2 (29.0 µg/m3), SO2 (9.0 µg/m3), and CO (0.6 mg/m3) over the 0 to 24 hours before onset was significantly associated with 1.32%, 3.89%, 0.67%, and 1.55% higher risks of ACS onset, respectively. For a given pollutant, the associations were comparable in magnitude across different subtypes of ACS. NO2 showed the strongest associations with all 3 subtypes, followed by PM2.5, CO, and SO2. Greater magnitude of associations was observed among patients older than 65 years and in the cold season. Null associations of exposure to either PM2.5-10 or O3 with ACS onset were observed. CONCLUSIONS: The results suggest that transient exposure to the air pollutants PM2.5, NO2, SO2, or CO, but not PM2.5-10 or O3, may trigger the onset of ACS, even at concentrations below the World Health Organization air quality guidelines.


Assuntos
Síndrome Coronariana Aguda , Poluentes Atmosféricos , Poluição do Ar , Exposição Ambiental , Síndrome Coronariana Aguda/epidemiologia , Poluentes Atmosféricos/análise , Poluentes Atmosféricos/toxicidade , Poluição do Ar/efeitos adversos , Poluição do Ar/análise , Monóxido de Carbono/análise , Monóxido de Carbono/toxicidade , China/epidemiologia , Cidades/epidemiologia , Estudos Cross-Over , Exposição Ambiental/efeitos adversos , Exposição Ambiental/análise , Humanos , Dióxido de Nitrogênio/análise , Dióxido de Nitrogênio/toxicidade , Ozônio/análise , Ozônio/toxicidade , Material Particulado/análise , Material Particulado/toxicidade , Dióxido de Enxofre/análise , Dióxido de Enxofre/toxicidade , Fatores de Tempo
7.
CMAJ ; 195(17): E601-E611, 2023 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-37127306

RESUMO

BACKGROUND: Few studies have explored the relationship between air pollution and arrhythmia onset at the hourly level. We aimed to examine the association of exposure to air pollution with the onset of acute symptomatic arrhythmia at an hourly level. METHODS: We conducted a nationwide, time-stratified, case-crossover study in China between 2015 and 2021. We obtained hourly information on the onset of symptomatic arrhythmia (including atrial fibrillation, atrial flutter, atrial and ventricular premature beats and supraventricular tachycardia) from the Chinese Cardiovascular Association Database - Chest Pain Center (including 2025 certified hospitals in 322 cities). We obtained data on hourly concentrations of 6 air pollutants from the nearest monitors, including fine particles (PM2.5), coarse particles (PM2.5-10), nitrogen dioxide (NO2), sulfur dioxide (SO2), carbon monoxide (CO) and ozone. For each patient, we matched the case period to 3 or 4 control periods during the same hour, day of week, month and year. We used conditional logistic regression models to analyze the data. RESULTS: We included a total of 190 115 patients with acute onset of symptomatic arrhythmia. Air pollution was associated with increased risk of onset of symptomatic arrhythmia within the first few hours of exposure; this risk attenuated substantially after 24 hours. An interquartile range increase in PM2.5, NO2, SO2 and CO in the first 24 hours after exposure (i.e., lag period 0-24 h) was associated with significantly higher odds of atrial fibrillation (1.7%-3.4%), atrial flutter (8.1%-11.4%) and supraventricular tachycardia (3.4%-8.9%). Exposure to PM2.5-10 was associated with significantly higher odds of atrial flutter (8.7%) and supraventricular tachycardia (5.4%), and exposure to ozone was associated with higher odds of supraventricular tachycardia (3.4%). The exposure-response relationships were approximately linear, without discernible concentration thresholds. INTERPRETATION: Exposure to air pollution was associated with the onset of symptomatic arrhythmia shortly after exposure. This finding highlights the importance of further reducing air pollution and taking prompt protective measures for susceptible populations during periods of elevated levels of air pollutants.


Assuntos
Poluentes Atmosféricos , Poluição do Ar , Fibrilação Atrial , Flutter Atrial , Ozônio , Humanos , Estudos Cross-Over , Fibrilação Atrial/induzido quimicamente , Cidades , Flutter Atrial/induzido quimicamente , Dióxido de Nitrogênio , Material Particulado/efeitos adversos , Material Particulado/análise , Poluição do Ar/efeitos adversos , Poluentes Atmosféricos/efeitos adversos , Poluentes Atmosféricos/análise , Ozônio/análise , China , Exposição Ambiental/efeitos adversos
8.
BMC Bioinformatics ; 22(1): 5, 2021 Jan 06.
Artigo em Inglês | MEDLINE | ID: mdl-33407064

RESUMO

BACKGROUND: Single-cell RNA sequencing (scRNA-seq) enables the possibility of many in-depth transcriptomic analyses at a single-cell resolution. It's already widely used for exploring the dynamic development process of life, studying the gene regulation mechanism, and discovering new cell types. However, the low RNA capture rate, which cause highly sparse expression with dropout, makes it difficult to do downstream analyses. RESULTS: We propose a new method SCC to impute the dropouts of scRNA-seq data. Experiment results show that SCC gives competitive results compared to two existing methods while showing superiority in reducing the intra-class distance of cells and improving the clustering accuracy in both simulation and real data. CONCLUSIONS: SCC is an effective tool to resolve the dropout noise in scRNA-seq data. The code is freely accessible at https://github.com/nwpuzhengyan/SCC .


Assuntos
Perfilação da Expressão Gênica/métodos , RNA Citoplasmático Pequeno/genética , Análise de Célula Única/métodos , Regulação da Expressão Gênica/genética , Genômica/métodos , Modelos Genéticos
9.
BMC Cardiovasc Disord ; 21(1): 376, 2021 08 04.
Artigo em Inglês | MEDLINE | ID: mdl-34348647

RESUMO

BACKGROUND: H type hypertension is defined as homocysteine (Hcy) ≥ 10 µmol/L in combination with primary hypertension. Studies demonstrated that the existence of hyperhomocysteine (HHcy) in hypertensive exacerbates the poor outcome of cardiocerebral incidents. This study was to investigate the current epidemic situation of H type hypertension and determine the risk factors in order to find intervention targets for H type hypertensives. METHODS: We conducted a cross-sectional study using cluster sampling design in Shanghai, China from July 2019 and April 2020. 23,652 patients with primary hypertension were enrolled in this study. Their medical information was recorded, and the level of Hcy concentrations and methylenetetrahydrofolate reductase (MTHFR) C677T polymorphisms were detected. RESULTS: In total, 22,731 of 23,652 patients were recorded. The mean age was 68.9 ± 8.6 y and 43% were men. 80.0% of the enrolled patients had H type hypertension. The frequency of allele T was 40.9%, and the proportions of the CC, CT, and TT genotypes were 36.1%, 46.0%, and 17.9%, respectively. Compared with the TT genotype, the plasma Hcy concentration levels were lower in patients with the CC/CT genotype (18.96 ± 13.48 µmol/L vs. 13.62 ± 5.20/14.28 ± 5.36, F = 75.04, p < 0.01). The risk for H type hypertension was higher in elderly people. Men had ~ 5.55-fold odds of H type hypertension compared with women. Patients with CT genotype and TT genotype had ~ 1.36- and ~ 2.76-fold odds of H type hypertension compared with those with CC genotype, respectively. Smoking and diabetes were not significantly associated with H type hypertension. CONCLUSIONS: The prevalence of H type hypertension in patients with primary hypertension was 80.0%, which was higher than the 75% found in prior report in China. Age, gender, and MTHFR C677T polymorphisms rather than smoking and diabetes were independently associated with H type hypertension.


Assuntos
Genótipo , Homocisteína/sangue , Hipertensão/sangue , Hipertensão/epidemiologia , Metilenotetra-Hidrofolato Redutase (NADPH2)/genética , Adulto , Idoso , Idoso de 80 Anos ou mais , China/epidemiologia , Estudos Transversais , Feminino , Humanos , Hiper-Homocisteinemia/complicações , Hipertensão/genética , Masculino , Pessoa de Meia-Idade , Polimorfismo Genético , Prevalência , Fatores de Risco
10.
Environ Res ; 194: 110655, 2021 03.
Artigo em Inglês | MEDLINE | ID: mdl-33358871

RESUMO

BACKGROUND: The impacts of temperature variability on cardiac autonomic function remain unclear. OBJECTIVE: To explore the short-term associations between daily temperature variability and parameters of heart rate variability (HRV). METHODS: This is a repeated-measure study among 78 eligible participants in Shanghai, China. We defined temperature variability as diurnal temperature range (DTR), the standard-deviation of temperature (SDT) and temperature variability (TV). We evaluated 3 frequency-domain HRV parameters (VLF, LF and HF) and 4 time-domain parameters (SDNN, SDANN, rMSSD and pNN50). We used linear mixed-effect models to analyze the data after controlling for environmental and individual confounders. RESULTS: Temperature variability was significantly associated with decreased HRV, especially on the concurrent day. The exposure-response relationships were almost inversely linear for most parameters. Every one interquartile range (IQR) increase of DTR was associated with a decrease of 3.92% for VLF, 6.99% for LF, 5.88% for HF, 3.94% for rMSSD and 1.30% for pNN50. Each IQR increase of SDT was associated with a decline of 6.48% for LF, 5.91% for HF, 4.26% for rMSSD and 1.87% for pNN50. Every IQR increase of SDT was associated with a decrease of 4.39% for VLF, 7.67% for LF, 6.52% for HF, 3.22% for SDNN, 2.98% for SDANN, 4.05% for rMSSD, and 1.41% for pNN50. The decrements in HRV associated with temperature variability were more prominent in females. CONCLUSION: Temperature variability on the concurrent day could significantly decrease cardiac autonomic function, especially in females.


Assuntos
Sistema Nervoso Autônomo , Coração , China , Feminino , Frequência Cardíaca , Humanos , Temperatura
11.
Ecotoxicol Environ Saf ; 208: 111726, 2021 Jan 15.
Artigo em Inglês | MEDLINE | ID: mdl-33396057

RESUMO

BACKGROUND: It remains unclear which size of particles has the strongest effects on heart rate variability (HRV). OBJECTIVE: To explore the association between HRV parameters and daily variations of size-fractionated particle number concentrations (PNCs). METHODS: We conducted a longitudinal repeated-measure study among 78 participants with a 24-h continuous ambulatory Holter electrocardiographic recorder in Shanghai, China, from January 2015 to June 2019. Linear mixed-effects models were employed to evaluate the changes of HRV parameters associated with PNCs of 7 size ranges from 0.01 to 10 µm after controlling for environmental and individual confounders. RESULTS: On the concurrent day, decreased HRV parameters were associated with increased PNCs of 0.01-0.3 µm, and smaller particles showed greater effects. For an interquartile range increase in ultrafine particles (UFP, those < 0.1 µm, 2453 particles/cm3), the declines in very-low-frequency power, low-frequency power, high-frequency power, standard deviation of normal R-R intervals, root mean square of the successive differences between R-R intervals and percentage of adjacent normal R-R intervals with a difference ≥ 50 ms were 5.06% [95% confidence interval (CI): 2.09%, 7.94%], 7.65% (95%CI: 2.73%, 12.32%), 9.49% (95%CI: 4.64%, 14.09%), 5.10% (95%CI: 2.21%, 7.91%), 8.09% (95%CI: 4.39%, 11.65%) and 24.98% (95%CI: 14.70%, 34.02%), respectively. These results were robust to the adjustment of criteria air pollutants, temperature at different lags, and the status of heart medication. CONCLUSIONS: Particles less than 0.3 µm (especially UFP) may dominate the acute effects of particulate air pollution on cardiac autonomic dysfunction.


Assuntos
Poluentes Atmosféricos/análise , Poluição do Ar/estatística & dados numéricos , Exposição Ambiental/estatística & dados numéricos , Material Particulado/análise , Poluição do Ar/análise , China , Feminino , Cardiopatias , Frequência Cardíaca/efeitos dos fármacos , Humanos , Masculino , Pessoa de Meia-Idade , Tamanho da Partícula , Temperatura
12.
BMC Bioinformatics ; 21(Suppl 13): 385, 2020 Sep 17.
Artigo em Inglês | MEDLINE | ID: mdl-32938373

RESUMO

BACKGROUND: Network alignment is an efficient computational framework in the prediction of protein function and phylogenetic relationships in systems biology. However, most of existing alignment methods focus on aligning PPIs based on static network model, which are actually dynamic in real-world systems. The dynamic characteristic of PPI networks is essential for understanding the evolution and regulation mechanism at the molecular level and there is still much room to improve the alignment quality in dynamic networks. RESULTS: In this paper, we proposed a novel alignment algorithm, Twadn, to align dynamic PPI networks based on a strategy of time warping. We compare Twadn with the existing dynamic network alignment algorithm DynaMAGNA++ and DynaWAVE and use area under the receiver operating characteristic curve and area under the precision-recall curve as evaluation indicators. The experimental results show that Twadn is superior to DynaMAGNA++ and DynaWAVE. In addition, we use protein interaction network of Drosophila to compare Twadn and the static network alignment algorithm NetCoffee2 and experimental results show that Twadn is able to capture timing information compared to NetCoffee2. CONCLUSIONS: Twadn is a versatile and efficient alignment tool that can be applied to dynamic network. Hopefully, its application can benefit the research community in the fields of molecular function and evolution.


Assuntos
Algoritmos , Biologia Computacional/métodos , Drosophila/metabolismo , Mapas de Interação de Proteínas/genética , Proteínas/metabolismo , Animais , Humanos
13.
BMC Med ; 18(1): 312, 2020 11 10.
Artigo em Inglês | MEDLINE | ID: mdl-33167994

RESUMO

BACKGROUND: Recently, the association between inflammatory bowel disease (including ulcerative colitis and Crohn's disease) and BMD has attracted great interest in the research community. However, the results of the published epidemiological observational studies on the relationship between inflammatory bowel disease and BMD are still inconclusive. Here, we performed a two-sample Mendelian randomization analysis to investigate the causal link between inflammatory bowel disease and level of BMD using publically available GWAS summary statistics. METHODS: A series of quality control steps were taken in our analysis to select eligible instrumental SNPs which were strongly associated with exposure. To make the conclusions more robust and reliable, we utilized several robust analytical methods (inverse-variance weighting, MR-PRESSO method, mode-based estimate method, weighted median, MR-Egger regression, and MR.RAPS method) that are based on different assumptions of two-sample MR analysis. The MR-Egger intercept test, Cochran's Q test, and "leave-one-out" sensitivity analysis were performed to evaluate the horizontal pleiotropy, heterogeneities, and stability of these genetic variants on BMD. Outlier variants identified by the MR-PRESSO outlier test were removed step-by-step to reduce heterogeneity and the effect of horizontal pleiotropy. RESULTS: Our two-sample Mendelian randomization analysis with two groups of exposure GWAS summary statistics and four groups of outcome GWAS summary statistics suggested a definitively causal effect of genetically predicted ulcerative colitis on TB-BMD and FA-BMD but not on FN-BMD or LS-BMD (after Bonferroni correction), and we merely determined a causal effect of Crohn's disease on FN-BMD but not on the others, which was somewhat inconsistent with many published observational researches. The causal effect of inflammatory bowel disease on TB-BMD was significant and robust but not on FA-BMD, FN-BMD, and LS-BMD, which might result from the cumulative effect of ulcerative colitis and Crohn's disease on BMDs. CONCLUSIONS: Our Mendelian randomization analysis supported the causal effect of ulcerative colitis on TB-BMD and FA-BMD. As to Crohn's disease, only the definitively causal effect of it on decreased FN-BMD was observed. Updated MR analysis is warranted to confirm our findings when a more advanced method to get less biased estimates and better precision or GWAS summary data with more ulcerative colitis and Crohn's disease patients was available.


Assuntos
Doenças Inflamatórias Intestinais/epidemiologia , Doenças Inflamatórias Intestinais/patologia , Densidade Óssea , Humanos , Análise da Randomização Mendeliana/métodos , Projetos de Pesquisa
14.
BMC Bioinformatics ; 20(Suppl 18): 569, 2019 Nov 25.
Artigo em Inglês | MEDLINE | ID: mdl-31760932

RESUMO

BACKGROUNDS: There is evidence to suggest that lncRNAs are associated with distinct and diverse biological processes. The dysfunction or mutation of lncRNAs are implicated in a wide range of diseases. An accurate computational model can benefit the diagnosis of diseases and help us to gain a better understanding of the molecular mechanism. Although many related algorithms have been proposed, there is still much room to improve the accuracy of the algorithm. RESULTS: We developed a novel algorithm, BiWalkLDA, to predict disease-related lncRNAs in three real datasets, which have 528 lncRNAs, 545 diseases and 1216 interactions in total. To compare performance with other algorithms, the leave-one-out validation test was performed for BiWalkLDA and three other existing algorithms, SIMCLDA, LDAP and LRLSLDA. Additional tests were carefully designed to analyze the parameter effects such as α, ß, l and r, which could help user to select the best choice of these parameters in their own application. In a case study of prostate cancer, eight out of the top-ten disease-related lncRNAs reported by BiWalkLDA were previously confirmed in literatures. CONCLUSIONS: In this paper, we develop an algorithm, BiWalkLDA, to predict lncRNA-disease association by using bi-random walks. It constructs a lncRNA-disease network by integrating interaction profile and gene ontology information. Solving cold-start problem by using neighbors' interaction profile information. Then, bi-random walks was applied to three real biological datasets. Results show that our method outperforms other algorithms in predicting lncRNA-disease association in terms of both accuracy and specificity. AVAILABILITY: https://github.com/screamer/BiwalkLDA.


Assuntos
Biologia Computacional/métodos , Doença/genética , RNA Longo não Codificante/genética , Algoritmos , Simulação por Computador , Ontologia Genética , Humanos , Software
15.
BMC Bioinformatics ; 20(Suppl 7): 200, 2019 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-31074373

RESUMO

BACKGROUND: Transcription factors (TFs) play important roles in the regulation of gene expression. They can activate or block transcription of downstream genes in a manner of binding to specific genomic sequences. Therefore, motif discovery of these binding preference patterns is of central significance in the understanding of molecular regulation mechanism. Many algorithms have been proposed for the identification of transcription factor binding sites. However, it remains a challengeable problem. RESULTS: Here, we proposed a novel motif discovery algorithm based on support vector machine (MD-SVM) to learn a discriminative model for TF binding sites. MD-SVM firstly obtains position weight matrix (PWM) from a set of training datasets. Then it translates the MD problem into a computational framework of multiple instance learning (MIL). It was applied to several real biological datasets. Results show that our algorithm outperforms MI-SVM in terms of both accuracy and specificity. CONCLUSIONS: In this paper, we modeled the TF motif discovery problem as a MIL optimization problem. The SVM algorithm was adapted to discriminate positive and negative bags of instances. Compared to other svm-based algorithms, MD-SVM show its superiority over its competitors in term of ROC AUC. Hopefully, it could be of benefit to the research community in the understanding of molecular functions of DNA functional elements and transcription factors.


Assuntos
Algoritmos , Motivos de Nucleotídeos , Máquina de Vetores de Suporte , Fatores de Transcrição/metabolismo , Sítios de Ligação , Humanos , Ligação Proteica
16.
BMC Genomics ; 20(Suppl 13): 932, 2019 Dec 27.
Artigo em Inglês | MEDLINE | ID: mdl-31881842

RESUMO

Proteins play essential roles in almost all life processes. The prediction of protein function is of significance for the understanding of molecular function and evolution. Network alignment provides a fast and effective framework to automatically identify functionally conserved proteins in a systematic way. However, due to the fast growing genomic data, interactions and annotation data, there is an increasing demand for more accurate and efficient tools to deal with multiple PPI networks. Here, we present a novel global alignment algorithm NetCoffee2 based on graph feature vectors to discover functionally conserved proteins and predict function for unknown proteins. To test the algorithm performance, NetCoffee2 and three other notable algorithms were applied on eight real biological datasets. Functional analyses were performed to evaluate the biological quality of these alignments. Results show that NetCoffee2 is superior to existing algorithms IsoRankN, NetCoffee and multiMAGNA++ in terms of both coverage and consistency. The binary and source code are freely available under the GNU GPL v3 license at https://github.com/screamer/NetCoffee2.


Assuntos
Algoritmos , Proteínas/metabolismo , Animais , Arabidopsis/metabolismo , Proteínas de Arabidopsis/química , Proteínas de Arabidopsis/metabolismo , Drosophila/metabolismo , Proteínas de Drosophila/química , Proteínas de Drosophila/metabolismo , Entropia , Humanos , Camundongos , Mapas de Interação de Proteínas , Proteínas/química , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/química , Proteínas de Saccharomyces cerevisiae/metabolismo
17.
BMC Bioinformatics ; 19(1): 422, 2018 Nov 12.
Artigo em Inglês | MEDLINE | ID: mdl-30419809

RESUMO

BACKGROUND: The discovery of functionally conserved proteins is a tough and important task in system biology. Global network alignment provides a systematic framework to search for these proteins from multiple protein-protein interaction (PPI) networks. Although there exist many web servers for network alignment, no one allows to perform global multiple network alignment tasks on users' test datasets. RESULTS: Here, we developed a web server WebNetcoffee based on the algorithm of NetCoffee to search for a global network alignment from multiple networks. To build a series of online test datasets, we manually collected 218,339 proteins, 4,009,541 interactions and many other associated protein annotations from several public databases. All these datasets and alignment results are available for download, which can support users to perform algorithm comparison and downstream analyses. CONCLUSION: WebNetCoffee provides a versatile, interactive and user-friendly interface for easily running alignment tasks on both online datasets and users' test datasets, managing submitted jobs and visualizing the alignment results through a web browser. Additionally, our web server also facilitates graphical visualization of induced subnetworks for a given protein and its neighborhood. To the best of our knowledge, it is the first web server that facilitates the performing of global alignment for multiple PPI networks. AVAILABILITY: http://www.nwpu-bioinformatics.com/WebNetCoffee.


Assuntos
Biologia Computacional/métodos , Mapeamento de Interação de Proteínas/métodos , Humanos
18.
Artigo em Inglês | MEDLINE | ID: mdl-28416555

RESUMO

Tuberculosis (TB) continues to be one of the most common bacterial infectious diseases and is the leading cause of death in many parts of the world. A major limitation of TB therapy is slow killing of the infecting organism, increasing the risk for the development of a tolerance phenotype and drug resistance. Studies indicate that Mycobacterium tuberculosis takes several days to be killed upon treatment with lethal concentrations of antibiotics both in vitro and in vivo To investigate how metabolic remodeling can enable transient bacterial survival during exposure to bactericidal concentrations of compounds, M. tuberculosis strain H37Rv was exposed to twice the MIC of isoniazid, rifampin, moxifloxacin, mefloquine, or bedaquiline for 24 h, 48 h, 4 days, and 6 days, and the bacterial proteomic response was analyzed using quantitative shotgun mass spectrometry. Numerous sets of de novo bacterial proteins were identified over the 6-day treatment. Network analysis and comparisons between the drug treatment groups revealed several shared sets of predominant proteins and enzymes simultaneously belonging to a number of diverse pathways. Overexpression of some of these proteins in the nonpathogenic Mycobacterium smegmatis extended bacterial survival upon exposure to bactericidal concentrations of antimicrobials, and inactivation of some proteins in M. tuberculosis prevented the pathogen from escaping the fast killing in vitro and in macrophages, as well. Our biology-driven approach identified promising bacterial metabolic pathways and enzymes that might be targeted by novel drugs to reduce the length of tuberculosis therapy.


Assuntos
Antituberculosos/farmacologia , Mycobacterium tuberculosis/efeitos dos fármacos , Proteômica/métodos , Diarilquinolinas/farmacologia , Fluoroquinolonas/farmacologia , Isoniazida/farmacologia , Mefloquina/farmacologia , Moxifloxacina , Proteoma/metabolismo , Rifampina/farmacologia
19.
Acta Biochim Biophys Sin (Shanghai) ; 49(3): 270-276, 2017 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-28159958

RESUMO

Cardiac sodium channel plays a key role in the fast depolarization and maintenance of impulse conduction in cardiomyocytes. Mutations of SCN5A gene can lead to many types of arrhythmias. A 14-year-old boy with familial paternal history of sudden unexpected nocturnal death was admitted to hospital with recurrent syncope. A cardiac channelopathy was suspected and a pathogenic ion channel was searched for mutation identification. The proband manifested sinus node dysfunction, ventricular tachycardia, cardiac conduction disturbance involving atrioventricular node and His bundle. The proband and his mother received whole exome sequencing. A heterozygous in-frame deletion N1380del on exon 23 of SCN5A gene locating in a highly conserved pore residue in domain III (S5-S6) was revealed in the proband. The mutation was assessed in other family members by Sanger sequencing. The proband's living uncle and two sisters were asymptomatic mutation carriers with different degrees of cardiac conduction disturbance. Functional analysis was conducted using whole-cell patch clamping in HEK293T cells transfected with wild-type or mutant channels. The HEK293T cells transfected with plasmid pcDNA3.1-N1380del-SCN5A had no detectable sodium current. Overall, N1380del mutation of SCN5A gene leads to loss of function of sodium channel. N1380del is a pathogenetic mutation which can cause cardiac conduction defect and ventricular tachycardia.


Assuntos
Doença do Sistema de Condução Cardíaco/genética , Mutação , Canal de Sódio Disparado por Voltagem NAV1.5/genética , Taquicardia Ventricular/genética , Adolescente , Doença do Sistema de Condução Cardíaco/patologia , Doença do Sistema de Condução Cardíaco/terapia , Éxons , Humanos , Masculino , Fenótipo , Prognóstico , Taquicardia Ventricular/patologia , Taquicardia Ventricular/terapia
20.
Molecules ; 22(12)2017 Dec 10.
Artigo em Inglês | MEDLINE | ID: mdl-29232861

RESUMO

Network motifs are patterns of complex networks occurring significantly more frequently than those in random networks. They have been considered as fundamental building blocks of complex networks. Therefore, the detection of network motifs in transcriptional regulation networks is a crucial step in understanding the mechanism of transcriptional regulation and network evolution. The search for network motifs is similar to solving subgraph searching problems, which has proven to be NP-complete. To quickly and effectively count subgraphs of a large biological network, we propose a novel graph canonization algorithm based on resolving sets. This method has been implemented in a command line interface (CLI) program sgip using the SeqAn library. Comparing to Babai's algorithm, this approach has a tighter complexity bound, o ( exp ( n log 2 n + 4 log n ) ) , on strongly regular graphs. Results on several simulated datasets and transcriptional regulation networks indicate that sgip outperforms nauty on many graph cases. The source code of sgip is freely accessible in https://github.com/seqan/seqan/tree/master/apps/sgip and the binary code in http://packages.seqan.de/sgip/.


Assuntos
Regulação da Expressão Gênica , Redes Reguladoras de Genes , Algoritmos , Internet , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA