Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
1.
Entropy (Basel) ; 23(8)2021 Aug 17.
Artigo em Inglês | MEDLINE | ID: mdl-34441199

RESUMO

Time series classification (TSC) is a significant problem in data mining with several applications in different domains. Mining different distinguishing features is the primary method. One promising method is algorithms based on the morphological structure of time series, which are interpretable and accurate. However, existing structural feature-based algorithms, such as time series forest (TSF) and shapelet traverse, all features through many random combinations, which means that a lot of training time and computing resources are required to filter meaningless features, important distinguishing information will be ignored. To overcome this problem, in this paper, we propose a perceptual features-based framework for TSC. We are inspired by how humans observe time series and realize that there are usually only a few essential points that need to be remembered for a time series. Although the complex time series has a lot of details, a small number of data points is enough to describe the shape of the entire sample. First, we use the improved perceptually important points (PIPs) to extract key points and use them as the basis for time series segmentation to obtain a combination of interval-level and point-level features. Secondly, we propose a framework to explore the effects of perceptual structural features combined with decision trees (DT), random forests (RF), and gradient boosting decision trees (GBDT) on TSC. The experimental results on the UCR datasets show that our work has achieved leading accuracy, which is instructive for follow-up research.

2.
Entropy (Basel) ; 23(4)2021 Apr 09.
Artigo em Inglês | MEDLINE | ID: mdl-33918679

RESUMO

The trend prediction of the stock is a main challenge. Accidental factors often lead to short-term sharp fluctuations in stock markets, deviating from the original normal trend. The short-term fluctuation of stock price has high noise, which is not conducive to the prediction of stock trends. Therefore, we used discrete wavelet transform (DWT)-based denoising to denoise stock data. Denoising the stock data assisted us to eliminate the influences of short-term random events on the continuous trend of the stock. The denoised data showed more stable trend characteristics and smoothness. Extreme learning machine (ELM) is one of the effective training algorithms for fully connected single-hidden-layer feedforward neural networks (SLFNs), which possesses the advantages of fast convergence, unique results, and it does not converge to a local minimum. Therefore, this paper proposed a combination of ELM- and DWT-based denoising to predict the trend of stocks. The proposed method was used to predict the trend of 400 stocks in China. The prediction results of the proposed method are a good proof of the efficacy of DWT-based denoising for stock trends, and showed an excellent performance compared to 12 machine learning algorithms (e.g., recurrent neural network (RNN) and long short-term memory (LSTM)).

3.
Entropy (Basel) ; 22(1)2020 Jan 18.
Artigo em Inglês | MEDLINE | ID: mdl-33285894

RESUMO

Event-based social networks (EBSNs) are widely used to create online social groups and organize offline events for users. Activeness and loyalty are crucial characteristics of these online social groups in terms of determining the growth or inactiveness of the social groups in a specific time frame. However, there is less research on these concepts to clarify the existence of groups in event-based social networks. In this paper, we study the problem of group activeness and user loyalty to provide a novel insight into online social networks. First, we analyze the structure of EBSNs and generate features from the crawled datasets. Second, we define the concepts of group activeness and user loyalty based on a series of time windows, and propose a method to measure the group activeness. In this proposed method, we first compute a ratio of a number of events between two consecutive time windows. We then develop an association matrix to assign the activeness label for each group after several consecutive time windows. Similarly, we measure the user loyalty in terms of attended events gathered in time windows and treat loyalty as a contributive feature of the group activeness. Finally, three well-known machine learning techniques are used to verify the activeness label and to generate features for each group. As a consequence, we also find a small group of features that are highly correlated and result in higher accuracy as compared to the whole features.

4.
Entropy (Basel) ; 22(10)2020 Oct 15.
Artigo em Inglês | MEDLINE | ID: mdl-33286931

RESUMO

Time series prediction has been widely applied to the finance industry in applications such as stock market price and commodity price forecasting. Machine learning methods have been widely used in financial time series prediction in recent years. How to label financial time series data to determine the prediction accuracy of machine learning models and subsequently determine final investment returns is a hot topic. Existing labeling methods of financial time series mainly label data by comparing the current data with those of a short time period in the future. However, financial time series data are typically non-linear with obvious short-term randomness. Therefore, these labeling methods have not captured the continuous trend features of financial time series data, leading to a difference between their labeling results and real market trends. In this paper, a new labeling method called "continuous trend labeling" is proposed to address the above problem. In the feature preprocessing stage, this paper proposed a new method that can avoid the problem of look-ahead bias in traditional data standardization or normalization processes. Then, a detailed logical explanation was given, the definition of continuous trend labeling was proposed and also an automatic labeling algorithm was given to extract the continuous trend features of financial time series data. Experiments on the Shanghai Composite Index and Shenzhen Component Index and some stocks of China showed that our labeling method is a much better state-of-the-art labeling method in terms of classification accuracy and some other classification evaluation metrics. The results of the paper also proved that deep learning models such as LSTM and GRU are more suitable for dealing with the prediction of financial time series data.

5.
Gut ; 64(1): 156-67, 2015 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-24572141

RESUMO

OBJECTIVE: Liver tumour-initiating cells (T-ICs) are critical for hepatocarcinogenesis. However, the underlying mechanism regulating the function of liver T-ICs remains unclear. METHODS: Tissue microarrays containing 242 hepatocellular carcinoma (HCC) samples were used for prognostic analysis. Magnetically activated cell sorting was used to isolate epithelial cell adhesion molecule (EPCAM)-positive cells. The gene expressions affected by miR-429 were determined by arrays. Co-immunoprecipitation was used to study interactions among retinoblastoma protein (RB1), Rb binding protein 4 (RBBP4) and E2F transcription factor 1 (E2F1). The DNA methylation status in CpG islands was detected by quantitative methylation analysis. miRNAs in microvesicles were isolated by a syringe filter system. RESULTS: The significant prognosis factor miR-429 was upregulated in HCC tissues and also in primary liver T-ICs isolated from clinical samples. The enrichment of miR-429 in EPCAM+ T-ICs contributed to hepatocyte self-renewal, malignant proliferation, chemoresistance and tumorigenicity. A novel functional axis involving miR-429, RBBP4, E2F1 and POU class 5 homeobox 1 (POU5F1 or OCT4) governing the regulation of liver EPCAM+ T-ICs was established in vitro and in vivo. The molecular mechanism regulating miR-429 expression, involving four abnormal hypomethylated sites upstream of the miR-200b/miR-200a/miR-429 cluster, was first defined in both EPCAM+ liver T-ICs and very early-stage HCC tissues. miR-429 secreted by high-expressing cells has the potential to become a proactive signalling molecule to mediate intercellular communication. CONCLUSIONS: Epigenetic modification of miR-429 can manipulate liver T-ICs by targeting the RBBP4/E2F1/OCT4 axis. This miRNA might be targeted to inactivate T-ICs, thus providing a novel strategy for HCC prevention and treatment.


Assuntos
Carcinoma Hepatocelular/genética , Carcinoma Hepatocelular/patologia , Epigênese Genética , Neoplasias Hepáticas/genética , Neoplasias Hepáticas/patologia , MicroRNAs/genética , Células-Tronco Neoplásicas , Proteína do Retinoblastoma/fisiologia , Humanos , Prognóstico , Células Tumorais Cultivadas
6.
BMC Genomics ; 16: 1022, 2015 Dec 01.
Artigo em Inglês | MEDLINE | ID: mdl-26626453

RESUMO

BACKGROUND: One major goal of large-scale cancer omics study is to identify molecular subtypes for more accurate cancer diagnoses and treatments. To deal with high-dimensional cancer multi-omics data, a promising strategy is to find an effective low-dimensional subspace of the original data and then cluster cancer samples in the reduced subspace. However, due to data-type diversity and big data volume, few methods can integrative and efficiently find the principal low-dimensional manifold of the high-dimensional cancer multi-omics data. RESULTS: In this study, we proposed a novel low-rank approximation based integrative probabilistic model to fast find the shared principal subspace across multiple data types: the convexity of the low-rank regularized likelihood function of the probabilistic model ensures efficient and stable model fitting. Candidate molecular subtypes can be identified by unsupervised clustering hundreds of cancer samples in the reduced low-dimensional subspace. On testing datasets, our method LRAcluster (low-rank approximation based multi-omics data clustering) runs much faster with better clustering performances than the existing method. Then, we applied LRAcluster on large-scale cancer multi-omics data from TCGA. The pan-cancer analysis results show that the cancers of different tissue origins are generally grouped as independent clusters, except squamous-like carcinomas. While the single cancer type analysis suggests that the omics data have different subtyping abilities for different cancer types. CONCLUSIONS: LRAcluster is a very useful method for fast dimension reduction and unsupervised clustering of large-scale multi-omics data. LRAcluster is implemented in R and freely available via http://bioinfo.au.tsinghua.edu.cn/software/lracluster/ .


Assuntos
Biologia Computacional/métodos , Epigenômica/métodos , Genômica/métodos , Neoplasias/genética , Algoritmos , Análise por Conglomerados , Conjuntos de Dados como Assunto , Humanos
7.
Chem Commun (Camb) ; (11): 1284-5, 2003 Jun 07.
Artigo em Inglês | MEDLINE | ID: mdl-12809232

RESUMO

The one-dimensional double helicate, [NH4][Mo2O4Gd(H2O)6(L-C4H2O6)2] x 4H2O (1), which was synthesized by the reaction of GdCl3, L-tartaric acid and ammonium molybdate in acidified water solution, is built up by two heft-handed single-helical chains, linked up further by eight-coordinated GdIII pieces in an enantiopure left-handed double helical configuration, of which each helix is formed by L-tartrate bridged six-coordinated MoVI atoms.

8.
PLoS One ; 8(9): e74275, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24040221

RESUMO

DNA methylation is vital for many essential biological processes and human diseases. Illumina Infinium HumanMethylation450 Beadchip is a recently developed platform studying genome-wide DNA methylation state on more than 480,000 CpG sites and a few CHG sites with high data quality. To analyze the data of this promising platform, we developed FastDMA which can be used to identify significantly differentially methylated probes. Besides single probe analysis, FastDMA can also do region-based analysis for identifying the differentially methylated region (DMRs). A uniformed statistical model, analysis of covariance (ANCOVA), is used to achieve all the analyses in FastDMA. We apply FastDMA on three large-scale DNA methylation datasets from The Cancer Genome Atlas (TCGA) and find many differentially methylated genomic sites in different types of cancer. On the testing datasets, FastDMA shows much higher computational efficiency than current tools. FastDMA can benefit the data analyses of large-scale DNA methylation studies with an integrative pipeline and a high computational efficiency. The software is freely available via http://bioinfo.au.tsinghua.edu.cn/software/fastdma/.


Assuntos
Adenocarcinoma/genética , Neoplasias da Mama/genética , Carcinoma/genética , Neoplasias Pulmonares/genética , Modelos Estatísticos , Neoplasias da Próstata/genética , Software , Adenocarcinoma de Pulmão , Atlas como Assunto , Ilhas de CpG , Metilação de DNA , Feminino , Genoma Humano , Humanos , Masculino , Análise de Sequência com Séries de Oligonucleotídeos
9.
Quant Biol ; 1(3): 201-208, 2013 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-26085954

RESUMO

Cancer stem cell (CSC) theory suggests a cell-lineage structure in tumor cells in which CSCs are capable of giving rise to the other non-stem cancer cells (NSCCs) but not vice versa. However, an alternative scenario of bidirectional interconversions between CSCs and NSCCs was proposed very recently. Here we present a general population model of cancer cells by integrating conventional cell divisions with direct conversions between different cell states, namely, not only can CSCs differentiate into NSCCs by asymmetric cell division, NSCCs can also dedifferentiate into CSCs by cell state conversion. Our theoretical model is validated when applying the model to recent experimental data. It is also found that the transient increase in CSCs proportion initiated from the purified NSCCs subpopulation cannot be well predicted by the conventional CSC model where the conversion from NSCCs to CSCs is forbidden, implying that the cell state conversion is required especially for the transient dynamics. The theoretical analysis also gives the condition such that our general model can be equivalently reduced into a simple Markov chain with only cell state transitions keeping the same cell proportion dynamics.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA