Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 46
Filtrar
Más filtros













Base de datos
Intervalo de año de publicación
1.
Neural Netw ; 176: 106341, 2024 Apr 25.
Artículo en Inglés | MEDLINE | ID: mdl-38692189

RESUMEN

The great learning ability of deep learning facilitates us to comprehend the real physical world, making learning to simulate complicated particle systems a promising endeavour both in academia and industry. However, the complex laws of the physical world pose significant challenges to the learning based simulations, such as the varying spatial dependencies between interacting particles and varying temporal dependencies between particle system states in different time stamps, which dominate particles' interacting behavior and the physical systems' evolution patterns. Existing learning based methods fail to fully account for the complexities, making them unable to yield satisfactory simulations. To better comprehend the complex physical laws, we propose a novel model - Graph Networks with Spatial-Temporal neural Ordinary Differential Equations (GNSTODE) - that characterizes the varying spatial and temporal dependencies in particle systems using a united end-to-end framework. Through training with real-world particle-particle interaction observations, GNSTODE can simulate any possible particle systems with high precisions. We empirically evaluate GNSTODE's simulation performance on two real-world particle systems, Gravity and Coulomb, with varying levels of spatial and temporal dependencies. The results show that GNSTODE yields better simulations than state-of-the-art methods, showing that GNSTODE can serve as an effective tool for particle simulation in real-world applications. Our code is made available at https://github.com/Guangsi-Shi/AI-for-physics-GNSTODE.

2.
Artículo en Inglés | MEDLINE | ID: mdl-38743540

RESUMEN

Conversational recommender systems (CRSs) utilize natural language interactions and dialog history to infer user preferences and provide accurate recommendations. Due to the limited conversation context and background knowledge, existing CRSs rely on external sources such as knowledge graphs (KGs) to enrich the context and model entities based on their interrelations. However, these methods ignore the rich intrinsic information within entities. To address this, we introduce the knowledge-enhanced entity representation learning (KERL) framework, which leverages both the KG and a pretrained language model (PLM) to improve the semantic understanding of entities for CRS. In our KERL framework, entity textual descriptions are encoded via a PLM, while a KG helps reinforce the representation of these entities. We also employ positional encoding to effectively capture the temporal information of entities in a conversation. The enhanced entity representation is then used to develop a recommender component that fuses both entity and contextual representations for more informed recommendations, as well as a dialog component that generates informative entity-related information in the response text. A high-quality KG with aligned entity descriptions is constructed to facilitate this study, namely, the Wiki Movie Knowledge Graph (WikiMKG). The experimental results show that KERL achieves state-of-the-art results in both recommendation and response generation tasks. Our code is publicly available at the link: https://github.com/icedpanda/KERL.

3.
IEEE Trans Cybern ; PP2024 May 21.
Artículo en Inglés | MEDLINE | ID: mdl-38771679

RESUMEN

Temporal knowledge graphs (TKGs) are receiving increased attention due to their time-dependent properties and the evolving nature of knowledge over time. TKGs typically contain complex geometric structures, such as hierarchical, ring, and chain structures, which can often be mixed together. However, embedding TKGs into Euclidean space, as is typically done with TKG completion (TKGC) models, presents a challenge when dealing with high-dimensional nonlinear data and complex geometric structures. To address this issue, we propose a novel TKGC model called multicurvature adaptive embedding (MADE). MADE models TKGs in multicurvature spaces, including flat Euclidean space (zero curvature), hyperbolic space (negative curvature), and hyperspherical space (positive curvature), to handle multiple geometric structures. We assign different weights to different curvature spaces in a data-driven manner to strengthen the ideal curvature spaces for modeling and weaken the inappropriate ones. Additionally, we introduce the quadruplet distributor (QD) to assist the information interaction in each geometric space. Ultimately, we develop an innovative temporal regularization to enhance the smoothness of timestamp embeddings by strengthening the correlation of neighboring timestamps. Experimental results show that MADE outperforms the existing state-of-the-art TKGC models.

4.
Artículo en Inglés | MEDLINE | ID: mdl-38598381

RESUMEN

Self-supervised learning (SSL) has recently achieved impressive performance on various time series tasks. The most prominent advantage of SSL is that it reduces the dependence on labeled data. Based on the pre-training and fine-tuning strategy, even a small amount of labeled data can achieve high performance. Compared with many published self-supervised surveys on computer vision and natural language processing, a comprehensive survey for time series SSL is still missing. To fill this gap, we review current state-of-the-art SSL methods for time series data in this article. To this end, we first comprehensively review existing surveys related to SSL and time series, and then provide a new taxonomy of existing time series SSL methods by summarizing them from three perspectives: generative-based, contrastive-based, and adversarial-based. These methods are further divided into ten subcategories with detailed reviews and discussions about their key intuitions, main frameworks, advantages and disadvantages. To facilitate the experiments and validation of time series SSL methods, we also summarize datasets commonly used in time series forecasting, classification, anomaly detection, and clustering tasks. Finally, we present the future directions of SSL for time series analysis.

5.
Neural Netw ; 174: 106219, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38442489

RESUMEN

Extrapolating future events based on historical information in temporal knowledge graphs (TKGs) holds significant research value and practical applications. In this field, the methods currently utilized can be classified as either embedding-based or logical rule-based. Embedding-based methods depend on learned entity and relation embeddings for prediction, but they suffer from the lack of interpretability due to the opaque reasoning process. On the other hand, logical rule-based methods face scalability challenges as they heavily rely on predefined logical rules. To overcome these limitations, we propose a hybrid model that combines embedding-based and logical rule-based methods to capture deep causal logic. Our model, called the Inductive Reasoning Model based on Interpretable Logical Rule (ILR-IR), aims to provide interpretable insights while effectively predicting future events in TKGs. ILR-IR delves into historical information, extracting valuable insights from logical rules embedded within relations and interaction preferences between entities. By considering both logical rules and interaction preferences, ILR-IR offers a comprehensive perspective for predicting future events. In addition, we propose the incorporation of a one-class augmented matching loss during optimization, which serves to enhance performance of the model during training. We evaluate ILR-IR on multiple datasets, including ICEWS14, ICEWS0515, and ICEWS18. Experimental results demonstrate that ILR-IR outperforms state-of-the-art baselines, showcasing its superior performance in TKG extrapolation reasoning. Moreover, ILR-IR demonstrates remarkable generalization capabilities, even when applied to related datasets that share a common relation vocabulary. This suggests that our proposed model exhibits robust zero-shot reasoning abilities. For interested parties, we have made our code publicly available at https://github.com/mxadorable/ILR-IR.


Asunto(s)
Reconocimiento de Normas Patrones Automatizadas , Solución de Problemas , Aprendizaje , Generalización Psicológica , Conocimiento
6.
Neural Netw ; 172: 106151, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38301339

RESUMEN

Representation learning on temporal interaction graphs (TIG) aims to model complex networks with the dynamic evolution of interactions on a wide range of web and social graph applications. However, most existing works on TIG either (a) rely on discretely updated node embeddings merely when an interaction occurs that fail to capture the continuous evolution of embedding trajectories of nodes, or (b) overlook the rich temporal patterns hidden in the ever-changing graph data that presumably lead to sub-optimal models. In this paper, we propose a two-module framework named ConTIG, a novel representation learning method on TIG that captures the continuous dynamic evolution of node embedding trajectories. With two essential modules, our model exploits three-fold factors in dynamic networks including latest interaction, neighbor features, and inherent characteristics. In the first update module, we employ a continuous inference block to learn the nodes' state trajectories from time-adjacent interaction patterns using ordinary differential equations. In the second transform module, we introduce a self-attention mechanism to predict future node embeddings by aggregating historical temporal interaction information. Experiment results demonstrate the superiority of ConTIG on temporal link prediction, temporal node recommendation, and dynamic node classification tasks of four datasets compared with a range of state-of-the-art baselines, especially for long-interval interaction prediction.


Asunto(s)
Aprendizaje Automático
7.
Artículo en Inglés | MEDLINE | ID: mdl-38190667

RESUMEN

Origins of replication sites (ORIs) are crucial genomic regions where DNA replication initiation takes place, playing pivotal roles in fundamental biological processes like cell division, gene expression regulation, and DNA integrity. Accurate identification of ORIs is essential for comprehending cell replication, gene expression, and mutation-related diseases. However, experimental approaches for ORI identification are often expensive and time-consuming, leading to the growing popularity of computational methods. In this study, we present PLANNER (DeeP LeArNiNg prEdictor for ORI), a novel approach for species-specific and cell-specific prediction of eukaryotic ORIs. PLANNER uses the multi-scale ktuple sequences as input and employs the DNABERT pre-training model with transfer learning and ensemble learning strategies to train accurate predictive models. Extensive empirical test results demonstrate that PLANNER achieved superior predictive performance compared to state-of-the-art approaches, including iOri-Euk, Stack-ORI, and ORI-Deep, within specific cell types and across different cell types. Furthermore, by incorporating an interpretable analysis mechanism, we provide insights into the learned patterns, facilitating the mapping from discovering important sequential determinants to comprehensively analysing their biological functions. To facilitate the widespread utilisation of PLANNER, we developed an online webserver and local stand-alone software, available at http://planner.unimelb-biotools.cloud.edu.au/ and https://github.com/CongWang3/PLANNER, respectively.

8.
Int J Hematol ; 119(2): 119-129, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-38147275

RESUMEN

Adult B-cell acute lymphoblastic leukemia (B-ALL) prognosis remains unsatisfactory, and searching for new therapeutic targets is crucial for improving patient prognosis. Sperm-associated antigen 6 (SPAG6), a member of the cancer-testis antigen family, plays an important role in tumors, especially hematologic tumors; however, it is unknown whether SPAG6 plays a role in adult B-ALL. In this study, we demonstrated for the first time that SPAG6 expression was up-regulated in the bone marrow of adult B-ALL patients compared to healthy donors, and expression was significantly reduced in patients who achieved complete remission (CR) after treatment. In addition, patients with high SPAG6 expression were older (≥ 35 years; P = 0.015), had elevated white blood cell counts (WBC > 30 × 109/L; P = 0.021), and a low rate of CR (P = 0.036). We explored the SPAG6 effect on cell function by lentiviral transfection of adult B-ALL cell lines BALL-1 and NALM-6, and discovered that knocking down SPAG6 significantly inhibited cell proliferation and promoted apoptosis. We identified that SPAG6 knockdown might regulate cell proliferation and apoptosis via the transforming growth factor-ß (TGF-ß)/Smad signaling pathway.


Asunto(s)
Leucemia-Linfoma Linfoblástico de Células Precursoras , Factor de Crecimiento Transformador beta , Masculino , Adulto , Humanos , Transducción de Señal , Apoptosis/genética , Proliferación Celular , Proteínas de Microtúbulos/metabolismo
9.
Artículo en Inglés | MEDLINE | ID: mdl-37962997

RESUMEN

Multivariate time-series anomaly detection is critically important in many applications, including retail, transportation, power grid, and water treatment plants. Existing approaches for this problem mostly employ either statistical models which cannot capture the nonlinear relations well or conventional deep learning (DL) models e.g., convolutional neural network (CNN) and long short-term memory (LSTM) that do not explicitly learn the pairwise correlations among variables. To overcome these limitations, we propose a novel method, correlation-aware spatial-temporal graph learning (termed ), for time-series anomaly detection. explicitly captures the pairwise correlations via a correlation learning (MTCL) module based on which a spatial-temporal graph neural network (STGNN) can be developed. Then, by employing a graph convolution network (GCN) that exploits one-and multihop neighbor information, our STGNN component can encode rich spatial information from complex pairwise dependencies between variables. With a temporal module that consists of dilated convolutional functions, the STGNN can further capture long-range dependence over time. A novel anomaly scoring component is further integrated into to estimate the degree of an anomaly in a purely unsupervised manner. Experimental results demonstrate that can detect and diagnose anomalies effectively in general settings as well as enable early detection across different time delays. Our code is available at https://github.com/huankoh/CST-GL.

11.
Artículo en Inglés | MEDLINE | ID: mdl-37695949

RESUMEN

Graph neural networks (GNNs) have shown great ability in modeling graphs; however, their performance would significantly degrade when there are noisy edges connecting nodes from different classes. To alleviate negative effect of noisy edges on neighborhood aggregation, some recent GNNs propose to predict the label agreement between node pairs within a single network. However, predicting the label agreement of edges across different networks has not been investigated yet. Our work makes the pioneering attempt to study a novel problem of cross-network homophilous and heterophilous edge classification (CNHHEC) and proposes a novel domain-adaptive graph attention-supervised network (DGASN) to effectively tackle the CNHHEC problem. First, DGASN adopts multihead graph attention network (GAT) as the GNN encoder, which jointly trains node embeddings and edge embeddings via the node classification and edge classification losses. As a result, label-discriminative embeddings can be obtained to distinguish homophilous edges from heterophilous edges. In addition, DGASN applies direct supervision on graph attention learning based on the observed edge labels from the source network, thus lowering the negative effects of heterophilous edges while enlarging the positive effects of homophilous edges during neighborhood aggregation. To facilitate knowledge transfer across networks, DGASN employs adversarial domain adaptation to mitigate domain divergence. Extensive experiments on real-world benchmark datasets demonstrate that the proposed DGASN achieves the state-of-the-art performance in CNHHEC.

12.
Cancer Sci ; 114(11): 4445-4458, 2023 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-37681349

RESUMEN

Sperm-associated antigen 6 (SPAG6) has been identified as an oncogene or tumor suppressor in various types of human cancer. However, the role of SPAG6 in BCR::ABL1 negative myeloproliferative neoplasms (MPNs) remains unclear. Herein, we found that SPAG6 was upregulated at the mRNA level in primary MPN cells and MPN-derived leukemia cell lines. The SPAG6 protein was primarily located in the cytoplasm around the nucleus and positively correlated with ß-tubulin expression. In vitro, forced expression of SPAG6 increased cell clone formation and promoted G1 to S cell cycle progression. Downregulation of SPAG6 promoted apoptosis, reduced G1 to S phase transition, and impaired cell proliferation and cytokine release accompanied by downregulated signal transducer and activator of transcription 1 (STAT1) expression. Furthermore, the inhibitory effect of interferon-α (INF-α) on the primary MPN cells with high SPAG6 expression was decreased. Downregulation of SPAG6 enhanced STAT1 induction, thus enhancing the proapoptotic and cell cycle arrest effects of INF-α both in vitro and in vivo. Finally, a decrease in SPAG6 protein expression was noted when the STAT1 signaling was blocked. Chromatin immunoprecipitation assays indicated that STAT1 protein could bind to the SPAG6 promoter, while the dual-luciferase reporter assay indicated that STAT1 could promote the expression of SPAG6. Our results substantiate the relationship between upregulated SPAG6, increased STAT1, and reduced sensitivity to INF-α response in MPN.


Asunto(s)
Interferón-alfa , Neoplasias , Humanos , Interferón-alfa/farmacología , Interferón-alfa/genética , Proteínas/metabolismo , Transducción de Señal/genética , Genes Supresores de Tumor , Regiones Promotoras Genéticas , Factor de Transcripción STAT1/genética , Factor de Transcripción STAT1/metabolismo , Neoplasias/genética , Proteínas de Microtúbulos/genética , Proteínas de Microtúbulos/metabolismo
13.
Neural Netw ; 166: 105-126, 2023 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-37487409

RESUMEN

In recent years, neural systems have demonstrated highly effective learning ability and superior perception intelligence. However, they have been found to lack effective reasoning and cognitive ability. On the other hand, symbolic systems exhibit exceptional cognitive intelligence but suffer from poor learning capabilities when compared to neural systems. Recognizing the advantages and disadvantages of both methodologies, an ideal solution emerges: combining neural systems and symbolic systems to create neural-symbolic learning systems that possess powerful perception and cognition. The purpose of this paper is to survey the advancements in neural-symbolic learning systems from four distinct perspectives: challenges, methods, applications, and future directions. By doing so, this research aims to propel this emerging field forward, offering researchers a comprehensive and holistic overview. This overview will not only highlight the current state-of-the-art but also identify promising avenues for future research.


Asunto(s)
Aprendizaje , Redes Neurales de la Computación , Inteligencia Artificial , Cognición , Solución de Problemas
14.
Artículo en Inglés | MEDLINE | ID: mdl-37440376

RESUMEN

Contrastive learning (CL) is a prominent technique for self-supervised representation learning, which aims to contrast semantically similar (i.e., positive) and dissimilar (i.e., negative) pairs of examples under different augmented views. Recently, CL has provided unprecedented potential for learning expressive graph representations without external supervision. In graph CL, the negative nodes are typically uniformly sampled from augmented views to formulate the contrastive objective. However, this uniform negative sampling strategy limits the expressive power of contrastive models. To be specific, not all the negative nodes can provide sufficiently meaningful knowledge for effective contrastive representation learning. In addition, the negative nodes that are semantically similar to the anchor are undesirably repelled from it, leading to degraded model performance. To address these limitations, in this article, we devise an adaptive sampling strategy termed "AdaS." The proposed AdaS framework can be trained to adaptively encode the importance of different negative nodes, so as to encourage learning from the most informative graph nodes. Meanwhile, an auxiliary polarization regularizer is proposed to suppress the adverse impacts of the false negatives and enhance the discrimination ability of AdaS. The experimental results on a variety of real-world datasets firmly verify the effectiveness of our AdaS in improving the performance of graph CL.

15.
Neural Netw ; 165: 596-610, 2023 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-37364470

RESUMEN

Although graph representation learning has been studied extensively in static graph settings, dynamic graphs are less investigated in this context. This paper proposes a novel integrated variational framework called DYnamic mixture Variational Graph Recurrent Neural Networks (DyVGRNN), which consists of extra latent random variables in structural and temporal modelling. Our proposed framework comprises an integration of Variational Graph Auto-Encoder (VGAE) and Graph Recurrent Neural Network (GRNN) by exploiting a novel attention mechanism. The Gaussian Mixture Model (GMM) and the VGAE framework are combined in DyVGRNN to model the multimodal nature of data, which enhances performance. To consider the significance of time steps, our proposed method incorporates an attention-based module. The experimental results demonstrate that our method greatly outperforms state-of-the-art dynamic graph representation learning methods in terms of link prediction and clustering.2.


Asunto(s)
Aprendizaje , Redes Neurales de la Computación , Análisis por Conglomerados , Distribución Normal
16.
Comput Biol Med ; 163: 107155, 2023 09.
Artículo en Inglés | MEDLINE | ID: mdl-37356289

RESUMEN

The genome of Mycobacterium tuberculosis contains a relatively high percentage (10%) of genes that are poorly characterised because of their highly repetitive nature and high GC content. Some of these genes encode proteins of the PE/PPE family, which are thought to be involved in host-pathogen interactions, virulence, and disease pathogenicity. Members of this family are genetically divergent and challenging to both identify and classify using conventional computational tools. Thus, advanced in silico methods are needed to identify proteins of this family for subsequent functional annotation efficiently. In this study, we developed the first deep learning-based approach, termed Digerati, for the rapid and accurate identification of PE and PPE family proteins. Digerati was built upon a multipath parallel hybrid deep learning framework, which equips multi-layer convolutional neural networks with bidirectional, long short-term memory, equipped with a self-attention module to effectively learn the higher-order feature representations of PE/PPE proteins. Empirical studies demonstrated that Digerati achieved a significantly better performance (∼18-20%) than alignment-based approaches, including BLASTP, PHMMER, and HHsuite, in both prediction accuracy and speed. Digerati is anticipated to facilitate community-wide efforts to conduct high-throughput identification and analysis of PE/PPE family members. The webserver and source codes of Digerati are publicly available at http://web.unimelb-bioinfortools.cloud.edu.au/Digerati/.


Asunto(s)
Aprendizaje Profundo , Mycobacterium tuberculosis , Mycobacterium tuberculosis/genética , Mycobacterium tuberculosis/metabolismo , Proteínas Bacterianas/genética , Virulencia/genética
17.
Neural Netw ; 164: 439-454, 2023 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-37182346

RESUMEN

Cross-network node classification (CNNC), which aims to classify nodes in a label-deficient target network by transferring the knowledge from a source network with abundant labels, draws increasing attention recently. To address CNNC, we propose a domain-adaptive message passing graph neural network (DM-GNN), which integrates graph neural network (GNN) with conditional adversarial domain adaptation. DM-GNN is capable of learning informative representations for node classification that are also transferrable across networks. Firstly, a GNN encoder is constructed by dual feature extractors to separate ego-embedding learning from neighbor-embedding learning so as to jointly capture commonality and discrimination between connected nodes. Secondly, a label propagation node classifier is proposed to refine each node's label prediction by combining its own prediction and its neighbors' prediction. In addition, a label-aware propagation scheme is devised for the labeled source network to promote intra-class propagation while avoiding inter-class propagation, thus yielding label-discriminative source embeddings. Thirdly, conditional adversarial domain adaptation is performed to take the neighborhood-refined class-label information into account during adversarial domain adaptation, so that the class-conditional distributions across networks can be better matched. Comparisons with eleven state-of-the-art methods demonstrate the effectiveness of the proposed DM-GNN.


Asunto(s)
Conocimiento , Redes Neurales de la Computación
18.
Brief Bioinform ; 24(3)2023 05 19.
Artículo en Inglés | MEDLINE | ID: mdl-37150785

RESUMEN

A-to-I editing is the most prevalent RNA editing event, which refers to the change of adenosine (A) bases to inosine (I) bases in double-stranded RNAs. Several studies have revealed that A-to-I editing can regulate cellular processes and is associated with various human diseases. Therefore, accurate identification of A-to-I editing sites is crucial for understanding RNA-level (i.e. transcriptional) modifications and their potential roles in molecular functions. To date, various computational approaches for A-to-I editing site identification have been developed; however, their performance is still unsatisfactory and needs further improvement. In this study, we developed a novel stacked-ensemble learning model, ATTIC (A-To-I ediTing predICtor), to accurately identify A-to-I editing sites across three species, including Homo sapiens, Mus musculus and Drosophila melanogaster. We first comprehensively evaluated 37 RNA sequence-derived features combined with 14 popular machine learning algorithms. Then, we selected the optimal base models to build a series of stacked ensemble models. The final ATTIC framework was developed based on the optimal models improved by the feature selection strategy for specific species. Extensive cross-validation and independent tests illustrate that ATTIC outperforms state-of-the-art tools for predicting A-to-I editing sites. We also developed a web server for ATTIC, which is publicly available at http://web.unimelb-bioinfortools.cloud.edu.au/ATTIC/. We anticipate that ATTIC can be utilized as a useful tool to accelerate the identification of A-to-I RNA editing events and help characterize their roles in post-transcriptional regulation.


Asunto(s)
Drosophila melanogaster , Edición de ARN , Animales , Ratones , Humanos , Drosophila melanogaster/genética , Drosophila melanogaster/metabolismo , ARN/genética , Adenosina/genética , Adenosina/metabolismo , Inosina/genética , Inosina/metabolismo
19.
IEEE Trans Neural Netw Learn Syst ; 34(2): 1089-1096, 2023 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-34437071

RESUMEN

Non-Euclidean property of graph structures has faced interesting challenges when deep learning methods are applied. Graph convolutional networks (GCNs) can be regarded as one of the successful approaches to classification tasks on graph data, although the structure of this approach limits its performance. In this work, a novel representation learning approach is introduced based on spectral convolutions on graph-structured data in a semisupervised learning setting. Our proposed method, COnvOlving cLiques (COOL), is constructed as a neighborhood aggregation approach for learning node representations using established GCN architectures. This approach relies on aggregating local information by finding maximal cliques. Unlike the existing graph neural networks which follow a traditional neighborhood averaging scheme, COOL allows for aggregation of densely connected neighboring nodes of potentially differing locality. This leads to substantial improvements on multiple transductive node classification tasks.

20.
IEEE Trans Neural Netw Learn Syst ; 34(11): 9102-9115, 2023 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-35320107

RESUMEN

Many e-commerce platforms, such as AliExpress, run major promotion campaigns regularly. Before such a promotion, it is important to predict potential best sellers and their respective sales volumes so that the platform can arrange their supply chains and logistics accordingly. For items with a sufficiently long sales history, accurate sales forecast can be achieved through the traditional statistical forecasting techniques. Accurately predicting the sales volume of a new item, however, is rather challenging with existing methods; time series models tend to overfit due to the very limited historical sales records of the new item, whereas models that do not utilize historical information often fail to make accurate predictions, due to the lack of strong indicators of sales volume among the item's basic attributes. This article presents the solution deployed at Alibaba in 2019, which had been used in production to prepare for its annual "Double 11" promotion event whose total sales amount exceeded U.S. $ 38 billion in a single day. The main idea of the proposed solution is to predict the sales volume of each new item through its connections with older products with sufficiently long sales history. In other words, our solution considers the cross-selling effects between different products, which has been largely neglected in previous methods. Specifically, the proposed solution first constructs an item graph, in which each new item is connected to relevant older items. Then, a novel multitask graph convolutional neural network (GCN) is trained by a multiobjective optimization-based gradient surgery technique to predict the expected sales volumes of new items. The designs of both the item graph and the GCN exploit the fact that we only need to perform accurate sales forecasts for potential best-selling items in a major promotion, which helps reduce computational overhead. Extensive experiments on both proprietary AliExpress data and a public dataset demonstrate that the proposed solution achieves consistent performance gains compared to existing methods for sales forecast.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA