Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Data Brief ; 52: 110034, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38282916

RESUMO

Low-resource languages, like Malay, face the threat of extinction when linguistic resources become scarce. This paper addresses the scarcity issue by contributing to the inventory of low-resource languages, specifically focusing on Malay-English, known as Manglish. Manglish speakers are primarily located in Malaysia, Indonesia, Brunei, and Singapore. As global adoption of second languages and social media usage increases, language code-switching, such as Spanglish and Chinglish, becomes more prevalent. In the case of Malay-English, this phenomenon is termed Manglish. To enhance the status of the Malay language and its transition out of the low-resource category, this unique text corpus, with binary annotations for biological gender and anonymized author identities is presented. This bi-annotated dataset offers valuable applications for various fields, including the investigation of cyberbullying, combating gender bias, and providing targeted recommendations for gender-specific products. This corpus can be used with either of the annotations or their composite. The dataset comprises of posts from 50 Malaysian public figures, equally split between biological males and females. The dataset contains a total of 709,012 raw X posts (formerly Twitter), with a relatively balanced distribution of 53.72% from biological female authors and 46.28% from biological male authors. Twitter API was used to scrape the posts. After pre-processing, the total posts reduced to 650,409 posts, widening the gap between the genders with the 56.88% for biological female and 43.12% for biological male. This dataset is a valuable resource for researchers in the field of Malay-English code-switching Natural Language Processing (NLP) and can be used to train or enhance existing and future Manglish language transformers.

2.
BMC Bioinformatics ; 25(1): 23, 2024 Jan 12.
Artigo em Inglês | MEDLINE | ID: mdl-38216898

RESUMO

BACKGROUND: With the exponential growth of high-throughput technologies, multiple pathway analysis methods have been proposed to estimate pathway activities from gene expression profiles. These pathway activity inference methods can be divided into two main categories: non-Topology-Based (non-TB) and Pathway Topology-Based (PTB) methods. Although some review and survey articles discussed the topic from different aspects, there is a lack of systematic assessment and comparisons on the robustness of these approaches. RESULTS: Thus, this study presents comprehensive robustness evaluations of seven widely used pathway activity inference methods using six cancer datasets based on two assessments. The first assessment seeks to investigate the robustness of pathway activity in pathway activity inference methods, while the second assessment aims to assess the robustness of risk-active pathways and genes predicted by these methods. The mean reproducibility power and total number of identified informative pathways and genes were evaluated. Based on the first assessment, the mean reproducibility power of pathway activity inference methods generally decreased as the number of pathway selections increased. Entropy-based Directed Random Walk (e-DRW) distinctly outperformed other methods in exhibiting the greatest reproducibility power across all cancer datasets. On the other hand, the second assessment shows that no methods provide satisfactory results across datasets. CONCLUSION: However, PTB methods generally appear to perform better in producing greater reproducibility power and identifying potential cancer markers compared to non-TB methods.


Assuntos
Neoplasias , Humanos , Reprodutibilidade dos Testes , Neoplasias/genética , Entropia , Expressão Gênica
3.
PLoS One ; 18(11): e0293742, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37917752

RESUMO

Refactoring, a widely adopted technique, has proven effective in facilitating and reducing maintenance activities and costs. Nonetheless, the effects of applying refactoring techniques on software quality exhibit inconsistencies and contradictions, leading to conflicting evidence on their overall benefit. Consequently, software developers face challenges in leveraging these techniques to improve software quality. Moreover, the absence of a categorization model hampers developers' ability to decide the most suitable refactoring techniques for improving software quality, considering specific design goals. Thus, this study aims to propose a novel refactoring categorization model that categorizes techniques based on their measurable impacts on internal quality attributes. Initially, the most common refactoring techniques used by software practitioners were identified. Subsequently, an experimental study was conducted using five case studies to measure the impacts of refactoring techniques on internal quality attributes. A subsequent multi-case analysis was conducted to analyze these effects across the case studies. The proposed model was developed based on the experimental study results and the subsequent multi-case analysis. The model categorizes refactoring techniques into green, yellow, and red categories. The proposed model, by acting as a guideline, assists developers in understanding the effects of each refactoring technique on quality attributes, allowing them to select appropriate techniques to improve specific quality attributes. Compared to existing studies, the proposed model emerges superior by offering a more granular categorization (green, yellow, and red categories), and its range is wide (including ten refactoring techniques and eleven internal quality attributes). Such granularity not only equips developers with an in-depth understanding of each technique's impact but also fosters informed decision-making. In addition, the proposed model outperforms current studies and offers a more nuanced understanding, explicitly highlighting areas of strength and concern for each refactoring technique. This enhancement aids developers in better grasping the implications of each refactoring technique on quality attributes. As a result, the model simplifies the decision-making process for developers, saving time and effort that would otherwise be spent weighing the benefits and drawbacks of various refactoring techniques. Furthermore, it has the potential to help reduce maintenance activities and associated costs.


Assuntos
Melhoria de Qualidade , Software
5.
MethodsX ; 10: 102124, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36974325

RESUMO

Using data analytics to properly extracting insights that are in-line to the enterprises strategic goals is crucial for the business sustainability. Developing the most fitting context as a knowledge graph that answer related businesses questions and queries at scale. Data analytics is an integral main part of smart manufacturing for monitoring the production processes and identifying the potentials for automated operations for improved manufacturing performance. This paper reviews and investigates the best development practices to be followed for industrial enterprise knowledge-graph development that support smart manufacturing in the following aspects:•Decision for intelligent business processes, data collection from multiple sources, competitive advantage graph ontology, ensuring data quality, improved data analytics, human-friendly interaction, rapid and scalable enterprise's architectures.•Successful digital-transformation adoption for smart manufacturing as an enterprise knowledge-graph development with the capability to be transformed to data fabric supporting scalability of smart manufacturing processes in industrial enterprises.

6.
MethodsX ; 9: 101920, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36420313

RESUMO

To achieve the maximum return-of-investment for the adoption of Digital-Twin in manufacturing, organizations should be totally aware about the challenges that limit the widely adoption as well as opportunities that may create real-added values to their businesses at operational and strategic management. In this context, determining the most influential factors for successful adoption must be clear even at the early stages of planning towards high effective digital-transformation journey for business's sustainability. The beneficial achievements and outcome towards such successful planning and adoption of the industrial digital-twin are significant in terms of optimized processes, reduced costs and downtown of the operations, flexibility in product design and processes' adaptation to satisfy future markets demands The main purpose of this paper is to propose adoption modelling of digital-twin for optimized products and production processes. The methodology of the proposed modelling can be considered unique in the following aspects of:•Determining the expected added-values of adopting digital-twin to the manufacturing business according to certain business's operational criticality, budget and size.•Allowing processes' optimization at three levels of plant (factory) physical layout, Machines' operational fault tolerance and final products' design and quality.•Allowing strategic-planning achievement for sustainable Production-Product and future demands.

7.
PLoS One ; 13(3): e0193951, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-29565982

RESUMO

Although Radio Frequency Identification (RFID) is poised to displace barcodes, security vulnerabilities pose serious challenges for global adoption of the RFID technology. Specifically, RFID tags are prone to basic cloning and counterfeiting security attacks. A successful cloning of the RFID tags in many commercial applications can lead to many serious problems such as financial losses, brand damage, safety and health of the public. With many industries such as pharmaceutical and businesses deploying RFID technology with a variety of products, it is important to tackle RFID tag cloning problem and improve the resistance of the RFID systems. To this end, we propose an approach for detecting cloned RFID tags in RFID systems with high detection accuracy and minimal overhead thus overcoming practical challenges in existing approaches. The proposed approach is based on consistency of dual hash collisions and modified count-min sketch vector. We evaluated the proposed approach through extensive experiments and compared it with existing baseline approaches in terms of execution time and detection accuracy under varying RFID tag cloning ratio. The results of the experiments show that the proposed approach outperforms the baseline approaches in cloned RFID tag detection accuracy.


Assuntos
Dispositivo de Identificação por Radiofrequência/métodos , Segurança Computacional , Confidencialidade , Processamento Eletrônico de Dados/métodos , Humanos
8.
Sensors (Basel) ; 11(10): 9863-77, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-22163730

RESUMO

Radio frequency identification (RFID) systems are emerging as the primary object identification mechanism, especially in supply chain management. However, RFID naturally generates a large amount of duplicate readings. Removing these duplicates from the RFID data stream is paramount as it does not contribute new information to the system and wastes system resources. Existing approaches to deal with this problem cannot fulfill the real time demands to process the massive RFID data stream. We propose a data filtering approach that efficiently detects and removes duplicate readings from RFID data streams. Experimental results show that the proposed approach offers a significant improvement as compared to the existing approaches.


Assuntos
Sistemas de Gerenciamento de Base de Dados , Dispositivo de Identificação por Radiofrequência/métodos , Estatística como Assunto/métodos , Algoritmos , Fatores de Tempo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA