Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 10 de 10
Filter
Add more filters










Publication year range
1.
Heliyon ; 10(9): e30413, 2024 May 15.
Article in English | MEDLINE | ID: mdl-38707296

ABSTRACT

To comprehend the genuine reading habits and preferences of diverse user cohorts and furnish tailored reading recommendations, this study introduces an English text reading recommendation model designed specifically for long-tail users. This model integrates collaborative filtering algorithms with the FastText classification method. Initially, the integrated collaborative filtering algorithm is explicated, followed by the calculation of the user's interest distribution across various types of English texts, achieved through an enhanced Ebbinghaus forgetting curve and analysis of user reading behaviors. Subsequently, an intelligent English text reading recommendation is generated by amalgamating collaborative filtering algorithms with association rule-based recommendation algorithms. Through optimization of the recommendation generation process, the model's recommendation accuracy is enhanced, thereby augmenting the performance and user satisfaction of the recommendation system. Finally, a comparative analysis is conducted with respect to the Top-N algorithm model, matrix factorization-based algorithm model, and FastText classification model, illustrating the superior recommendation accuracy and F-Measure value of the proposed model. The study findings indicate that when the recommendation list contains 10, 30, 50, and 70 texts, the recommendation accuracy of the proposed algorithm model is 0.75, 0.79, 0.8, and 0.74, respectively, outperforming other algorithms. Furthermore, as the number of texts increases, the F-Measure of all four models gradually improves, with the final F-Measure of the proposed model reaching 0.81. Notably, the F-Measure of the English text reading recommendation model proposed in this study significantly surpasses that of the other three recommendation methods. Demonstrating commendable performance in recall rate, root mean square error, normalized cumulative gain, precision, and accuracy, the model adeptly reflects user reading interests, thereby enhancing the accuracy of text recommendations and the overall system performance. The study findings offer crucial insights and guidance for enhancing the accuracy and overall efficacy of English text recommendation systems.

2.
Diagnostics (Basel) ; 14(5)2024 Mar 05.
Article in English | MEDLINE | ID: mdl-38473017

ABSTRACT

The critical success index (CSI) is an established metric used in meteorology to verify the accuracy of weather forecasts. It is defined as the ratio of hits to the sum of hits, false alarms, and misses. Translationally, CSI has gained popularity as a unitary outcome measure in various clinical situations where large numbers of true negatives may influence the interpretation of other, more traditional, outcome measures, such as specificity (Spec) and negative predictive value (NPV), or when unified interpretation of positive predictive value (PPV) and sensitivity (Sens) is needed. The derivation of CSI from measures including PPV has prompted questions as to whether and how CSI values may vary with disease prevalence (P), just as PPV estimates are dependent on P, and hence whether CSI values are generalizable between studies with differing prevalences. As no detailed study of the relation of CSI to prevalence has been undertaken hitherto, the dataset of a previously published test accuracy study of a cognitive screening instrument was interrogated to address this question. Three different methods were used to examine the change in CSI across a range of prevalences, using both the Bayes formula and equations directly relating CSI to Sens, PPV, P, and the test threshold (Q). These approaches showed that, as expected, CSI does vary with prevalence, but the dependence differs according to the method of calculation that is adopted. Bayesian rescaling of both Sens and PPV generates a concave curve, suggesting that CSI will be maximal at a particular prevalence, which may vary according to the particular dataset.

3.
J Imaging ; 8(4)2022 Mar 23.
Article in English | MEDLINE | ID: mdl-35448212

ABSTRACT

To establish an optimal model for photo aesthetic assessment, in this paper, an internal metric called the disentanglement-measure (D-measure) is introduced, which reflects the disentanglement degree of the final layer FC (full connection) nodes of convolutional neural network (CNN). By combining the F-measure with the D-measure to obtain an FD measure, an algorithm of determining the optimal model from many photo score prediction models generated by CNN-based repetitively self-revised learning (RSRL) is proposed. Furthermore, the aesthetics features of the model regarding the first fixation perspective (FFP) and the assessment interest region (AIR) are defined by means of the feature maps so as to analyze the consistency with human aesthetics. The experimental results show that the proposed method is helpful in improving the efficiency of determining the optimal model. Moreover, extracting the FFP and AIR of the models to the image is useful in understanding the internal properties of these models related to the human aesthetics and validating the external performances of the aesthetic assessment.

4.
J Proteome Res ; 18(7): 2931-2939, 2019 07 05.
Article in English | MEDLINE | ID: mdl-31136183

ABSTRACT

Cellular respiration provides direct energy substances for living organisms. Electron storage and transportation should be completed through electron transport chains during the cellular respiration process. Thus, identifying electron transport proteins is an important research task. In protein identification, selection of the feature extraction method and classification algorithm has a direct bearing on classification. The distance-based Top-n-gram method, which was proposed based on the frequency profile and considered evolutionary information, was used in this study for feature extraction. The Max-Relevance-Max-Distance algorithm was adopted for feature selection. The first 4D features that greatly influenced the classification result were selected to form the feature data set. Finally, the random forest algorithm was used to identify electron transport proteins. Under the 10-fold cross-validation of the model constructed in this study, sensitivity, specificity, and accuracy rates surpassed 85%, 80%, and 82%, respectively. In the testing set, F-measure, AUC value, and accuracy exceeded 74%, 95%, and 86%, respectively. These experimental results indicated that the classification model built in this study is an effective tool in identifying electron transport proteins.


Subject(s)
Algorithms , Carrier Proteins/analysis , Electron Transport Chain Complex Proteins/analysis , Electron Transport , Classification , Models, Chemical , Sensitivity and Specificity
5.
Artif Intell Med ; 80: 1-10, 2017 07.
Article in English | MEDLINE | ID: mdl-28709745

ABSTRACT

We propose a method to discover sleep patterns via clustering of sound events recorded during sleep. The proposed method extends the conventional self-organizing map algorithm by kernelization and sequence-based technologies to obtain a fine-grained map that visualizes the distribution and changes of sleep-related events. We introduced features widely applied in sound processing and popular kernel functions to the proposed method to evaluate and compare performance. The proposed method provides a new aspect of sleep monitoring because the results demonstrate that sound events can be directly correlated to an individual's sleep patterns. In addition, by visualizing the transition of cluster dynamics, sleep-related sound events were found to relate to the various stages of sleep. Therefore, these results empirically warrant future study into the assessment of personal sleep quality using sound data.


Subject(s)
Algorithms , Polysomnography , Sleep , Cluster Analysis , Humans , Sound
6.
Percept Mot Skills ; 124(5): 961-973, 2017 Oct.
Article in English | MEDLINE | ID: mdl-28649923

ABSTRACT

Abnormal prosody is often evident in the voice intonations of individuals with autism spectrum disorders. We compared a machine-learning-based voice analysis with human hearing judgments made by 10 speech therapists for classifying children with autism spectrum disorders ( n = 30) and typical development ( n = 51). Using stimuli limited to single-word utterances, machine-learning-based voice analysis was superior to speech therapist judgments. There was a significantly higher true-positive than false-negative rate for machine-learning-based voice analysis but not for speech therapists. Results are discussed in terms of some artificiality of clinician judgments based on single-word utterances, and the objectivity machine-learning-based voice analysis adds to judging abnormal prosody.


Subject(s)
Autism Spectrum Disorder/physiopathology , Child Development/physiology , Machine Learning , Speech/physiology , Child , Child, Preschool , Female , Humans , Male , Psycholinguistics
7.
Ecol Evol ; 6(1): 337-48, 2016 Jan.
Article in English | MEDLINE | ID: mdl-26811797

ABSTRACT

Presence-only data present challenges for selecting thresholds to transform species distribution modeling results into binary outputs. In this article, we compare two recently published threshold selection methods (maxSSS and maxF pb) and examine the effectiveness of the threshold-based prevalence estimation approach. Six virtual species with varying prevalence were simulated within a real landscape in southeastern Australia. Presence-only models were built with DOMAIN, generalized linear model, Maxent, and Random Forest. Thresholds were selected with two methods maxSSS and maxF pb with four presence-only datasets with different ratios of the number of known presences to the number of random points (KP-RP ratio). Sensitivity, specificity, true skill statistic, and F measure were used to evaluate the performance of the results. Species prevalence was estimated as the ratio of the number of predicted presences to the total number of points in the evaluation dataset. Thresholds selected with maxF pb varied as the KP-RP ratio of the threshold selection datasets changed. Datasets with the KP-RP ratio around 1 generally produced better results than scores distant from 1. Results produced by We conclude that maxFpb had specificity too low for very common species using Random Forest and Maxent models. In contrast, maxSSS produced consistent results whichever dataset was used. The estimation of prevalence was almost always biased, and the bias was very large for DOMAIN and Random Forest predictions. We conclude that maxF pb is affected by the KP-RP ratio of the threshold selection datasets, but maxSSS is almost unaffected by this ratio. Unbiased estimations of prevalence are difficult to be determined using the threshold-based approach.

8.
FEBS Open Bio ; 5: 877-84, 2015.
Article in English | MEDLINE | ID: mdl-26649272

ABSTRACT

MicroRNAs (miRNAs) are small, non-coding RNA molecules that regulate gene expression in almost all plants and animals. They play an important role in key processes, such as proliferation, apoptosis, and pathogen-host interactions. Nevertheless, the mechanisms by which miRNAs act are not fully understood. The first step toward unraveling the function of a particular miRNA is the identification of its direct targets. This step has shown to be quite challenging in animals primarily because of incomplete complementarities between miRNA and target mRNAs. In recent years, the use of machine-learning techniques has greatly increased the prediction of miRNA targets, avoiding the need for costly and time-consuming experiments to achieve miRNA targets experimentally. Among the most important machine-learning algorithms are decision trees, which classify data based on extracted rules. In the present work, we used a genetic algorithm in combination with C4.5 decision tree for prediction of miRNA targets. We applied our proposed method to a validated human datasets. We nearly achieved 93.9% accuracy of classification, which could be related to the selection of best rules.

9.
Gene ; 533(1): 94-9, 2014 Jan 01.
Article in English | MEDLINE | ID: mdl-24120395

ABSTRACT

Long intergenic non-coding RNAs (lincRNAs) are a new type of non-coding RNAs and are closely related with the occurrence and development of diseases. In previous studies, most lincRNAs have been identified through next-generation sequencing. Because lincRNAs exhibit tissue-specific expression, the reproducibility of lincRNA discovery in different studies is very poor. In this study, not including lincRNA expression, we used the sequence, structural and protein-coding potential features as potential features to construct a classifier that can be used to distinguish lincRNAs from non-lincRNAs. The GA-SVM algorithm was performed to extract the optimized feature subset. Compared with several feature subsets, the five-fold cross validation results showed that this optimized feature subset exhibited the best performance for the identification of human lincRNAs. Moreover, the LincRNA Classifier based on Selected Features (linc-SF) was constructed by support vector machine (SVM) based on the optimized feature subset. The performance of this classifier was further evaluated by predicting lincRNAs from two independent lincRNA sets. Because the recognition rates for the two lincRNA sets were 100% and 99.8%, the linc-SF was found to be effective for the prediction of human lincRNAs.


Subject(s)
Algorithms , Computational Biology , RNA, Untranslated/genetics , Humans , Reproducibility of Results , Support Vector Machine
10.
J Pathol Inform ; 4(Suppl): S11, 2013.
Article in English | MEDLINE | ID: mdl-23766933

ABSTRACT

AIMS: A methodology for quantitative comparison of histological stains based on their classification and clustering performance, which may facilitate the choice of histological stains for automatic pattern and image analysis. BACKGROUND: Machine learning and image analysis are becoming increasingly important in pathology applications for automatic analysis of histological tissue samples. Pathologists rely on multiple, contrasting stains to analyze tissue samples, but histological stains are developed for visual analysis and are not always ideal for automatic analysis. MATERIALS AND METHODS: Thirteen different histological stains were used to stain adjacent prostate tissue sections from radical prostatectomies. We evaluate the stains for both supervised and unsupervised classification of stain/tissue combinations. For supervised classification we measure the error rate of nonlinear support vector machines, and for unsupervised classification we use the Rand index and the F-measure to assess the clustering results of a Gaussian mixture model based on expectation-maximization. Finally, we investigate class separability measures based on scatter criteria. RESULTS: A methodology for quantitative evaluation of histological stains in terms of their classification and clustering efficacy that aims at improving segmentation and color decomposition. We demonstrate that for a specific tissue type, certain stains perform consistently better than others according to objective error criteria. CONCLUSIONS: The choice of histological stain for automatic analysis must be based on its classification and clustering performance, which are indicators of the performance of automatic segmentation of tissue into morphological components, which in turn may be the basis for diagnosis.

SELECTION OF CITATIONS
SEARCH DETAIL