RESUMEN
Although sifting functional genes has been discussed for years, traditional selection methods tend to be ineffective in capturing potential specific genes. First, typical methods focus on finding features (genes) relevant to class while irrelevant to each other. However, the features that can offer rich discriminative information are more likely to be the complementary ones. Next, almost all existing methods assess feature relations in pairs, yielding an inaccurate local estimation and lacking a global exploration. In this paper, we introduce multi-variable Area Under the receiver operating characteristic Curve (AUC) to globally evaluate the complementarity among features by employing Area Above the receiver operating characteristic Curve (AAC). Due to AAC, the class-relevant information newly provided by a candidate feature and that preserved by the selected features can be achieved beyond pairwise computation. Furthermore, we propose an AAC-based feature selection algorithm, named Multi-variable AUC-based Combined Features Complementarity, to screen discriminative complementary feature combinations. Extensive experiments on public datasets demonstrate the effectiveness of the proposed approach. Besides, we provide a gene set about prostate cancer and discuss its potential biological significance from the machine learning aspect and based on the existing biomedical findings of some individual genes.
Asunto(s)
Algoritmos , Aprendizaje Automático , Área Bajo la Curva , Curva ROCRESUMEN
KEY MESSAGE: Found a trans-splicing of PHYTOENE SYNTHASE 1 alters tomato fruit color by map-based cloning, functional complementation and RACE providing an insight into fruit color development. Color is an important fruit quality trait and a major determinant of the economic value of tomato (Solanum lycopersicum). Fruit color inheritance in a yellow-fruited cherry tomato (cv. No. 22), named yellow-fruited tomato 2 (yft2), was shown to be controlled by a single recessive gene, YFT2. The YFT2 gene was mapped in a 95.7 kb region on chromosome 3, and the candidate gene, PHYTOENE SYNTHASE 1 (PSY1), was confirmed by functional complementation analysis. Constitutive over expression of PSY1 in yft2 increased the accumulation of carotenoids and resulted in a red fruit color, while no causal mutation was detected in the YFT2 allele of yft2, compared with red-fruited SL1995 cherry tomato or cultivated variety (cv. M82). Expression of YFT2 3' region in yft2 was significantly lower than in SL1995, and further studies revealed a difference in YFT2 post-transcriptional processing in yft2 compared with SL1995 and cv. M82, resulting in a longer YFT2 transcript. The alternatively trans-spliced allele of YFT2 in yft2 is predicted to encode a novel LT-YFT2 protein of 432 amino acid (AA) residues, compared to the 412 AA YFT2 protein of SL1995. The trans-spliced event also resulted in significantly down regulated expression of YFT2 in yft2 tomato, and the YFT2 allele suppressed expression of the downstream genes involved in the carotenoid biosynthesis pathway and carotenoids synthesis by a mechanism of the feed-forward regulation. In conclusion, we found that trans-splicing of YFT2 alters tomato fruit color, providing new insights into fruit color development.