Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
Mais filtros










Intervalo de ano de publicação
1.
Comput Biol Med ; 143: 105296, 2022 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-35149458

RESUMO

Data mining has proven to be a reliable method to analyze and discover useful knowledge about various diseases, including cancer research. In particular, data mining and machine learning algorithms to study oral squamous cell carcinoma (OSCC), the most common form of oral cancer, is a new area of research. This malignant neoplasm can be studied using saliva samples. Saliva is an important biofluid that must be used to verify potential biomarkers associated with oral cancer. In this study, first, we provide an overview of OSSC diagnoses based on machine learning and salivary metabolites. To our knowledge, this is the first study to apply advanced data mining techniques to diagnose OSCC. Then, we give new results of classification and feature selection algorithms used to identify potential salivary biomarkers of OSCC. To accomplish this task, we used the filter feature selection random forest importance algorithm and a wrapper methodology to evaluate the importance of metabolites obtained from gas chromatography mass-spectrometry (GC-MS) in the context of differentiation of OSCC and the control group. Salivary samples (n = 68) were collected for the control group, and the OSCC group were from patients matched for gender, age, and smoking habit. The classification process occurred based on Random Forest (RF) classification algorithm along with 10-cross validation. The results showed that glucuronic acid, maleic acid, and batyl alcohol can classify the samples with an area under the curve (AUC) of 0.91 versus an AUC of 0.76 using all 51 metabolites analyzed. The methodology used in this study can assist healthcare professionals and be adopted to discover diagnostic biomarkers for other diseases.

2.
Metabolites ; 12(1)2022 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-35050157

RESUMO

The urinary volatomic profiling of Indian cohorts composed of 28 lung cancer (LC) patients and 27 healthy subjects (control group, CTRL) was established using headspace solid phase microextraction technique combined with gas chromatography mass spectrometry methodology as a powerful approach to identify urinary volatile organic metabolites (uVOMs) to discriminate among LC patients from CTRL. Overall, 147 VOMs of several chemistries were identified in the intervention groups-including naphthalene derivatives, phenols, and organosulphurs-augmented in the LC group. In contrast, benzene and terpenic derivatives were found to be more prevalent in the CTRL group. The volatomic data obtained were processed using advanced statistical analysis, namely partial least square discriminative analysis (PLS-DA), support vector machine (SVM), random forest (RF), and multilayer perceptron (MLP) methods. This resulted in the identification of nine uVOMs with a higher potential to discriminate LC patients from CTRL subjects. These were furan, o-cymene, furfural, linalool oxide, viridiflorene, 2-bromo-phenol, tricyclazole, 4-methyl-phenol, and 1-(4-hydroxy-3,5-di-tert-butylphenyl)-2-methyl-3-morpholinopropan-1-one. The metabolic pathway analysis of the data obtained identified several altered biochemical pathways in LC mainly affecting glycolysis/gluconeogenesis, pyruvate metabolism, and fatty acid biosynthesis. Moreover, acetate and octanoic, decanoic, and dodecanoic fatty acids were identified as the key metabolites responsible for such deregulation. Furthermore, studies involving larger cohorts of LC patients would allow us to consolidate the data obtained and challenge the potential of the uVOMs as candidate biomarkers for LC.

3.
Biol Trace Elem Res ; 199(1): 92-101, 2021 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-32356206

RESUMO

Osteoporosis and its consequence of fragility fracture represent a major public health problem. Human exposure to heavy metals has received considerable attention over the last decades. However, little is known about the influence of co-exposure to multiple heavy metals on bone density. The present study aimed to examine the association between exposure to metals and bone mineral density (BMD) loss. Blood and urine concentrations of 20 chemical elements were selected from 3 cycles (2005-2010) NHANES (National Health and Nutrition Examination Survey), in which we included white women over 50 years of age and previously selected for BMD testing (N = 1892). The bone loss group was defined as participants having T-score < - 1.0, and the normal group was defined as participants having T-score ≥ - 1.0. We developed classification models based on support vector machines capable of determining which factors could best predict BMD loss. The model which included the five-best features-selected from the random forest were age, body mass index, urinary concentration of arsenic (As), cadmium (Cd), and tungsten (W), which have achieved high scores for accuracy (92.18%), sensitivity (90.50%), and specificity (93.35%). These data demonstrate the importance of these factors and metals to the classification since they alone were capable of generating a classification model with a high prediction of accuracy without requiring the other variables. In summary, our findings provide insight into the important, yet overlooked impact that arsenic, cadmium, and tungsten have on overall bone health.


Assuntos
Metais Pesados , Osteoporose , Densidade Óssea , Mineração de Dados , Feminino , Humanos , Inquéritos Nutricionais , Osteoporose/epidemiologia
4.
Med Biol Eng Comput ; 58(3): 519-528, 2020 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-31900818

RESUMO

Early diagnosis and treatment are the most important strategies to prevent deaths from several diseases. In this regard, data mining and machine learning techniques have been useful tools to help minimize errors and to provide useful information for diagnosis. Our paper aims to present a new feature selection algorithm. In order to validate our study, we used eight benchmark data sets which are commonly used among researchers who developed machine learning methods for medical data classification. The experiment has shown that the performance of our proposed new feature selection method combined with twin-bounded support vector machine (FSTBSVM) is very efficient. The robustness of the FSTBSVM is examined using classification accuracy, analysis of sensitivity, and specificity. The proposed FSTBSVM is a very promising technique for classification, and the results show that the proposed method is capable of producing good results with fewer features than the original data sets. Graphical abstract Model using a new feature selection and grid search with 10-fold CV to optimize model parameters in our FSTBSVM.


Assuntos
Máquina de Vetores de Suporte , Bases de Dados como Assunto , Feminino , Humanos , Redes Neurais de Computação
5.
Food Chem ; 302: 125340, 2020 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-31419775

RESUMO

In this study, 83 wines representating four commercial categories: "Argentinean Malbec", "Brazilian Merlot", "Uruguayan Tannat" and "Chilean Carménère" were analyzed according to their phenolic and volatile compounds. The objective was to identify the chemical compounds that would typify each category. From approximately about 600 peaks obtained by chromatographic techniques, 169 were identified and 53 of them were selected for multivariate statistical analysis. Chilean Carménère was the best discriminated group by the methods applied in our study, followed by Argentinean Malbec. Brazilian Merlot mixed mainly with some Carménère, whileTannat mixed with all wines categories, especially Malbec. In general, Chilean Carménère wines can be characterized by a bluish color, higher amounts of sulphur dioxide, higher content of octanoic acid, isobutanol, ethyl isoamyl succinate and catechin and a smaller amount of quercetin. These data can contribute for further process of authenticity or typification of South American red wines.


Assuntos
Análise de Alimentos/estatística & dados numéricos , Fenóis/análise , Compostos Orgânicos Voláteis/análise , Vinho/análise , Butanóis/análise , Caprilatos/análise , Catequina/análise , Análise de Alimentos/métodos , Cromatografia Gasosa-Espectrometria de Massas/métodos , Cromatografia Gasosa-Espectrometria de Massas/estatística & dados numéricos , Análise Multivariada , Quercetina/análise , América do Sul , Dióxido de Enxofre/análise , Vinho/classificação
6.
Crit Rev Food Sci Nutr ; 59(12): 1868-1879, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-29363991

RESUMO

Rice is one of the most important staple foods around the world. Authentication of rice is one of the most addressed concerns in the present literature, which includes recognition of its geographical origin and variety, certification of organic rice and many other issues. Good results have been achieved by multivariate data analysis and data mining techniques when combined with specific parameters for ascertaining authenticity and many other useful characteristics of rice, such as quality, yield and others. This paper brings a review of the recent research projects on discrimination and authentication of rice using multivariate data analysis and data mining techniques. We found that data obtained from image processing, molecular and atomic spectroscopy, elemental fingerprinting, genetic markers, molecular content and others are promising sources of information regarding geographical origin, variety and other aspects of rice, being widely used combined with multivariate data analysis techniques. Principal component analysis and linear discriminant analysis are the preferred methods, but several other data classification techniques such as support vector machines, artificial neural networks and others are also frequently present in some studies and show high performance for discrimination of rice.


Assuntos
Análise de Alimentos , Oryza/química , Bases de Dados Factuais , Análise Discriminante , Processamento de Imagem Assistida por Computador , Análise Multivariada , Oryza/genética , Análise de Componente Principal , Espectrofotometria Atômica , Análise Espectral Raman
7.
Environ Int ; 116: 269-277, 2018 07.
Artigo em Inglês | MEDLINE | ID: mdl-29704805

RESUMO

Human exposure to endocrine disrupting chemicals (EDCs) has received considerable attention over the last three decades. However, little is known about the influence of co-exposure to multiple EDCs on effect-biomarkers such as oxidative stress in Brazilian children. In this study, concentrations of 40 EDCs were determined in urine samples collected from 300 Brazilian children of ages 6-14 years and data were analyzed by advanced data mining techniques. Oxidative DNA damage was evaluated from the urinary concentrations of 8-hydroxy-2'-deoxyguanosine (8OHDG). Fourteen EDCs, including bisphenol A (BPA), methyl paraben (MeP), ethyl paraben (EtP), propyl paraben (PrP), 3,4-dihydroxy benzoic acid (3,4-DHB), methyl-protocatechuic acid (OH-MeP), ethyl-protocatechuic acid (OH-EtP), triclosan (TCS), triclocarban (TCC), 2-hydroxy-4-methoxybenzophenone (BP3), 2,4-dihydroxybenzophenone (BP1), bisphenol A bis(2,3-dihydroxypropyl) glycidyl ether (BADGE·2H2O), 2,4-dichlorophenol (2,4-DCP), and 2,5-dichlorophenol (2,5-DCP) were found in >50% of the urine samples analyzed. The highest geometric mean concentrations were found for MeP (43.1 ng/mL), PrP (3.12 ng/mL), 3,4-DHB (42.2 ng/mL), TCS (8.26 ng/mL), BP3 (3.71 ng/mL), and BP1 (4.85 ng/mL), and exposures to most of which were associated with personal care product (PCP) use. Statistically significant associations were found between urinary concentrations of 8OHDG and BPA, MeP, 3,4-DHB, OH-MeP, OH-EtP, TCS, BP3, 2,4-DCP, and 2,5-DCP. After clustering the data on the basis of i) 14 EDCs (exposure levels), ii) demography (age, gender and geographic location), and iii) 8OHDG (effect), two distinct clusters of samples were identified. 8OHDG concentration was the most critical parameter that differentiated the two clusters, followed by OH-EtP. When 8OHDG was removed from the dataset, predictability of exposure variables increased in the order of: OH-EtP > OH-MeP > 3,4-DHB > BPA > 2,4-DCP > MeP > TCS > EtP > BP1 > 2,5-DCP. Our results showed that co-exposure to OH-EtP, OH-MeP, 3,4-DHB, BPA, 2,4-DCP, MeP, TCS, EtP, BP1, and 2,5-DCP was associated with DNA damage in children. This is the first study to report exposure of Brazilian children to a wide range of EDCs and the data mining approach further strengthened our findings of chemical co-exposures and biomarkers of effect.


Assuntos
Derivados de Benzeno/urina , Dano ao DNA , Mineração de Dados/métodos , Brasil/epidemiologia , Criança , Biologia Computacional , Humanos
8.
Psico USF ; 23(3): 425-436, 2018. tab
Artigo em Inglês | LILACS | ID: biblio-948239

RESUMO

The conclusion of the undergraduate course by university students in the time predicted by the curriculum is desirable for young people and for society. The aim was to verify the reliability, sensitivity and specificity of a broad set of predictors for academic performance of university students, who completed the undergraduate course within the time predicted by the curricula, through data mining methodology, provided by the Support Vector Machines algorithm. A simple approach is proposed for the prediction of course completion by students in a university in Brazil. The dataset has 170 students who finished the course and 117 who did not finish. With the proposed methodology, it was possible to predict the course completion by students with an accuracy of 79.5% when using the 19 original variables. An accuracy of 75% was found using only 05 variables: Course, year of the course, gender, initial and final academic performance. (AU)


A conclusão do curso de graduação por estudantes universitários no tempo previsto pelo currículo é desejável para os jovens e para a sociedade. O objetivo foi verificar a confiabilidade, sensibilidade e especificidade de um amplo conjunto de indicadores sobre o desempenho acadêmico de estudantes universitários, que completaram o curso de graduação dentro do tempo previsto pelo currículo, por meio de metodologia de mineração de dados, fornecida pelo algoritmo Vector Machines Suporte. Uma abordagem simples é proposta para a previsão da conclusão do curso por estudantes de uma universidade no Brasil. O conjunto de dados tem 170 alunos que concluíram o curso e 117 que não terminaram. Com a metodologia proposta, foi possível prever a conclusão do curso pelos alunos com uma precisão de 79,5% quando se utiliza as 19 variáveis originais. Uma precisão de 75% foi encontrada usando apenas cinco variáveis: curso, ano do curso, o sexo, o desempenho inicial e final acadêmico. (AU)


La conclusión del curso de graduación de los estudiantes universitarios en el tiempo previsto por el plan de estudios es deseable para los jóvenes y para la sociedad. El objetivo fue verificar confianza, sensibilidad y especificidad de un amplio conjunto de indicadores sobre el desempeño académico de los estudiantes universitarios, que completaron el curso de graduación dentro del tiempo previsto por los planes de estudio, a través de la metodología de minería de datos, proporcionada por el algoritmo Vector Machines Suporte. Se propone un abordaje simple para previsión de la finalización de la carrera por estudiantes en una Universidad de Brasil. El conjunto de datos tiene 170 estudiantes que concluyeron la carrera y 117 que no terminaron. Con la metodología propuesta, fue posibe prever la finalización de la carrera por los estudiantes con una precisión de 79,5% cuando se utilizan las 19 variables originales. Una precisión de 75% fue encontrada usando apenas 5 variables: Curso, duración de la carrera, sexo, desempeño inicial y final académico. (AU)


Assuntos
Humanos , Masculino , Feminino , Adulto , Estudantes/psicologia , Previsões , Desempenho Acadêmico/psicologia , Saúde Mental , Universidades , Mineração de Dados , Máquina de Vetores de Suporte , Habilidades Sociais
9.
J Forensic Sci ; 62(6): 1479-1486, 2017 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-28205217

RESUMO

The variations found in the elemental composition in ecstasy samples result in spectral profiles with useful information for data analysis, and cluster analysis of these profiles can help uncover different categories of the drug. We provide a cluster analysis of ecstasy tablets based on their elemental composition. Twenty-five elements were determined by ICP-MS in tablets apprehended by Sao Paulo's State Police, Brazil. We employ the K-means clustering algorithm along with C4.5 decision tree to help us interpret the clustering results. We found a better number of two clusters within the data, which can refer to the approximated number of sources of the drug which supply the cities of seizures. The C4.5 model was capable of differentiating the ecstasy samples from the two clusters with high prediction accuracy using the leave-one-out cross-validation. The model used only Nd, Ni, and Pb concentration values in the classification of the samples.


Assuntos
Drogas Ilícitas/química , N-Metil-3,4-Metilenodioxianfetamina/química , Algoritmos , Brasil , Análise por Conglomerados , Árvores de Decisões , Contaminação de Medicamentos , Tráfico de Drogas , Humanos , Espectrometria de Massas/métodos , Comprimidos
10.
Planta ; 242(5): 1123-38, 2015 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-26067758

RESUMO

MAIN CONCLUSION: Chemical analyses and glycome profiling demonstrate differences in the structures of the xyloglucan, galactomannan, glucuronoxylan, and rhamnogalacturonan I isolated from soybean ( Glycine max ) roots and root hair cell walls. The root hair is a plant cell that extends only at its tip. All other root cells have the ability to grow in different directions (diffuse growth). Although both growth modes require controlled expansion of the cell wall, the types and structures of polysaccharides in the walls of diffuse and tip-growing cells from the same plant have not been determined. Soybean (Glycine max) is one of the few plants whose root hairs can be isolated in amounts sufficient for cell wall chemical characterization. Here, we describe the structural features of rhamnogalacturonan I, rhamnogalacturonan II, xyloglucan, glucomannan, and 4-O-methyl glucuronoxylan present in the cell walls of soybean root hairs and roots stripped of root hairs. Irrespective of cell type, rhamnogalacturonan II exists as a dimer that is cross-linked by a borate ester. Root hair rhamnogalacturonan I contains more neutral oligosaccharide side chains than its root counterpart. At least 90% of the glucuronic acid is 4-O-methylated in root glucuronoxylan. Only 50% of this glycose is 4-O-methylated in the root hair counterpart. Mono O-acetylated fucose-containing subunits account for at least 60% of the neutral xyloglucan from root and root hair walls. By contrast, a galacturonic acid-containing xyloglucan was detected only in root hair cell walls. Soybean homologs of the Arabidopsis xyloglucan-specific galacturonosyltransferase are highly expressed only in root hairs. A mannose-rich polysaccharide was also detected only in root hair cell walls. Our data demonstrate that the walls of tip-growing root hairs cells have structural features that distinguish them from the walls of other roots cells.


Assuntos
Parede Celular/química , Glucanos/química , Glycine max/química , Mananas/química , Pectinas/química , Raízes de Plantas/química , Xilanos/química , Galactose/análogos & derivados
11.
Food Chem ; 184: 154-9, 2015 Oct 01.
Artigo em Inglês | MEDLINE | ID: mdl-25872438

RESUMO

A practical and easy control of the authenticity of organic sugarcane samples based on the use of machine-learning algorithms and trace elements determination by inductively coupled plasma mass spectrometry is proposed. Reference ranges for 32 chemical elements in 22 samples of sugarcane (13 organic and 9 non organic) were established and then two algorithms, Naive Bayes (NB) and Random Forest (RF), were evaluated to classify the samples. Accurate results (>90%) were obtained when using all variables (i.e., 32 elements). However, accuracy was improved (95.4% for NB) when only eight minerals (Rb, U, Al, Sr, Dy, Nb, Ta, Mo), chosen by a feature selection algorithm, were employed. Thus, the use of a fingerprint based on trace element levels associated with classification machine learning algorithms may be used as a simple alternative for authenticity evaluation of organic sugarcane samples.


Assuntos
Espectrometria de Massas/métodos , Saccharum/química , Oligoelementos/análise , Algoritmos , Teorema de Bayes , Valores de Referência , Espectrofotometria Atômica
12.
J Food Sci ; 79(9): C1672-7, 2014 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-25124993

RESUMO

This article aims to evaluate 2 machine learning algorithms, decision trees and naïve Bayes (NB), for egg classification (free-range eggs compared with battery eggs). The database used for the study consisted of 15 chemical elements (As, Ba, Cd, Co, Cs, Cu, Fe, Mg, Mn, Mo, Pb, Se, Sr, V, and Zn) determined in 52 eggs samples (20 free-range and 32 battery eggs) by inductively coupled plasma mass spectrometry. Our results demonstrated that decision trees and NB associated with the mineral contents of eggs provide a high level of accuracy (above 80% and 90%, respectively) for classification between free-range and battery eggs and can be used as an alternative method for adulteration evaluation.


Assuntos
Ovos/análise , Qualidade dos Alimentos , Algoritmos , Animais , Inteligência Artificial , Teorema de Bayes , Galinhas , Árvores de Decisões , Feminino , Espectrometria de Massas/métodos , Reconhecimento Automatizado de Padrão , Oligoelementos/análise
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...