Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 66
Filtrar
1.
Cytometry A ; 103(4): 304-312, 2023 04.
Artigo em Inglês | MEDLINE | ID: mdl-36030398

RESUMO

Minimal residual disease (MRD) detection is a strong predictor for survival and relapse in acute myeloid leukemia (AML). MRD can be either determined by molecular assessment strategies or via multiparameter flow cytometry. The degree of bone marrow (BM) dilution with peripheral blood (PB) increases with aspiration volume causing consecutive underestimation of the residual AML blast amount. In order to prevent false-negative MRD results, we developed Cinderella, a simple automated method for one-tube simultaneous measurement of hemodilution in BM samples and MRD level. The explainable artificial intelligence (XAI) Cinderella was trained and validated with the digital raw data of a flow cytometric "8-color" AML-MRD antibody panel in 126 BM and 23 PB samples from 35 patients. Cinderella predicted PB dilution with high accordance compared to the results of the Holdrinet formula (Pearson's correlation coefficient r = 0.94, R2  = 0.89, p < 0.001). Unlike conventional neuronal networks Cinderella calculated the distributions of 12 different cell populations that were assigned to true hematopoietic counterparts as a human in the loop (HIL) approach. Besides characteristic BM cells such as myelocytes and myeloid progenitor cells the XAI identified discriminating populations, which were not specific for BM or PB (e.g., T cell/NK cell subpopulations and CD45 negative cells) and considered their frequency differences. Thus, Cinderella represents a HIL-XAI algorithm capable to calculate the degree of hemodilution in BM samples with an AML MRD immunophenotype panel. It is explicable, transparent, and paves a simple way to prevent false negative MRD reports.


Assuntos
Medula Óssea , Leucemia Mieloide Aguda , Humanos , Neoplasia Residual/diagnóstico , Inteligência Artificial , Hemodiluição
2.
BMC Bioinformatics ; 23(1): 233, 2022 Jun 16.
Artigo em Inglês | MEDLINE | ID: mdl-35710346

RESUMO

BACKGROUND: Data transformations are commonly used in bioinformatics data processing in the context of data projection and clustering. The most used Euclidean metric is not scale invariant and therefore occasionally inappropriate for complex, e.g., multimodal distributed variables and may negatively affect the results of cluster analysis. Specifically, the squaring function in the definition of the Euclidean distance as the square root of the sum of squared differences between data points has the consequence that the value 1 implicitly defines a limit for distances within clusters versus distances between (inter-) clusters. METHODS: The Euclidean distances within a standard normal distribution (N(0,1)) follow a N(0,[Formula: see text]) distribution. The EDO-transformation of a variable X is proposed as [Formula: see text] following modeling of the standard deviation s by a mixture of Gaussians and selecting the dominant modes via item categorization. The method was compared in artificial and biomedical datasets with clustering of untransformed data, z-transformed data, and the recently proposed pooled variable scaling. RESULTS: A simulation study and applications to known real data examples showed that the proposed EDO scaling method is generally useful. The clustering results in terms of cluster accuracy, adjusted Rand index and Dunn's index outperformed the classical alternatives. Finally, the EDO transformation was applied to cluster a high-dimensional genomic dataset consisting of gene expression data for multiple samples of breast cancer tissues, and the proposed approach gave better results than classical methods and was compared with pooled variable scaling. CONCLUSIONS: For multivariate procedures of data analysis, it is proposed to use the EDO transformation as a better alternative to the established z-standardization, especially for nontrivially distributed data. The "EDOtrans" R package is available at https://cran.r-project.org/package=EDOtrans .


Assuntos
Algoritmos , Biologia Computacional , Análise por Conglomerados , Genômica , Distribuição Normal
3.
Int J Mol Sci ; 23(22)2022 Nov 15.
Artigo em Inglês | MEDLINE | ID: mdl-36430580

RESUMO

Bayesian inference is ubiquitous in science and widely used in biomedical research such as cell sorting or "omics" approaches, as well as in machine learning (ML), artificial neural networks, and "big data" applications. However, the calculation is not robust in regions of low evidence. In cases where one group has a lower mean but a higher variance than another group, new cases with larger values are implausibly assigned to the group with typically smaller values. An approach for a robust extension of Bayesian inference is proposed that proceeds in two main steps starting from the Bayesian posterior probabilities. First, cases with low evidence are labeled as "uncertain" class membership. The boundary for low probabilities of class assignment (threshold ε) is calculated using a computed ABC analysis as a data-based technique for item categorization. This leaves a number of cases with uncertain classification (p < ε). Second, cases with uncertain class membership are relabeled based on the distance to neighboring classified cases based on Voronoi cells. The approach is demonstrated on biomedical data typically analyzed with Bayesian statistics, such as flow cytometric data sets or biomarkers used in medical diagnostics, where it increased the class assignment accuracy by 1−10% depending on the data set. The proposed extension of the Bayesian inference of class membership can be used to obtain robust and plausible class assignments even for data at the extremes of the distribution and/or for which evidence is weak.


Assuntos
Big Data , Pesquisa Biomédica , Teorema de Bayes , Probabilidade , Incerteza
4.
Bioinformatics ; 35(14): 2362-2370, 2019 07 15.
Artigo em Inglês | MEDLINE | ID: mdl-30500872

RESUMO

MOTIVATION: The genetic architecture of diseases becomes increasingly known. This raises difficulties in picking suitable targets for further research among an increasing number of candidates. Although expression based methods of gene set reduction are applied to laboratory-derived genetic data, the analysis of topical sets of genes gathered from knowledge bases requires a modified approach as no quantitative information about gene expression is available. RESULTS: We propose a computational functional genomics-based approach at reducing sets of genes to the most relevant items based on the importance of the gene within the polyhierarchy of biological processes characterizing the disease. Knowledge bases about the biological roles of genes can provide a valid description of traits or diseases represented as a directed acyclic graph (DAG) picturing the polyhierarchy of disease relevant biological processes. The proposed method uses a gene importance score derived from the location of the gene-related biological processes in the DAG. It attempts to recreate the DAG and thereby, the roles of the original gene set, with the least number of genes in descending order of importance. This obtained precision and recall of over 70% to recreate the components of the DAG charactering the biological functions of n=540 genes relevant to pain with a subset of only the k=29 best-scoring genes. CONCLUSIONS: A new method for reduction of gene sets is shown that is able to reproduce the biological processes in which the full gene set is involved by over 70%; however, by using only ∼5% of the original genes. AVAILABILITY AND IMPLEMENTATION: The necessary numerical parameters for the calculation of gene importance are implemented in the R package dbtORA at https://github.com/IME-TMP-FFM/dbtORA. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Genômica , Bases de Conhecimento , Biologia Computacional , Expressão Gênica , Software
5.
Eur J Anaesthesiol ; 37(3): 235-246, 2020 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-32028289

RESUMO

BACKGROUND: Persistent pain extending beyond 6 months after breast cancer surgery when adjuvant therapies have ended is a recognised phenomenon. The evolution of postsurgery pain is therefore of interest for future patient management in terms of possible prognoses for distinct groups of patients to enable better patient information. OBJECTIVE(S): An analysis aimed to identify subgroups of patients who share similar time courses of postoperative persistent pain. DESIGN: Prospective cohort study. SETTING: Helsinki University Hospital, Finland, between 2006 and 2010. PATIENTS: A total of 763 women treated for breast cancer at the Helsinki University Hospital. INTERVENTIONS: Employing a data science approach in a nonredundant reanalysis of data published previously, pain ratings acquired at 6, 12, 24 and 36 months after breast cancer surgery, were analysed for a group structure of the temporal courses of pain. Unsupervised automated evolutionary (genetic) algorithms were used for patient cluster detection in the pain ratings and for Gaussian mixture modelling of the slopes of the linear relationship between pain ratings and acquisition times. MAIN OUTCOME MEASURES: Clusters or groups of patients sharing patterns in the time courses of pain between 6 and 36 months after breast cancer surgery. RESULTS: Three groups of patients with distinct time courses of pain were identified as the best solutions for both clustering of the pain ratings and multimodal modelling of the slopes of their temporal trends. In two clusters/groups, pain decreased or remained stable and the two approaches suggested/identified similar subgroups representing 80/763 and 86/763 of the patients, respectively, in whom rather high pain levels tended to further increase over time. CONCLUSION: In the majority of patients, pain after breast cancer surgery decreased rapidly and disappeared or the intensity decreased over 3 years. However, in about a tenth of patients, moderate-to-severe pain tended to increase during the 3-year follow-up.


Assuntos
Neoplasias da Mama , Neoplasias da Mama/cirurgia , Ciência de Dados , Feminino , Humanos , Mastectomia , Dor Pós-Operatória/diagnóstico , Dor Pós-Operatória/epidemiologia , Dor Pós-Operatória/etiologia , Estudos Prospectivos
6.
Haematologica ; 104(2): 277-287, 2019 02.
Artigo em Inglês | MEDLINE | ID: mdl-30190345

RESUMO

Differential induction therapy of all subtypes of acute myeloid leukemia other than acute promyelocytic leukemia is impeded by the long time required to complete complex and diverse cytogenetic and molecular genetic analyses for risk stratification or targeted treatment decisions. Here, we describe a reliable, rapid and sensitive diagnostic approach that combines karyotyping and mutational screening in a single, integrated, next-generation sequencing assay. Numerical karyotyping was performed by low coverage whole genome sequencing followed by copy number variation analysis using a novel algorithm based on in silico-generated reference karyotypes. Translocations and DNA variants were examined by targeted resequencing of fusion transcripts and mutational hotspot regions using commercially available kits and analysis pipelines. For the identification of FLT3 internal tandem duplications and KMT2A partial tandem duplications, we adapted previously described tools. In a validation cohort including 22 primary patients' samples, 9/9 numerically normal karyotypes were classified correctly and 30/31 (97%) copy number variations reported by classical cytogenetics and fluorescence in situ hybridization analysis were uncovered by our next-generation sequencing karyotyping approach. Predesigned fusion and mutation panels were validated exemplarily on leukemia cell lines and a subset of patients' samples and identified all expected genomic alterations. Finally, blinded analysis of eight additional patients' samples using our comprehensive assay accurately reproduced reference results. Therefore, calculated karyotyping by low coverage whole genome sequencing enables fast and reliable detection of numerical chromosomal changes and, in combination with panel-based fusion-and mutation screening, will greatly facilitate implementation of subtype-specific induction therapies in acute myeloid leukemia.


Assuntos
Biomarcadores Tumorais , Estudos de Associação Genética , Predisposição Genética para Doença , Leucemia Mieloide Aguda/diagnóstico , Leucemia Mieloide Aguda/genética , Adulto , Idoso , Idoso de 80 Anos ou mais , Algoritmos , Aberrações Cromossômicas , Biologia Computacional/métodos , Variações do Número de Cópias de DNA , Feminino , Estudos de Associação Genética/métodos , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Cariotipagem , Masculino , Pessoa de Meia-Idade , Adulto Jovem
7.
Int J Mol Sci ; 21(1)2019 Dec 20.
Artigo em Inglês | MEDLINE | ID: mdl-31861946

RESUMO

Advances in flow cytometry enable the acquisition of large and high-dimensional data sets per patient. Novel computational techniques allow the visualization of structures in these data and, finally, the identification of relevant subgroups. Correct data visualizations and projections from the high-dimensional space to the visualization plane require the correct representation of the structures in the data. This work shows that frequently used techniques are unreliable in this respect. One of the most important methods for data projection in this area is the t-distributed stochastic neighbor embedding (t-SNE). We analyzed its performance on artificial and real biomedical data sets. t-SNE introduced a cluster structure for homogeneously distributed data that did not contain any subgroup structure. In other data sets, t-SNE occasionally suggested the wrong number of subgroups or projected data points belonging to different subgroups, as if belonging to the same subgroup. As an alternative approach, emergent self-organizing maps (ESOM) were used in combination with U-matrix methods. This approach allowed the correct identification of homogeneous data while in sets containing distance or density-based subgroups structures; the number of subgroups and data point assignments were correctly displayed. The results highlight possible pitfalls in the use of a currently widely applied algorithmic technique for the detection of subgroups in high dimensional cytometric data and suggest a robust alternative.


Assuntos
Biologia Computacional/métodos , Citometria de Fluxo/métodos , Aprendizado de Máquina , Algoritmos , Antígenos CD/análise , Conjuntos de Dados como Assunto , Humanos , Processos Estocásticos
8.
Breast Cancer Res Treat ; 171(2): 399-411, 2018 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-29876695

RESUMO

BACKGROUND: Prevention of persistent pain following breast cancer surgery, via early identification of patients at high risk, is a clinical need. Supervised machine-learning was used to identify parameters that predict persistence of significant pain. METHODS: Over 500 demographic, clinical and psychological parameters were acquired up to 6 months after surgery from 1,000 women (aged 28-75 years) who were treated for breast cancer. Pain was assessed using an 11-point numerical rating scale before surgery and at months 1, 6, 12, 24, and 36. The ratings at months 12, 24, and 36 were used to allocate patents to either "persisting pain" or "non-persisting pain" groups. Unsupervised machine learning was applied to map the parameters to these diagnoses. RESULTS: A symbolic rule-based classifier tool was created that comprised 21 single or aggregated parameters, including demographic features, psychological and pain-related parameters, forming a questionnaire with "yes/no" items (decision rules). If at least 10 of the 21 rules applied, persisting pain was predicted at a cross-validated accuracy of 86% and a negative predictive value of approximately 95%. CONCLUSIONS: The present machine-learned analysis showed that, even with a large set of parameters acquired from a large cohort, early identification of these patients is only partly successful. This indicates that more parameters are needed for accurate prediction of persisting pain. However, with the current parameters it is possible, with a certainty of almost 95%, to exclude the possibility of persistent pain developing in a woman being treated for breast cancer.


Assuntos
Neoplasias da Mama/complicações , Aprendizado de Máquina , Dor Pós-Operatória/diagnóstico , Dor Pós-Operatória/etiologia , Adulto , Idoso , Neoplasias da Mama/cirurgia , Feminino , Seguimentos , Humanos , Mastectomia , Pessoa de Meia-Idade , Dor Pós-Operatória/prevenção & controle , Prognóstico , Reprodutibilidade dos Testes , Fatores de Risco , Aprendizado de Máquina Supervisionado , Fatores de Tempo
9.
J Biomed Inform ; 66: 95-104, 2017 02.
Artigo em Inglês | MEDLINE | ID: mdl-28040499

RESUMO

BACKGROUND: High-dimensional biomedical data are frequently clustered to identify subgroup structures pointing at distinct disease subtypes. It is crucial that the used cluster algorithm works correctly. However, by imposing a predefined shape on the clusters, classical algorithms occasionally suggest a cluster structure in homogenously distributed data or assign data points to incorrect clusters. We analyzed whether this can be avoided by using emergent self-organizing feature maps (ESOM). METHODS: Data sets with different degrees of complexity were submitted to ESOM analysis with large numbers of neurons, using an interactive R-based bioinformatics tool. On top of the trained ESOM the distance structure in the high dimensional feature space was visualized in the form of a so-called U-matrix. Clustering results were compared with those provided by classical common cluster algorithms including single linkage, Ward and k-means. RESULTS: Ward clustering imposed cluster structures on cluster-less "golf ball", "cuboid" and "S-shaped" data sets that contained no structure at all (random data). Ward clustering also imposed structures on permuted real world data sets. By contrast, the ESOM/U-matrix approach correctly found that these data contain no cluster structure. However, ESOM/U-matrix was correct in identifying clusters in biomedical data truly containing subgroups. It was always correct in cluster structure identification in further canonical artificial data. Using intentionally simple data sets, it is shown that popular clustering algorithms typically used for biomedical data sets may fail to cluster data correctly, suggesting that they are also likely to perform erroneously on high dimensional biomedical data. CONCLUSIONS: The present analyses emphasized that generally established classical hierarchical clustering algorithms carry a considerable tendency to produce erroneous results. By contrast, unsupervised machine-learned analysis of cluster structures, applied using the ESOM/U-matrix method, is a viable, unbiased method to identify true clusters in the high-dimensional space of complex data.


Assuntos
Algoritmos , Análise por Conglomerados , Aprendizado de Máquina , Biologia Computacional
10.
Int J Mol Sci ; 18(6)2017 Jun 07.
Artigo em Inglês | MEDLINE | ID: mdl-28590455

RESUMO

Lipid metabolism has been suggested to be a major pathophysiological mechanism of multiple sclerosis (MS). With the increasing knowledge about lipid signaling, acquired data become increasingly complex making bioinformatics necessary in lipid research. We used unsupervised machine-learning to analyze lipid marker serum concentrations, pursuing the hypothesis that for the most relevant markers the emerging data structures will coincide with the diagnosis of MS. Machine learning was implemented as emergent self-organizing feature maps (ESOM) combined with the U*-matrix visualization technique. The data space consisted of serum concentrations of three main classes of lipid markers comprising eicosanoids (d = 11 markers), ceramides (d = 10), and lyosophosphatidic acids (d = 6). They were analyzed in cohorts of MS patients (n = 102) and healthy subjects (n = 301). Clear data structures in the high-dimensional data space were observed in eicosanoid and ceramides serum concentrations whereas no clear structure could be found in lysophosphatidic acid concentrations. With ceramide concentrations, the structures that had emerged from unsupervised machine-learning almost completely overlapped with the known grouping of MS patients versus healthy subjects. This was only partly provided by eicosanoid serum concentrations. Thus, unsupervised machine-learning identified distinct data structures of bioactive lipid serum concentrations. These structures could be superimposed with the known grouping of MS patients versus healthy subjects, which was almost completely possible with ceramides. Therefore, based on the present analysis, ceramides are first-line candidates for further exploration as drug-gable targets or biomarkers in MS.


Assuntos
Metabolismo dos Lipídeos , Lipídeos/sangue , Aprendizado de Máquina , Esclerose Múltipla/sangue , Biomarcadores , Estudos de Casos e Controles , Ceramidas/sangue , Eicosanoides/sangue , Humanos , Informática/métodos , Lisofosfolipídeos/sangue
11.
Chem Senses ; 41(9): 763-770, 2016 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-27566724

RESUMO

In the clinical diagnosis of olfactory function, 2 quantitative extremes of either lost or normal olfactory function are in the focus while no particular attention is directed at the interval between the 2 main diagnoses of "anosmia" or "normosmia", respectively. We analyzed the modal distribution of olfactory scores with the intention to describe a complex human olfactory pathology in a unifying model. In a cross-sectional retrospective study, olfactory performance scores acquired from 10714 individuals by means of a clinically established psychophysical test were analyzed with respect to their modal distribution by fitting a Gaussian mixture model (GMM) to the data. The probability distribution of all olfactory scores was found to be multimodal. It could be described as a mixture of 6 Gaussian distributions at a high statistical significance level of P < 10 -5 . Moreover, 9 different pathologies associated with the olfactory dysfunction could be shown to be reflected in 1-3 distinct Gaussians. This provides the possibility to assign distinct degrees of olfactory acuity with each etiology. Results indicate that human olfactory pathology is composed of clearly distinct subpathologies that can be connected with underlying subetiologies. We present a unifying data science-based model that satisfies the human olfactory pathology observed in 10714 subjects. The analysis of the distribution of their olfactory performance scores suggests a complex but very distinct human olfactory pathology. This implies a distinction of the olfactory diagnosis of hyposmia from those of anosmia or normosmia.

12.
Chem Senses ; 41(4): 339-44, 2016 05.
Artigo em Inglês | MEDLINE | ID: mdl-26857742

RESUMO

The establishment of normal olfactory function by means of a simple and reliable test is one method that could minimize olfactory test procedures in the clinic. This retrospective study analyzed the identification of 16 odors by 613 subjects (aged 18-96 years, 266 men) as a part of a complex olfactory test battery by which 183, 251, and 179 subjects were diagnosed with anosmia, hyposmia, or normosmia, respectively. Cinnamon was identified as the best scoring odor, that is, identified correctly by most normosmic subjects, but identified correctly by the fewest anosmic patients. An exact calculation of the optimum number of items needed for a diagnosis of normosmia resulted in 1 single odor identification item as being sufficient. The inclusion of more items is solely determined by the acceptable proportion of chance, which in a 4-alternative forced choice paradigm is only 1.6% with 3 odors. A proposed screening test using cinnamon, fish odor, and banana established normosmia at a sensitivity of 80.4% and a specificity of 84.3% and a negative predictive value of 91.3%.A positive test result reliably establishes normosmia providing a confidence basis to terminate olfactory assessments following the application of only 3 odor identification items.


Assuntos
Odorantes/análise , Transtornos do Olfato/diagnóstico , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Estudos de Casos e Controles , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Transtornos do Olfato/prevenção & controle , Limiar Sensorial/fisiologia , Adulto Jovem
13.
Eur J Clin Pharmacol ; 72(12): 1449-1461, 2016 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-27695919

RESUMO

OBJECTIVE: The public accessibility of "big data" about the molecular targets of drugs and the biological functions of genes allows novel data science-based approaches to pharmacology that link drugs directly with their effects on pathophysiologic processes. This provides a phenotypic path to drug discovery and repurposing. This paper compares the performance of a functional genomics-based criterion to the traditional drug target-based classification. METHODS: Knowledge discovery in the DrugBank and Gene Ontology databases allowed the construction of a "drug target versus biological process" matrix as a combination of "drug versus genes" and "genes versus biological processes" matrices. As a canonical example, such matrices were constructed for classical analgesic drugs. These matrices were projected onto a toroid grid of 50 × 82 artificial neurons using a self-organizing map (SOM). The distance, respectively, cluster structure of the high-dimensional feature space of the matrices was visualized on top of this SOM using a U-matrix. RESULTS: The cluster structure emerging on the U-matrix provided a correct classification of the analgesics into two main classes of opioid and non-opioid analgesics. The classification was flawless with both the functional genomics and the traditional target-based criterion. The functional genomics approach inherently included the drugs' modulatory effects on biological processes. The main pharmacological actions known from pharmacological science were captures, e.g., actions on lipid signaling for non-opioid analgesics that comprised many NSAIDs and actions on neuronal signal transmission for opioid analgesics. CONCLUSIONS: Using machine-learned techniques for computational drug classification in a comparative assessment, a functional genomics-based criterion was found to be similarly suitable for drug classification as the traditional target-based criterion. This supports a utility of functional genomics-based approaches to computational system pharmacology for drug discovery and repurposing.


Assuntos
Analgésicos/classificação , Genômica , Preparações Farmacêuticas/classificação , Bases de Dados Factuais , Descoberta de Drogas , Ontologia Genética , Humanos , Aprendizado de Máquina
14.
Hum Genet ; 134(11-12): 1221-38, 2015 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-26385553

RESUMO

Micro-ribonucleic acids (miRNAs) play a role in pain, based on studies on models of neuropathic or inflammatory pain and clinical evidence. The present analysis made extensive use of computational biology, knowledge discovery methods, publicly available databases and data mining tools to merge results from genetic and miRNA research into an analysis of the systems biological roles of miRNAs in pain. We identified that about one-third of miRNAs detected through nociceptive research have been associated with a mere 18 regulated genes. Substituting the missing genetic information by computational data mining and based on comprehensive current empirical evidence of gene versus miRNA interactions, we have identified a total of 130 pain genes as being probably regulated by a total of 167 different miRNAs. Particularly pain-relevant roles of miRNAs include the control of gene expression at any level and regulation of interleukin-6-related pain entities. Among the miRNAs regulating pain genes are seven that are brain specific, hinting at their therapeutic utility for modulating central nervous mechanisms of pain.


Assuntos
Genômica/métodos , MicroRNAs/genética , Dor/genética , Mineração de Dados , Bases de Dados Genéticas , Epistasia Genética , Regulação da Expressão Gênica , Redes Reguladoras de Genes , Humanos , Anotação de Sequência Molecular
15.
Eur J Clin Pharmacol ; 71(4): 461-71, 2015 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-25666029

RESUMO

BACKGROUND: Drug effects on the human sense of smell attract increasing interest, yet systematic evidence from controlled studies is sparse. The present cross-sectional approach to olfactory drug effects made use of the recent developments in informatics, knowledge discovery, and data mining allowing connecting drug-related information from humans with underlying molecular drug targets. METHODS: In this prospective cross-sectional study, n = 1008 outpatients at a general practitioner were enrolled. All currently taken medications were obtained, and olfactory function was assessed by means of a clinically established 12-item odor identification test. The association between the patients' sense of smell and the administered medications was based (i) on the active pharmacological substances and (ii) on the molecular targets queried from the publicly accessible DrugBank database. RESULTS: Of the 168 different substances, six were taken sufficiently often to be analyzed. The administration of levothyroxine was associated with a higher olfactory score (p = 0.033). For the 168 drugs, 323 different targets could be queried. Thirty-one gene products were addressed sufficiently often to be analyzed. Besides agonistic targeting of thyroid hormone receptors (genes THRA1, THRB1) agreeing with the above result, antagonistically targeting the adrenoceptor alpha 1A (gene ADRA1A) by several unrelated medications was associated with a significantly higher olfactory score (p = 0.012). CONCLUSIONS: The identified drug effects on olfaction are both biologically plausible based on supportive information from basic science studies. The novel molecular target-based approach suggested clear advantages over the classical drug or drug class-based approach. It increased the analyzable data volume fivefold and provided plausible hypotheses about mechanistic drug effects opening possibilities for drug discovery and repurposing.


Assuntos
Odorantes/análise , Olfato/efeitos dos fármacos , Adolescente , Agonistas de Receptores Adrenérgicos alfa 1/uso terapêutico , Adulto , Idoso , Idoso de 80 Anos ou mais , Estudos Transversais , Sistemas de Liberação de Medicamentos/métodos , Descoberta de Drogas/métodos , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Estudos Prospectivos , Receptores Adrenérgicos alfa 1/metabolismo , Receptores dos Hormônios Tireóideos/metabolismo , Tiroxina/uso terapêutico , Adulto Jovem
16.
Int J Mol Sci ; 16(10): 25897-911, 2015 Oct 28.
Artigo em Inglês | MEDLINE | ID: mdl-26516852

RESUMO

Biomedical data obtained during cell experiments, laboratory animal research, or human studies often display a complex distribution. Statistical identification of subgroups in research data poses an analytical challenge. Here were introduce an interactive R-based bioinformatics tool, called "AdaptGauss". It enables a valid identification of a biologically-meaningful multimodal structure in the data by fitting a Gaussian mixture model (GMM) to the data. The interface allows a supervised selection of the number of subgroups. This enables the expectation maximization (EM) algorithm to adapt more complex GMM than usually observed with a noninteractive approach. Interactively fitting a GMM to heat pain threshold data acquired from human volunteers revealed a distribution pattern with four Gaussian modes located at temperatures of 32.3, 37.2, 41.4, and 45.4 °C. Noninteractive fitting was unable to identify a meaningful data structure. Obtained results are compatible with known activity temperatures of different TRP ion channels suggesting the mechanistic contribution of different heat sensors to the perception of thermal pain. Thus, sophisticated analysis of the modal structure of biomedical data provides a basis for the mechanistic interpretation of the observations. As it may reflect the involvement of different TRP thermosensory ion channels, the analysis provides a starting point for hypothesis-driven laboratory experiments.


Assuntos
Temperatura Alta , Dor Nociceptiva/metabolismo , Limiar da Dor , Sensação Térmica , Adolescente , Adulto , Algoritmos , Feminino , Humanos , Masculino , Modelos Neurológicos , Dor Nociceptiva/fisiopatologia , Canais de Potencial de Receptor Transitório/metabolismo
17.
BMC Genomics ; 15: 976, 2014 Nov 18.
Artigo em Inglês | MEDLINE | ID: mdl-25404408

RESUMO

BACKGROUND: Micro-RNAs (miRNA) are attributed to the systems biological role of a regulatory mechanism of the expression of protein coding genes. Research has identified miRNAs dysregulations in several but distinct pathophysiological processes, which hints at distinct systems-biology functions of miRNAs. The present analysis approached the role of miRNAs from a genomics perspective and assessed the biological roles of 2954 genes and 788 human miRNAs, which can be considered to interact, based on empirical evidence and computational predictions of miRNA versus gene interactions. RESULTS: From a genomics perspective, the biological processes in which the genes that are influenced by miRNAs are involved comprise of six major topics comprising biological regulation, cellular metabolism, information processing, development, gene expression and tissue homeostasis. The usage of this knowledge as a guidance for further research is sketched for two genetically defined functional areas: cell death and gene expression. Results suggest that the latter points to a fundamental role of miRNAs consisting of hyper-regulation of gene expression, i.e., the control of the expression of such genes which control specifically the expression of genes. CONCLUSIONS: Laboratory research identified contributions of miRNA regulation to several distinct biological processes. The present analysis transferred this knowledge to a systems-biology level. A comprehensible and precise description of the biological processes in which the genes that are influenced by miRNAs are notably involved could be made. This knowledge can be employed to guide future research concerning the biological role of miRNA (dys-) regulations. The analysis also suggests that miRNAs especially control the expression of genes that control the expression of genes.


Assuntos
Regulação da Expressão Gênica/genética , MicroRNAs/genética , Transcrição Gênica , Inteligência Artificial , Biologia Computacional , Genoma Humano , Humanos , Biologia de Sistemas
18.
Br J Clin Pharmacol ; 78(5): 961-9, 2014 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-24802974

RESUMO

AIMS: Olfactory loss impairs the patient's quality of life. In individualized therapies, olfactory drug effects gain clinical importance. Molecular evidence suggests that among drugs with potential olfactory effects is Δ(9) -tetrahydrocannabinol (THC), which is approved for several indications, including neuropathic pain or analgesia in cancer patients. The present study aimed at assessing the olfactory effects of THC to be expected during analgesic treatment. METHODS: The effects of 20 mg oral THC on olfaction were assessed in a placebo-controlled, randomized cross-over study in healthy volunteers. Using an established olfactory test (Sniffin' Sticks), olfactory thresholds, odour discrimination and odour identification were assessed in 15 subjects at baseline and 2 h after THC administration. RESULTS: Δ(9) -Tetrahydrocannabinol impaired the performance of subjects (n = 15) in the olfactory test. Specifically, olfactory thresholds were increased and odour discrimination performance was reduced. This resulted in a significant drop in composite threshold, discrimination, identification (TDI) olfactory score by 5.5 points (from 37.7 ± 4.2 to 32.2 ± 5.6, 95% confidence interval for differences THC vs. placebo, -7.8 to -2.0, P = 0.003), which is known to be a subjectively perceptible impairment of olfactory function. CONCLUSIONS: Considering the resurgence of THC in medical use for several pathological conditions, the present results indicate that THC-based analgesics may be accompanied by subjectively noticeable reductions in olfactory acuity. In particular, for patients relying on their sense of smell, this might be relevant information for personalized therapy strategies.


Assuntos
Analgésicos não Narcóticos/efeitos adversos , Discriminação Psicológica/efeitos dos fármacos , Dronabinol/efeitos adversos , Voluntários Saudáveis , Percepção Olfatória/efeitos dos fármacos , Administração Oral , Adulto , Analgésicos não Narcóticos/administração & dosagem , Analgésicos não Narcóticos/farmacologia , Estudos Cross-Over , Interpretação Estatística de Dados , Método Duplo-Cego , Dronabinol/administração & dosagem , Dronabinol/farmacologia , Feminino , Voluntários Saudáveis/psicologia , Humanos , Masculino , Odorantes/análise
19.
J Biomed Inform ; 46(5): 921-8, 2013 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-23896390

RESUMO

BACKGROUND: The association of genotyping information with common traits is not satisfactorily solved. One of the most complex traits is pain and association studies have failed so far to provide reproducible predictions of pain phenotypes from genotypes in the general population despite a well-established genetic basis of pain. We therefore aimed at developing a method able to prospectively and highly accurately predict pain phenotype from the underlying genotype. METHODS: Complex phenotypes and genotypes were obtained from experimental pain data including four different pain stimuli and genotypes with respect to 30 reportedly pain relevant variants in 10 genes. The training data set was obtained in 125 healthy volunteers and the independent prospective test data set was obtained in 89 subjects. The approach involved supervised machine learning. RESULTS: The phenotype-genotype association was reached in three major steps. First, the pain phenotype data was projected and clustered by means of emergent self-organizing map (ESOM) analysis and subsequent U-matrix visualization. Second, pain sub-phenotypes were identified by interpreting the cluster structure using classification and regression tree classifiers. Third, a supervised machine learning algorithm (Unweighted Label Rule generation) was applied to genetic markers reportedly modulating pain to obtain a complex genotype underlying the identified subgroups of subjects with homogenous pain response. This procedure correctly identified 80% of the subjects as belonging to an extreme pain phenotype in an independently and prospectively assessed cohort. CONCLUSION: The developed methodology is a suitable basis for complex genotype-phenotype associations in pain. It may provide personalized treatments of complex traits. Due to its generality, this new method should also be applicable to other association tasks except pain.


Assuntos
Inteligência Artificial , Genótipo , Manejo da Dor/métodos , Fenótipo , Humanos
20.
Sci Rep ; 13(1): 5470, 2023 Apr 04.
Artigo em Inglês | MEDLINE | ID: mdl-37016033

RESUMO

Selecting the k best features is a common task in machine learning. Typically, a few features have high importance, but many have low importance (right-skewed distribution). This report proposes a numerically precise method to address this skewed feature importance distribution in order to reduce a feature set to the informative minimum of items. Computed ABC analysis (cABC) is an item categorization method that aims to identify the most important items by partitioning a set of non-negative numerical items into subsets "A", "B", and "C" such that subset "A" contains the "few important" items based on specific properties of ABC curves defined by their relationship to Lorenz curves. In its recursive form, the cABC analysis can be applied again to subset "A". A generic image dataset and three biomedical datasets (lipidomics and two genomics datasets) with a large number of variables were used to perform the experiments. The experimental results show that the recursive cABC analysis limits the dimensions of the data projection to a minimum where the relevant information is still preserved and directs the feature selection in machine learning to the most important class-relevant information, including filtering feature sets for nonsense variables. Feature sets were reduced to 10% or less of the original variables and still provided accurate classification in data not used for feature selection. cABC analysis, in its recursive variant, provides a computationally precise means of reducing information to a minimum. The minimum is the result of a computation of the number of k most relevant items, rather than a decision to select the k best items from a list. In addition, there are precise criteria for stopping the reduction process. The reduction to the most important features can improve the human understanding of the properties of the data set. The cABC method is implemented in the Python package "cABCanalysis" available at https://pypi.org/project/cABCanalysis/ .

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA