Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 15 de 15
Filtrar
1.
Artif Intell Med ; 148: 102781, 2024 02.
Artigo em Inglês | MEDLINE | ID: mdl-38325926

RESUMO

The Concordance Index (C-index) is a commonly used metric in Survival Analysis for evaluating the performance of a prediction model. In this paper, we propose a decomposition of the C-index into a weighted harmonic mean of two quantities: one for ranking observed events versus other observed events, and the other for ranking observed events versus censored cases. This decomposition enables a finer-grained analysis of the relative strengths and weaknesses between different survival prediction methods. The usefulness of this decomposition is demonstrated through benchmark comparisons against classical models and state-of-the-art methods, together with the new variational generative neural-network-based method (SurVED) proposed in this paper. The performance of the models is assessed using four publicly available datasets with varying levels of censoring. Using the C-index decomposition and synthetic censoring, the analysis shows that deep learning models utilize the observed events more effectively than other models. This allows them to keep a stable C-index in different censoring levels. In contrast to such deep learning methods, classical machine learning models deteriorate when the censoring level decreases due to their inability to improve on ranking the events versus other events.


Assuntos
Aprendizado de Máquina , Redes Neurais de Computação , Análise de Sobrevida
2.
IEEE Trans Pattern Anal Mach Intell ; 46(7): 4763-4779, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38265905

RESUMO

Meta-learning empowers learning systems with the ability to acquire knowledge from multiple tasks, enabling faster adaptation and generalization to new tasks. This review provides a comprehensive technical overview of meta-learning, emphasizing its importance in real-world applications where data may be scarce or expensive to obtain. The article covers the state-of-the-art meta-learning approaches and explores the relationship between meta-learning and multi-task learning, transfer learning, domain adaptation and generalization, self-supervised learning, personalized federated learning, and continual learning. By highlighting the synergies between these topics and the field of meta-learning, the article demonstrates how advancements in one area can benefit the field as a whole, while avoiding unnecessary duplication of efforts. Additionally, the article delves into advanced meta-learning topics such as learning from complex multi-modal task distributions, unsupervised meta-learning, learning to efficiently adapt to data distribution shifts, and continual meta-learning. Lastly, the article highlights open problems and challenges for future research in the field. By synthesizing the latest research developments, this article provides a thorough understanding of meta-learning and its potential impact on various machine learning applications. We believe that this technical overview will contribute to the advancement of meta-learning and its practical implications in addressing real-world problems.

3.
Stud Health Technol Inform ; 302: 378-379, 2023 May 18.
Artigo em Inglês | MEDLINE | ID: mdl-37203694

RESUMO

Synthetic data generation can be applied to Electronic Health Records (EHRs) to obtain synthetic versions that do not compromise patients' privacy. However, the proliferation of synthetic data generation techniques has led to the introduction of a wide variety of methods for evaluating the quality of generated data. This makes the task of evaluating generated data from different models challenging as there is no consensus on the methods used. Hence the need for standard ways of evaluating the generated data. In addition, the available methods do not assess whether dependencies between different variables are maintained in the synthetic data. Furthermore, synthetic time series EHRs (patient encounters) are not well investigated, as the available methods do not consider the temporality of patient encounters. In this work, we present an overview of evaluation methods and propose an evaluation framework to guide the evaluation of synthetic EHRs.


Assuntos
Confidencialidade , Registros Eletrônicos de Saúde , Humanos , Consenso
4.
Patterns (N Y) ; 3(10): 100600, 2022 Oct 14.
Artigo em Inglês | MEDLINE | ID: mdl-36277818

RESUMO

Recent advances in artificial intelligence and deep machine learning have created a step change in how to measure human development indicators, in particular asset-based poverty. The combination of satellite imagery and deep machine learning now has the capability to estimate some types of poverty at a level close to what is achieved with traditional household surveys. An increasingly important issue beyond static estimations is whether this technology can contribute to scientific discovery and, consequently, new knowledge in the poverty and welfare domain. A foundation for achieving scientific insights is domain knowledge, which in turn translates into explainability and scientific consistency. We perform an integrative literature review focusing on three core elements relevant in this context-transparency, interpretability, and explainability-and investigate how they relate to the poverty, machine learning, and satellite imagery nexus. Our inclusion criteria for papers are that they cover poverty/wealth prediction, using survey data as the basis for the ground truth poverty/wealth estimates, be applicable to both urban and rural settings, use satellite images as the basis for at least some of the inputs (features), and the method should include deep neural networks. Our review of 32 papers shows that the status of the three core elements of explainable machine learning (transparency, interpretability, and domain knowledge) is varied and does not completely fulfill the requirements set up for scientific insights and discoveries. We argue that explainability is essential to support wider dissemination and acceptance of this research in the development community and that explainability means more than just interpretability.

5.
Sensors (Basel) ; 22(11)2022 May 30.
Artigo em Inglês | MEDLINE | ID: mdl-35684791

RESUMO

Machine Activity Recognition (MAR) can be used to monitor manufacturing processes and find bottlenecks and potential for improvement in production. Several interesting results on MAR techniques have been produced in the last decade, but mostly on construction equipment. Forklift trucks, which are ubiquitous and highly important industrial machines, have been missing from the MAR research. This paper presents a data-driven method for forklift activity recognition that uses Controller Area Network (CAN) signals and semi-supervised learning (SSL). The SSL enables the utilization of large quantities of unlabeled operation data to build better classifiers; after a two-step post-processing, the recognition results achieve balanced accuracy of 88% for driving activities and 95% for load-handling activities on a hold-out data set. In terms of the Matthews correlation coefficient for five activity classes, the final score is 0.82, which is equal to the recognition results of two non-domain experts who use videos of the activities. A particular success is that context can be used to capture the transport of small weight loads that are not detected by the forklift's built-in weight sensor.


Assuntos
Algoritmos , Aprendizado de Máquina Supervisionado
6.
Sci Rep ; 7(1): 11559, 2017 09 14.
Artigo em Inglês | MEDLINE | ID: mdl-28912582

RESUMO

Several groups have proposed that genotypic determinants in gag and the gp41 cytoplasmic domain (gp41-CD) reduce protease inhibitor (PI) susceptibility without PI-resistance mutations in protease. However, no gag and gp41-CD mutations definitively responsible for reduced PI susceptibility have been identified in individuals with virological failure (VF) while receiving a boosted PI (PI/r)-containing regimen. To identify gag and gp41 mutations under selective PI pressure, we sequenced gag and/or gp41 in 61 individuals with VF on a PI/r (n = 40) or NNRTI (n = 20) containing regimen. We quantified nonsynonymous and synonymous changes in both genes and identified sites exhibiting signal for directional or diversifying selection. We also used published gag and gp41 polymorphism data to highlight mutations displaying a high selection index, defined as changing from a conserved to an uncommon amino acid. Many amino acid mutations developed in gag and in gp41-CD in both the PI- and NNRTI-treated groups. However, in neither gene, were there discernable differences between the two groups in overall numbers of mutations, mutations displaying evidence of diversifying or directional selection, or mutations with a high selection index. If gag and/or gp41 encode PI-resistance mutations, they may not be confined to consistent mutations at a few sites.


Assuntos
Evolução Molecular , Proteína gp41 do Envelope de HIV/genética , Infecções por HIV/tratamento farmacológico , Inibidores da Protease de HIV/uso terapêutico , HIV-1/isolamento & purificação , Ritonavir/uso terapêutico , Produtos do Gene gag do Vírus da Imunodeficiência Humana/genética , Adulto , Feminino , Genótipo , Infecções por HIV/virologia , HIV-1/genética , Humanos , Masculino , Pessoa de Meia-Idade , Mutação , Seleção Genética , Análise de Sequência de DNA , Adulto Jovem
7.
Bioinformatics ; 31(8): 1204-10, 2015 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-25504647

RESUMO

MOTIVATION: Understanding the substrate specificity of human immunodeficiency virus (HIV)-1 protease is important when designing effective HIV-1 protease inhibitors. Furthermore, characterizing and predicting the cleavage profile of HIV-1 protease is essential to generate and test hypotheses of how HIV-1 affects proteins of the human host. Currently available tools for predicting cleavage by HIV-1 protease can be improved. RESULTS: The linear support vector machine with orthogonal encoding is shown to be the best predictor for HIV-1 protease cleavage. It is considerably better than current publicly available predictor services. It is also found that schemes using physicochemical properties do not improve over the standard orthogonal encoding scheme. Some issues with the currently available data are discussed. AVAILABILITY AND IMPLEMENTATION: The datasets used, which are the most important part, are available at the UCI Machine Learning Repository. The tools used are all standard and easily available. CONTACT: thorsteinn.rognvaldsson@hh.se.


Assuntos
Algoritmos , Protease de HIV/química , Protease de HIV/metabolismo , Oligopeptídeos/metabolismo , Inteligência Artificial , Humanos , Máquina de Vetores de Suporte
8.
BMC Bioinformatics ; 10: 149, 2009 May 16.
Artigo em Inglês | MEDLINE | ID: mdl-19445713

RESUMO

BACKGROUND: Proteases of human pathogens are becoming increasingly important drug targets, hence it is necessary to understand their substrate specificity and to interpret this knowledge in practically useful ways. New methods are being developed that produce large amounts of cleavage information for individual proteases and some have been applied to extract cleavage rules from data. However, the hitherto proposed methods for extracting rules have been neither easy to understand nor very accurate. To be practically useful, cleavage rules should be accurate, compact, and expressed in an easily understandable way. RESULTS: A new method is presented for producing cleavage rules for viral proteases with seemingly complex cleavage profiles. The method is based on orthogonal search-based rule extraction (OSRE) combined with spectral clustering. It is demonstrated on substrate data sets for human immunodeficiency virus type 1 (HIV-1) protease and hepatitis C (HCV) NS3/4A protease, showing excellent prediction performance for both HIV-1 cleavage and HCV NS3/4A cleavage, agreeing with observed HCV genotype differences. New cleavage rules (consensus sequences) are suggested for HIV-1 and HCV NS3/4A cleavages. The practical usability of the method is also demonstrated by using it to predict the location of an internal cleavage site in the HCV NS3 protease and to correct the location of a previously reported internal cleavage site in the HCV NS3 protease. The method is fast to converge and yields accurate rules, on par with previous results for HIV-1 protease and better than previous state-of-the-art for HCV NS3/4A protease. Moreover, the rules are fewer and simpler than previously obtained with rule extraction methods. CONCLUSION: A rule extraction methodology by searching for multivariate low-order predicates yields results that significantly outperform existing rule bases on out-of-sample data, but are more transparent to expert users. The approach yields rules that are easy to use and useful for interpreting experimental data.


Assuntos
Interpretação Estatística de Dados , Peptídeo Hidrolases/química , Peptídeo Hidrolases/metabolismo , Inibidores de Proteases/química , Proteômica/métodos , Sequência de Aminoácidos , Domínio Catalítico , Análise por Conglomerados , Simulação por Computador , Bases de Dados de Proteínas , Protease de HIV/química , Protease de HIV/genética , Protease de HIV/metabolismo , Humanos , Peptídeo Hidrolases/genética , Curva ROC , Reprodutibilidade dos Testes , Serina Endopeptidases/química , Serina Endopeptidases/genética , Serina Endopeptidases/metabolismo , Proteínas não Estruturais Virais/química , Proteínas não Estruturais Virais/genética , Proteínas não Estruturais Virais/metabolismo , Proteínas Virais/química , Proteínas Virais/genética , Proteínas Virais/metabolismo
9.
Expert Rev Mol Diagn ; 7(4): 435-51, 2007 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-17620050

RESUMO

HIV-1 protease has a broad and complex substrate specificity, which hitherto has escaped a simple comprehensive definition. This, and the relatively high mutation rate of the retroviral protease, makes it challenging to design effective protease inhibitors. Several attempts have been made during the last two decades to elucidate the enigmatic cleavage specificity of HIV-1 protease and to predict cleavage of novel substrates using bioinformatic analysis methods. This review describes the methods that have been utilized to date to address this important problem and the results achieved. The data sets used are also reviewed and important aspects of these are highlighted.


Assuntos
Biologia Computacional/métodos , Biologia Computacional/tendências , HIV-1/enzimologia , Modelos Químicos , Peptídeo Hidrolases/química , Peptídeo Hidrolases/metabolismo , Humanos , Especificidade por Substrato/fisiologia
10.
Lab Anim (NY) ; 36(3): 36-40, 2007 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-17311048

RESUMO

Individual identification of laboratory rodents typically involves invasive methods, such as tattoos, ear clips, and implanted transponders. Beyond the ethical dilemmas they may present, these methods may cause pain or distress that confounds research results. The authors describe a prototype device for biometric identification of laboratory rodents that would allow researchers to identify rodents without the complications of other methods. The device, which uses the rodent's ear blood vessel pattern as the identifier, is fast, automatic, noninvasive, and painless.


Assuntos
Sistemas de Identificação Animal/veterinária , Biometria/métodos , Algoritmos , Sistemas de Identificação Animal/métodos , Animais , Biometria/instrumentação , Orelha Externa/irrigação sanguínea , Feminino , Masculino , Camundongos , Camundongos Endogâmicos C57BL
11.
J Virol ; 79(19): 12477-86, 2005 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-16160175

RESUMO

Rapidly developing viral resistance to licensed human immunodeficiency virus type 1 (HIV-1) protease inhibitors is an increasing problem in the treatment of HIV-infected individuals and AIDS patients. A rational design of more effective protease inhibitors and discovery of potential biological substrates for the HIV-1 protease require accurate models for protease cleavage specificity. In this study, several popular bioinformatic machine learning methods, including support vector machines and artificial neural networks, were used to analyze the specificity of the HIV-1 protease. A new, extensive data set (746 peptides that have been experimentally tested for cleavage by the HIV-1 protease) was compiled, and the data were used to construct different classifiers that predicted whether the protease would cleave a given peptide substrate or not. The best predictor was a nonlinear predictor using two physicochemical parameters (hydrophobicity, or alternatively polarity, and size) for the amino acids, indicating that these properties are the key features recognized by the HIV-1 protease. The present in silico study provides new and important insights into the workings of the HIV-1 protease at the molecular level, supporting the recent hypothesis that the protease primarily recognizes a conformation rather than a specific amino acid sequence. Furthermore, we demonstrate that the presence of 1 to 2 lysine residues near the cleavage site of octameric peptide substrates seems to prevent cleavage efficiently, suggesting that this positively charged amino acid plays an important role in hindering the activity of the HIV-1 protease.


Assuntos
Biologia Computacional , Protease de HIV/genética , Protease de HIV/metabolismo , HIV-1/enzimologia , Algoritmos , Inteligência Artificial , Simulação por Computador , Protease de HIV/química , HIV-1/efeitos dos fármacos , HIV-1/genética , Redes Neurais de Computação , Especificidade por Substrato
12.
Proteomics ; 4(9): 2594-601, 2004 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-15352234

RESUMO

In order to maximize protein identification by peptide mass fingerprinting noise peaks must be removed from spectra and recalibration is often required. The preprocessing of the spectra before database searching is essential but is time-consuming. Nevertheless, the optimal database search parameters often vary over a batch of samples. For high-throughput protein identification, these factors should be set automatically, with no or little human intervention. In the present work automated batch filtering and recalibration using a statistical filter is described. The filter is combined with multiple data searches that are performed automatically. We show that, using several hundred protein digests, protein identification rates could be more than doubled, compared to standard database searching. Furthermore, automated large-scale in-gel digestion of proteins with endoproteinase LysC, and matrix-assisted laser desorption/ionization-time of flight (MALDI-TOF) analysis, followed by subsequent trypsin digestion and MALDI-TOF analysis were performed. Several proteins could be identified only after digestion with one of the enzymes, and some less significant protein identifications were confirmed after digestion with the other enzyme. The results indicate that identification of especially small and low-abundance proteins could be significantly improved after sequential digestions with two enzymes.


Assuntos
Mapeamento de Peptídeos/métodos , Peptídeos/análise , Proteínas/química , Animais , Interpretação Estatística de Dados , Bases de Dados de Proteínas , Humanos , Mapeamento de Peptídeos/instrumentação , Proteínas/metabolismo , Espectrometria de Massas por Ionização e Dessorção a Laser Assistida por Matriz
13.
Bioinformatics ; 20(18): 3628-35, 2004 Dec 12.
Artigo em Inglês | MEDLINE | ID: mdl-15297302

RESUMO

UNLABELLED: A set of new algorithms and software tools for automatic protein identification using peptide mass fingerprinting is presented. The software is automatic, fast and modular to suit different laboratory needs, and it can be operated either via a Java user interface or called from within scripts. The software modules do peak extraction, peak filtering and protein database matching, and communicate via XML. Individual modules can therefore easily be replaced with other software if desired, and all intermediate results are available to the user. The algorithms are designed to operate without human intervention and contain several novel approaches. The performance and capabilities of the software is illustrated on spectra from different mass spectrometer manufacturers, and the factors influencing successful identification are discussed and quantified. MOTIVATION: Protein identification with mass spectrometric methods is a key step in modern proteomics studies. Some tools are available today for doing different steps in the analysis. Only a few commercial systems integrate all the steps in the analysis, often for only one vendor's hardware, and the details of these systems are not public. RESULTS: A complete system for doing protein identification with peptide mass fingerprints is presented, including everything from peak picking to matching the database protein. The details of the different algorithms are disclosed so that academic researchers can have full control of their tools. AVAILABILITY: The described software tools are available from the Halmstad University website www.hh.se/staff/bioinf/ SUPPLEMENTARY INFORMATION: Details of the algorithms are described in supporting information available from the Halmstad University website www.hh.se/staff/bioinf/


Assuntos
Mapeamento de Peptídeos/métodos , Proteínas/química , Alinhamento de Sequência/métodos , Análise de Sequência de Proteína/métodos , Software , Espectrometria de Massas por Ionização e Dessorção a Laser Assistida por Matriz/métodos , Interface Usuário-Computador , Algoritmos , Sistemas de Gerenciamento de Base de Dados , Documentação/métodos , Armazenamento e Recuperação da Informação/métodos , Linguagens de Programação , Proteínas/análise
14.
Artigo em Inglês | MEDLINE | ID: mdl-15203031

RESUMO

An automated peak picking strategy is presented where several peak sets with different signal-to-noise levels are combined to form a more reliable statement on the protein identity. The strategy is compared against both manual peak picking and industry standard automated peak picking on a set of mass spectra obtained after tryptic in gel digestion of 2D-gel samples from human fetal fibroblasts. The set of spectra contain samples ranging from strong to weak spectra, and the proposed multiple-scale method is shown to be much better on weak spectra than the industry standard method and a human operator, and equal in performance to these on strong and medium strong spectra. It is also demonstrated that peak sets selected by a human operator display a considerable variability and that it is impossible to speak of a single "true" peak set for a given spectrum. The described multiple-scale strategy both avoids time-consuming parameter tuning and exceeds the human operator in protein identification efficiency. The strategy therefore promises reliable automated user-independent protein identification using peptide mass fingerprints.


Assuntos
Peptídeos/química , Linhagem Celular , Eletroforese em Gel Bidimensional , Humanos , Peso Molecular , Mapeamento de Peptídeos , Espectrometria de Massas por Ionização e Dessorção a Laser Assistida por Matriz
15.
Bioinformatics ; 20(11): 1702-9, 2004 Jul 22.
Artigo em Inglês | MEDLINE | ID: mdl-14988129

RESUMO

UNLABELLED: Several papers have been published where nonlinear machine learning algorithms, e.g. artificial neural networks, support vector machines and decision trees, have been used to model the specificity of the HIV-1 protease and extract specificity rules. We show that the dataset used in these studies is linearly separable and that it is a misuse of nonlinear classifiers to apply them to this problem. The best solution on this dataset is achieved using a linear classifier like the simple perceptron or the linear support vector machine, and it is straightforward to extract rules from these linear models. We identify key residues in peptides that are efficiently cleaved by the HIV-1 protease and list the most prominent rules, relating them to experimental results for the HIV-1 protease. MOTIVATION: Understanding HIV-1 protease specificity is important when designing HIV inhibitors and several different machine learning algorithms have been applied to the problem. However, little progress has been made in understanding the specificity because nonlinear and overly complex models have been used. RESULTS: We show that the problem is much easier than what has previously been reported and that linear classifiers like the simple perceptron or linear support vector machines are at least as good predictors as nonlinear algorithms. We also show how sets of specificity rules can be generated from the resulting linear classifiers. AVAILABILITY: The datasets used are available at http://www.hh.se/staff/bioinf/


Assuntos
Algoritmos , Inteligência Artificial , Protease de HIV/química , Redes Neurais de Computação , Reconhecimento Automatizado de Padrão , Mapeamento de Interação de Proteínas/métodos , Análise de Sequência de Proteína/métodos , Sítios de Ligação , Simulação por Computador , Bases de Dados de Proteínas , Ativação Enzimática , Modelos Lineares , Modelos Químicos , Dinâmica não Linear , Ligação Proteica , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Alinhamento de Sequência/métodos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA