Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
1.
Sensors (Basel) ; 19(20)2019 Oct 15.
Artículo en Inglés | MEDLINE | ID: mdl-31618929

RESUMEN

Understanding and correctly modeling urban mobility is a crucial issue for the development of smart cities. The estimation of individual trips from mobile phone positioning data (i.e., call detail records (CDR)) can naturally support urban and transport studies as well as marketing applications. Individual trips are often aggregated in an origin-destination (OD) matrix counting the number of trips from a given origin to a given destination. In the literature dealing with CDR data there are two main approaches to extract OD matrices from such data: (a) in time-based matrices, the analysis focuses on estimating mobility directly from a sequence of CDRs; (b) in routine-based matrices (OD by purpose) the analysis focuses on routine kind of movements, like home-work commute, derived from a trip generation model. In both cases, the OD matrix measured by CDR counts is scaled to match the actual number of people moving in the area, and projected to the road network to estimate actual flows on the streets. In this paper, we describe prototypical approaches to estimate OD matrices, describe an actual implementation, and present a number of experiments to evaluate the results from multiple perspectives.

2.
Genome Res ; 21(6): 898-907, 2011 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-21482623

RESUMEN

High-throughput X-ray absorption spectroscopy was used to measure transition metal content based on quantitative detection of X-ray fluorescence signals for 3879 purified proteins from several hundred different protein families generated by the New York SGX Research Center for Structural Genomics. Approximately 9% of the proteins analyzed showed the presence of transition metal atoms (Zn, Cu, Ni, Co, Fe, or Mn) in stoichiometric amounts. The method is highly automated and highly reliable based on comparison of the results to crystal structure data derived from the same protein set. To leverage the experimental metalloprotein annotations, we used a sequence-based de novo prediction method, MetalDetector, to identify Cys and His residues that bind to transition metals for the redundancy reduced subset of 2411 sequences sharing <70% sequence identity and having at least one His or Cys. As the HT-XAS identifies metal type and protein binding, while the bioinformatics analysis identifies metal- binding residues, the results were combined to identify putative metal-binding sites in the proteins and their associated families. We explored the combination of this data with homology models to generate detailed structure models of metal-binding sites for representative proteins. Finally, we used extended X-ray absorption fine structure data from two of the purified Zn metalloproteins to validate predicted metalloprotein binding site structures. This combination of experimental and bioinformatics approaches provides comprehensive active site analysis on the genome scale for metalloproteins as a class, revealing new insights into metalloprotein structure and function.


Asunto(s)
Metaloproteínas/química , Programas Informáticos , Espectroscopía de Absorción de Rayos X/métodos , Sitios de Unión/genética , Biología Computacional/métodos , Fluorescencia , Genómica/métodos , Metales Pesados/análisis , Sincrotrones
3.
J Chem Inf Model ; 54(8): 2380-90, 2014 Aug 25.
Artículo en Inglés | MEDLINE | ID: mdl-25068386

RESUMEN

Optical chemical structure recognition is the problem of converting a bitmap image containing a chemical structure formula into a standard structured representation of the molecule. We introduce a novel approach to this problem based on the pipelined integration of pattern recognition techniques with probabilistic knowledge representation and reasoning. Basic entities and relations (such as textual elements, points, lines, etc.) are first extracted by a low-level processing module. A probabilistic reasoning engine based on Markov logic, embodying chemical and graphical knowledge, is subsequently used to refine these pieces of information. An annotated connection table of atoms and bonds is finally assembled and converted into a standard chemical exchange format. We report a successful evaluation on two large image data sets, showing that the method compares favorably with the current state-of-the-art, especially on degraded low-resolution images. The system is available as a web server at http://mlocsr.dinfo.unifi.it.


Asunto(s)
Cadenas de Markov , Reconocimiento de Normas Patrones Automatizadas/estadística & datos numéricos , Bibliotecas de Moléculas Pequeñas/química , Programas Informáticos , Gráficos por Computador , Bases de Datos de Compuestos Químicos , Procesamiento de Imagen Asistido por Computador
4.
Nucleic Acids Res ; 39(Web Server issue): W288-92, 2011 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-21576237

RESUMEN

MetalDetector identifies CYS and HIS involved in transition metal protein binding sites, starting from sequence alone. A major new feature of release 2.0 is the ability to predict which residues are jointly involved in the coordination of the same metal ion. The server is available at http://metaldetector.dsi.unifi.it/v2.0/.


Asunto(s)
Metaloproteínas/química , Metales/química , Programas Informáticos , Sitios de Unión , Cisteína/química , Histidina/química , Internet , Análisis de Secuencia de Proteína
5.
Front Public Health ; 10: 945181, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35923956

RESUMEN

Background: The COVID-19 pandemic prompted the scientific community to share timely evidence, also in the form of pre-printed papers, not peer reviewed yet. Purpose: To develop an artificial intelligence system for the analysis of the scientific literature by leveraging on recent developments in the field of Argument Mining. Methodology: Scientific quality criteria were borrowed from two selected Cochrane systematic reviews. Four independent reviewers gave a blind evaluation on a 1-5 scale to 40 papers for each review. These scores were matched with the automatic analysis performed by an AM system named MARGOT, which detected claims and supporting evidence for the cited papers. Outcomes were evaluated with inter-rater indices (Cohen's Kappa, Krippendorff's Alpha, s* statistics). Results: MARGOT performs differently on the two selected Cochrane reviews: the inter-rater indices show a fair-to-moderate agreement of the most relevant MARGOT metrics both with Cochrane and the skilled interval scores, with larger values for one of the two reviews. Discussion and conclusions: The noted discrepancy could rely on a limitation of the MARGOT system that can be improved; yet, the level of agreement between human reviewers also suggests a different complexity between the two reviews in debating controversial arguments. These preliminary results encourage to expand and deepen the investigation to other topics and a larger number of highly specialized reviewers, to reduce uncertainty in the evaluation process, thus supporting the retraining of AM systems.


Asunto(s)
Inteligencia Artificial , COVID-19 , COVID-19/diagnóstico , COVID-19/epidemiología , Humanos , Pandemias , Reproducibilidad de los Resultados , Investigación
6.
IEEE Trans Neural Netw Learn Syst ; 32(10): 4291-4308, 2021 10.
Artículo en Inglés | MEDLINE | ID: mdl-32915750

RESUMEN

Attention is an increasingly popular mechanism used in a wide range of neural architectures. The mechanism itself has been realized in a variety of formats. However, because of the fast-paced advances in this domain, a systematic overview of attention is still missing. In this article, we define a unified model for attention architectures in natural language processing, with a focus on those designed to work with vector representations of the textual data. We propose a taxonomy of attention models according to four dimensions: the representation of the input, the compatibility function, the distribution function, and the multiplicity of the input and/or output. We present the examples of how prior information can be exploited in attention models and discuss ongoing research efforts and open challenges in the area, providing the first extensive categorization of the vast body of literature in this exciting domain.

7.
Cogn Sci ; 45(6): e13009, 2021 06.
Artículo en Inglés | MEDLINE | ID: mdl-34170027

RESUMEN

The investigation of visual categorization has recently been aided by the introduction of deep convolutional neural networks (CNNs), which achieve unprecedented accuracy in picture classification after extensive training. Even if the architecture of CNNs is inspired by the organization of the visual brain, the similarity between CNN and human visual processing remains unclear. Here, we investigated this issue by engaging humans and CNNs in a two-class visual categorization task. To this end, pictures containing animals or vehicles were modified to contain only low/high spatial frequency (HSF) information, or were scrambled in the phase of the spatial frequency spectrum. For all types of degradation, accuracy increased as degradation was reduced for both humans and CNNs; however, the thresholds for accurate categorization varied between humans and CNNs. More remarkable differences were observed for HSF information compared to the other two types of degradation, both in terms of overall accuracy and image-level agreement between humans and CNNs. The difficulty with which the CNNs were shown to categorize high-passed natural scenes was reduced by picture whitening, a procedure which is inspired by how visual systems process natural images. The results are discussed concerning the adaptation to regularities in the visual environment (scene statistics); if the visual characteristics of the environment are not learned by CNNs, their visual categorization may depend only on a subset of the visual information on which humans rely, for example, on low spatial frequency information.


Asunto(s)
Redes Neurales de la Computación , Percepción Visual , Animales , Encéfalo , Mapeo Encefálico , Humanos
8.
Bioinformatics ; 25(18): 2326-33, 2009 Sep 15.
Artículo en Inglés | MEDLINE | ID: mdl-19592394

RESUMEN

MOTIVATION: Accurate prediction of contacts between beta-strand residues can significantly contribute towards ab initio prediction of the 3D structure of many proteins. Contacts in the same protein are highly interdependent. Therefore, significant improvements can be expected by applying statistical relational learners that overcome the usual machine learning assumption that examples are independent and identically distributed. Furthermore, the dependencies among beta-residue contacts are subject to strong regularities, many of which are known a priori. In this article, we take advantage of Markov logic, a statistical relational learning framework that is able to capture dependencies between contacts, and constrain the solution according to domain knowledge expressed by means of weighted rules in a logical language. RESULTS: We introduce a novel hybrid architecture based on neural and Markov logic networks with grounding-specific weights. On a non-redundant dataset, our method achieves 44.9% F(1) measure, with 47.3% precision and 42.7% recall, which is significantly better (P < 0.01) than previously reported performance obtained by 2D recursive neural networks. Our approach also significantly improves the number of chains for which beta-strands are nearly perfectly paired (36% of the chains are predicted with F(1) >or= 70% on coarse map). It also outperforms more general contact predictors on recent CASP 2008 targets.


Asunto(s)
Cadenas de Markov , Redes Neurales de la Computación , Proteínas/química , Biología Computacional/métodos , Bases de Datos de Proteínas , Conformación Proteica
9.
Comput Methods Programs Biomed ; 185: 105153, 2020 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-31678792

RESUMEN

BACKGROUND AND OBJECTIVES: Malignant lymphomas are cancers of the immune system and are characterized by enlarged lymph nodes that typically spread across many different sites. Many different histological subtypes exist, whose diagnosis is typically based on sampling (biopsy) of a single tumor site, whereas total body examinations with computed tomography and positron emission tomography, though not diagnostic, are able to provide a comprehensive picture of the patient. In this work, we exploit a data-driven approach based on multiple-instance learning algorithms and texture analysis features extracted from positron emission tomography, to predict differential diagnosis of the main malignant lymphomas subtypes. METHODS: We exploit a multiple-instance learning setting where support vector machines and random forests are used as classifiers both at the level of single VOIs (instances) and at the level of patients (bags). We present results on two datasets comprising patients that suffer from four different types of malignant lymphomas, namely diffuse large B cell lymphoma, follicular lymphoma, Hodgkin's lymphoma, and mantle cell lymphoma. RESULTS: Despite the complexity of the task, experimental results show that, with sufficient data samples, some cancer subtypes, such as the Hodgkin's lymphoma, can be identified from texture information: in particular, we achieve a 97.0% of sensitivity (recall) and a 94.1% of predictive positive value (precision) on a dataset that consists in 60 patients. CONCLUSIONS: The presented study indicates that texture analysis features extracted from positron emission tomography, combined with multiple-instance machine learning algorithms, can be discriminating for different malignant lymphomas subtypes.


Asunto(s)
Linfoma/clasificación , Aprendizaje Automático , Algoritmos , Conjuntos de Datos como Asunto , Humanos , Linfoma/diagnóstico por imagen , Tomografía de Emisión de Positrones/métodos , Sensibilidad y Especificidad , Máquina de Vectores de Soporte , Tomografía Computarizada por Rayos X/métodos
10.
Bioinformatics ; 24(18): 2094-5, 2008 Sep 15.
Artículo en Inglés | MEDLINE | ID: mdl-18635571

RESUMEN

UNLABELLED: The web server MetalDetector classifies histidine residues in proteins into one of two states (free or metal bound) and cysteines into one of three states (free, metal bound or disulfide bridged). A decision tree integrates predictions from two previously developed methods (DISULFIND and Metal Ligand Predictor). Cross-validated performance assessment indicates that our server predicts disulfide bonding state at 88.6% precision and 85.1% recall, while it identifies cysteines and histidines in transition metal-binding sites at 79.9% precision and 76.8% recall, and at 60.8% precision and 40.7% recall, respectively. AVAILABILITY: Freely available at http://metaldetector.dsi.unifi.it. SUPPLEMENTARY INFORMATION: Details and data can be found at http://metaldetector.dsi.unifi.it/help.php.


Asunto(s)
Biología Computacional/métodos , Cisteína/química , Disulfuros/química , Histidina/química , Metaloproteínas/química , Análisis de Secuencia de Proteína , Secuencia de Aminoácidos , Sitios de Unión , Simulación por Computador , Bases de Datos de Proteínas , Disulfuros/metabolismo , Internet , Metaloproteínas/metabolismo , Datos de Secuencia Molecular , Alineación de Secuencia
11.
Front Big Data ; 2: 52, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-33693375

RESUMEN

Deep learning is bringing remarkable contributions to the field of argumentation mining, but the existing approaches still need to fill the gap toward performing advanced reasoning tasks. In this position paper, we posit that neural-symbolic and statistical relational learning could play a crucial role in the integration of symbolic and sub-symbolic methods to achieve this goal.

12.
IEEE Trans Neural Netw Learn Syst ; 30(11): 3326-3337, 2019 11.
Artículo en Inglés | MEDLINE | ID: mdl-30951479

RESUMEN

Long short-term memory (LSTM) networks have recently shown remarkable performance in several tasks that are dealing with natural language generation, such as image captioning or poetry composition. Yet, only few works have analyzed text generated by LSTMs in order to quantitatively evaluate to which extent such artificial texts resemble those generated by humans. We compared the statistical structure of LSTM-generated language to that of written natural language, and to those produced by Markov models of various orders. In particular, we characterized the statistical structure of language by assessing word-frequency statistics, long-range correlations, and entropy measures. Our main finding is that while both LSTM- and Markov-generated texts can exhibit features similar to real ones in their word-frequency statistics and entropy measures, LSTM-texts are shown to reproduce long-range correlations at scales comparable to those found in natural language. Moreover, for LSTM networks, a temperature-like parameter controlling the generation process shows an optimal value-for which the produced texts are closest to real language-consistent across different statistical features investigated.


Asunto(s)
Cadenas de Markov , Memoria a Largo Plazo , Procesamiento de Lenguaje Natural , Redes Neurales de la Computación , Humanos
14.
Artículo en Inglés | MEDLINE | ID: mdl-21606549

RESUMEN

Prediction of binding sites from sequence can significantly help toward determining the function of uncharacterized proteins on a genomic scale. The task is highly challenging due to the enormous amount of alternative candidate configurations. Previous research has only considered this prediction problem starting from 3D information. When starting from sequence alone, only methods that predict the bonding state of selected residues are available. The sole exception consists of pattern-based approaches, which rely on very specific motifs and cannot be applied to discover truly novel sites. We develop new algorithmic ideas based on structured-output learning for determining transition-metal-binding sites coordinated by cysteines and histidines. The inference step (retrieving the best scoring output) is intractable for general output types (i.e., general graphs). However, under the assumption that no residue can coordinate more than one metal ion, we prove that metal binding has the algebraic structure of a matroid, allowing us to employ a very efficient greedy algorithm. We test our predictor in a highly stringent setting where the training set consists of protein chains belonging to SCOP folds different from the ones used for accuracy estimation. In this setting, our predictor achieves 56 percent precision and 60 percent recall in the identification of ligand-ion bonds.


Asunto(s)
Sitios de Unión , Biología Computacional/métodos , Metales , Proteínas , Análisis de Secuencia de Proteína/métodos , Secuencia de Aminoácidos , Bases de Datos de Proteínas , Metales/química , Metales/metabolismo , Datos de Secuencia Molecular , Proteínas/química , Proteínas/metabolismo
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA