Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 31
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
País de afiliación
Intervalo de año de publicación
1.
Catheter Cardiovasc Interv ; 102(4): 631-640, 2023 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-37579212

RESUMEN

BACKGROUND: Visual assessment of the percentage diameter stenosis (%DSVE ) of lesions is essential in coronary angiography (CAG) interpretation. We have previously developed an artificial intelligence (AI) model capable of accurate CAG segmentation. We aim to compare operators' %DSVE in angiography versus AI-segmented images. METHODS: Quantitative coronary analysis (QCA) %DS (%DSQCA ) was previously performed in our published validation dataset. Operators were asked to estimate %DSVE of lesions in angiography versus AI-segmented images in separate sessions and differences were assessed using angiography %DSQCA as reference. RESULTS: A total of 123 lesions were included. %DSVE was significantly higher in both the angiography (77% ± 20% vs. 56% ± 13%, p < 0.001) and segmentation groups (59% ± 20% vs. 56% ± 13%, p < 0.001), with a much smaller absolute %DS difference in the latter. For lesions with %DSQCA of 50%-70% (60% ± 5%), an even higher discrepancy was found (angiography: 83% ± 13% vs. 60% ± 5%, p < 0.001; segmentation: 63% ± 15% vs. 60% ± 5%, p < 0.001). Similar, less pronounced, findings were observed for %DSQCA < 50% lesions, but not %DSQCA > 70% lesions. Agreement between %DSQCA /%DSVE across %DSQCA strata (<50%, 50%-70%, >70%) was approximately twice in the segmentation group (60.4% vs. 30.1%; p < 0.001). %DSVE inter-operator differences were smaller with segmentation. CONCLUSION: %DSVE was much less discrepant with segmentation versus angiography. Overestimation of %DSQCA < 70% lesions with angiography was especially common. Segmentation may reduce %DSVE overestimation and thus unwarranted revascularization.

2.
Proc Natl Acad Sci U S A ; 119(23): e2205971119, 2022 06 07.
Artículo en Inglés | MEDLINE | ID: mdl-35609191
3.
ArXiv ; 2024 Apr 11.
Artículo en Inglés | MEDLINE | ID: mdl-38903738

RESUMEN

Whole Slide Images (WSI), obtained by high-resolution digital scanning of microscope slides at multiple scales, are the cornerstone of modern Digital Pathology. However, they represent a particular challenge to AI-based/AI-mediated analysis because pathology labeling is typically done at slide-level, instead of tile-level. It is not just that medical diagnostics is recorded at the specimen level, the detection of oncogene mutation is also experimentally obtained, and recorded by initiatives like The Cancer Genome Atlas (TCGA), at the slide level. This configures a dual challenge: a) accurately predicting the overall cancer phenotype and b) finding out what cellular morphologies are associated with it at the tile level. To address these challenges, a weakly supervised Multiple Instance Learning (MIL) approach was explored for two prevalent cancer types, Invasive Breast Carcinoma (TCGA-BRCA) and Lung Squamous Cell Carcinoma (TCGA-LUSC). This approach was explored for tumor detection at low magnification levels and TP53 mutations at various levels. Our results show that a novel additive implementation of MIL matched the performance of reference implementation (AUC 0.96), and was only slightly outperformed by Attention MIL (AUC 0.97). More interestingly from the perspective of the molecular pathologist, these different AI architectures identify distinct sensitivities to morphological features (through the detection of Regions of Interest, RoI) at different amplification levels. Tellingly, TP53 mutation was most sensitive to features at the higher applications where cellular morphology is resolved.

4.
J Invasive Cardiol ; 36(3)2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38441988

RESUMEN

OBJECTIVES: Coronary angiography (CAG)-derived physiology methods have been developed in an attempt to simplify and increase the usage of coronary physiology, based mostly on dynamic fluid computational algorithms. We aimed to develop a different approach based on artificial intelligence methods, which has seldom been explored. METHODS: Consecutive patients undergoing invasive instantaneous free-wave ratio (iFR) measurements were included. We developed artificial intelligence (AI) models capable of classifying target lesions as positive (iFR ≤ 0.89) or negative (iFR > 0.89). The predictions were then compared to the true measurements. RESULTS: Two hundred-fifty measurements were included, and 3 models were developed. Model 3 had the best overall performance: accuracy, negative predictive value (NPV), positive predictive value (PPV), sensitivity, and specificity were 69%, 88%, 44%, 74%, and 67%, respectively. Performance differed per target vessel. For the left anterior descending artery (LAD), model 3 had the highest accuracy (66%), while model 2 the highest NPV (86%) and sensitivity (91%). PPV was always low/modest. Model 1 had the highest specificity (68%). For the right coronary artery, model 1's accuracy was 86%, NPV was 97%, and specificity was 87%, but all models had low PPV (maximum 25%) and low/modest sensitivity (maximum 60%). For the circumflex, model 1 performed best: accuracy, NPV, PPV, sensitivity, and specificity were 69%, 96%, 24%, 80%, and 68%, respectively. CONCLUSIONS: We developed 3 AI models capable of binary iFR estimation from CAG images. Despite modest accuracy, the consistently high NPV is of potential clinical significance, as it would enable avoiding further invasive maneuvers after CAG. This pivotal study offers proof of concept for further development.


Asunto(s)
Inteligencia Artificial , Aprendizaje Profundo , Humanos , Proyectos Piloto , Rayos X , Angiografía Coronaria
5.
China CDC Wkly ; 6(21): 478-486, 2024 May 24.
Artículo en Inglés | MEDLINE | ID: mdl-38854463

RESUMEN

Background: This study provides a detailed analysis of the daily fluctuations in coronavirus disease 2019 (COVID-19) case numbers in London from January 31, 2020 to February 24, 2022. The primary objective was to enhance understanding of the interactions among government pandemic responses, viral mutations, and the subsequent changes in COVID-19 case incidences. Methods: We employed the adaptive Fourier decomposition (AFD) method to analyze diurnal changes and further segmented the AFD into novel multi-component groups consisting of one to three elements. These restructured components were rigorously evaluated using Pearson correlation, and their effectiveness was compared with other signal analysis techniques. This study introduced a novel approach to differentiate individual components across various time-frequency scales using basis decomposition methods. Results: Analysis of London's daily COVID-19 data using AFD revealed a strong correlation between the "stay at home" directive and high-frequency components during the first epidemic wave. This indicates the need for sustained implementation of vaccination policies to maintain their effectiveness. Discussion: The AFD component method provides a comprehensive analysis of the immediate and prolonged impact of governmental policies on the spread of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). This robust tool has proven invaluable for analyzing COVID-19 pandemic data, offering critical insights that guide the formulation of future preventive and public health strategies.

6.
Sci Rep ; 13(1): 467, 2023 01 10.
Artículo en Inglés | MEDLINE | ID: mdl-36627317

RESUMEN

Given the inherent complexity of the human nervous system, insight into the dynamics of brain activity can be gained from studying smaller and simpler organisms. While some of the potential target organisms are simple enough that their behavioural and structural biology might be well-known and understood, others might still lead to computationally intractable models that require extensive resources to simulate. Since such organisms are frequently only acting as proxies to further our understanding of underlying phenomena or functionality, often one is not interested in the detailed evolution of every single neuron in the system. Instead, it is sufficient to observe the subset of neurons that capture the effect that the profound nonlinearities of the neuronal system have in response to different stimuli. In this paper, we consider the well-known nematode Caenorhabditis elegans and seek to investigate the possibility of generating lower complexity models that capture the system's dynamics with low error using only measured or simulated input-output information. Such models are often termed black-box models. We show how the nervous system of C. elegans can be modelled and simulated with data-driven models using different neural network architectures. Specifically, we target the use of state-of-the-art recurrent neural network architectures such as Long Short-Term Memory and Gated Recurrent Units and compare these architectures in terms of their properties and their accuracy (Root Mean Square Error), as well as the complexity of the resulting models. We show that Gated Recurrent Unit models with a hidden layer size of 4 are able to accurately reproduce the system response to very different stimuli. We furthermore explore the relative importance of their inputs as well as scalability to more scenarios.


Asunto(s)
Caenorhabditis elegans , Fenómenos Fisiológicos del Sistema Nervioso , Animales , Humanos , Caenorhabditis elegans/fisiología , Redes Neurales de la Computación , Neuronas/fisiología , Aprendizaje
7.
Diagnostics (Basel) ; 13(24)2023 Dec 05.
Artículo en Inglés | MEDLINE | ID: mdl-38132189

RESUMEN

Accurately predicting functional outcomes in stroke patients remains challenging yet clinically relevant. While brain CTs provide prognostic information, their practical value for outcome prediction is unclear. We analyzed a multi-center cohort of 743 ischemic stroke patients (<72 h onset), including their admission brain NCCT and CTA scans as well as their clinical data. Our goal was to predict the patients' future functional outcome, measured by the 3-month post-stroke modified Rankin Scale (mRS), dichotomized into good (mRS ≤ 2) and poor (mRS > 2). To this end, we developed deep learning models to predict the outcome from CT data only, and models that incorporate other patient variables. Three deep learning architectures were tested in the image-only prediction, achieving 0.779 ± 0.005 AUC. In addition, we created a model fusing imaging and tabular data by feeding the output of a deep learning model trained to detect occlusions on CT angiograms into our prediction framework, which achieved an AUC of 0.806 ± 0.082. These findings highlight how further refinement of prognostic models incorporating both image biomarkers and clinical data could enable more accurate outcome prediction for ischemic stroke patients.

8.
Rev Port Cardiol ; 42(7): 643-651, 2023 07.
Artículo en Inglés, Portugués | MEDLINE | ID: mdl-37001583

RESUMEN

INTRODUCTION: Pulmonary embolism (PE) is a life-threatening condition, in which diagnostic uncertainty remains high given the lack of specificity in clinical presentation. It requires confirmation by computed tomography pulmonary angiography (CTPA). Electrocardiography (ECG) signals can be detected by artificial intelligence (AI) with precision. The purpose of this study was to develop an AI model for predicting PE using a 12-lead ECG. METHODS: We extracted 1014 ECGs from patients admitted to the emergency department who underwent CTPA due to suspected PE: 911 ECGs were used for development of the AI model and 103 ECGs for validation. An AI algorithm based on an ensemble neural network was developed. The performance of the AI model was compared against the guideline recommended clinical prediction rules for PE (Wells and Geneva scores combined with a standard D-dimer cut-off of 500 ng/mL and an age-adjusted cut-off, PEGeD and YEARS algorithm). RESULTS: The AI model achieves greater specificity to detect PE than the commonly used clinical prediction rules. The AI model shown a specificity of 100% (95% confidence interval (CI): 94-100) and a sensitivity of 50% (95% CI: 33-67). The AI model performed significantly better than the other models (area under the curve 0.75; 95% CI 0.66-0.82; p<0.001), which had nearly no discriminative power. The incidence of typical PE ECG features was similar in patients with and without PE. CONCLUSION: We developed and validated a deep learning-based AI model for PE diagnosis using a 12-lead ECG and it demonstrated high specificity.


Asunto(s)
Inteligencia Artificial , Embolia Pulmonar , Humanos , Embolia Pulmonar/diagnóstico , Aprendizaje Automático , Electrocardiografía/métodos , Estudios Retrospectivos
9.
Int J Cardiovasc Imaging ; 39(7): 1385-1396, 2023 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-37027105

RESUMEN

INTRODUCTION: We previously developed an artificial intelligence (AI) model for automatic coronary angiography (CAG) segmentation, using deep learning. To validate this approach, the model was applied to a new dataset and results are reported. METHODS: Retrospective selection of patients undergoing CAG and percutaneous coronary intervention or invasive physiology assessment over a one month period from four centers. A single frame was selected from images containing a lesion with a 50-99% stenosis (visual estimation). Automatic Quantitative Coronary Analysis (QCA) was performed with a validated software. Images were then segmented by the AI model. Lesion diameters, area overlap [based on true positive (TP) and true negative (TN) pixels] and a global segmentation score (GSS - 0 -100 points) - previously developed and published - were measured. RESULTS: 123 regions of interest from 117 images across 90 patients were included. There were no significant differences between lesion diameter, percentage diameter stenosis and distal border diameter between the original/segmented images. There was a statistically significant albeit minor difference [0,19 mm (0,09-0,28)] regarding proximal border diameter. Overlap accuracy ((TP + TN)/(TP + TN + FP + FN)), sensitivity (TP / (TP + FN)) and Dice Score (2TP / (2TP + FN + FP)) between original/segmented images was 99,9%, 95,1% and 94,8%, respectively. The GSS was 92 (87-96), similar to the previously obtained value in the training dataset. CONCLUSION: the AI model was capable of accurate CAG segmentation across multiple performance metrics, when applied to a multicentric validation dataset. This paves the way for future research on its clinical uses.


Asunto(s)
Estenosis Coronaria , Aprendizaje Profundo , Humanos , Estenosis Coronaria/diagnóstico por imagen , Estenosis Coronaria/terapia , Inteligencia Artificial , Constricción Patológica , Estudios Retrospectivos , Rayos X , Valor Predictivo de las Pruebas , Angiografía Coronaria/métodos
10.
Front Public Health ; 11: 1259084, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-38106897

RESUMEN

Background: As China amends its "zero COVID" strategy, a sudden increase in the number of infections may overwhelm medical resources and its impact has not been quantified. Specific mitigation strategies are needed to minimize disruption to the healthcare system and to prepare for the next possible epidemic in advance. Method: We develop a stochastic compartmental model to project the burden on the medical system (that is, the number of fever clinic visits and admission beds) of China after adjustment to COVID-19 policy, which considers the epidemiological characteristics of the Omicron variant, age composition of the population, and vaccine effectiveness against infection and severe COVD-19. We also estimate the effect of four-dose vaccinations (heterologous and homologous), antipyretic drug supply, non-pharmacological interventions (NPIs), and triage treatment on mitigating the domestic infection peak. Result: As to the impact on the medical system, this epidemic is projected to result in 398.02 million fever clinic visits and 16.58 million hospitalizations, and the disruption period on the healthcare system is 18 and 30 days, respectively. Antipyretic drug supply and booster vaccination could reduce the burden on emergency visits and hospitalization, respectively, while neither of them could not reduce to the current capacity. The synergy of several different strategies suggests that increasing the heterologous booster vaccination rate for older adult to over 90% is a key measure to alleviate the bed burden for respiratory diseases on the basis of expanded healthcare resource allocation. Conclusion: The Omicron epidemic followed the adjustment to COVID-19 policy overloading many local health systems across the country at the end of 2022. The combined effect of vaccination, antipyretic drug supply, triage treatment, and PHSMs could prevent overwhelming medical resources.


Asunto(s)
Antipiréticos , COVID-19 , Humanos , Anciano , Antipiréticos/uso terapéutico , COVID-19/epidemiología , COVID-19/prevención & control , SARS-CoV-2 , China/epidemiología , Fiebre , Políticas
11.
Bioinformatics ; 27(22): 3149-57, 2011 Nov 15.
Artículo en Inglés | MEDLINE | ID: mdl-21965816

RESUMEN

MOTIVATION: Uncovering mechanisms underlying gene expression control is crucial to understand complex cellular responses. Studies in gene regulation often aim to identify regulatory players involved in a biological process of interest, either transcription factors coregulating a set of target genes or genes eventually controlled by a set of regulators. These are frequently prioritized with respect to a context-specific relevance score. Current approaches rely on relevance measures accounting exclusively for direct transcription factor-target interactions, namely overrepresentation of binding sites or target ratios. Gene regulation has, however, intricate behavior with overlapping, indirect effect that should not be neglected. In addition, the rapid accumulation of regulatory data already enables the prediction of large-scale networks suitable for higher level exploration by methods based on graph theory. A paradigm shift is thus emerging, where isolated and constrained analyses will likely be replaced by whole-network, systemic-aware strategies. RESULTS: We present TFRank, a graph-based framework to prioritize regulatory players involved in transcriptional responses within the regulatory network of an organism, whereby every regulatory path containing genes of interest is explored and incorporated into the analysis. TFRank selected important regulators of yeast adaptation to stress induced by quinine and acetic acid, which were missed by a direct effect approach. Notably, they reportedly confer resistance toward the chemicals. In a preliminary study in human, TFRank unveiled regulators involved in breast tumor growth and metastasis when applied to genes whose expression signatures correlated with short interval to metastasis.


Asunto(s)
Regulación de la Expresión Génica , Redes Reguladoras de Genes , Factores de Transcripción/metabolismo , Transcripción Genética , Ácido Acético/farmacología , Sitios de Unión , Humanos , Metástasis de la Neoplasia , Quinina/farmacología , Saccharomyces cerevisiae/efectos de los fármacos , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/metabolismo , Transcripción Genética/efectos de los fármacos
12.
Rev Port Cardiol ; 41(12): 1011-1021, 2022 12.
Artículo en Inglés, Portugués | MEDLINE | ID: mdl-36511271

RESUMEN

INTRODUCTION AND OBJECTIVES: Although automatic artificial intelligence (AI) coronary angiography (CAG) segmentation is arguably the first step toward future clinical application, it is underexplored. We aimed to (1) develop AI models for CAG segmentation and (2) assess the results using similarity scores and a set of criteria defined by expert physicians. METHODS: Patients undergoing CAG were randomly selected in a retrospective study at a single center. Per incidence, an ideal frame was segmented, forming a baseline human dataset (BH), used for training a baseline AI model (BAI). Enhanced human segmentation (EH) was created by combining the best of both. An enhanced AI model (EAI) was trained using the EH. Results were assessed by experts using 11 weighted criteria, combined into a Global Segmentation Score (GSS: 0-100 points). Generalized Dice Score (GDS) and Dice Similarity Coefficient (DSC) were also used for AI models assessment. RESULTS: 1664 processed images were generated. GSS for BH, EH, BAI and EAI were 96.9+/-5.7; 98.9+/-3.1; 86.1+/-10.1 and 90+/-7.6, respectively (95% confidence interval, p<0.001 for both paired and global differences). The GDS for the BAI and EAI was 0.9234±0.0361 and 0.9348±0.0284, respectively. The DSC for the coronary tree was 0.8904±0.0464 and 0.9134±0.0410 for the BAI and EAI, respectively. The EAI outperformed the BAI in all coronary segmentation tasks, but performed less well in some catheter segmentation tasks. CONCLUSIONS: We successfully developed AI models capable of CAG segmentation, with good performance as assessed by all scores.


Asunto(s)
Aprendizaje Profundo , Humanos , Tomografía Computarizada por Rayos X , Inteligencia Artificial , Estudios Retrospectivos , Rayos X , Angiografía Coronaria
13.
BMC Bioinformatics ; 12: 163, 2011 May 16.
Artículo en Inglés | MEDLINE | ID: mdl-21672185

RESUMEN

BACKGROUND: Over the past few years, new massively parallel DNA sequencing technologies have emerged. These platforms generate massive amounts of data per run, greatly reducing the cost of DNA sequencing. However, these techniques also raise important computational difficulties mostly due to the huge volume of data produced, but also because of some of their specific characteristics such as read length and sequencing errors. Among the most critical problems is that of efficiently and accurately mapping reads to a reference genome in the context of re-sequencing projects. RESULTS: We present an efficient method for the local alignment of pyrosequencing reads produced by the GS FLX (454) system against a reference sequence. Our approach explores the characteristics of the data in these re-sequencing applications and uses state of the art indexing techniques combined with a flexible seed-based approach, leading to a fast and accurate algorithm which needs very little user parameterization. An evaluation performed using real and simulated data shows that our proposed method outperforms a number of mainstream tools on the quantity and quality of successful alignments, as well as on the execution time. CONCLUSIONS: The proposed methodology was implemented in a software tool called TAPyR--Tool for the Alignment of Pyrosequencing Reads--which is publicly available from http://www.tapyr.net.


Asunto(s)
Análisis de Secuencia de ADN/métodos , Algoritmos , Animales , Secuencia de Bases , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Alineación de Secuencia , Programas Informáticos
14.
Nucleic Acids Res ; 36(Database issue): D132-6, 2008 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-18032429

RESUMEN

The Yeast search for transcriptional regulators and consensus tracking (YEASTRACT) information system (www.yeastract.com) was developed to support the analysis of transcription regulatory associations in Saccharomyces cerevisiae. Last updated in September 2007, this database contains over 30 990 regulatory associations between Transcription Factors (TFs) and target genes and includes 284 specific DNA binding sites for 108 characterized TFs. Computational tools are also provided to facilitate the exploitation of the gathered data when solving a number of biological questions, in particular the ones that involve the analysis of global gene expression results. In this new release, YEASTRACT includes DISCOVERER, a set of computational tools that can be used to identify complex motifs over-represented in the promoter regions of co-regulated genes. The motifs identified are then clustered in families, represented by a position weight matrix and are automatically compared with the known transcription factor binding sites described in YEASTRACT. Additionally, in this new release, it is possible to generate graphic depictions of transcriptional regulatory networks for documented or potential regulatory associations between TFs and target genes. The visual display of these networks of interactions is instrumental in functional studies. Tutorials are available on the system to exemplify the use of all the available tools.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , Redes Reguladoras de Genes , Regiones Promotoras Genéticas , Saccharomyces cerevisiae/genética , Factores de Transcripción/metabolismo , Sitios de Unión , Regulación Fúngica de la Expresión Génica , Internet , Programas Informáticos
15.
J Bioinform Comput Biol ; 7(1): 55-74, 2009 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-19226660

RESUMEN

Applications for the manipulation of molecular structures are usually computationally intensive. Problems like protein docking or ab-initio protein folding need to frequently determine if two atoms in the structure collide. Therefore, an efficient algorithm for this problem, usually referred as clash detection, can greatly improve the application efficiency. This work focus mainly on the ab-initio protein folding problem. A naive approach for the clash problem, the most commonly-used by molecular structure programs, consists in calculating the distance between every pair of atoms. We propose an efficient data structure that uses a three-dimensional array to store the atoms' position. We compare the proposed data structure with one of the best known general data structures for this type of problems (SAT tree) and with the naive approach. While the naive approach takes linear time to the number of atoms to verify if a new atom clashes with any previously-set atoms, the proposed data structure takes constant time to perform the same verification. The SAT tree takes logarithmic time for the same task. The results show that the proposed data structure surpasses the other techniques for any protein size. The proposed data structure takes near half of the time of the SAT data structure and close to a fifth of the time of the naive approach for the larger proteins. We believe that this data structure can improve the existing molecular structure applications by decreasing the computational cost needed for clash detection. The data structure presented in this work can be used for any protein structure clash verification, as long as the atoms that need to be checked are kept in the 3D array. This data structure is particulary useful when manipulating large sets of atoms, for example, in applications like loop prediction, structure refinement of large proteins, and protein docking.


Asunto(s)
Cristalografía/métodos , Modelos Químicos , Modelos Moleculares , Pliegue de Proteína , Proteínas/química , Proteínas/ultraestructura , Simulación por Computador , Conformación Proteica , Factores de Tiempo
16.
Biotechnol J ; 14(8): e1800613, 2019 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-30927505

RESUMEN

Developments in biotechnology are increasingly dependent on the extensive use of big data, generated by modern high-throughput instrumentation technologies, and stored in thousands of databases, public and private. Future developments in this area depend, critically, on the ability of biotechnology researchers to master the skills required to effectively integrate their own contributions with the large amounts of information available in these databases. This article offers a perspective of the relations that exist between the fields of big data and biotechnology, including the related technologies of artificial intelligence and machine learning and describes how data integration, data exploitation, and process optimization correspond to three essential steps in any future biotechnology project. The article also lists a number of application areas where the ability to use big data will become a key factor, including drug discovery, drug recycling, drug safety, functional and structural genomics, proteomics, pharmacogenetics, and pharmacogenomics, among others.


Asunto(s)
Inteligencia Artificial , Macrodatos , Biotecnología/métodos , Animales , Minería de Datos , Bases de Datos Factuales , Humanos , Aprendizaje Automático
18.
BMC Bioinformatics ; 9: 89, 2008 Feb 07.
Artículo en Inglés | MEDLINE | ID: mdl-18257925

RESUMEN

BACKGROUND: Motif finding algorithms have developed in their ability to use computationally efficient methods to detect patterns in biological sequences. However the posterior classification of the output still suffers from some limitations, which makes it difficult to assess the biological significance of the motifs found. Previous work has highlighted the existence of positional bias of motifs in the DNA sequences, which might indicate not only that the pattern is important, but also provide hints of the positions where these patterns occur preferentially. RESULTS: We propose to integrate position uniformity tests and over-representation tests to improve the accuracy of the classification of motifs. Using artificial data, we have compared three different statistical tests (Chi-Square, Kolmogorov-Smirnov and a Chi-Square bootstrap) to assess whether a given motif occurs uniformly in the promoter region of a gene. Using the test that performed better in this dataset, we proceeded to study the positional distribution of several well known cis-regulatory elements, in the promoter sequences of different organisms (S. cerevisiae, H. sapiens, D. melanogaster, E. coli and several Dicotyledons plants). The results show that position conservation is relevant for the transcriptional machinery. CONCLUSION: We conclude that many biologically relevant motifs appear heterogeneously distributed in the promoter region of genes, and therefore, that non-uniformity is a good indicator of biological relevance and can be used to complement over-representation tests commonly used. In this article we present the results obtained for the S. cerevisiae data sets.


Asunto(s)
Algoritmos , ADN/genética , Modelos Genéticos , Regiones Promotoras Genéticas/genética , Análisis de Secuencia de ADN/métodos , Secuencia de Bases , Simulación por Computador , Modelos Estadísticos , Datos de Secuencia Molecular , Reproducibilidad de los Resultados , Sensibilidad y Especificidad , Distribuciones Estadísticas
19.
Nucleic Acids Res ; 34(Database issue): D446-51, 2006 Jan 01.
Artículo en Inglés | MEDLINE | ID: mdl-16381908

RESUMEN

We present the YEAst Search for Transcriptional Regulators And Consensus Tracking (YEASTRACT; www.yeastract.com) database, a tool for the analysis of transcription regulatory associations in Saccharomyces cerevisiae. This database is a repository of 12 346 regulatory associations between transcription factors and target genes, based on experimental evidence which was spread throughout 861 bibliographic references. It also includes 257 specific DNA-binding sites for more than a hundred characterized transcription factors. Further information about each yeast gene included in the database was obtained from Saccharomyces Genome Database (SGD), Regulatory Sequences Analysis Tools and Gene Ontology (GO) Consortium. Computational tools are also provided to facilitate the exploitation of the gathered data when solving a number of biological questions as exemplified in the Tutorial also available on the system. YEASTRACT allows the identification of documented or potential transcription regulators of a given gene and of documented or potential regulons for each transcription factor. It also renders possible the comparison between DNA motifs, such as those found to be over-represented in the promoter regions of co-regulated genes, and the transcription factor-binding sites described in the literature. The system also provides an useful mechanism for grouping a list of genes (for instance a set of genes with similar expression profiles as revealed by microarray analysis) based on their regulatory associations with known transcription factors.


Asunto(s)
Bases de Datos Genéticas , Regulación Fúngica de la Expresión Génica , Regiones Promotoras Genéticas , Proteínas de Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/genética , Factores de Transcripción/metabolismo , Sitios de Unión , Biología Computacional , ADN de Hongos/química , ADN de Hongos/metabolismo , Internet , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/metabolismo , Programas Informáticos , Transcripción Genética , Interfaz Usuario-Computador
20.
IEEE/ACM Trans Comput Biol Bioinform ; 15(6): 1953-1959, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-29994736

RESUMEN

Ischemic stroke is a leading cause of disability and death worldwide among adults. The individual prognosis after stroke is extremely dependent on treatment decisions physicians take during the acute phase. In the last five years, several scores such as the ASTRAL, DRAGON, and THRIVE have been proposed as tools to help physicians predict the patient functional outcome after a stroke. These scores are rule-based classifiers that use features available when the patient is admitted to the emergency room. In this paper, we apply machine learning techniques to the problem of predicting the functional outcome of ischemic stroke patients, three months after admission. We show that a pure machine learning approach achieves only a marginally superior Area Under the ROC Curve (AUC) ( 0.808±0.085) than that of the best score ( 0.771±0.056) when using the features available at admission. However, we observed that by progressively adding features available at further points in time, we can significantly increase the AUC to a value above 0.90. We conclude that the results obtained validate the use of the scores at the time of admission, but also point to the importance of using more features, which require more advanced methods, when possible.


Asunto(s)
Isquemia Encefálica , Diagnóstico por Computador/métodos , Aprendizaje Automático , Algoritmos , Área Bajo la Curva , Isquemia Encefálica/diagnóstico , Isquemia Encefálica/epidemiología , Isquemia Encefálica/fisiopatología , Isquemia Encefálica/terapia , Humanos , Resultado del Tratamiento
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA