Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 16 de 16
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Front Oncol ; 13: 1242639, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37869094

RESUMO

Introduction: Prostate cancer (PCa) is the most frequent tumor among men in Europe and has both indolent and aggressive forms. There are several treatment options, the choice of which depends on multiple factors. To further improve current prognostication models, we established the Turin Prostate Cancer Prognostication (TPCP) cohort, an Italian retrospective biopsy cohort of patients with PCa and long-term follow-up. This work presents this new cohort with its main characteristics and the distributions of some of its core variables, along with its potential contributions to PCa research. Methods: The TPCP cohort includes consecutive non-metastatic patients with first positive biopsy for PCa performed between 2008 and 2013 at the main hospital in Turin, Italy. The follow-up ended on December 31st 2021. The primary outcome is the occurrence of metastasis; death from PCa and overall mortality are the secondary outcomes. In addition to numerous clinical variables, the study's prognostic variables include histopathologic information assigned by a centralized uropathology review using a digital pathology software system specialized for the study of PCa, tumor DNA methylation in candidate genes, and features extracted from digitized slide images via Deep Neural Networks. Results: The cohort includes 891 patients followed-up for a median time of 10 years. During this period, 97 patients had progression to metastatic disease and 301 died; of these, 56 died from PCa. In total, 65.3% of the cohort has a Gleason score less than or equal to 3 + 4, and 44.5% has a clinical stage cT1. Consistent with previous studies, age and clinical stage at diagnosis are important prognostic factors: the crude cumulative incidence of metastatic disease during the 14-years of follow-up increases from 9.1% among patients younger than 64 to 16.2% for patients in the age group of 75-84, and from 6.1% for cT1 stage to 27.9% in cT3 stage. Discussion: This study stands to be an important resource for updating existing prognostic models for PCa on an Italian cohort. In addition, the integrated collection of multi-modal data will allow development and/or validation of new models including new histopathological, digital, and molecular markers, with the goal of better directing clinical decisions to manage patients with PCa.

2.
Nat Commun ; 14(1): 2577, 2023 05 04.
Artigo em Inglês | MEDLINE | ID: mdl-37142591

RESUMO

Access to large volumes of so-called whole-slide images-high-resolution scans of complete pathological slides-has become a cornerstone of the development of novel artificial intelligence methods in pathology for diagnostic use, education/training of pathologists, and research. Nevertheless, a methodology based on risk analysis for evaluating the privacy risks associated with sharing such imaging data and applying the principle "as open as possible and as closed as necessary" is still lacking. In this article, we develop a model for privacy risk analysis for whole-slide images which focuses primarily on identity disclosure attacks, as these are the most important from a regulatory perspective. We introduce a taxonomy of whole-slide images with respect to privacy risks and mathematical model for risk assessment and design . Based on this risk assessment model and the taxonomy, we conduct a series of experiments to demonstrate the risks using real-world imaging data. Finally, we develop guidelines for risk assessment and recommendations for low-risk sharing of whole-slide image data.


Assuntos
Inteligência Artificial , Privacidade , Processamento de Imagem Assistida por Computador/métodos , Diagnóstico por Imagem/métodos
3.
Gigascience ; 10(9)2021 09 16.
Artigo em Inglês | MEDLINE | ID: mdl-34528664

RESUMO

BACKGROUND: The Investigation/Study/Assay (ISA) Metadata Framework is an established and widely used set of open source community specifications and software tools for enabling discovery, exchange, and publication of metadata from experiments in the life sciences. The original ISA software suite provided a set of user-facing Java tools for creating and manipulating the information structured in ISA-Tab-a now widely used tabular format. To make the ISA framework more accessible to machines and enable programmatic manipulation of experiment metadata, the JSON serialization ISA-JSON was developed. RESULTS: In this work, we present the ISA API, a Python library for the creation, editing, parsing, and validating of ISA-Tab and ISA-JSON formats by using a common data model engineered as Python object classes. We describe the ISA API feature set, early adopters, and its growing user community. CONCLUSIONS: The ISA API provides users with rich programmatic metadata-handling functionality to support automation, a common interface, and an interoperable medium between the 2 ISA formats, as well as with other life science data formats required for depositing data in public databases.


Assuntos
Disciplinas das Ciências Biológicas , Metadados , Bases de Dados Factuais , Software
4.
Stud Health Technol Inform ; 270: 443-447, 2020 Jun 16.
Artigo em Inglês | MEDLINE | ID: mdl-32570423

RESUMO

Current high-throughput sequencing technologies allow us to acquire entire genomes in a very short time and at a relatively sustainable cost, thus resulting in an increasing diffusion of genetic test capabilities, in specialized clinical laboratories and research centers. In contrast, it is still limited the impact of genomic information on clinical decisions, as an effective interpretation is a challenging task. From the technological point of view, genomic data are big in size, have a complex granular nature and strongly depend on the computational steps of the generation and processing workflows. This article introduces our work to create the openEHR Genomic Project and the set of genomic information models we developed to catch such complex structure and to preserve data provenance efficiently in a machine-readable format. The models support clinical actionability of data, by improving their quality, fostering interoperability and laying the basis for re-usability.


Assuntos
Registros Eletrônicos de Saúde , Genômica , Testes Genéticos , Fluxo de Trabalho
5.
Bioinformatics ; 35(19): 3752-3760, 2019 10 01.
Artigo em Inglês | MEDLINE | ID: mdl-30851093

RESUMO

MOTIVATION: Developing a robust and performant data analysis workflow that integrates all necessary components whilst still being able to scale over multiple compute nodes is a challenging task. We introduce a generic method based on the microservice architecture, where software tools are encapsulated as Docker containers that can be connected into scientific workflows and executed using the Kubernetes container orchestrator. RESULTS: We developed a Virtual Research Environment (VRE) which facilitates rapid integration of new tools and developing scalable and interoperable workflows for performing metabolomics data analysis. The environment can be launched on-demand on cloud resources and desktop computers. IT-expertise requirements on the user side are kept to a minimum, and workflows can be re-used effortlessly by any novice user. We validate our method in the field of metabolomics on two mass spectrometry, one nuclear magnetic resonance spectroscopy and one fluxomics study. We showed that the method scales dynamically with increasing availability of computational resources. We demonstrated that the method facilitates interoperability using integration of the major software suites resulting in a turn-key workflow encompassing all steps for mass-spectrometry-based metabolomics including preprocessing, statistics and identification. Microservices is a generic methodology that can serve any scientific discipline and opens up for new types of large-scale integrative science. AVAILABILITY AND IMPLEMENTATION: The PhenoMeNal consortium maintains a web portal (https://portal.phenomenal-h2020.eu) providing a GUI for launching the Virtual Research Environment. The GitHub repository https://github.com/phnmnl/ hosts the source code of all projects. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Análise de Dados , Metabolômica , Biologia Computacional , Software , Fluxo de Trabalho
6.
Gigascience ; 8(2)2019 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-30535405

RESUMO

BACKGROUND: Metabolomics is the comprehensive study of a multitude of small molecules to gain insight into an organism's metabolism. The research field is dynamic and expanding with applications across biomedical, biotechnological, and many other applied biological domains. Its computationally intensive nature has driven requirements for open data formats, data repositories, and data analysis tools. However, the rapid progress has resulted in a mosaic of independent, and sometimes incompatible, analysis methods that are difficult to connect into a useful and complete data analysis solution. FINDINGS: PhenoMeNal (Phenome and Metabolome aNalysis) is an advanced and complete solution to set up Infrastructure-as-a-Service (IaaS) that brings workflow-oriented, interoperable metabolomics data analysis platforms into the cloud. PhenoMeNal seamlessly integrates a wide array of existing open-source tools that are tested and packaged as Docker containers through the project's continuous integration process and deployed based on a kubernetes orchestration framework. It also provides a number of standardized, automated, and published analysis workflows in the user interfaces Galaxy, Jupyter, Luigi, and Pachyderm. CONCLUSIONS: PhenoMeNal constitutes a keystone solution in cloud e-infrastructures available for metabolomics. PhenoMeNal is a unique and complete solution for setting up cloud e-infrastructures through easy-to-use web interfaces that can be scaled to any custom public and private cloud environment. By harmonizing and automating software installation and configuration and through ready-to-use scientific workflow user interfaces, PhenoMeNal has succeeded in providing scientists with workflow-driven, reproducible, and shareable metabolomics data analysis platforms that are interfaced through standard data formats, representative datasets, versioned, and have been tested for reproducibility and interoperability. The elastic implementation of PhenoMeNal further allows easy adaptation of the infrastructure to other application areas and 'omics research domains.


Assuntos
Metabolômica/métodos , Software , Computação em Nuvem , Humanos , Fluxo de Trabalho
7.
Bioinformatics ; 33(23): 3805-3807, 2017 Dec 01.
Artigo em Inglês | MEDLINE | ID: mdl-29036536

RESUMO

MOTIVATION: Workflow managers for scientific analysis provide a high-level programming platform facilitating standardization, automation, collaboration and access to sophisticated computing resources. The Galaxy workflow manager provides a prime example of this type of platform. As compositions of simpler tools, workflows effectively comprise specialized computer programs implementing often very complex analysis procedures. To date, no simple way to automatically test Galaxy workflows and ensure their correctness has appeared in the literature. RESULTS: With wft4galaxy we offer a tool to bring automated testing to Galaxy workflows, making it feasible to bring continuous integration to their development and ensuring that defects are detected promptly. wft4galaxy can be easily installed as a regular Python program or launched directly as a Docker container-the latter reducing installation effort to a minimum. AVAILABILITY AND IMPLEMENTATION: Available at https://github.com/phnmnl/wft4galaxy under the Academic Free License v3.0. CONTACT: marcoenrico.piras@crs4.it.


Assuntos
Biologia Computacional/métodos , Software , Fluxo de Trabalho , Automação
8.
Gigascience ; 5: 26, 2016 06 07.
Artigo em Inglês | MEDLINE | ID: mdl-27267963

RESUMO

With ever-increasing amounts of data being produced by next-generation sequencing (NGS) experiments, the requirements placed on supporting e-infrastructures have grown. In this work, we provide recommendations based on the collective experiences from participants in the EU COST Action SeqAhead for the tasks of data preprocessing, upstream processing, data delivery, and downstream analysis, as well as long-term storage and archiving. We cover demands on computational and storage resources, networks, software stacks, automation of analysis, education, and also discuss emerging trends in the field. E-infrastructures for NGS require substantial effort to set up and maintain over time, and with sequencing technologies and best practices for data analysis evolving rapidly it is important to prioritize both processing capacity and e-infrastructure flexibility when making strategic decisions to support the data analysis demands of tomorrow. Due to increasingly demanding technical requirements we recommend that e-infrastructure development and maintenance be handled by a professional service unit, be it internal or external to the organization, and emphasis should be placed on collaboration between researchers and IT professionals.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Biologia Computacional/métodos , Humanos , Armazenamento e Recuperação da Informação , Internet , Software
9.
Biol Direct ; 10: 43, 2015 Aug 19.
Artigo em Inglês | MEDLINE | ID: mdl-26282399

RESUMO

High-throughput technologies, such as next-generation sequencing, have turned molecular biology into a data-intensive discipline, requiring bioinformaticians to use high-performance computing resources and carry out data management and analysis tasks on large scale. Workflow systems can be useful to simplify construction of analysis pipelines that automate tasks, support reproducibility and provide measures for fault-tolerance. However, workflow systems can incur significant development and administration overhead so bioinformatics pipelines are often still built without them. We present the experiences with workflows and workflow systems within the bioinformatics community participating in a series of hackathons and workshops of the EU COST action SeqAhead. The organizations are working on similar problems, but we have addressed them with different strategies and solutions. This fragmentation of efforts is inefficient and leads to redundant and incompatible solutions. Based on our experiences we define a set of recommendations for future systems to enable efficient yet simple bioinformatics workflow construction and execution.


Assuntos
Biologia Computacional/métodos , Processamento Eletrônico de Dados/métodos , Fluxo de Trabalho , Sequenciamento de Nucleotídeos em Larga Escala , Reprodutibilidade dos Testes
10.
Bioinformatics ; 30(19): 2816-7, 2014 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-24928211

RESUMO

SUMMARY: BioBlend.objects is a new component of the BioBlend package, adding an object-oriented interface for the Galaxy REST-based application programming interface. It improves support for metacomputing on Galaxy entities by providing higher-level functionality and allowing users to more easily create programs to explore, query and create Galaxy datasets and workflows. AVAILABILITY AND IMPLEMENTATION: BioBlend.objects is available online at https://github.com/afgane/bioblend. The new object-oriented API is implemented by the galaxy/objects subpackage.


Assuntos
Biologia Computacional/métodos , Algoritmos , Automação , Gráficos por Computador , Sistemas Computacionais , Linguagens de Programação , Software , Interface Usuário-Computador
11.
Bioinformatics ; 30(1): 119-20, 2014 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-24149054

RESUMO

SUMMARY: Hadoop MapReduce-based approaches have become increasingly popular due to their scalability in processing large sequencing datasets. However, as these methods typically require in-depth expertise in Hadoop and Java, they are still out of reach of many bioinformaticians. To solve this problem, we have created SeqPig, a library and a collection of tools to manipulate, analyze and query sequencing datasets in a scalable and simple manner. SeqPigscripts use the Hadoop-based distributed scripting engine Apache Pig, which automatically parallelizes and distributes data processing tasks. We demonstrate SeqPig's scalability over many computing nodes and illustrate its use with example scripts. AVAILABILITY AND IMPLEMENTATION: Available under the open source MIT license at http://sourceforge.net/projects/seqpig/


Assuntos
Ensaios de Triagem em Larga Escala/métodos , Design de Software
12.
Cell ; 155(1): 242-56, 2013 Sep 26.
Artigo em Inglês | MEDLINE | ID: mdl-24074872

RESUMO

The complex network of specialized cells and molecules in the immune system has evolved to defend against pathogens, but inadvertent immune system attacks on "self" result in autoimmune disease. Both genetic regulation of immune cell levels and their relationships with autoimmunity are largely undetermined. Here, we report genetic contributions to quantitative levels of 95 cell types encompassing 272 immune traits, in a cohort of 1,629 individuals from four clustered Sardinian villages. We first estimated trait heritability, showing that it can be substantial, accounting for up to 87% of the variance (mean 41%). Next, by assessing ∼8.2 million variants that we identified and confirmed in an extended set of 2,870 individuals, 23 independent variants at 13 loci associated with at least one trait. Notably, variants at three loci (HLA, IL2RA, and SH2B3/ATXN2) overlap with known autoimmune disease associations. These results connect specific cellular phenotypes to specific genetic variants, helping to explicate their involvement in disease.


Assuntos
Citometria de Fluxo/métodos , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Doenças do Sistema Imunitário/genética , Polimorfismo de Nucleotídeo Único , Humanos , Fenótipo
13.
Bioinformatics ; 27(15): 2159-60, 2011 Aug 01.
Artigo em Inglês | MEDLINE | ID: mdl-21697132

RESUMO

SUMMARY: SEAL is a scalable tool for short read pair mapping and duplicate removal. It computes mappings that are consistent with those produced by BWA and removes duplicates according to the same criteria employed by Picard MarkDuplicates. On a 16-node Hadoop cluster, it is capable of processing about 13 GB per hour in map+rmdup mode, while reaching a throughput of 19 GB per hour in mapping-only mode. AVAILABILITY: SEAL is available online at http://biodoop-seal.sourceforge.net/.


Assuntos
Biologia Computacional/métodos , Análise de Sequência de DNA/métodos , Software , Algoritmos , Alinhamento de Sequência/métodos
14.
Mol Inform ; 29(1-2): 51-64, 2010 Jan 12.
Artigo em Inglês | MEDLINE | ID: mdl-27463848

RESUMO

Quantitative structure-activity relationship (QSAR) analysis has been frequently utilized as a computational tool for the prediction of several eco-toxicological parameters including the acute aquatic toxicity. In the present study, we describe a novel integrated strategy to describe the acute aquatic toxicity through the combination of both toxicokinetic and toxicodynamic behaviors of chemicals. In particular, a robust classification model (TOXclass) has been derived by combining Support Vector Machine (SVM) analysis with three classes of toxicokinetic-like molecular descriptors: the autocorrelation molecular electrostatic potential (autoMEP) vectors, Sterimol topological descriptors and logP(o/w) property values. TOXclass model is able to assign chemicals to different levels of acute aquatic toxicity, providing an appropriate answer to the new regulatory requirements. Moreover, we have extended the above mentioned toxicokinetic-like descriptor set with a more toxicodynamic-like descriptors, as for example HOMO and LUMO energies, to generate a valuable SVM classifier (MOAclass) for the prediction of the mode of action (MOA) of toxic chemicals. As preliminary validation of our approach, the toxicokinetic (TOXclass) and the toxicodynamic (MOAclass) models have been applied in series to inspect both aquatic toxicity hazard and mode of action of 296 chemical substances with unknown or uncertain toxicodynamic information to assess the potential ecological risk and the toxic mechanism.

15.
Nucleic Acids Res ; 37(Database issue): D284-90, 2009 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-18931373

RESUMO

MMsINC (http://mms.dsfarm.unipd.it/MMsINC/search) is a database of non-redundant, richly annotated and biomedically relevant chemical structures. A primary goal of MMsINC is to guarantee the highest quality and the uniqueness of each entry. MMsINC then adds value to these entries by including the analysis of crucial chemical properties, such as ionization and tautomerization processes, and the in silico prediction of 24 important molecular properties in the biochemical profile of each structure. MMsINC is consequently a natural input for different chemoinformatics and virtual screening applications. In addition, MMsINC supports various types of queries, including substructure queries and the novel 'molecular scissoring' query. MMsINC is interfaced with other primary data collectors, such as PubChem, Protein Data Bank (PDB), the Food and Drug Administration database of approved drugs and ZINC.


Assuntos
Bases de Dados Factuais , Preparações Farmacêuticas/química , Biologia Computacional , Ligantes , Proteínas/química
16.
Nucleic Acids Res ; 34(Web Server issue): W714-9, 2006 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-16845105

RESUMO

Pathway Analyst (Path-A) is a publicly available web server (http://path-a.cs.ualberta.ca) that predicts metabolic pathways. It takes a FASTA format file containing a set of query protein sequences from a single organism (a partial or complete proteome) and identifies those sequences that are likely to participate in any of its supported metabolic pathways (currently 10). Path-A uses a number of machine-learning and sequence analysis techniques (e.g. SVM, BLAST and HMM) to predict pathways. Each machine-learned classifier exploits similarity between sequences in the pathways of its model organisms and sequences in the query set. It predicts the pathways that are present in the query organism and annotates each predicted reaction and catalyst, using the appropriate sequences from the query set. Path-A also provides a browsable and searchable database of the pathways for the model organisms that are used to make its predictions. Path-A's predictor sets (using different classifier technologies) have been evaluated using standard cross-validation techniques on a dataset of 10 metabolic pathways across 13 model organisms--a total of 125 organism-specific pathways. The most accurate classifier technology obtained a mean precision of 78.3% and a mean recall of 92.6% in predicting all catalyst proteins, of all reactions, in all pathways present in the dataset. Although Path-A currently only supports metabolic pathways, the underlying prediction techniques are general enough for other types of pathways. Consequently, it is our intent to extend Path-A to predict other types of pathways, including signalling pathways.


Assuntos
Metabolismo , Análise de Sequência de Proteína , Software , Algoritmos , Inteligência Artificial , Gráficos por Computador , Internet , Proteômica , Interface Usuário-Computador
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...