Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 27
Filtrar
1.
PLoS Genet ; 17(8): e1009713, 2021 08.
Artigo em Inglês | MEDLINE | ID: mdl-34460823

RESUMO

Genome-wide association studies (GWASs) have uncovered a wealth of associations between common variants and human phenotypes. Here, we present an integrative analysis of GWAS summary statistics from 36 phenotypes to decipher multitrait genetic architecture and its link with biological mechanisms. Our framework incorporates multitrait association mapping along with an investigation of the breakdown of genetic associations into clusters of variants harboring similar multitrait association profiles. Focusing on two subsets of immunity and metabolism phenotypes, we then demonstrate how genetic variants within clusters can be mapped to biological pathways and disease mechanisms. Finally, for the metabolism set, we investigate the link between gene cluster assignment and the success of drug targets in randomized controlled trials.


Assuntos
Biologia Computacional/métodos , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , Análise por Conglomerados , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Humanos , Fenótipo
2.
Brief Bioinform ; 21(5): 1697-1705, 2020 09 25.
Artigo em Inglês | MEDLINE | ID: mdl-31624831

RESUMO

The corpus of bioinformatics resources is huge and expanding rapidly, presenting life scientists with a growing challenge in selecting tools that fit the desired purpose. To address this, the European Infrastructure for Biological Information is supporting a systematic approach towards a comprehensive registry of tools and databases for all domains of bioinformatics, provided under a single portal (https://bio.tools). We describe here the practical means by which scientific communities, including individual developers and projects, through major service providers and research infrastructures, can describe their own bioinformatics resources and share these via bio.tools.


Assuntos
Participação da Comunidade , Biologia Computacional/métodos , Software , Biologia Computacional/normas , Sistemas de Gerenciamento de Base de Dados , Europa (Continente) , Humanos
3.
Bioinformatics ; 37(17): 2798-2801, 2021 Sep 09.
Artigo em Inglês | MEDLINE | ID: mdl-33594411

RESUMO

MOTIVATION: Viruses are ubiquitous in the living world, and their ability to infect more than one host defines their host range. However, information about which virus infects which host, and about which host is infected by which virus, is not readily available. RESULTS: We developed a web-based tool called the Viral Host Range database to record, analyze and disseminate experimental host range data for viruses infecting archaea, bacteria and eukaryotes. AVAILABILITY AND IMPLEMENTATION: The ViralHostRangeDB application is available from https://viralhostrangedb.pasteur.cloud. Its source code is freely available from the Gitlab instance of Institut Pasteur (https://gitlab.pasteur.fr/hub/viralhostrangedb).

4.
Bioinformatics ; 37(1): 89-96, 2021 Apr 09.
Artigo em Inglês | MEDLINE | ID: mdl-33416858

RESUMO

MOTIVATION: One avenue to address the paucity of clinically testable targets is to reinvestigate the druggable genome by tackling complicated types of targets such as Protein-Protein Interactions (PPIs). Given the challenge to target those interfaces with small chemical compounds, it has become clear that learning from successful examples of PPI modulation is a powerful strategy. Freely accessible databases of PPI modulators that provide the community with tractable chemical and pharmacological data, as well as powerful tools to query them, are therefore essential to stimulate new drug discovery projects on PPI targets. RESULTS: Here, we present the new version iPPI-DB, our manually curated database of PPI modulators. In this completely redesigned version of the database, we introduce a new web interface relying on crowdsourcing for the maintenance of the database. This interface was created to enable community contributions, whereby external experts can suggest new database entries. Moreover, the data model, the graphical interface, and the tools to query the database have been completely modernized and improved. We added new PPI modulators, new PPI targets and extended our focus to stabilizers of PPIs as well. AVAILABILITY AND IMPLEMENTATION: The iPPI-DB server is available at https://ippidb.pasteur.fr The source code for this server is available at https://gitlab.pasteur.fr/ippidb/ippidb-web/ and is distributed under GPL licence (http://www.gnu.org/licences/gpl). Queries can be shared through persistent links according to the FAIR data standards. Data can be downloaded from the website as csv files. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

5.
Nucleic Acids Res ; 48(W1): W41-W47, 2020 07 02.
Artigo em Inglês | MEDLINE | ID: mdl-32383755

RESUMO

Nuclear magnetic resonance (NMR) spectroscopy is a method of choice to study the dynamics and determine the atomic structure of macromolecules in solution. The standalone program ARIA (Ambiguous Restraints for Iterative Assignment) for automated assignment of nuclear Overhauser enhancement (NOE) data and structure calculation is well established in the NMR community. To ultimately provide a perfectly transparent and easy to use service, we designed an online user interface to ARIA with additional functionalities. Data conversion, structure calculation setup and execution, followed by interactive visualization of the generated 3D structures are all integrated in ARIAweb and freely accessible at https://ariaweb.pasteur.fr.


Assuntos
Ressonância Magnética Nuclear Biomolecular , Proteínas/química , Software , Animais , Humanos , Camundongos , Modelos Moleculares , RNA/química
6.
Nucleic Acids Res ; 48(11): e64, 2020 06 19.
Artigo em Inglês | MEDLINE | ID: mdl-32352514

RESUMO

The ability to block gene expression in bacteria with the catalytically inactive mutant of Cas9, known as dCas9, is quickly becoming a standard methodology to probe gene function, perform high-throughput screens, and engineer cells for desired purposes. Yet, we still lack a good understanding of the design rules that determine on-target activity for dCas9. Taking advantage of high-throughput screening data, we fit a model to predict the ability of dCas9 to block the RNA polymerase based on the target sequence, and validate its performance on independently generated datasets. We further design a novel genome wide guide RNA library for E. coli MG1655, EcoWG1, using our model to choose guides with high activity while avoiding guides which might be toxic or have off-target effects. A screen performed using the EcoWG1 library during growth in rich medium improved upon previously published screens, demonstrating that very good performances can be attained using only a small number of well designed guides. Being able to design effective, smaller libraries will help make CRISPRi screens even easier to perform and more cost-effective. Our model and materials are available to the community through crispr.pasteur.fr and Addgene.


Assuntos
Proteína 9 Associada à CRISPR/metabolismo , Sistemas CRISPR-Cas/genética , Escherichia coli/genética , Ensaios de Triagem em Larga Escala , RNA Guia de Cinetoplastídeos/genética , Sequência de Bases , RNA Polimerases Dirigidas por DNA/antagonistas & inibidores , Conjuntos de Dados como Assunto , Modelos Lineares , Reprodutibilidade dos Testes
7.
Nucleic Acids Res ; 44(D1): D38-47, 2016 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-26538599

RESUMO

Life sciences are yielding huge data sets that underpin scientific discoveries fundamental to improvement in human health, agriculture and the environment. In support of these discoveries, a plethora of databases and tools are deployed, in technically complex and diverse implementations, across a spectrum of scientific disciplines. The corpus of documentation of these resources is fragmented across the Web, with much redundancy, and has lacked a common standard of information. The outcome is that scientists must often struggle to find, understand, compare and use the best resources for the task at hand.Here we present a community-driven curation effort, supported by ELIXIR-the European infrastructure for biological information-that aspires to a comprehensive and consistent registry of information about bioinformatics resources. The sustainable upkeep of this Tools and Data Services Registry is assured by a curation effort driven by and tailored to local needs, and shared amongst a network of engaged partners.As of November 2015, the registry includes 1785 resources, with depositions from 126 individual registrations including 52 institutional providers and 74 individuals. With community support, the registry can become a standard for dissemination of information about bioinformatics resources: we welcome everyone to join us in this common endeavour. The registry is freely available at https://bio.tools.


Assuntos
Biologia Computacional , Sistema de Registros , Curadoria de Dados , Software
8.
HGG Adv ; 5(3): 100319, 2024 Jul 18.
Artigo em Inglês | MEDLINE | ID: mdl-38872309

RESUMO

Since the first genome-wide association studies (GWASs), thousands of variant-trait associations have been discovered. However, comprehensively mapping the genetic determinant of complex traits through univariate testing can require prohibitive sample sizes. Multi-trait GWAS can circumvent this issue and improve statistical power by leveraging the joint genetic architecture of human phenotypes. Although many methodological hurdles of multi-trait testing have been solved, the strategy to select traits has been overlooked. In this study, we conducted multi-trait GWAS on approximately 20,000 combinations of 72 traits using an omnibus test as implemented in the Joint Analysis of Summary Statistics. We assessed which genetic features of the sets of traits analyzed were associated with an increased detection of variants compared with univariate screening. Several features of the set of traits, including the heritability, the number of traits, and the genetic correlation, drive the multi-trait test gain. Using these features jointly in predictive models captures a large fraction of the power gain of the multi-trait test (Pearson's r between the observed and predicted gain equals 0.43, p < 1.6 × 10-60). Applying an alternative multi-trait approach (Multi-Trait Analysis of GWAS), we identified similar features of interest, but with an overall 70% lower number of new associations. Finally, selecting sets based on our data-driven models systematically outperformed the common strategy of selecting clinically similar traits. This work provides a unique picture of the determinant of multi-trait GWAS statistical power and outlines practical strategies for multi-trait testing.


Assuntos
Estudo de Associação Genômica Ampla , Fenótipo , Polimorfismo de Nucleotídeo Único , Estudo de Associação Genômica Ampla/métodos , Humanos , Polimorfismo de Nucleotídeo Único/genética , Locos de Características Quantitativas/genética , Modelos Genéticos , Característica Quantitativa Herdável
9.
Sci Data ; 10(1): 557, 2023 08 23.
Artigo em Inglês | MEDLINE | ID: mdl-37612312

RESUMO

Findable, Accessible, Interoperable, and Reusable (FAIR) guiding principles tailored for research software have been proposed by the FAIR for Research Software (FAIR4RS) Working Group. They provide a foundation for optimizing the reuse of research software. The FAIR4RS principles are, however, aspirational and do not provide practical instructions to the researchers. To fill this gap, we propose in this work the first actionable step-by-step guidelines for biomedical researchers to make their research software compliant with the FAIR4RS principles. We designate them as the FAIR Biomedical Research Software (FAIR-BioRS) guidelines. Our process for developing these guidelines, presented here, is based on an in-depth study of the FAIR4RS principles and a thorough review of current practices in the field. To support researchers, we have also developed a workflow that streamlines the process of implementing these guidelines. This workflow is incorporated in FAIRshare, a free and open-source software application aimed at simplifying the curation and sharing of FAIR biomedical data and software through user-friendly interfaces and automation. Details about this tool are also presented.

10.
F1000Res ; 12: 1568, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-38076297

RESUMO

The 24th annual Bioinformatics Open Source Conference ( BOSC 2023) was part of the 2023i conference on Intelligent Systems for Molecular Biology and the European Conference on Computational Biology (ISMB/ECCB 2023). Launched in 2000 and held yearly since, BOSC is the premier meeting covering open-source bioinformatics and open science. Like ISMB 2022, the 2023 meeting was a hybrid conference, with the in-person component hosted in Lyon, France. ISMB/ECCB attracted a near-record number of attendees, with over 2100 in person and about 900 more online. Approximately 200 people participated in BOSC sessions. In addition to 43 talks and 49 posters, BOSC featured two keynotes: Sara El-Gebali, who spoke about "A New Odyssey: Pioneering the Future of Scientific Progress Through Open Collaboration", and Joseph Yracheta, who spoke about "The Dissonance between Scientific Altruism & Capitalist Extraction: The Zero Trust and Federated Data Sovereignty Solution." Once again, a joint session brought together BOSC and the Bio-Ontologies COSI. The conference ended with a panel on Open and Ethical Data Sharing. As in prior years, BOSC was preceded by a CollaborationFest, a collaborative work event that brought together about 40 participants interested in synergistically combining ideas, shaping project plans, developing software, and more.


Assuntos
Biologia Computacional , Software , Humanos , Disseminação de Informação
11.
bioRxiv ; 2023 Oct 27.
Artigo em Inglês | MEDLINE | ID: mdl-37961722

RESUMO

Since the first Genome-Wide Association Studies (GWAS), thousands of variant-trait associations have been discovered. However, the sample size required to detect additional variants using standard univariate association screening is increasingly prohibitive. Multi-trait GWAS offers a relevant alternative: it can improve statistical power and lead to new insights about gene function and the joint genetic architecture of human phenotypes. Although many methodological hurdles of multi-trait testing have been discussed, the strategy to select trait, among overwhelming possibilities, has been overlooked. In this study, we conducted extensive multi-trait tests using JASS (Joint Analysis of Summary Statistics) and assessed which genetic features of the analysed sets were associated with an increased detection of variants as compared to univariate screening. Our analyses identified multiple factors associated with the gain in the association detection in multi-trait tests. Together, these factors of the analysed sets are predictive of the gain of the multi-trait test (Pearson's ρ equal to 0.43 between the observed and predicted gain, P < 1.6 × 10-60). Applying an alternative multi-trait approach (MTAG, multi-trait analysis of GWAS), we found that in most scenarios but particularly those with larger numbers of traits, JASS outperformed MTAG. Finally, we benchmark several strategies to select set of traits including the prevalent strategy of selecting clinically similar traits, which systematically underperformed selecting clinically heterogenous traits or selecting sets that issued from our data-driven models. This work provides a unique picture of the determinant of multi-trait GWAS statistical power and outline practical strategies for multi-trait testing.

12.
PeerJ Comput Sci ; 8: e1023, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36092012

RESUMO

Scientific software registries and repositories improve software findability and research transparency, provide information for software citations, and foster preservation of computational methods in a wide range of disciplines. Registries and repositories play a critical role by supporting research reproducibility and replicability, but developing them takes effort and few guidelines are available to help prospective creators of these resources. To address this need, the FORCE11 Software Citation Implementation Working Group convened a Task Force to distill the experiences of the managers of existing resources in setting expectations for all stakeholders. In this article, we describe the resultant best practices which include defining the scope, policies, and rules that govern individual registries and repositories, along with the background, examples, and collaborative work that went into their development. We believe that establishing specific policies such as those presented here will help other scientific software registries and repositories better serve their users and their disciplines.

13.
F1000Res ; 11: 1034, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36128559

RESUMO

The 23 rd annual Bioinformatics Open Source Conference (BOSC 2022) was part of this year's conference on Intelligent Systems for Molecular Biology (ISMB). Launched in 2000 and held every year since, BOSC is the premier meeting covering open source bioinformatics and open science. ISMB 2022 was, for the first time, a hybrid conference, with the in-person component hosted in Madison, Wisconsin (USA). About 1000 people attended ISMB 2022 in person, with another 800 online. Approximately 200 people participated in BOSC sessions, which included 28 talks chosen from submitted abstracts, 46 posters, and a panel discussion, "Building and Sustaining Inclusive Open Science Communities". BOSC 2022 included joint keynotes with two other COSIs. Jason Williams gave a BOSC / Education COSI keynote entitled "Riding the bicycle: Including all scientists on a path to excellence". A joint session with Bio-Ontologies featured a keynote by Melissa Haendel, "The open data highway: turbo-boosting translational traffic with ontologies."


Assuntos
Biologia Computacional , Biologia de Sistemas , Congressos como Assunto , Humanos
14.
Gigascience ; 10(1)2021 01 27.
Artigo em Inglês | MEDLINE | ID: mdl-33506265

RESUMO

BACKGROUND: Life scientists routinely face massive and heterogeneous data analysis tasks and must find and access the most suitable databases or software in a jungle of web-accessible resources. The diversity of information used to describe life-scientific digital resources presents an obstacle to their utilization. Although several standardization efforts are emerging, no information schema has been sufficiently detailed to enable uniform semantic and syntactic description-and cataloguing-of bioinformatics resources. FINDINGS: Here we describe biotoolsSchema, a formalized information model that balances the needs of conciseness for rapid adoption against the provision of rich technical information and scientific context. biotoolsSchema results from a series of community-driven workshops and is deployed in the bio.tools registry, providing the scientific community with >17,000 machine-readable and human-understandable descriptions of software and other digital life-science resources. We compare our approach to related initiatives and provide alignments to foster interoperability and reusability. CONCLUSIONS: biotoolsSchema supports the formalized, rigorous, and consistent specification of the syntax and semantics of bioinformatics resources, and enables cataloguing efforts such as bio.tools that help scientists to find, comprehend, and compare resources. The use of biotoolsSchema in bio.tools promotes the FAIRness of research software, a key element of open and reproducible developments for data-intensive sciences.


Assuntos
Disciplinas das Ciências Biológicas , Biologia Computacional , Bases de Dados Factuais , Humanos , Semântica , Software
15.
Orphanet J Rare Dis ; 16(1): 288, 2021 06 28.
Artigo em Inglês | MEDLINE | ID: mdl-34183044

RESUMO

BACKGROUND: Epstein-Barr virus (EBV) targets B-cells where it establishes a latent infection. EBV can transform B-cells in vitro and is recognized as an oncogenic virus, especially in the setting of immune compromise. Indeed, immunodeficient patients may fail to control chronic EBV infection, leading to the development EBV-driven lymphoid malignancies. Ataxia telangiectasia (AT) is a primary immune deficiency caused by mutations in the ATM gene, involved in the repair of double-strand breaks. Patients with AT are at high risk of developing cancers, mostly B-cell lymphoid malignancies, most of which being EBV-related. Aside from immune deficiency secondary to AT, loss of ATM function could also hinder the control of the virus within B-cells, favoring lymphomagenesis in AT patients. RESULTS: We used RNA sequencing on lymphoblastoid cell lines derived from patients with AT and healthy donors to analyze and compare both cellular and viral gene expression. We found numerous deregulated signaling pathways involving transcription, translation, oncogenesis and immune regulation. Specifically, the translational defect was confirmed in vitro, suggesting that the pathogenesis of AT may also involve a ribosomal defect. Concomitant analysis of viral gene expression did not reveal significant differential gene expression, however, analysis of EBV interactome suggests that the viral latency genes EBNA-3A, EBNA-3C and LMP1 may be disrupted in LCL from AT patients. CONCLUSION: Our data support the notion that ATM deficiency deregulates cellular gene expression possibly disrupting interactions with EBV latent genes, promoting the oncogenic potential of the virus. These preliminary findings provide a new step towards the understanding of EBV regulation and of AT pathogenesis.


Assuntos
Ataxia Telangiectasia , Infecções por Vírus Epstein-Barr , Ataxia Telangiectasia/genética , Linhagem Celular , Infecções por Vírus Epstein-Barr/genética , Antígenos Nucleares do Vírus Epstein-Barr , Expressão Gênica , Herpesvirus Humano 4/genética , Humanos , RNA , Análise de Sequência de RNA
16.
F1000Res ; 10: 320, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34136134

RESUMO

Workflows are the keystone of bioimage analysis, and the NEUBIAS (Network of European BioImage AnalystS) community is trying to gather the actors of this field and organize the information around them.  One of its most recent outputs is the opening of the F1000Research NEUBIAS gateway, whose main objective is to offer a channel of publication for bioimage analysis workflows and associated resources. In this paper we want to express some personal opinions and recommendations related to finding, handling and developing bioimage analysis workflows.  The emergence of "big data" in bioimaging and resource-intensive analysis algorithms make local data storage and computing solutions a limiting factor. At the same time, the need for data sharing with collaborators and a general shift towards remote work, have created new challenges and avenues for the execution and sharing of bioimage analysis workflows. These challenges are to reproducibly run workflows in remote environments, in particular when their components come from different software packages, but also to document them and link their parameters and results by following the FAIR principles (Findable, Accessible, Interoperable, Reusable) to foster open and reproducible science. In this opinion paper, we focus on giving some directions to the reader to tackle these challenges and navigate through this complex ecosystem, in order to find and use workflows, and to compare workflows addressing the same problem. We also discuss tools to run workflows in the cloud and on High Performance Computing resources, and suggest ways to make these workflows FAIR.


Assuntos
Biologia Computacional , Ecossistema , Algoritmos , Armazenamento e Recuperação da Informação , Fluxo de Trabalho
17.
F1000Res ; 10: 897, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34804501

RESUMO

Scientific data analyses often combine several computational tools in automated pipelines, or workflows. Thousands of such workflows have been used in the life sciences, though their composition has remained a cumbersome manual process due to a lack of standards for annotation, assembly, and implementation. Recent technological advances have returned the long-standing vision of automated workflow composition into focus. This article summarizes a recent Lorentz Center workshop dedicated to automated composition of workflows in the life sciences. We survey previous initiatives to automate the composition process, and discuss the current state of the art and future perspectives. We start by drawing the "big picture" of the scientific workflow development life cycle, before surveying and discussing current methods, technologies and practices for semantic domain modelling, automation in workflow development, and workflow assessment. Finally, we derive a roadmap of individual and community-based actions to work toward the vision of automated workflow development in the forthcoming years. A central outcome of the workshop is a general description of the workflow life cycle in six stages: 1) scientific question or hypothesis, 2) conceptual workflow, 3) abstract workflow, 4) concrete workflow, 5) production workflow, and 6) scientific results. The transitions between stages are facilitated by diverse tools and methods, usually incorporating domain knowledge in some form. Formal semantic domain modelling is hard and often a bottleneck for the application of semantic technologies. However, life science communities have made considerable progress here in recent years and are continuously improving, renewing interest in the application of semantic technologies for workflow exploration, composition and instantiation. Combined with systematic benchmarking with reference data and large-scale deployment of production-stage workflows, such technologies enable a more systematic process of workflow development than we know today. We believe that this can lead to more robust, reusable, and sustainable workflows in the future.


Assuntos
Disciplinas das Ciências Biológicas , Biologia Computacional , Benchmarking , Software , Fluxo de Trabalho
18.
F1000Res ; 102021.
Artigo em Inglês | MEDLINE | ID: mdl-37842337

RESUMO

Toxicology has been an active research field for many decades, with academic, industrial and government involvement. Modern omics and computational approaches are changing the field, from merely disease-specific observational models into target-specific predictive models. Traditionally, toxicology has strong links with other fields such as biology, chemistry, pharmacology and medicine. With the rise of synthetic and new engineered materials, alongside ongoing prioritisation needs in chemical risk assessment for existing chemicals, early predictive evaluations are becoming of utmost importance to both scientific and regulatory purposes. ELIXIR is an intergovernmental organisation that brings together life science resources from across Europe. To coordinate the linkage of various life science efforts around modern predictive toxicology, the establishment of a new ELIXIR Community is seen as instrumental. In the past few years, joint efforts, building on incidental overlap, have been piloted in the context of ELIXIR. For example, the EU-ToxRisk, diXa, HeCaToS, transQST, and the nanotoxicology community have worked with the ELIXIR TeSS, Bioschemas, and Compute Platforms and activities. In 2018, a core group of interested parties wrote a proposal, outlining a sketch of what this new ELIXIR Toxicology Community would look like. A recent workshop (held September 30th to October 1st, 2020) extended this into an ELIXIR Toxicology roadmap and a shortlist of limited investment-high gain collaborations to give body to this new community. This Whitepaper outlines the results of these efforts and defines our vision of the ELIXIR Toxicology Community and how it complements other ELIXIR activities.


Assuntos
Disciplinas das Ciências Biológicas , Europa (Continente) , Medição de Risco
19.
Bioinformatics ; 25(22): 3005-11, 2009 Nov 15.
Artigo em Inglês | MEDLINE | ID: mdl-19689959

RESUMO

MOTIVATION: For the biologist, running bioinformatics analyses involves a time-consuming management of data and tools. Users need support to organize their work, retrieve parameters and reproduce their analyses. They also need to be able to combine their analytic tools using a safe data flow software mechanism. Finally, given that scientific tools can be difficult to install, it is particularly helpful for biologists to be able to use these tools through a web user interface. However, providing a web interface for a set of tools raises the problem that a single web portal cannot offer all the existing and possible services: it is the user, again, who has to cope with data copy among a number of different services. A framework enabling portal administrators to build a network of cooperating services would therefore clearly be beneficial. RESULTS: We have designed a system, Mobyle, to provide a flexible and usable Web environment for defining and running bioinformatics analyses. It embeds simple yet powerful data management features that allow the user to reproduce analyses and to combine tools using a hierarchical typing system. Mobyle offers invocation of services distributed over remote Mobyle servers, thus enabling a federated network of curated bioinformatics portals without the user having to learn complex concepts or to install sophisticated software. While being focused on the end user, the Mobyle system also addresses the need, for the bioinfomatician, to automate remote services execution: PlayMOBY is a companion tool that automates the publication of BioMOBY web services, using Mobyle program definitions. AVAILABILITY: The Mobyle system is distributed under the terms of the GNU GPLv2 on the project web site (http://bioweb2.pasteur.fr/projects/mobyle/). It is already deployed on three servers: http://mobyle.pasteur.fr, http://mobyle.rpbs.univ-paris-diderot.fr and http://lipm-bioinfo.toulouse.inra.fr/Mobyle. The PlayMOBY companion is distributed under the terms of the CeCILL license, and is available at http://lipm-bioinfo.toulouse.inra.fr/biomoby/PlayMOBY/.


Assuntos
Biologia Computacional/métodos , Software , Bases de Dados Factuais , Internet , Interface Usuário-Computador
20.
NAR Genom Bioinform ; 2(1): lqaa003, 2020 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-32002517

RESUMO

Genome-wide association study (GWAS) has been the driving force for identifying association between genetic variants and human phenotypes. Thousands of GWAS summary statistics covering a broad range of human traits and diseases are now publicly available. These GWAS have proven their utility for a range of secondary analyses, including in particular the joint analysis of multiple phenotypes to identify new associated genetic variants. However, although several methods have been proposed, there are very few large-scale applications published so far because of challenges in implementing these methods on real data. Here, we present JASS (Joint Analysis of Summary Statistics), a polyvalent Python package that addresses this need. Our package incorporates recently developed joint tests such as the omnibus approach and various weighted sum of Z-score tests while solving all practical and computational barriers for large-scale multivariate analysis of GWAS summary statistics. This includes data cleaning and harmonization tools, an efficient algorithm for fast derivation of joint statistics, an optimized data management process and a web interface for exploration purposes. Both benchmark analyses and real data applications demonstrated the robustness and strong potential of JASS for the detection of new associated genetic variants. Our package is freely available at https://gitlab.pasteur.fr/statistical-genetics/jass.

SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa