Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 21
Filtrar
Mais filtros

Bases de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
BMC Bioinformatics ; 23(1): 498, 2022 Nov 19.
Artigo em Inglês | MEDLINE | ID: mdl-36402955

RESUMO

BACKGROUND: Genome-wide association studies (GWAS) are a powerful method to detect associations between variants and phenotypes. A GWAS requires several complex computations with large data sets, and many steps may need to be repeated with varying parameters. Manual running of these analyses can be tedious, error-prone and hard to reproduce. RESULTS: The H3AGWAS workflow from the Pan-African Bioinformatics Network for H3Africa is a powerful, scalable and portable workflow implementing pre-association analysis, implementation of various association testing methods and post-association analysis of results. CONCLUSIONS: The workflow is scalable-laptop to cluster to cloud (e.g., SLURM, AWS Batch, Azure). All required software is containerised and can run under Docker or Singularity.


Assuntos
Biologia Computacional , Estudo de Associação Genômica Ampla , Fluxo de Trabalho , Biologia Computacional/métodos , Software , Fenótipo
2.
BMC Bioinformatics ; 19(1): 457, 2018 Nov 29.
Artigo em Inglês | MEDLINE | ID: mdl-30486782

RESUMO

BACKGROUND: The Pan-African bioinformatics network, H3ABioNet, comprises 27 research institutions in 17 African countries. H3ABioNet is part of the Human Health and Heredity in Africa program (H3Africa), an African-led research consortium funded by the US National Institutes of Health and the UK Wellcome Trust, aimed at using genomics to study and improve the health of Africans. A key role of H3ABioNet is to support H3Africa projects by building bioinformatics infrastructure such as portable and reproducible bioinformatics workflows for use on heterogeneous African computing environments. Processing and analysis of genomic data is an example of a big data application requiring complex interdependent data analysis workflows. Such bioinformatics workflows take the primary and secondary input data through several computationally-intensive processing steps using different software packages, where some of the outputs form inputs for other steps. Implementing scalable, reproducible, portable and easy-to-use workflows is particularly challenging. RESULTS: H3ABioNet has built four workflows to support (1) the calling of variants from high-throughput sequencing data; (2) the analysis of microbial populations from 16S rDNA sequence data; (3) genotyping and genome-wide association studies; and (4) single nucleotide polymorphism imputation. A week-long hackathon was organized in August 2016 with participants from six African bioinformatics groups, and US and European collaborators. Two of the workflows are built using the Common Workflow Language framework (CWL) and two using Nextflow. All the workflows are containerized for improved portability and reproducibility using Docker, and are publicly available for use by members of the H3Africa consortium and the international research community. CONCLUSION: The H3ABioNet workflows have been implemented in view of offering ease of use for the end user and high levels of reproducibility and portability, all while following modern state of the art bioinformatics data processing protocols. The H3ABioNet workflows will service the H3Africa consortium projects and are currently in use. All four workflows are also publicly available for research scientists worldwide to use and adapt for their respective needs. The H3ABioNet workflows will help develop bioinformatics capacity and assist genomics research within Africa and serve to increase the scientific output of H3Africa and its Pan-African Bioinformatics Network.


Assuntos
Biologia Computacional/métodos , Genômica/métodos , África , Humanos , Reprodutibilidade dos Testes
3.
Bioinformatics ; 33(9): 1418-1420, 2017 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-28453679

RESUMO

Summary: BioPAXViz is a Cytoscape (version 3) application, providing a comprehensive framework for metabolic pathway visualization. Beyond the basic parsing, viewing and browsing roles, the main novel function that BioPAXViz provides is a visual comparative analysis of metabolic pathway topologies across pre-computed pathway phylogenomic profiles given a species phylogeny. Furthermore, BioPAXViz supports the display of hierarchical trees that allow efficient navigation through sets of variants of a single reference pathway. Thus, BioPAXViz can significantly facilitate, and contribute to, the study of metabolic pathway evolution and engineering. Availability and Implementation: BioPAXViz has been developed as a Cytoscape app and is available at: https://github.com/CGU-CERTH/BioPAX.Viz. The software is distributed under the MIT License and is accompanied by example files and data. Additional documentation is available at the aforementioned GitHub repository. Contact: ouzounis@certh.gr.


Assuntos
Biologia Computacional/métodos , Evolução Molecular , Redes e Vias Metabólicas/genética , Software , Filogenia
4.
Nat Genet ; 2024 Oct 02.
Artigo em Inglês | MEDLINE | ID: mdl-39358599

RESUMO

Men of African descent have the highest prostate cancer incidence and mortality rates, yet the genetic basis of prostate cancer in African men has been understudied. We used genomic data from 3,963 cases and 3,509 controls from Ghana, Nigeria, Senegal, South Africa and Uganda to infer ancestry-specific genetic architectures and fine-map disease associations. Fifteen independent associations at 8q24.21, 6q22.1 and 11q13.3 reached genome-wide significance, including four new associations. Intriguingly, multiple lead associations are private alleles, a pattern arising from recent mutations and the out-of-Africa bottleneck. These African-specific alleles contribute to haplotypes with odds ratios above 2.4. We found that the genetic architecture of prostate cancer differs across Africa, with effect size differences contributing more to this heterogeneity than allele frequency differences. Population genetic analyses reveal that African prostate cancer associations are largely governed by neutral evolution. Collectively, our findings emphasize the utility of conducting genetic studies that use diverse populations.

5.
Cell Genom ; 3(10): 100386, 2023 Oct 11.
Artigo em Inglês | MEDLINE | ID: mdl-37868041

RESUMO

A lack of diversity in genomics for health continues to hinder equitable leadership and access to precision medicine approaches for underrepresented populations. To avoid perpetuating biases within the genomics workforce and genomic data collection practices, equity, diversity, and inclusion (EDI) must be addressed. This paper documents the journey taken by the Global Alliance for Genomics and Health (a genomics-based standard-setting and policy-framing organization) to create a more equitable, diverse, and inclusive environment for its standards and members. Initial steps include the creation of two groups: the Equity, Diversity, and Inclusion Advisory Group and the Regulatory and Ethics Diversity Group. Following a framework that we call "Reflected in our Teams, Reflected in our Standards," both groups address EDI at different stages in their policy development process.

6.
Res Sq ; 2023 Oct 12.
Artigo em Inglês | MEDLINE | ID: mdl-37886553

RESUMO

Men of African descent have the highest prostate cancer (CaP) incidence and mortality rates, yet the genetic basis of CaP in African men has been understudied. We used genomic data from 3,963 CaP cases and 3,509 controls recruited in Ghana, Nigeria, Senegal, South Africa, and Uganda, to infer ancestry-specific genetic architectures and fine-mapped disease associations. Fifteen independent associations at 8q24.21, 6q22.1, and 11q13.3 reached genome-wide significance, including four novel associations. Intriguingly, multiple lead SNPs are private alleles, a pattern arising from recent mutations and the out-of-Africa bottleneck. These African-specific alleles contribute to haplotypes with odds ratios above 2.4. We found that the genetic architecture of CaP differs across Africa, with effect size differences contributing more to this heterogeneity than allele frequency differences. Population genetic analyses reveal that African CaP associations are largely governed by neutral evolution. Collectively, our findings emphasize the utility of conducting genetic studies that use diverse populations.

7.
NPJ Precis Oncol ; 6(1): 39, 2022 Jun 17.
Artigo em Inglês | MEDLINE | ID: mdl-35715489

RESUMO

Carriers of germline BRCA2 pathogenic sequence variants have elevated aggressive prostate cancer risk and are candidates for precision oncology treatments. We examined whether BRCA2-deficient (BRCA2d) prostate tumors have distinct genomic alterations compared with BRCA2-intact (BRCA2i) tumors. Among 2536 primary and 899 metastatic prostate tumors from the ICGC, GENIE, and TCGA databases, we identified 138 primary and 85 metastatic BRCA2d tumors. Total tumor mutation burden (TMB) was higher among primary BRCA2d tumors, although pathogenic TMB did not differ by tumor BRCA2 status. Pathogenic and total single nucleotide variant (SNV) frequencies at KMT2D were higher in BRCA2d primary tumors, as was the total SNV frequency at KMT2D in BRCA2d metastatic tumors. Homozygous deletions at NEK3, RB1, and APC were enriched in BRCA2d primary tumors, and RB1 deletions in metastatic BRCA2d tumors as well. TMPRSS2-ETV1 fusions were more common in BRCA2d tumors. These results identify somatic alterations that hallmark etiological and prognostic differences between BRCA2d and BRCA2i prostate tumors.

8.
Genome Biol ; 23(1): 194, 2022 09 13.
Artigo em Inglês | MEDLINE | ID: mdl-36100952

RESUMO

BACKGROUND: Genome-wide association studies do not always replicate well across populations, limiting the generalizability of polygenic risk scores (PRS). Despite higher incidence and mortality rates of prostate cancer in men of African descent, much of what is known about cancer genetics comes from populations of European descent. To understand how well genetic predictions perform in different populations, we evaluated test characteristics of PRS from three previous studies using data from the UK Biobank and a novel dataset of 1298 prostate cancer cases and 1333 controls from Ghana, Nigeria, Senegal, and South Africa. RESULTS: Allele frequency differences cause predicted risks of prostate cancer to vary across populations. However, natural selection is not the primary driver of these differences. Comparing continental datasets, we find that polygenic predictions of case vs. control status are more effective for European individuals (AUC 0.608-0.707, OR 2.37-5.71) than for African individuals (AUC 0.502-0.585, OR 0.95-2.01). Furthermore, PRS that leverage information from African Americans yield modest AUC and odds ratio improvements for sub-Saharan African individuals. These improvements were larger for West Africans than for South Africans. Finally, we find that existing PRS are largely unable to predict whether African individuals develop aggressive forms of prostate cancer, as specified by higher tumor stages or Gleason scores. CONCLUSIONS: Genetic predictions of prostate cancer perform poorly if the study sample does not match the ancestry of the original GWAS. PRS built from European GWAS may be inadequate for application in non-European populations and perpetuate existing health disparities.


Assuntos
Estudo de Associação Genômica Ampla , Neoplasias da Próstata , África Subsaariana/epidemiologia , Predisposição Genética para Doença , Humanos , Masculino , Neoplasias da Próstata/genética , Fatores de Risco
9.
Inform Health Soc Care ; 45(1): 77-95, 2020 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-30653364

RESUMO

Background: While healthcare systems are investing resources on type 2 diabetes patients, self-management is becoming the new trend for these patients. Due to the pervasiveness of computing devices, a number of computerized systems are emerging to support the self-management of patients.Objective: The primary objective of this review is to identify and categorize the computational tools that exist for the self-management of type 2 diabetes, and to identify challenges that need to be addressed.Results: The tools have been categorized into web applications, mobile applications, games and ubiquitous diabetes management systems. We provide a detailed description of the salient features of each category along with a comparison of the various tools, listing their challenges and practical implications. A list of platforms that can be used to develop new tools for the self-management of type 2 diabetes, namely mobile applications development, sensor development, cloud computing, social media, and machine learning and predictive analysis platforms, are also provided.Discussions: This paper identifies a number of challenges in the existing categories of computational tools and consequently presents possible avenues for future research. Failure to address these issues will negatively impact on the adoption rate of the self-management tools and applications.


Assuntos
Diabetes Mellitus Tipo 2/terapia , Comportamentos Relacionados com a Saúde , Autogestão/métodos , Automonitorização da Glicemia/métodos , Humanos , Internet , Aplicativos Móveis , Mídias Sociais , Telemedicina , Jogos de Vídeo
10.
Health Technol (Berl) ; 10(5): 1115-1127, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32837807

RESUMO

Early detection of disease outbreaks is crucial and even small improvements in detection can significantly impact on a country's public health. In this work, we investigate the use of a crowdsourcing application and a real-time disease outbreak surveillance system for five diseases; Influenza, Gastroenteritis, Upper Respiratory Tract Infection (URTI), Scabies and Conjunctivitis, that are closely monitored in Mauritius. We also analyze and correlate the collected data with past statistics. A crowdsourcing mobile application known as Disease Outbreak Tracker (DOT) was implemented and made public. A real-time disease surveillance system using the Early Aberration Reporting System algorithm (EARS) for analysis of the collected data was also implemented. The collected data were correlated to historical data for 2017. Data were successfully collected and plotted on a daily basis. The results show that a few cases of Flu and Scabies were reported in some districts. The EARS methods C1, C2 and C3 also depicted spikes above the set threshold on some days. The study covers data collected over a period of one month. Once symptoms data were collected using DOT, probabilistic methods were used to find the disease or diseases that the user was suffering from. The data were further processed to find the extent of the disease outbreak district-wise, per disease. These data were represented graphically for a rapid understanding of the situation in each district. Our findings concur with existing data for the same period for previous years showing that the crowdsourcing application can aid in the detection of disease outbreaks.

11.
R Soc Open Sci ; 7(12): 201293, 2020 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-33489277

RESUMO

The engineering of polymeric scaffolds for tissue regeneration has known a phenomenal growth during the past decades as materials scientists seek to understand cell biology and cell-material behaviour. Statistical methods are being applied to physico-chemical properties of polymeric scaffolds for tissue engineering (TE) to guide through the complexity of experimental conditions. We have attempted using experimental in vitro data and physico-chemical data of electrospun polymeric scaffolds, tested for skin TE, to model scaffold performance using machine learning (ML) approach. Fibre diameter, pore diameter, water contact angle and Young's modulus were used to find a correlation with 3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyltetrazolium bromide (MTT) assay of L929 fibroblasts cells on the scaffolds after 7 days. Six supervised learning algorithms were trained on the data using Seaborn/Scikit-learn Python libraries. After hyperparameter tuning, random forest regression yielded the highest accuracy of 62.74%. The predictive model was also correlated with in vivo data. This is a first preliminary study on ML methods for the prediction of cell-material interactions on nanofibrous scaffolds.

12.
PLoS One ; 15(11): e0242780, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-33232371

RESUMO

As the genomic profile across cancers varies from person to person, patient prognosis and treatment may differ based on the mutational signature of each tumour. Thus, it is critical to understand genomic drivers of cancer and identify potential mutational commonalities across tumors originating at diverse anatomical sites. Large-scale cancer genomics initiatives, such as TCGA, ICGC and GENIE have enabled the analysis of thousands of tumour genomes. Our goal was to identify new cancer-causing mutations that may be common across tumour sites using mutational and gene expression profiles. Genomic and transcriptomic data from breast, ovarian, and prostate cancers were aggregated and analysed using differential gene expression methods to identify the effect of specific mutations on the expression of multiple genes. Mutated genes associated with the most differentially expressed genes were considered to be novel candidates for driver mutations, and were validated through literature mining, pathway analysis and clinical data investigation. Our driver selection method successfully identified 116 probable novel cancer-causing genes, with 4 discovered in patients having no alterations in any known driver genes: MXRA5, OBSCN, RYR1, and TG. The candidate genes previously not officially classified as cancer-causing showed enrichment in cancer pathways and in cancer diseases. They also matched expectations pertaining to properties of cancer genes, for instance, showing larger gene and protein lengths, and having mutation patterns suggesting oncogenic or tumor suppressor properties. Our approach allows for the identification of novel putative driver genes that are common across cancer sites using an unbiased approach without any a priori knowledge on pathways or gene interactions and is therefore an agnostic approach to the identification of putative common driver genes acting at multiple cancer sites.


Assuntos
Bases de Dados de Ácidos Nucleicos , Regulação Neoplásica da Expressão Gênica , Mutação , Proteínas Oncogênicas , Lesões Pré-Cancerosas , Transcriptoma , Feminino , Perfilação da Expressão Gênica , Humanos , Masculino , Proteínas Oncogênicas/biossíntese , Proteínas Oncogênicas/genética , Lesões Pré-Cancerosas/genética , Lesões Pré-Cancerosas/metabolismo
13.
Cancer Res ; 80(13): 2956-2966, 2020 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-32393663

RESUMO

Although prostate cancer is the leading cause of cancer mortality for African men, the vast majority of known disease associations have been detected in European study cohorts. Furthermore, most genome-wide association studies have used genotyping arrays that are hindered by SNP ascertainment bias. To overcome these disparities in genomic medicine, the Men of African Descent and Carcinoma of the Prostate (MADCaP) Network has developed a genotyping array that is optimized for African populations. The MADCaP Array contains more than 1.5 million markers and an imputation backbone that successfully tags over 94% of common genetic variants in African populations. This array also has a high density of markers in genomic regions associated with cancer susceptibility, including 8q24. We assessed the effectiveness of the MADCaP Array by genotyping 399 prostate cancer cases and 403 controls from seven urban study sites in sub-Saharan Africa. Samples from Ghana and Nigeria clustered together, whereas samples from Senegal and South Africa yielded distinct ancestry clusters. Using the MADCaP array, we identified cancer-associated loci that have large allele frequency differences across African populations. Polygenic risk scores for prostate cancer were higher in Nigeria than in Senegal. In summary, individual and population-level differences in prostate cancer risk were revealed using a novel genotyping array. SIGNIFICANCE: This study presents an Africa-specific genotyping array, which enables investigators to identify novel disease associations and to fine-map genetic loci that are associated with prostate and other cancers.


Assuntos
População Negra/genética , Predisposição Genética para Doença , Neoplasias/epidemiologia , Neoplasias/genética , Polimorfismo de Nucleotídeo Único , Neoplasias da Próstata/epidemiologia , Neoplasias da Próstata/genética , Estudos de Casos e Controles , Estudos de Coortes , Loci Gênicos , Genética Populacional , Estudo de Associação Genômica Ampla , Humanos , Masculino , Neoplasias/classificação , Neoplasias da Próstata/classificação , Fatores de Risco , África do Sul/epidemiologia
14.
Front Microbiol ; 10: 3119, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-32082269

RESUMO

Microbial genome-wide association studies (mGWAS) are a new and exciting research field that is adapting human GWAS methods to understand how variations in microbial genomes affect host or pathogen phenotypes, such as drug resistance, virulence, host specificity and prognosis. Several computational tools and methods have been developed or adapted from human GWAS to facilitate the discovery of novel mutations and structural variations that are associated with the phenotypes of interest. However, no comprehensive, end-to-end, user-friendly tool is currently available. The development of a broadly applicable pipeline presents a real opportunity among computational biologists. Here, (i) we review the prominent and promising tools, (ii) discuss analytical pitfalls and bottlenecks in mGWAS, (iii) provide insights into the selection of appropriate tools, (iv) highlight the gaps that still need to be filled and how users and developers can work together to overcome these bottlenecks. Use of mGWAS research can inform drug repositioning decisions as well as accelerate the discovery and development of more effective vaccines and antimicrobials for pressing infectious diseases of global health significance, such as HIV, TB, influenza, and malaria.

15.
BMJ Open ; 9(11): e029539, 2019 11 26.
Artigo em Inglês | MEDLINE | ID: mdl-31772086

RESUMO

OBJECTIVE: This project aimed to develop and propose a standardised reporting guideline for kidney disease research and clinical data reporting, in order to improve kidney disease data quality and integrity, and combat challenges associated with the management and challenges of 'Big Data'. METHODS: A list of recommendations was proposed for the reporting guideline based on the systematic review and consolidation of previously published data collection and reporting standards, including PhenX measures and Minimal Information about a Proteomics Experiment (MIAPE). Thereafter, these recommendations were reviewed by domain-specialists using an online survey, developed in Research Electronic Data Capture (REDCap). Following interpretation and consolidation of the survey results, the recommendations were mapped to existing ontologies using Zooma, Ontology Lookup Service and the Bioportal search engine. Additionally, an associated eXtensible Markup Language schema was created for the REDCap implementation to increase user friendliness and adoption. RESULTS: The online survey was completed by 53 respondents; the majority of respondents were dual clinician-researchers (57%), based in Australia (35%), Africa (33%) and North America (22%). Data elements within the reporting standard were identified as participant-level, study-level and experiment-level information, further subdivided into essential or optional information. CONCLUSION: The reporting guideline is readily employable for kidney disease research projects, and also adaptable for clinical utility. The adoption of the reporting guideline in kidney disease research can increase data quality and the value for long-term preservation, ensuring researchers gain the maximum benefit from their collected and generated data.


Assuntos
Guias como Assunto/normas , Nefropatias/terapia , Nefrologia/normas , Pesquisa Translacional Biomédica/normas , Pesquisa Biomédica/normas , Humanos , Reprodutibilidade dos Testes , Projetos de Pesquisa
16.
Microbiol Res ; 211: 31-46, 2018 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-29705204

RESUMO

A number of examples of putative eukaryote-to-prokaryote horizontal gene transfer (HGT) have been proposed in the past using phylogenetic analysis in support of these claims but none have attempted to map these gene transfers to the presence of genomic islands (GIs) in the host. Two of these cases have been examined in detail, including an ATP sulfurylase (ATPS) gene and a class I fructose bisphosphate aldolase (FBA I) gene that were putatively transferred to cyanobacteria of the genus Prochlorococcus from either green or red algae, respectively. Unlike previous investigations of HGT, parametric methods were initially used to detect genomic islands, then more traditional phylogenomic and phylogenetic methods were used to confirm or deny the HGT status of these genes. The combination of these three methods of analysis- detection of GIs, the determination of genomic neighborhoods, as well as traditional phylogeny, lends strong support to the claim that trans-domain HGT has occurred in only one of these cases and further suggests a new insight into the method of transmission of FBA I, namely that cyanophage-mediated transfer may have been responsible for the HGT event in question. The described methods were then applied to a range of prochlorococcal genomes in order to characterize a candidate for eukaryote-to-prokaryote HGT that had not been previously studied by others. Application of the same methodology used to confirm or deny HGT for ATPS and FBA I identified a ⊗12 fatty acid desaturase (FAD) gene that was likely transferred to Prochlorococcus from either green or red algae.


Assuntos
Bacteriófagos/genética , Cianobactérias/genética , Eucariotos/genética , Evolução Molecular , Transferência Genética Horizontal , Ilhas Genômicas , Composição de Bases , Clorófitas/genética , Frutose-Bifosfato Aldolase/genética , Genes Bacterianos/genética , Genômica , Repetições de Microssatélites , Filogenia , Prochlorococcus/genética , Rodófitas/genética , Análise de Sequência de Proteína , Sulfato Adenililtransferase/classificação , Sulfato Adenililtransferase/genética
17.
AAS Open Res ; 1: 9, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-32382696

RESUMO

The need for portable and reproducible genomics analysis pipelines is growing globally as well as in Africa, especially with the growth of collaborative projects like the Human Health and Heredity in Africa Consortium (H3Africa). The Pan-African H3Africa Bioinformatics Network (H3ABioNet) recognized the need for portable, reproducible pipelines adapted to heterogeneous compute environments, and for the nurturing of technical expertise in workflow languages and containerization technologies. To address this need, in 2016 H3ABioNet arranged its first Cloud Computing and Reproducible Workflows Hackathon, with the purpose of building key genomics analysis pipelines able to run on heterogeneous computing environments and meeting the needs of H3Africa research projects. This paper describes the preparations for this hackathon and reflects upon the lessons learned about its impact on building the technical and scientific expertise of African researchers. The workflows developed were made publicly available in GitHub repositories and deposited as container images on quay.io.

18.
Biosystems ; 156-157: 72-85, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28392341

RESUMO

A multitude of algorithms for sequence comparison, short-read assembly and whole-genome alignment have been developed in the general context of molecular biology, to support technology development for high-throughput sequencing, numerous applications in genome biology and fundamental research on comparative genomics. The computational complexity of these algorithms has been previously reported in original research papers, yet this often neglected property has not been reviewed previously in a systematic manner and for a wider audience. We provide a review of space and time complexity of key sequence analysis algorithms and highlight their properties in a comprehensive manner, in order to identify potential opportunities for further research in algorithm or data structure optimization. The complexity aspect is poised to become pivotal as we will be facing challenges related to the continuous increase of genomic data on unprecedented scales and complexity in the foreseeable future, when robust biological simulation at the cell level and above becomes a reality.


Assuntos
Algoritmos , Genômica , Alinhamento de Sequência , Animais , Biologia Computacional , Genoma , Humanos , Análise de Sequência de DNA
19.
Glob Heart ; 12(2): 91-98, 2017 06.
Artigo em Inglês | MEDLINE | ID: mdl-28302555

RESUMO

BACKGROUND: Although pockets of bioinformatics excellence have developed in Africa, generally, large-scale genomic data analysis has been limited by the availability of expertise and infrastructure. H3ABioNet, a pan-African bioinformatics network, was established to build capacity specifically to enable H3Africa (Human Heredity and Health in Africa) researchers to analyze their data in Africa. Since the inception of the H3Africa initiative, H3ABioNet's role has evolved in response to changing needs from the consortium and the African bioinformatics community. OBJECTIVES: H3ABioNet set out to develop core bioinformatics infrastructure and capacity for genomics research in various aspects of data collection, transfer, storage, and analysis. METHODS AND RESULTS: Various resources have been developed to address genomic data management and analysis needs of H3Africa researchers and other scientific communities on the continent. NetMap was developed and used to build an accurate picture of network performance within Africa and between Africa and the rest of the world, and Globus Online has been rolled out to facilitate data transfer. A participant recruitment database was developed to monitor participant enrollment, and data is being harmonized through the use of ontologies and controlled vocabularies. The standardized metadata will be integrated to provide a search facility for H3Africa data and biospecimens. Because H3Africa projects are generating large-scale genomic data, facilities for analysis and interpretation are critical. H3ABioNet is implementing several data analysis platforms that provide a large range of bioinformatics tools or workflows, such as Galaxy, the Job Management System, and eBiokits. A set of reproducible, portable, and cloud-scalable pipelines to support the multiple H3Africa data types are also being developed and dockerized to enable execution on multiple computing infrastructures. In addition, new tools have been developed for analysis of the uniquely divergent African data and for downstream interpretation of prioritized variants. To provide support for these and other bioinformatics queries, an online bioinformatics helpdesk backed by broad consortium expertise has been established. Further support is provided by means of various modes of bioinformatics training. CONCLUSIONS: For the past 4 years, the development of infrastructure support and human capacity through H3ABioNet, have significantly contributed to the establishment of African scientific networks, data analysis facilities, and training programs. Here, we describe the infrastructure and how it has affected genomics and bioinformatics research in Africa.


Assuntos
Pesquisa Biomédica/métodos , Biologia Computacional/tendências , Genômica/métodos , África , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA