Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 33
Filtrar
1.
Clin Immunol ; : 110288, 2024 Jun 29.
Artigo em Inglês | MEDLINE | ID: mdl-38950723

RESUMO

Interleukin-2 (IL-2) holds promise for the treatment of cancer and autoimmune diseases, but its high-dose usage is associated with systemic immunotoxicity. Differential IL-2 receptor (IL-2R) regulation might impact function of cells upon IL-2 stimulation, possibly inducing cellular changes similar to patients with hypomorphic IL2RB mutations, presenting with multiorgan autoimmunity. Here, we show that sustained high-dose IL-2 stimulation of human lymphocytes drastically reduces IL-2Rß surface expression especially on T cells, resulting in impaired IL-2R signaling which correlates with high IL-2Rα baseline expression. IL-2R signaling in NK cells is maintained. CD4+ T cells, especially regulatory T cells are more broadly affected than CD8+ T cells, consistent with lineage-specific differences in IL-2 responsiveness. Given the resemblance of cellular characteristics of high-dose IL-2-stimulated cells and cells from patients with IL-2Rß defects, impact of continuous IL-2 stimulation on IL-2R signaling should be considered in the onset of clinical adverse events during IL-2 therapy.

2.
Bioinformatics ; 38(4): 1171-1172, 2022 01 27.
Artigo em Inglês | MEDLINE | ID: mdl-34791064

RESUMO

SUMMARY: COBREXA.jl is a Julia package for scalable, high-performance constraint-based reconstruction and analysis of very large-scale biological models. Its primary purpose is to facilitate the integration of modern high performance computing environments with the processing and analysis of large-scale metabolic models of challenging complexity. We report the architecture of the package, and demonstrate how the design promotes analysis scalability on several use-cases with multi-organism community models. AVAILABILITY AND IMPLEMENTATION: https://doi.org/10.17881/ZKCR-BT30. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Metodologias Computacionais , Software , Modelos Biológicos
3.
Nucleic Acids Res ; 45(20): 11495-11514, 2017 Nov 16.
Artigo em Inglês | MEDLINE | ID: mdl-29059321

RESUMO

The post-genomic era has provided researchers with a deluge of protein sequences. However, a significant fraction of the proteins encoded by sequenced genomes remains without an identified function. Here, we aim at determining how many enzymes of uncertain or unknown function are still present in the Saccharomyces cerevisiae and human proteomes. Using information available in the Swiss-Prot, BRENDA and KEGG databases in combination with a Hidden Markov Model-based method, we estimate that >600 yeast and 2000 human proteins (>30% of their proteins of unknown function) are enzymes whose precise function(s) remain(s) to be determined. This illustrates the impressive scale of the 'unknown enzyme problem'. We extensively review classical biochemical as well as more recent systematic experimental and computational approaches that can be used to support enzyme function discovery research. Finally, we discuss the possible roles of the elusive catalysts in light of recent developments in the fields of enzymology and metabolism as well as the significance of the unknown enzyme problem in the context of metabolic modeling, metabolic engineering and rare disease research.


Assuntos
Biocatálise , Genoma Fúngico/genética , Genoma Humano/genética , Metaboloma/genética , Saccharomyces cerevisiae/enzimologia , Sequência de Bases , Mapeamento Cromossômico , Bases de Dados Genéticas , Bases de Dados de Proteínas , Enzimas/análise , Enzimas/genética , Humanos , Metabolômica/métodos , Proteoma/genética , Locos de Características Quantitativas , Saccharomyces cerevisiae/genética
4.
Bioinformatics ; 33(12): 1852-1858, 2017 Jun 15.
Artigo em Inglês | MEDLINE | ID: mdl-28200120

RESUMO

MOTIVATION: The extraction of sequence variants from the literature remains an important task. Existing methods primarily target standard (ST) mutation mentions (e.g. 'E6V'), leaving relevant mentions natural language (NL) largely untapped (e.g. 'glutamic acid was substituted by valine at residue 6'). RESULTS: We introduced three new corpora suggesting named-entity recognition (NER) to be more challenging than anticipated: 28-77% of all articles contained mentions only available in NL. Our new method nala captured NL and ST by combining conditional random fields with word embedding features learned unsupervised from the entire PubMed. In our hands, nala substantially outperformed the state-of-the-art. For instance, we compared all unique mentions in new discoveries correctly detected by any of three methods (SETH, tmVar, or nala ). Neither SETH nor tmVar discovered anything missed by nala , while nala uniquely tagged 33% mentions. For NL mentions the corresponding value shot up to 100% nala -only. AVAILABILITY AND IMPLEMENTATION: Source code, API and corpora freely available at: http://tagtog.net/-corpora/IDP4+ . CONTACT: nala@rostlab.org. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Mineração de Dados/métodos , Mutação , Processamento de Linguagem Natural , Software , Humanos , PubMed , Aprendizado de Máquina não Supervisionado
5.
Nat Chem Biol ; 11(5): 347-354, 2015 May.
Artigo em Inglês | MEDLINE | ID: mdl-25848931

RESUMO

Huntington's disease (HD) is a currently incurable neurodegenerative condition caused by an abnormally expanded polyglutamine tract in huntingtin (HTT). We identified new modifiers of mutant HTT toxicity by performing a large-scale 'druggable genome' siRNA screen in human cultured cells, followed by hit validation in Drosophila. We focused on glutaminyl cyclase (QPCT), which had one of the strongest effects on mutant HTT-induced toxicity and aggregation in the cell-based siRNA screen and also rescued these phenotypes in Drosophila. We found that QPCT inhibition induced the levels of the molecular chaperone αB-crystallin and reduced the aggregation of diverse proteins. We generated new QPCT inhibitors using in silico methods followed by in vitro screening, which rescued the HD-related phenotypes in cell, Drosophila and zebrafish HD models. Our data reveal a new HD druggable target affecting mutant HTT aggregation and provide proof of principle for a discovery pipeline from druggable genome screen to drug development.


Assuntos
Aminoaciltransferases/efeitos dos fármacos , Aminoaciltransferases/genética , Doença de Huntington/tratamento farmacológico , Doença de Huntington/genética , RNA Interferente Pequeno , Aminoaciltransferases/antagonistas & inibidores , Animais , Células Cultivadas , Biologia Computacional , Drosophila , Avaliação Pré-Clínica de Medicamentos , Inibidores Enzimáticos/farmacologia , Inibidores Enzimáticos/uso terapêutico , Proteínas de Fluorescência Verde/metabolismo , Humanos , Proteína Huntingtina , Camundongos , Camundongos Endogâmicos C57BL , Mutação/genética , Proteínas do Tecido Nervoso/genética , Proteínas do Tecido Nervoso/metabolismo , Neurônios/efeitos dos fármacos , Neurônios/metabolismo , Peixe-Zebra , Cadeia B de alfa-Cristalina/metabolismo
6.
Bioinformatics ; 30(22): 3249-56, 2014 Nov 15.
Artigo em Inglês | MEDLINE | ID: mdl-25100685

RESUMO

SUMMARY: The iterative process of finding relevant information in biomedical literature and performing bioinformatics analyses might result in an endless loop for an inexperienced user, considering the exponential growth of scientific corpora and the plethora of tools designed to mine PubMed(®) and related biological databases. Herein, we describe BioTextQuest(+), a web-based interactive knowledge exploration platform with significant advances to its predecessor (BioTextQuest), aiming to bridge processes such as bioentity recognition, functional annotation, document clustering and data integration towards literature mining and concept discovery. BioTextQuest(+) enables PubMed and OMIM querying, retrieval of abstracts related to a targeted request and optimal detection of genes, proteins, molecular functions, pathways and biological processes within the retrieved documents. The front-end interface facilitates the browsing of document clustering per subject, the analysis of term co-occurrence, the generation of tag clouds containing highly represented terms per cluster and at-a-glance popup windows with information about relevant genes and proteins. Moreover, to support experimental research, BioTextQuest(+) addresses integration of its primary functionality with biological repositories and software tools able to deliver further bioinformatics services. The Google-like interface extends beyond simple use by offering a range of advanced parameterization for expert users. We demonstrate the functionality of BioTextQuest(+) through several exemplary research scenarios including author disambiguation, functional term enrichment, knowledge acquisition and concept discovery linking major human diseases, such as obesity and ageing. AVAILABILITY: The service is accessible at http://bioinformatics.med.uoc.gr/biotextquest. CONTACT: g.pavlopoulos@gmail.com or georgios.pavlopoulos@esat.kuleuven.be SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Mineração de Dados/métodos , Software , Autoria , Análise por Conglomerados , Doença/genética , Genes , Humanos , Internet , Medical Subject Headings , Proteínas , PubMed , Publicações
7.
Cell Commun Signal ; 13: 21, 2015 Mar 31.
Artigo em Inglês | MEDLINE | ID: mdl-25880691

RESUMO

BACKGROUND: Gastrointestinal stromal tumours (GIST) are mainly characterised by the presence of activating mutations in either of the two receptor tyrosine kinases c-KIT or platelet-derived growth factor receptor-α (PDGFRα). Most mechanistic studies dealing with GIST mutations have focused on c-KIT and far less is known about the signalling characteristics of the mutated PDGFRα proteins. Here, we study the signalling capacities and corresponding transcriptional responses of the different PDGFRα proteins under comparable genomic conditions. RESULTS: We demonstrate that the constitutive signalling via the oncogenic PDGFRα mutants favours a mislocalisation of the receptors and that this modifies the signalling characteristics of the mutated receptors. We show that signalling via the oncogenic PDGFRα mutants is not solely characterised by a constitutive activation of the conventional PDGFRα signalling pathways. In contrast to wild-type PDGFRα signal transduction, the activation of STAT factors (STAT1, STAT3 and STAT5) is an integral part of signalling mediated via mutated PDGF-receptors. Furthermore, this unconventional STAT activation by mutated PDGFRα is already initiated in the endoplasmic reticulum whereas the conventional signalling pathways rather require cell surface expression of the receptor. Finally, we demonstrate that the activation of STAT factors also translates into a biologic response as highlighted by the induction of STAT target genes. CONCLUSION: We show that the overall oncogenic response is the result of different signatures emanating from different cellular compartments. Furthermore, STAT mediated responses are an integral part of mutated PDGFRα signalling.


Assuntos
Neoplasias Gastrointestinais/metabolismo , Mutação , Proteínas de Neoplasias/metabolismo , Receptor alfa de Fator de Crescimento Derivado de Plaquetas/metabolismo , Fatores de Transcrição STAT/metabolismo , Transdução de Sinais , Linhagem Celular Tumoral , Retículo Endoplasmático/genética , Retículo Endoplasmático/metabolismo , Retículo Endoplasmático/patologia , Ativação Enzimática/genética , Neoplasias Gastrointestinais/genética , Neoplasias Gastrointestinais/patologia , Humanos , Proteínas de Neoplasias/genética , Receptor alfa de Fator de Crescimento Derivado de Plaquetas/genética , Fatores de Transcrição STAT/genética
8.
PLoS Comput Biol ; 10(12): e1003951, 2014 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-25474213

RESUMO

Huge research effort has been invested over many years to determine the phenotypes of natural or artificial mutations in HIV proteins--interpretation of mutation phenotypes is an invaluable source of new knowledge. The results of this research effort are recorded in the scientific literature, but it is difficult for virologists to rapidly find it. Manually locating data on phenotypic variation within the approximately 270,000 available HIV-related research articles, or the further 1,500 articles that are published each month is a daunting task. Accordingly, the HIV research community would benefit from a resource cataloguing the available HIV mutation literature. We have applied computational text-mining techniques to parse and map mutagenesis and polymorphism information from the HIV literature, have enriched the data with ancillary information and have developed a public, web-based interface through which it can be intuitively explored: the HIV mutation browser. The current release of the HIV mutation browser describes the phenotypes of 7,608 unique mutations at 2,520 sites in the HIV proteome, resulting from the analysis of 120,899 papers. The mutation information for each protein is organised in a residue-centric manner and each residue is linked to the relevant experimental literature. The importance of HIV as a global health burden advocates extensive effort to maximise the efficiency of HIV research. The HIV mutation browser provides a valuable new resource for the research community. The HIV mutation browser is available at: http://hivmut.org.


Assuntos
Biologia Computacional/métodos , Bases de Dados Genéticas , Infecções por HIV/virologia , HIV-1/genética , Mutação/genética , Sequência de Aminoácidos , Humanos , Dados de Sequência Molecular
9.
J Proteome Res ; 12(12): 5954-62, 2013 Dec 06.
Artigo em Inglês | MEDLINE | ID: mdl-24006944

RESUMO

Cell-cell interactions are of fundamental importance for cellular function. In islets of Langerhans, which control blood glucose levels by secreting insulin in response to the blood glucose concentration, the secretory response of intact islets is higher than that of insulin-producing beta-cells not arranged in the islet architecture. The objective was to define mechanisms by which cellular performance is enhanced when cells are arranged in three-dimensional space. The task was addressed by making a comprehensive analysis based on protein expression patterns generated from insulin-secreting MIN6 cells grown as islet-like clusters, so-called pseudoislets, and in monolayers. After culture, glucose-stimulated insulin secretion (GSIS) was measured from monolayers and pseudoislets. GSIS rose 6-fold in pseudoislets but only 3-fold in monolayers when the glucose concentration was increased from 2 to 20 mmol/L. Proteins from pseudoislets and monolayers were extracted and analyzed by liquid-chromatography mass spectrometry, and differentially expressed proteins were mapped onto KEGG pathways. Protein profiling identified 1576 proteins, which were common to pseudoislets and monolayers. When mapped onto KEGG pathways, 11 highly enriched pathways were identified. On the basis of differences in expression of proteins belonging to the pathways in pseudoislets and monolayers, predictions of differential pathway activation were performed. Mechanisms enhancing insulin secretory capacity of the beta-cell, when situated in the islet, include pathways regulating glucose metabolism, cell interaction, and translational regulation.


Assuntos
Glucose/farmacologia , Células Secretoras de Insulina/citologia , Células Secretoras de Insulina/efeitos dos fármacos , Insulina/metabolismo , Transdução de Sinais , Animais , Comunicação Celular , Técnicas de Cultura de Células , Cromatografia Líquida , Expressão Gênica , Perfilação da Expressão Gênica , Glucose/metabolismo , Secreção de Insulina , Células Secretoras de Insulina/metabolismo , Espectrometria de Massas , Camundongos , Anotação de Sequência Molecular
10.
Mol Pain ; 9: 48, 2013 Sep 25.
Artigo em Inglês | MEDLINE | ID: mdl-24067145

RESUMO

BACKGROUND: Cancer-associated pain is a major cause of poor quality of life in cancer patients and is frequently resistant to conventional therapy. Recent studies indicate that some hematopoietic growth factors, namely granulocyte macrophage colony stimulating factor (GMCSF) and granulocyte colony stimulating factor (GCSF), are abundantly released in the tumor microenvironment and play a key role in regulating tumor-nerve interactions and tumor-associated pain by activating receptors on dorsal root ganglion (DRG) neurons. Moreover, these hematopoietic factors have been highly implicated in postsurgical pain, inflammatory pain and osteoarthritic pain. However, the molecular mechanisms via which G-/GMCSF bring about nociceptive sensitization and elicit pain are not known. RESULTS: In order to elucidate G-/GMCSF mediated transcriptional changes in the sensory neurons, we performed a comprehensive, genome-wide analysis of changes in the transcriptome of DRG neurons brought about by exposure to GMCSF or GCSF. We present complete information on regulated genes and validated profiling analyses and report novel regulatory networks and interaction maps revealed by detailed bioinformatics analyses. Amongst these, we validate calpain 2, matrix metalloproteinase 9 (MMP9) and a RhoGTPase Rac1 as well as Tumor necrosis factor alpha (TNFα) as transcriptional targets of G-/GMCSF and demonstrate the importance of MMP9 and Rac1 in GMCSF-induced nociceptor sensitization. CONCLUSION: With integrative approach of bioinformatics, in vivo pharmacology and behavioral analyses, our results not only indicate that transcriptional control by G-/GMCSF signaling regulates a variety of established pain modulators, but also uncover a large number of novel targets, paving the way for translational analyses in the context of pain disorders.


Assuntos
Gânglios Espinais/efeitos dos fármacos , Fator Estimulador de Colônias de Granulócitos e Macrófagos/farmacologia , Fator Estimulador de Colônias de Macrófagos/farmacologia , Células Receptoras Sensoriais/efeitos dos fármacos , Animais , Camundongos , Camundongos Endogâmicos C57BL , Transdução de Sinais/efeitos dos fármacos
11.
Front Bioinform ; 3: 1101505, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37502697

RESUMO

Introduction: Investigation of molecular mechanisms of human disorders, especially rare diseases, require exploration of various knowledge repositories for building precise hypotheses and complex data interpretation. Recently, increasingly more resources offer diagrammatic representation of such mechanisms, including disease-dedicated schematics in pathway databases and disease maps. However, collection of knowledge across them is challenging, especially for research projects with limited manpower. Methods: In this article we present an automated workflow for construction of maps of molecular mechanisms for rare diseases. The workflow requires a standardized definition of a disease using Orphanet or HPO identifiers to collect relevant genes and variants, and to assemble a functional, visual repository of related mechanisms, including data overlays. The diagrams composing the final map are unified to a common systems biology format from CellDesigner SBML, GPML and SBML+layout+render. The constructed resource contains disease-relevant genes and variants as data overlays for immediate visual exploration, including embedded genetic variant browser and protein structure viewer. Results: We demonstrate the functionality of our workflow on two examples of rare diseases: Kawasaki disease and retinitis pigmentosa. Two maps are constructed based on their corresponding identifiers. Moreover, for the retinitis pigmentosa use-case, we include a list of differentially expressed genes to demonstrate how to tailor the workflow using omics datasets. Discussion: In summary, our work allows for an ad-hoc construction of molecular diagrams combined from different sources, preserving their layout and graphical style, but integrating them into a single resource. This allows to reduce time consuming tasks of prototyping of a molecular disease map, enabling visual exploration, hypothesis building, data visualization and further refinement. The code of the workflow is open and accessible at https://gitlab.lcsb.uni.lu/minerva/automap/.

12.
Front Neurol ; 14: 1330321, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-38174101

RESUMO

Background: Deep phenotyping of Parkinson's disease (PD) is essential to investigate this fastest-growing neurodegenerative disorder. Since 2015, over 800 individuals with PD and atypical parkinsonism along with more than 800 control subjects have been recruited in the frame of the observational, monocentric, nation-wide, longitudinal-prospective Luxembourg Parkinson's study. Objective: To profile the baseline dataset and to explore risk factors, comorbidities and clinical profiles associated with PD, atypical parkinsonism and controls. Methods: Epidemiological and clinical characteristics of all 1,648 participants divided in disease and control groups were investigated. Then, a cross-sectional group comparison was performed between the three largest groups: PD, progressive supranuclear palsy (PSP) and controls. Subsequently, multiple linear and logistic regression models were fitted adjusting for confounders. Results: The mean (SD) age at onset (AAO) of PD was 62.3 (11.8) years with 15% early onset (AAO < 50 years), mean disease duration 4.90 (5.16) years, male sex 66.5% and mean MDS-UPDRS III 35.2 (16.3). For PSP, the respective values were: 67.6 (8.2) years, all PSP with AAO > 50 years, 2.80 (2.62) years, 62.7% and 53.3 (19.5). The highest frequency of hyposmia was detected in PD followed by PSP and controls (72.9%; 53.2%; 14.7%), challenging the use of hyposmia as discriminating feature in PD vs. PSP. Alcohol abstinence was significantly higher in PD than controls (17.6 vs. 12.9%, p = 0.003). Conclusion: Luxembourg Parkinson's study constitutes a valuable resource to strengthen the understanding of complex traits in the aforementioned neurodegenerative disorders. It corroborated several previously observed clinical profiles, and provided insight on frequency of hyposmia in PSP and dietary habits, such as alcohol abstinence in PD.Clinical trial registration: clinicaltrials.gov, NCT05266872.

13.
Nucleic Acids Res ; 38(1): 26-38, 2010 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-19858102

RESUMO

Life scientists are often interested to compare two gene sets to gain insight into differences between two distinct, but related, phenotypes or conditions. Several tools have been developed for comparing gene sets, most of which find Gene Ontology (GO) terms that are significantly over-represented in one gene set. However, such tools often return GO terms that are too generic or too few to be informative. Here, we present Martini, an easy-to-use tool for comparing gene sets. Martini is based, not on GO, but on keywords extracted from Medline abstracts; Martini also supports a much wider range of species than comparable tools. To evaluate Martini we created a benchmark based on the human cell cycle, and we tested several comparable tools (CoPub, FatiGO, Marmite and ProfCom). Martini had the best benchmark performance, delivering a more detailed and accurate description of function. Martini also gave best or equal performance with three other datasets (related to Arabidopsis, melanoma and ovarian cancer), suggesting that Martini represents an advance in the automated comparison of gene sets. In agreement with previous studies, our results further suggest that literature-derived keywords are a richer source of gene-function information than GO annotations. Martini is freely available at http://martini.embl.de.


Assuntos
Genes , Software , Terminologia como Assunto , Arabidopsis/genética , Ciclo Celular/genética , Dicionários como Assunto , Genes Neoplásicos , Genes de Plantas , Humanos , MEDLINE , Melanoma/genética
14.
Front Immunol ; 13: 1002629, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36439150

RESUMO

Immune mediated inflammatory diseases (IMIDs) are a heterogeneous group of debilitating, multifactorial and unrelated conditions featured by a dysregulated immune response leading to destructive chronic inflammation. The immune dysregulation can affect various organ systems: gut (e.g., inflammatory bowel disease), joints (e.g., rheumatoid arthritis), skin (e.g., psoriasis, atopic dermatitis), resulting in significant morbidity, reduced quality of life, increased risk for comorbidities, and premature death. As there are no reliable disease progression and therapy response biomarkers currently available, it is very hard to predict how the disease will develop and which treatments will be effective in a given patient. In addition, a considerable proportion of patients do not respond sufficiently to the treatment. ImmUniverse is a large collaborative consortium of 27 partners funded by the Innovative Medicine Initiative (IMI), which is sponsored by the European Union (Horizon 2020) and in-kind contributions of participating pharmaceutical companies within the European Federation of Pharmaceutical Industries and Associations (EFPIA). ImmUniverse aims to advance our understanding of the molecular mechanisms underlying two immune-mediated diseases, ulcerative colitis (UC) and atopic dermatitis (AD), by pursuing an integrative multi-omics approach. As a consequence of the heterogeneity among IMIDs patients, a comprehensive, evidence-based identification of novel biomarkers is necessary to enable appropriate patient stratification that would account for the inter-individual differences in disease severity, drug efficacy, side effects or prognosis. This would guide clinicians in the management of patients and represent a major step towards personalized medicine. ImmUniverse will combine the existing and novel advanced technologies, including multi-omics, to characterize both the tissue microenvironment and blood. This comprehensive, systems biology-oriented approach will allow for identification and validation of tissue and circulating biomarker signatures as well as mechanistic principles, which will provide information about disease severity and future disease progression. This truly makes the ImmUniverse Consortium an unparalleled approach.


Assuntos
Dermatite Atópica , Medicina de Precisão , Humanos , Qualidade de Vida , Biomarcadores , Progressão da Doença
16.
Drug Discov Today ; 26(3): 626-630, 2021 03.
Artigo em Inglês | MEDLINE | ID: mdl-33338655

RESUMO

Translational research today is data-intensive and requires multi-stakeholder collaborations to generate and pool data together for integrated analysis. This leads to the challenge of harmonization of data from different sources with different formats and standards, which is often overlooked during project planning and thus becomes a bottleneck of the research progress. We report on our experience and lessons learnt about data curation for translational research garnered over the course of the European Translational Research Infrastructure & Knowledge management Services (eTRIKS) program (https://www.etriks.org), a unique, 5-year, cross-organizational, cross-cultural collaboration project funded by the Innovative Medicines Initiative of the EU. Here, we discuss the obstacles and suggest what steps are needed for effective data curation in translational research, especially for projects involving multiple organizations from academia and industry.


Assuntos
Comportamento Cooperativo , Curadoria de Dados , Pesquisa Translacional Biomédica/organização & administração , Comparação Transcultural , Humanos
17.
Gigascience ; 9(11)2020 11 18.
Artigo em Inglês | MEDLINE | ID: mdl-33205814

RESUMO

BACKGROUND: The amount of data generated in large clinical and phenotyping studies that use single-cell cytometry is constantly growing. Recent technological advances allow the easy generation of data with hundreds of millions of single-cell data points with >40 parameters, originating from thousands of individual samples. The analysis of that amount of high-dimensional data becomes demanding in both hardware and software of high-performance computational resources. Current software tools often do not scale to the datasets of such size; users are thus forced to downsample the data to bearable sizes, in turn losing accuracy and ability to detect many underlying complex phenomena. RESULTS: We present GigaSOM.jl, a fast and scalable implementation of clustering and dimensionality reduction for flow and mass cytometry data. The implementation of GigaSOM.jl in the high-level and high-performance programming language Julia makes it accessible to the scientific community and allows for efficient handling and processing of datasets with billions of data points using distributed computing infrastructures. We describe the design of GigaSOM.jl, measure its performance and horizontal scaling capability, and showcase the functionality on a large dataset from a recent study. CONCLUSIONS: GigaSOM.jl facilitates the use of commonly available high-performance computing resources to process the largest available datasets within minutes, while producing results of the same quality as the current state-of-art software. Measurements indicate that the performance scales to much larger datasets. The example use on the data from a massive mouse phenotyping effort confirms the applicability of GigaSOM.jl to huge-scale studies.


Assuntos
Algoritmos , Linguagens de Programação , Animais , Análise por Conglomerados , Camundongos , Software
18.
Front Neurol ; 11: 524, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32655481

RESUMO

Over the past two decades, our understanding of Parkinson's disease (PD) has been gleaned from the discoveries made in familial and/or sporadic forms of PD in the Caucasian population. The transferability and the clinical utility of genetic discoveries to other ethnically diverse populations are unknown. The Indian population has been under-represented in PD research. The Genetic Architecture of PD in India (GAP-India) project aims to develop one of the largest clinical/genomic bio-bank for PD in India. Specifically, GAP-India project aims to: (1) develop a pan-Indian deeply phenotyped clinical repository of Indian PD patients; (2) perform whole-genome sequencing in 500 PD samples to catalog Indian genetic variability and to develop an Indian PD map for the scientific community; (3) perform a genome-wide association study to identify novel loci for PD and (4) develop a user-friendly web-portal to disseminate results for the scientific community. Our "hub-spoke" model follows an integrative approach to develop a pan-Indian outreach to develop a comprehensive cohort for PD research in India. The alignment of standard operating procedures for recruiting patients and collecting biospecimens with international standards ensures harmonization of data/bio-specimen collection at the beginning and also ensures stringent quality control parameters for sample processing. Data sharing and protection policies follow the guidelines established by local and national authorities.We are currently in the recruitment phase targeting recruitment of 10,200 PD patients and 10,200 healthy volunteers by the end of 2020. GAP-India project after its completion will fill a critical gap that exists in PD research and will contribute a comprehensive genetic catalog of the Indian PD population to identify novel targets for PD.

20.
BMC Bioinformatics ; 9: 141, 2008 Mar 05.
Artigo em Inglês | MEDLINE | ID: mdl-18321373

RESUMO

BACKGROUND: Modern proteomes evolved by modification of pre-existing ones. It is extremely important to comparative biology that related proteins be identified as members of the same cognate group, since a characterized putative homolog could be used to find clues about the function of uncharacterized proteins from the same group. Typically, databases of related proteins focus on those from completely-sequenced genomes. Unfortunately, relatively few organisms have had their genomes fully sequenced; accordingly, many proteins are ignored by the currently available databases of cognate proteins, despite the high amount of important genes that are functionally described only for these incomplete proteomes. RESULTS: We have developed a method to cluster cognate proteins from multiple organisms beginning with only one sequence, through connectivity saturation with that Seed sequence. We show that the generated clusters are in agreement with some other approaches based on full genome comparison. CONCLUSION: The method produced results that are as reliable as those produced by conventional clustering approaches. Generating clusters based only on individual proteins of interest is less time consuming than generating clusters for whole proteomes.


Assuntos
Algoritmos , Análise por Conglomerados , Família Multigênica , Reconhecimento Automatizado de Padrão/métodos , Proteoma/química , Alinhamento de Sequência/métodos , Análise de Sequência de Proteína/métodos , Sequência de Aminoácidos , Dados de Sequência Molecular , Homologia de Sequência de Aminoácidos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA