Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 253
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
PLoS Biol ; 21(7): e3002204, 2023 07.
Artigo em Inglês | MEDLINE | ID: mdl-37478129

RESUMO

Research data is optimized when it can be freely accessed and reused. To maximize research equity, transparency, and reproducibility, policymakers should take concrete steps to ensure that research software is openly accessible and reusable.


Assuntos
Políticas , Software , Reprodutibilidade dos Testes
2.
PLoS Biol ; 20(12): e3001901, 2022 12.
Artigo em Inglês | MEDLINE | ID: mdl-36508416

RESUMO

Does reductionism, in the era of machine learning and now interpretable AI, facilitate or hinder scientific insight? The protein ribbon diagram, as a means of visual reductionism, is a case in point.


Assuntos
Aprendizado de Máquina , Sinapses
3.
BMC Bioinformatics ; 25(1): 11, 2024 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-38177985

RESUMO

BACKGROUND: Machine learning (ML) has a rich history in structural bioinformatics, and modern approaches, such as deep learning, are revolutionizing our knowledge of the subtle relationships between biomolecular sequence, structure, function, dynamics and evolution. As with any advance that rests upon statistical learning approaches, the recent progress in biomolecular sciences is enabled by the availability of vast volumes of sufficiently-variable data. To be useful, such data must be well-structured, machine-readable, intelligible and manipulable. These and related requirements pose challenges that become especially acute at the computational scales typical in ML. Furthermore, in structural bioinformatics such data generally relate to protein three-dimensional (3D) structures, which are inherently more complex than sequence-based data. A significant and recurring challenge concerns the creation of large, high-quality, openly-accessible datasets that can be used for specific training and benchmarking tasks in ML pipelines for predictive modeling projects, along with reproducible splits for training and testing. RESULTS: Here, we report 'Prop3D', a platform that allows for the creation, sharing and extensible reuse of libraries of protein domains, featurized with biophysical and evolutionary properties that can range from detailed, atomically-resolved physicochemical quantities (e.g., electrostatics) to coarser, residue-level features (e.g., phylogenetic conservation). As a community resource, we also supply a 'Prop3D-20sf' protein dataset, obtained by applying our approach to CATH . We have developed and deployed the Prop3D framework, both in the cloud and on local HPC resources, to systematically and reproducibly create comprehensive datasets via the Highly Scalable Data Service ( HSDS ). Our datasets are freely accessible via a public HSDS instance, or they can be used with accompanying Python wrappers for popular ML frameworks. CONCLUSION: Prop3D and its associated Prop3D-20sf dataset can be of broad utility in at least three ways. Firstly, the Prop3D workflow code can be customized and deployed on various cloud-based compute platforms, with scalability achieved largely by saving the results to distributed HDF5 files via HSDS . Secondly, the linked Prop3D-20sf dataset provides a hand-crafted, already-featurized dataset of protein domains for 20 highly-populated CATH families; importantly, provision of this pre-computed resource can aid the more efficient development (and reproducible deployment) of ML pipelines. Thirdly, Prop3D-20sf's construction explicitly takes into account (in creating datasets and data-splits) the enigma of 'data leakage', stemming from the evolutionary relationships between proteins.


Assuntos
Biologia Computacional , Proteínas , Humanos , Filogenia , Biologia Computacional/métodos , Fluxo de Trabalho , Aprendizado de Máquina
4.
PLoS Biol ; 19(3): e3001165, 2021 03.
Artigo em Inglês | MEDLINE | ID: mdl-33735179

RESUMO

Why would a computational biologist with 40 years of research experience say bioinformatics is dead? The short answer is, in being the Founding Dean of a new School of Data Science, what we do suddenly looks different.


Assuntos
Biologia Computacional/métodos , Biologia Computacional/tendências , Ciência de Dados/tendências , Biologia Computacional/educação , Currículo , Ciência de Dados/métodos , Humanos , Disseminação de Informação/métodos , Instituições Acadêmicas , Estudantes
5.
PLoS Comput Biol ; 19(1): e1010851, 2023 01.
Artigo em Inglês | MEDLINE | ID: mdl-36652496

RESUMO

Systematically discovering protein-ligand interactions across the entire human and pathogen genomes is critical in chemical genomics, protein function prediction, drug discovery, and many other areas. However, more than 90% of gene families remain "dark"-i.e., their small-molecule ligands are undiscovered due to experimental limitations or human/historical biases. Existing computational approaches typically fail when the dark protein differs from those with known ligands. To address this challenge, we have developed a deep learning framework, called PortalCG, which consists of four novel components: (i) a 3-dimensional ligand binding site enhanced sequence pre-training strategy to encode the evolutionary links between ligand-binding sites across gene families; (ii) an end-to-end pretraining-fine-tuning strategy to reduce the impact of inaccuracy of predicted structures on function predictions by recognizing the sequence-structure-function paradigm; (iii) a new out-of-cluster meta-learning algorithm that extracts and accumulates information learned from predicting ligands of distinct gene families (meta-data) and applies the meta-data to a dark gene family; and (iv) a stress model selection step, using different gene families in the test data from those in the training and development data sets to facilitate model deployment in a real-world scenario. In extensive and rigorous benchmark experiments, PortalCG considerably outperformed state-of-the-art techniques of machine learning and protein-ligand docking when applied to dark gene families, and demonstrated its generalization power for target identifications and compound screenings under out-of-distribution (OOD) scenarios. Furthermore, in an external validation for the multi-target compound screening, the performance of PortalCG surpassed the rational design from medicinal chemists. Our results also suggest that a differentiable sequence-structure-function deep learning framework, where protein structural information serves as an intermediate layer, could be superior to conventional methodology where predicted protein structures were used for the compound screening. We applied PortalCG to two case studies to exemplify its potential in drug discovery: designing selective dual-antagonists of dopamine receptors for the treatment of opioid use disorder (OUD), and illuminating the understudied human genome for target diseases that do not yet have effective and safe therapeutics. Our results suggested that PortalCG is a viable solution to the OOD problem in exploring understudied regions of protein functional space.


Assuntos
Algoritmos , Proteínas , Humanos , Ligantes , Proteínas/química , Sítios de Ligação , Aprendizado de Máquina , Ligação Proteica
6.
PLoS Comput Biol ; 19(12): e1011652, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-38060459

RESUMO

Information is the cornerstone of research, from experimental (meta)data and computational processes to complex inventories of reagents and equipment. These 10 simple rules discuss best practices for leveraging laboratory information management systems to transform this large information load into useful scientific findings.

7.
PLoS Comput Biol ; 18(8): e1010395, 2022 08.
Artigo em Inglês | MEDLINE | ID: mdl-36006874

RESUMO

Special sessions are important parts of scientific meetings and conferences: They gather together researchers and students interested in a specific topic and can strongly contribute to the success of the conference itself. Moreover, they can be the first step for trainees and students to the organization of a scientific event. Organizing a special session, however, can be uneasy for beginners and students. Here, we provide ten simple rules to follow to organize a special session at a scientific conference.


Assuntos
Pesquisadores , Estudantes , Humanos
8.
PLoS Comput Biol ; 18(6): e1010130, 2022 06.
Artigo em Inglês | MEDLINE | ID: mdl-35737640

RESUMO

Communication is a fundamental part of scientific development and methodology. With the advancement of the internet and social networks, communication has become rapid and sometimes overwhelming, especially in science. It is important to provide scientists with useful, effective, and dynamic tools to establish and build a fluid communication framework that allows for scientific advancement. Therefore, in this article, we present advice and recommendations that can help promote and improve science communication while respecting an adequate balance in the degree of commitment toward collaborative work. We have developed 10 rules shown in increasing order of commitment that are grouped into 3 key categories: (1) speak (based on active participation); (2) join (based on joining scientific groups); and (3) assess (based on the analysis and retrospective consideration of the weaknesses and strengths). We include examples and resources that provide actionable strategies for involvement and engagement with science communication, from basic steps to more advanced, introspective, and long-term commitments. Overall, we aim to help spread science from within and encourage and engage scientists to become involved in science communication effectively and dynamically.


Assuntos
Comunicação , Rede Social , Estudos Retrospectivos
9.
J Chem Inf Model ; 63(4): 1362-1370, 2023 02 27.
Artigo em Inglês | MEDLINE | ID: mdl-36780612

RESUMO

KRAS, a common human oncogene, has been recognized as a critical drug target in treating multiple cancers. After four decades of effort, one allosteric KRAS drug (Sotorasib) has been approved, inspiring more KRAS-targeted drug research. Here, we provide the features of KRAS binding pockets and ligand-binding characteristics of KRAS complexes using a structural systems pharmacology approach. Three distinct binding sites (conserved nucleotide-binding site, shallow Switch-I/II pocket, and allosteric Switch-II/α3 pocket) are characterized. Ligand-binding features are determined based on encoded KRAS-inhibitor interaction fingerprints. Finally, the flexibility of the three distinct binding sites to accommodate different potential ligands, based on MD simulation, is discussed. Collectively, these findings are intended to facilitate rational KRAS drug design.


Assuntos
Neoplasias , Proteínas Proto-Oncogênicas p21(ras) , Humanos , Ligantes , Sítios de Ligação , Desenho de Fármacos , Neoplasias/tratamento farmacológico , Mutação
10.
J Proteome Res ; 19(11): 4698-4705, 2020 11 06.
Artigo em Inglês | MEDLINE | ID: mdl-32946692

RESUMO

The coronavirus disease of 2019 (COVID-19) pandemic speaks to the need for drugs that not only are effective but also remain effective given the mutation rate of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). To this end, we describe structural binding-site insights for facilitating COVID-19 drug design when targeting RNA-dependent RNA polymerase (RDRP), a common conserved component of RNA viruses. We combined an RDRP structure data set, including 384 RDRP PDB structures and all corresponding RDRP-ligand interaction fingerprints, thereby revealing the structural characteristics of the active sites for application to RDRP-targeted drug discovery. Specifically, we revealed the intrinsic ligand-binding modes and associated RDRP structural characteristics. Four types of binding modes with corresponding binding pockets were determined, suggesting two major subpockets available for drug discovery. We screened a drug data set of 7894 compounds against these binding pockets and presented the top-10 small molecules as a starting point in further exploring the prevention of virus replication. In summary, the binding characteristics determined here help rationalize RDRP-targeted drug discovery and provide insights into the specific binding mechanisms important for containing the SARS-CoV-2 virus.


Assuntos
Betacoronavirus , Infecções por Coronavirus/virologia , Descoberta de Drogas/métodos , Pneumonia Viral/virologia , RNA Polimerase Dependente de RNA , Proteínas Virais , Betacoronavirus/química , Betacoronavirus/metabolismo , Sítios de Ligação , COVID-19 , Humanos , Simulação de Acoplamento Molecular , Pandemias , Ligação Proteica , RNA Polimerase Dependente de RNA/química , RNA Polimerase Dependente de RNA/metabolismo , SARS-CoV-2 , Proteínas Virais/química , Proteínas Virais/metabolismo
11.
Annu Rev Pharmacol Toxicol ; 57: 245-262, 2017 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-27814027

RESUMO

Systems pharmacology aims to holistically understand mechanisms of drug actions to support drug discovery and clinical practice. Systems pharmacology modeling (SPM) is data driven. It integrates an exponentially growing amount of data at multiple scales (genetic, molecular, cellular, organismal, and environmental). The goal of SPM is to develop mechanistic or predictive multiscale models that are interpretable and actionable. The current explosions in genomics and other omics data, as well as the tremendous advances in big data technologies, have already enabled biologists to generate novel hypotheses and gain new knowledge through computational models of genome-wide, heterogeneous, and dynamic data sets. More work is needed to interpret and predict a drug response phenotype, which is dependent on many known and unknown factors. To gain a comprehensive understanding of drug actions, SPM requires close collaborations between domain experts from diverse fields and integration of heterogeneous models from biophysics, mathematics, statistics, machine learning, and semantic webs. This creates challenges in model management, model integration, model translation, and knowledge integration. In this review, we discuss several emergent issues in SPM and potential solutions using big data technology and analytics. The concurrent development of high-throughput techniques, cloud computing, data science, and the semantic web will likely allow SPM to be findable, accessible, interoperable, reusable, reliable, interpretable, and actionable.


Assuntos
Interpretação Estatística de Dados , Bases de Dados Factuais/estatística & dados numéricos , Farmacologia Clínica/métodos , Biologia de Sistemas/métodos , Animais , Ensaios de Triagem em Larga Escala/métodos , Ensaios de Triagem em Larga Escala/tendências , Humanos , Farmacologia Clínica/tendências , Biologia de Sistemas/tendências
12.
BMC Med ; 18(1): 369, 2020 11 25.
Artigo em Inglês | MEDLINE | ID: mdl-33234138

RESUMO

BACKGROUND: Given that an individual's age and gender are strongly predictive of coronavirus disease 2019 (COVID-19) outcomes, do such factors imply anything about preferable therapeutic options? METHODS: An analysis of electronic health records for a large (68,466-case), international COVID-19 cohort, in 5-year age strata, revealed age-dependent sex differences. In particular, we surveyed the effects of systemic hormone administration in women. The primary outcome for estradiol therapy was death. Odds ratios (ORs) and Kaplan-Meier survival curves were analyzed for 37,086 COVID-19 women in two age groups: pre- (15-49 years) and peri-/post-menopausal (> 50 years). RESULTS: The incidence of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection is higher in women than men (by about + 15%) and, in contrast, the fatality rate is higher in men (about + 50%). Interestingly, the relationships between these quantities are linked to age: pre-adolescent girls and boys had the same risk of infection and fatality rate, while adult premenopausal women had a significantly higher risk of infection than men in the same 5-year age stratum (about 16,000 vs. 12,000 cases). This ratio changed again in peri- and postmenopausal women, with infection susceptibility converging with men. While fatality rates increased continuously with age for both sexes, at 50 years, there was a steeper increase for men. Thus far, these types of intricacies have been largely neglected. Because the hormone 17ß-estradiol influences expression of the human angiotensin-converting enzyme 2 (ACE2) protein, which plays a role in SARS-CoV-2 cellular entry, propensity score matching was performed for the women's sub-cohort, comparing users vs. non-users of estradiol. This retrospective study of hormone therapy in female COVID-19 patients shows that the fatality risk for women > 50 years receiving estradiol therapy (user group) is reduced by more than 50%; the OR was 0.33, 95% CI [0.18, 0.62] and the hazard ratio (HR) was 0.29, 95% CI [0.11,0.76]. For younger, pre-menopausal women (15-49 years), the risk of COVID-19 fatality is the same irrespective of estradiol treatment, probably because of higher endogenous estradiol levels. CONCLUSIONS: As of this writing, still no effective drug treatment is available for COVID-19; since estradiol shows such a strong improvement regarding fatality in COVID-19, we suggest prospective studies on the potentially more broadly protective roles of this naturally occurring hormone.


Assuntos
COVID-19/epidemiologia , Estradiol/uso terapêutico , Peptidil Dipeptidase A/uso terapêutico , Pneumonia Viral/epidemiologia , Adolescente , Adulto , COVID-19/prevenção & controle , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Pneumonia Viral/tratamento farmacológico , Estudos Retrospectivos , SARS-CoV-2 , Caracteres Sexuais , Adulto Jovem
13.
PLoS Biol ; 15(3): e2002041, 2017 03.
Artigo em Inglês | MEDLINE | ID: mdl-28301467

RESUMO

The iconic image of the DNA double helix embodies the central role that three-dimensional structures play in understanding biological processes, which, in turn, impact health and well-being. Here, that role is explored through the eyes of one scientist, who has been lucky enough to have over 150 talented people pass through his laboratory. Each contributed to that understanding. What follows is a small fraction of their story, with an emphasis on basic research outcomes of importance to society at large.


Assuntos
DNA/análise , Vida , Proteínas/análise , Biologia Computacional , Descoberta de Drogas , Genoma Humano , Humanos
14.
PLoS Biol ; 15(7): e2003082, 2017 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-28715407

RESUMO

This article describes efforts at the National Institutes of Health (NIH) from 2013 to 2016 to train a national workforce in biomedical data science. We provide an analysis of the Big Data to Knowledge (BD2K) training program strengths and weaknesses with an eye toward future directions aimed at any funder and potential funding recipient worldwide. The focus is on extramurally funded programs that have a national or international impact rather than the training of NIH staff, which was addressed by the NIH's internal Data Science Workforce Development Center. From its inception, the major goal of BD2K was to narrow the gap between needed and existing biomedical data science skills. As biomedical research increasingly relies on computational, mathematical, and statistical thinking, supporting the training and education of the workforce of tomorrow requires new emphases on analytical skills. From 2013 to 2016, BD2K jump-started training in this area for all levels, from graduate students to senior researchers.


Assuntos
Biologia Computacional/educação , Pesquisa Biomédica/educação , Biologia Computacional/tendências , National Institutes of Health (U.S.) , Pesquisadores/educação , Ensino , Estados Unidos
15.
PLoS Biol ; 15(4): e2001818, 2017 04.
Artigo em Inglês | MEDLINE | ID: mdl-28388615

RESUMO

The thesis presented here is that biomedical research is based on the trusted exchange of services. That exchange would be conducted more efficiently if the trusted software platforms to exchange those services, if they exist, were more integrated. While simpler and narrower in scope than the services governing biomedical research, comparison to existing internet-based platforms, like Airbnb, can be informative. We illustrate how the analogy to internet-based platforms works and does not work and introduce The Commons, under active development at the National Institutes of Health (NIH) and elsewhere, as an example of the move towards platforms for research.


Assuntos
Pesquisa Biomédica/normas , Sistemas de Gerenciamento de Base de Dados/normas , Disseminação de Informação/métodos , Avaliação de Programas e Projetos de Saúde/normas , Mudança Social , Confiança , Animais , Pesquisa Biomédica/tendências , Barreiras de Comunicação , Sistemas de Gerenciamento de Base de Dados/tendências , Eficiência , Humanos , Internet , National Institutes of Health (U.S.) , Publicações Periódicas como Assunto/normas , Publicações Periódicas como Assunto/tendências , Avaliação de Programas e Projetos de Saúde/tendências , Apoio à Pesquisa como Assunto/tendências , Má Conduta Científica , Software , Transferência de Tecnologia , Estados Unidos , Recursos Humanos
16.
PLoS Biol ; 15(8): e2002617, 2017 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-28763440

RESUMO

The Open Science Prize was established with the following objectives: first, to encourage the crowdsourcing of open data to make breakthroughs that are of biomedical significance; second, to illustrate that funders can indeed work together when scientific interests are aligned; and finally, to encourage international collaboration between investigators with the intent of achieving important innovations that would not be possible otherwise. The process for running the competition and the successes and challenges that arose are presented.


Assuntos
Distinções e Prêmios , Crowdsourcing , Internacionalidade
17.
PLoS Comput Biol ; 15(4): e1006842, 2019 04.
Artigo em Inglês | MEDLINE | ID: mdl-31009453

RESUMO

Many proteins fold into highly regular and repetitive three dimensional structures. The analysis of structural patterns and repeated elements is fundamental to understand protein function and evolution. We present recent improvements to the CE-Symm tool for systematically detecting and analyzing the internal symmetry and structural repeats in proteins. In addition to the accurate detection of internal symmetry, the tool is now capable of i) reporting the type of symmetry, ii) identifying the smallest repeating unit, iii) describing the arrangement of repeats with transformation operations and symmetry axes, and iv) comparing the similarity of all the internal repeats at the residue level. CE-Symm 2.0 helps the user investigate proteins with a robust and intuitive sequence-to-structure analysis, with many applications in protein classification, functional annotation and evolutionary studies. We describe the algorithmic extensions of the method and demonstrate its applications to the study of interesting cases of protein evolution.


Assuntos
Algoritmos , Biologia Computacional/métodos , Proteínas/química , Software , Sequência de Aminoácidos , Bases de Dados de Proteínas , Modelos Moleculares , Análise de Sequência de Proteína
18.
PLoS Comput Biol ; 19(3): e1010911, 2023 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-36862619
19.
PLoS Comput Biol ; 14(6): e1006144, 2018 06.
Artigo em Inglês | MEDLINE | ID: mdl-29902176

RESUMO

Biomedical research has become a digital data-intensive endeavor, relying on secure and scalable computing, storage, and network infrastructure, which has traditionally been purchased, supported, and maintained locally. For certain types of biomedical applications, cloud computing has emerged as an alternative to locally maintained traditional computing approaches. Cloud computing offers users pay-as-you-go access to services such as hardware infrastructure, platforms, and software for solving common biomedical computational problems. Cloud computing services offer secure on-demand storage and analysis and are differentiated from traditional high-performance computing by their rapid availability and scalability of services. As such, cloud services are engineered to address big data problems and enhance the likelihood of data and analytics sharing, reproducibility, and reuse. Here, we provide an introductory perspective on cloud computing to help the reader determine its value to their own research.


Assuntos
Pesquisa Biomédica/métodos , Computação em Nuvem , Biologia Computacional/métodos , Humanos
20.
PLoS Comput Biol ; 19(12): e1011698, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-38127691
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA