Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Mais filtros

Bases de dados
Tipo de documento
Intervalo de ano de publicação
1.
Nat Methods ; 18(4): 389-396, 2021 04.
Artigo em Inglês | MEDLINE | ID: mdl-33828272

RESUMO

Protein engineering has enormous academic and industrial potential. However, it is limited by the lack of experimental assays that are consistent with the design goal and sufficiently high throughput to find rare, enhanced variants. Here we introduce a machine learning-guided paradigm that can use as few as 24 functionally assayed mutant sequences to build an accurate virtual fitness landscape and screen ten million sequences via in silico directed evolution. As demonstrated in two dissimilar proteins, GFP from Aequorea victoria (avGFP) and E. coli strain TEM-1 ß-lactamase, top candidates from a single round are diverse and as active as engineered mutants obtained from previous high-throughput efforts. By distilling information from natural protein sequence landscapes, our model learns a latent representation of 'unnaturalness', which helps to guide search away from nonfunctional sequence neighborhoods. Subsequent low-N supervision then identifies improvements to the activity of interest. In sum, our approach enables efficient use of resource-intensive high-fidelity assays without sacrificing throughput, and helps to accelerate engineered proteins into the fermenter, field and clinic.


Assuntos
Aprendizado Profundo , Engenharia de Proteínas/métodos , Algoritmos , Modelos Moleculares , beta-Lactamases/química
2.
Nat Methods ; 16(12): 1315-1322, 2019 12.
Artigo em Inglês | MEDLINE | ID: mdl-31636460

RESUMO

Rational protein engineering requires a holistic understanding of protein function. Here, we apply deep learning to unlabeled amino-acid sequences to distill the fundamental features of a protein into a statistical representation that is semantically rich and structurally, evolutionarily and biophysically grounded. We show that the simplest models built on top of this unified representation (UniRep) are broadly applicable and generalize to unseen regions of sequence space. Our data-driven approach predicts the stability of natural and de novo designed proteins, and the quantitative function of molecularly diverse mutants, competitively with the state-of-the-art methods. UniRep further enables two orders of magnitude efficiency improvement in a protein engineering task. UniRep is a versatile summary of fundamental protein features that can be applied across protein engineering informatics.


Assuntos
Aprendizado Profundo , Engenharia de Proteínas/métodos , Sequência de Aminoácidos , Mutação , Estabilidade Proteica
3.
BMC Genomics ; 17 Suppl 2: 395, 2016 06 23.
Artigo em Inglês | MEDLINE | ID: mdl-27356864

RESUMO

BACKGROUND: Somatic mutations in cancer cells affect various genomic elements disrupting important cell functions. In particular, mutations in DNA binding sites recognized by transcription factors can alter regulator binding affinities and, consequently, expression of target genes. A number of promoter mutations have been linked with an increased risk of cancer. Cancer somatic mutations in binding sites of selected transcription factors have been found under positive selection. However, action and significance of negative selection in non-coding regions remain controversial. RESULTS: Here we present analysis of transcription factor binding motifs co-localized with non-coding variants. To avoid statistical bias we account for mutation signatures of different cancer types. For many transcription factors, including multiple members of FOX, HOX, and NR families, we show that human cancers accumulate fewer mutations than expected by chance that increase or decrease affinity of predicted binding sites. Such stability of binding motifs is even more exhibited in DNase accessible regions. CONCLUSIONS: Our data demonstrate negative selection against binding sites alterations and suggest that such selection pressure protects cancer cells from rewiring of regulatory circuits. Further analysis of transcription factors with conserved binding motifs can reveal cell regulatory pathways crucial for the survivability of various human cancers.


Assuntos
DNA/metabolismo , Mutação , Neoplasias/genética , Fatores de Transcrição/metabolismo , Sítios de Ligação , DNA/química , DNA/genética , Humanos , Neoplasias/metabolismo , Regiões Promotoras Genéticas , Ligação Proteica , Seleção Genética , Fatores de Transcrição/química
4.
Database (Oxford) ; 2015: bav067, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26153137

RESUMO

Epigenetics refers to stable and long-term alterations of cellular traits that are not caused by changes in the DNA sequence per se. Rather, covalent modifications of DNA and histones affect gene expression and genome stability via proteins that recognize and act upon such modifications. Many enzymes that catalyse epigenetic modifications or are critical for enzymatic complexes have been discovered, and this is encouraging investigators to study the role of these proteins in diverse normal and pathological processes. Rapidly growing knowledge in the area has resulted in the need for a resource that compiles, organizes and presents curated information to the researchers in an easily accessible and user-friendly form. Here we present EpiFactors, a manually curated database providing information about epigenetic regulators, their complexes, targets and products. EpiFactors contains information on 815 proteins, including 95 histones and protamines. For 789 of these genes, we include expressions values across several samples, in particular a collection of 458 human primary cell samples (for approximately 200 cell types, in many cases from three individual donors), covering most mammalian cell steady states, 255 different cancer cell lines (representing approximately 150 cancer subtypes) and 134 human postmortem tissues. Expression values were obtained by the FANTOM5 consortium using Cap Analysis of Gene Expression technique. EpiFactors also contains information on 69 protein complexes that are involved in epigenetic regulation. The resource is practical for a wide range of users, including biologists, pharmacologists and clinicians.


Assuntos
Bases de Dados Genéticas , Epigênese Genética , Instabilidade Genômica , Histonas , Proteínas de Neoplasias , Neoplasias , Protaminas , Epigenômica , Histonas/biossíntese , Histonas/genética , Humanos , Proteínas de Neoplasias/biossíntese , Proteínas de Neoplasias/genética , Neoplasias/genética , Neoplasias/metabolismo , Protaminas/genética , Protaminas/metabolismo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA