RESUMO
Protein flexibility ranges from simple hinge movements to functional disorder. Around half of all human proteins contain apparently disordered regions with little 3D or functional information, and many of these proteins are associated with disease. Building on the evolutionary couplings approach previously successful in predicting 3D states of ordered proteins and RNA, we developed a method to predict the potential for ordered states for all apparently disordered proteins with sufficiently rich evolutionary information. The approach is highly accurate (79%) for residue interactions as tested in more than 60 known disordered regions captured in a bound or specific condition. Assessing the potential for structure of more than 1,000 apparently disordered regions of human proteins reveals a continuum of structural order with at least 50% with clear propensity for three- or two-dimensional states. Co-evolutionary constraints reveal hitherto unseen structures of functional importance in apparently disordered proteins.
Assuntos
Proteínas Intrinsicamente Desordenadas/química , Evolução Molecular Direcionada/métodos , Genômica , Humanos , Proteínas Intrinsicamente Desordenadas/genética , Estrutura Secundária de Proteína , Estrutura Terciária de Proteína , Proteoma/química , Proteoma/genéticaRESUMO
Non-coding RNAs are ubiquitous, but the discovery of new RNA gene sequences far outpaces the research on the structure and functional interactions of these RNA gene sequences. We mine the evolutionary sequence record to derive precise information about the function and structure of RNAs and RNA-protein complexes. As in protein structure prediction, we use maximum entropy global probability models of sequence co-variation to infer evolutionarily constrained nucleotide-nucleotide interactions within RNA molecules and nucleotide-amino acid interactions in RNA-protein complexes. The predicted contacts allow all-atom blinded 3D structure prediction at good accuracy for several known RNA structures and RNA-protein complexes. For unknown structures, we predict contacts in 160 non-coding RNA families. Beyond 3D structure prediction, evolutionary couplings help identify important functional interactions-e.g., at switch points in riboswitches and at a complex nucleation site in HIV. Aided by increasing sequence accumulation, evolutionary coupling analysis can accelerate the discovery of functional interactions and 3D structures involving RNA.
Assuntos
Conformação de Ácido Nucleico , RNA não Traduzido/química , Entropia , Evolução Molecular , Modelos Moleculares , Dobramento de RNA , RNA não Traduzido/genética , RNA não Traduzido/metabolismo , Proteínas de Ligação a RNA/química , Proteínas de Ligação a RNA/metabolismo , Ribossomos/metabolismoRESUMO
Three billion years of evolution has produced a tremendous diversity of protein molecules1, but the full potential of proteins is likely to be much greater. Accessing this potential has been challenging for both computation and experiments because the space of possible protein molecules is much larger than the space of those likely to have functions. Here we introduce Chroma, a generative model for proteins and protein complexes that can directly sample novel protein structures and sequences, and that can be conditioned to steer the generative process towards desired properties and functions. To enable this, we introduce a diffusion process that respects the conformational statistics of polymer ensembles, an efficient neural architecture for molecular systems that enables long-range reasoning with sub-quadratic scaling, layers for efficiently synthesizing three-dimensional structures of proteins from predicted inter-residue geometries and a general low-temperature sampling algorithm for diffusion models. Chroma achieves protein design as Bayesian inference under external constraints, which can involve symmetries, substructure, shape, semantics and even natural-language prompts. The experimental characterization of 310 proteins shows that sampling from Chroma results in proteins that are highly expressed, fold and have favourable biophysical properties. The crystal structures of two designed proteins exhibit atomistic agreement with Chroma samples (a backbone root-mean-square deviation of around 1.0 Å). With this unified approach to protein design, we hope to accelerate the programming of protein matter to benefit human health, materials science and synthetic biology.
Assuntos
Algoritmos , Simulação por Computador , Conformação Proteica , Proteínas , Humanos , Teorema de Bayes , Evolução Molecular Direcionada , Aprendizado de Máquina , Modelos Moleculares , Dobramento de Proteína , Proteínas/química , Proteínas/metabolismo , Semântica , Biologia Sintética/métodos , Biologia Sintética/tendênciasRESUMO
BACKGROUND: Successful replantation relies on proper preservation of traumatically amputated parts. The established protocol for preservation, however, is inconsistently adhered to. The objective of this study is to examine the rate of proper preservation in multiple patient populations. METHODS: A retrospective review of patients from 2015 to 2019 at a single academic institution was conducted. Patients were included if they suffered a traumatic amputation, the amputated part was present for evaluation by the hand surgery team, and modality of preservation was documented. Additional data including method of patient transport, replantation attempt, and operative outcome were assessed. Patients were stratified based on whether proper preservation was employed and compared using chi-square tests. RESULTS: Ninety-one patients were included, thirty-one (34.1%) of whom had amputated parts which were properly preserved. Patients from referring facilities were more likely to present with properly preserved parts (45.0%) than those presenting from home (25.5%), though this did not meet significance (P = .051). In total, 74 patients arrived via EMS with 35.1% adherence to preservation protocol. Of the 31 patients who had properly preserved parts, 58.1% underwent attempted replant; of the 60 patients who had improperly preserved parts, 23.3% underwent attempted replantation (P = .001). CONCLUSIONS: The majority of patients who suffer traumatic amputations do not present with properly preserved amputated parts, limiting potential replantation. With a direct correlation to attempted replantation, proper preservation is a crucial aspect of care and should not be overlooked when seeking to optimize efforts and results. LEVEL OF EVIDENCE: Level IV.
Assuntos
Amputação Traumática/terapia , Serviços Médicos de Emergência/normas , Reimplante/normas , Feminino , Traumatismos da Mão/terapia , Humanos , Masculino , Estudos RetrospectivosRESUMO
The functions of proteins and RNAs are defined by the collective interactions of many residues, and yet most statistical models of biological sequences consider sites nearly independently. Recent approaches have demonstrated benefits of including interactions to capture pairwise covariation, but leave higher-order dependencies out of reach. Here we show how it is possible to capture higher-order, context-dependent constraints in biological sequences via latent variable models with nonlinear dependencies. We found that DeepSequence ( https://github.com/debbiemarkslab/DeepSequence ), a probabilistic model for sequence families, predicted the effects of mutations across a variety of deep mutational scanning experiments substantially better than existing methods based on the same evolutionary data. The model, learned in an unsupervised manner solely on the basis of sequence information, is grounded with biologically motivated priors, reveals the latent organization of sequence families, and can be used to explore new parts of sequence space.
Assuntos
Biologia Computacional/métodos , Evolução Molecular , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Modelos Teóricos , Mutação , Algoritmos , HumanosRESUMO
SUMMARY: Coevolutionary sequence analysis has become a commonly used technique for de novo prediction of the structure and function of proteins, RNA, and protein complexes. We present the EVcouplings framework, a fully integrated open-source application and Python package for coevolutionary analysis. The framework enables generation of sequence alignments, calculation and evaluation of evolutionary couplings (ECs), and de novo prediction of structure and mutation effects. The combination of an easy to use, flexible command line interface and an underlying modular Python package makes the full power of coevolutionary analyses available to entry-level and advanced users. AVAILABILITY AND IMPLEMENTATION: https://github.com/debbiemarkslab/evcouplings.
Assuntos
Análise de Sequência , Software , Proteínas , RNA , Alinhamento de SequênciaRESUMO
Lack of quality data and difficulty generating these data hinder quantitative understanding of reaction kinetics. Specifically, conventional methods to generate transition state structures are deficient in speed, accuracy, or scope. We describe a novel method to generate three-dimensional transition state structures for isomerization reactions using reactant and product geometries. Our approach relies on a graph neural network to predict the transition state distance matrix and a least squares optimization to reconstruct the coordinates based on which entries of the distance matrix the model perceives to be important. We feed the structures generated by our algorithm through a rigorous quantum mechanics workflow to ensure the predicted transition state corresponds to the ground truth reactant and product. In both generating viable geometries and predicting accurate transition states, our method achieves excellent results. We envision workflows like this, which combine neural networks and quantum chemistry calculations, will become the preferred methods for computing chemical reactions.
RESUMO
Natural environments are filled with multiple, often competing, signals. In contrast, biological systems are often studied in "well-controlled" environments where only a single input is varied, potentially missing important interactions between signals. Catabolite repression of galactose by glucose is one of the best-studied eukaryotic signal integration systems. In this system, it is believed that galactose metabolic (GAL) genes are induced only when glucose levels drop below a threshold. In contrast, we show that GAL gene induction occurs at a constant external galactose:glucose ratio across a wide range of sugar concentrations. We systematically perturbed the components of the canonical galactose/glucose signaling pathways and found that these components do not account for ratio sensing. Instead we provide evidence that ratio sensing occurs upstream of the canonical signaling pathway and results from the competitive binding of the two sugars to hexose transporters. We show that a mutant that behaves as the classical model expects (i.e., cannot use galactose above a glucose threshold) has a fitness disadvantage compared with wild type. A number of common biological signaling motifs can give rise to ratio sensing, typically through negative interactions between opposing signaling molecules. We therefore suspect that this previously unidentified nutrient sensing paradigm may be common and overlooked in biology.
Assuntos
Galactose/metabolismo , Glucose/metabolismo , Saccharomyces cerevisiae/genética , Meios de Cultura , Genes Fúngicos , Microscopia de Fluorescência , Saccharomyces cerevisiae/metabolismo , Transdução de SinaisRESUMO
BACKGROUND: The ethical practice of medicine has always been of utmost importance, and plastic surgery is no exception. The literature is devoid of information on the teaching of ethics and professionalism in plastic surgery. In light of this, a survey was sent to ascertain the status of ethics training in plastic surgery residencies. METHODS: A 21-question survey was sent from the American Council of Academic Plastic Surgeons meeting to 180 plastic surgery program directors and coordinators via email. Survey questions inquired about practice environment, number of residents, presence of a formal ethics training program, among others. Binary regression was used to determine if any relationships existed between categorical variables, and Poisson linear regression was used to assess relationships between continuous variables. Statistical significance was set at a P value of 0.05. RESULTS: A total of 104 members responded to the survey (58% response rate). Sixty-three percent were program directors, and most (89%) practiced in academic settings. Sixty-two percent in academics reported having a formal training program, and 60% in private practice reported having one. Only 40% of programs with fewer than 10 residents had ethics training, whereas 78% of programs with more than 20 residents did. The odds of having a training program were slightly higher (odds ratio, 1.1) with more residents (P = 0.17). CONCLUSIONS: Despite the lack of information in the literature, formal ethics and professionalism training does exist in many plastic surgery residencies, although barriers to implementation do exist. Plastic surgery leadership should be involved in the development of standardized curricula to help overcome these barriers.
Assuntos
Ética Médica/educação , Profissionalismo/educação , Profissionalismo/ética , Cirurgia Plástica/educação , Cirurgia Plástica/ética , Estudos Transversais , Educação de Pós-Graduação em Medicina , Humanos , Internato e Residência , Inquéritos e Questionários , Estados UnidosRESUMO
OBJECTIVE: The present status of global mission trips of all of the academic Plastic Surgery programs was surveyed. We aimed to provide information and guidelines for other interested programs on creating a global health elective in compliance with American Board of Plastic Surgery (ABPS) and Accreditation Council for Graduate Medical Education Residency Review Committee (ACGME/RRC) requirements. DESIGN: A free-response survey was sent to all of the Plastic Surgery Residency program directors inquiring about their present policy on international mission trips for residents and faculty. Questions included time spent in mission, cases performed, sponsoring organizations, and whether cases are being counted in their resident Plastic Surgery Operative Logs (PSOL). RESULTS: Thirty-one programs responded, with 23 programs presently sponsoring international mission trips. Thirteen programs support residents going on nonprogram-sponsored trips where the majority of these programs partner with outside organizations. Many programs do not count cases performed on mission trips as part of ACGME index case requirement. Application templates for international rotations to comply with ABPS and ACGME/RRC requirements were created to facilitate the participation of interested programs. CONCLUSIONS: Many Plastic Surgery Residency programs are sponsoring international mission trips for their residents; however, there is a lack of uniformity and administrative support in pursuing these humanitarian efforts. The creation of a dynamic centralized database will help interested programs and residents seek out the global health experience they desire and ensure standardization of the educational experience they obtain during these trips.
Assuntos
Acreditação , Educação de Pós-Graduação em Medicina/métodos , Internato e Residência/organização & administração , Missões Médicas/organização & administração , Cirurgia Plástica/organização & administração , HumanosRESUMO
A major challenge in protein design is to augment existing functional proteins with multiple property enhancements. Altering several properties likely necessitates numerous primary sequence changes, and novel methods are needed to accurately predict combinations of mutations that maintain or enhance function. Models of sequence co-variation (e.g., EVcouplings), which leverage extensive information about various protein properties and activities from homologous protein sequences, have proven effective for many applications including structure determination and mutation effect prediction. We apply EVcouplings to computationally design variants of the model protein TEM-1 ß-lactamase. Nearly all the 14 experimentally characterized designs were functional, including one with 84 mutations from the nearest natural homolog. The designs also had large increases in thermostability, increased activity on multiple substrates, and nearly identical structure to the wild type enzyme. This study highlights the efficacy of evolutionary models in guiding large sequence alterations to generate functional diversity for protein design applications.
Assuntos
Evolução Molecular , Mutação , Engenharia de Proteínas , beta-Lactamases , beta-Lactamases/genética , beta-Lactamases/metabolismo , beta-Lactamases/química , Engenharia de Proteínas/métodos , Modelos Moleculares , Sequência de Aminoácidos , Estabilidade Enzimática , Conformação ProteicaRESUMO
BACKGROUND: As health care costs in the United States continue to rise, there is increasing attention on cost-saving measures. One area of investigation is the utility of pathologic examination of specimens from routine procedures with a suspected benign pathology. We assessed the utility and cost of routine pathologic analysis for wrist ganglion cyst excision. METHODS: A retrospective cohort study of all wrist ganglion cyst excisions performed by seven hand surgeons was conducted from 2015 to 2019 at Penn State Hershey Medical Center. Preoperative and intraoperative diagnoses, pathologic diagnosis, and pathology cost were assessed. RESULTS: A total of 407 patients underwent ganglion cyst excision, with 318 (78.1%) specimens sent for pathologic review. Of the 318, 317 (99.6%) specimens were concordant with the preoperative or intraoperative diagnosis of ganglion cyst. One specimen (0.3%) resulted as a benign cystic vascular malformation. The charge per specimen was $258, totaling $81,786 spent confirming benign pathology that was clinically correctly diagnosed by the operating surgeon in 99.6% of cases. CONCLUSIONS: Routine pathologic analysis is not indicated in cases in which surgeons have a high clinical suspicion for ganglion cyst based on preoperative and intraoperative findings. Pathologic review should be reserved for cases with atypical presentations or intraoperative findings.
RESUMO
Designing optimized proteins is important for a range of practical applications. Protein design is a rapidly developing field that would benefit from approaches that enable many changes in the amino acid primary sequence, rather than a small number of mutations, while maintaining structure and enhancing function. Homologous protein sequences contain extensive information about various protein properties and activities that have emerged over billions of years of evolution. Evolutionary models of sequence co-variation, derived from a set of homologous sequences, have proven effective in a range of applications including structure determination and mutation effect prediction. In this work we apply one of these models (EVcouplings) to computationally design highly divergent variants of the model protein TEM-1 ß-lactamase, and characterize these designs experimentally using multiple biochemical and biophysical assays. Nearly all designed variants were functional, including one with 84 mutations from the nearest natural homolog. Surprisingly, all functional designs had large increases in thermostability and most had a broadening of available substrates. These property enhancements occurred while maintaining a nearly identical structure to the wild type enzyme. Collectively, this work demonstrates that evolutionary models of sequence co-variation (1) are able to capture complex epistatic interactions that successfully guide large sequence departures from natural contexts, and (2) can be applied to generate functional diversity useful for many applications in protein design.
RESUMO
Esophageal strictures may be caused by many etiologies. Patients suffer from dysphagia and many are tube-feed dependent. Cervical esophageal reconstruction is challenging for the plastic surgeon, and although there are reports utilizing chest wall flaps or even free flaps, the use of a sternocleidomastoid (SCM) myocutaneous flap provides an ideal reconstruction in select patients who require noncircumferential "patch" cervical esophagoplasty. We present two cases of esophageal reconstruction in which we demonstrate our technique for harvesting and insetting the SCM flap, with particular emphasis on design of the skin paddle and elucidation of the vascular anatomy. We believe that the SCM flap is simple, reliable, convenient, and technically easy to perform. There is minimal donor site morbidity with no functional loss. The SCM myocutaneous flap is a viable option for reconstructing partial esophageal defects and obviates the need to perform staged procedures or more extensive operations such as free tissue transfer.
Assuntos
Estenose Esofágica/cirurgia , Esôfago/cirurgia , Músculo Esquelético/transplante , Procedimentos de Cirurgia Plástica/métodos , Retalhos Cirúrgicos , Estenose Esofágica/etiologia , Feminino , Humanos , Masculino , Pessoa de Meia-IdadeRESUMO
Systematic perturbation of cells followed by comprehensive measurements of molecular and phenotypic responses provides informative data resources for constructing computational models of cell biology. Models that generalize well beyond training data can be used to identify combinatorial perturbations of potential therapeutic interest. Major challenges for machine learning on large biological datasets are to find global optima in a complex multidimensional space and mechanistically interpret the solutions. To address these challenges, we introduce a hybrid approach that combines explicit mathematical models of cell dynamics with a machine-learning framework, implemented in TensorFlow. We tested the modeling framework on a perturbation-response dataset of a melanoma cell line after drug treatments. The models can be efficiently trained to describe cellular behavior accurately. Even though completely data driven and independent of prior knowledge, the resulting de novo network models recapitulate some known interactions. The approach is readily applicable to various kinetic models of cell biology. A record of this paper's Transparent Peer Review process is included in the Supplemental Information.
Assuntos
Biologia Computacional/métodos , Quimioterapia Combinada/métodos , Aprendizado de Máquina/normas , Neoplasias/terapia , HumanosRESUMO
Antibody-based therapeutics and vaccines are essential to combat COVID-19 morbidity and mortality after severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection. Multiple mutations in SARS-CoV-2 that could impair antibody defenses propagated in human-to-human transmission and spillover or spillback events between humans and animals. To develop prevention and therapeutic strategies, we formed an international consortium to map the epitope landscape on the SARS-CoV-2 spike protein, defining and structurally illustrating seven receptor binding domain (RBD)directed antibody communities with distinct footprints and competition profiles. Pseudovirion-based neutralization assays reveal spike mutations, individually and clustered together in variants, that affect antibody function among the communities. Key classes of RBD-targeted antibodies maintain neutralization activity against these emerging SARS-CoV-2 variants. These results provide a framework for selecting antibody treatment cocktails and understanding how viral variants might affect antibody therapeutic efficacy.
Assuntos
Anticorpos Neutralizantes/imunologia , Anticorpos Antivirais/imunologia , Mapeamento de Epitopos , Epitopos Imunodominantes/imunologia , SARS-CoV-2/imunologia , Glicoproteína da Espícula de Coronavírus/imunologia , Anticorpos Neutralizantes/uso terapêutico , Anticorpos Antivirais/uso terapêutico , Antígenos Virais/química , Antígenos Virais/imunologia , COVID-19/terapia , Humanos , Epitopos Imunodominantes/química , Ligação Proteica , Domínios Proteicos , Glicoproteína da Espícula de Coronavírus/químicaRESUMO
Background: Indication for intervention in Dupuytren disease is influenced by many factors, including location and extent of disease, surgeon preference, and comfort level with different treatment techniques. The aim of this study was to determine current Dupuytren disease management trends. Methods: A questionnaire was sent through the American Society for Surgery of the Hand to all members. In addition to demographic data, questions focused on indications for different procedural interventions based on location of disease, age, and activity level of the patient. Results: Approximately 24% of respondents completed the survey. Respondents were mostly orthopedic surgeons in private practice who do not work with residents or fellows. Respondents preferred collagenase over needle aponeurotomy and limited fasciectomy for primary Dupuytren disease involving only the metacarpophalangeal (MCP) joint. Limited fasciectomy was the preferred treatment for primary Dupuytren disease involving the MCP and proximal interphalangeal joints. For a patient amenable to any treatment option, the majority would use collagenase, although 87.1% felt that fasciectomy offered the longest disease-free interval. Furthermore, given the option of a young, working patient, 42.7% would use collagenase, while plastic and general surgeons were more likely to treat this patient with limited fasciectomy. More plastic surgeons (vs orthopedic) believe that limited fasciectomy yields the longest disease-free interval. For a patient amenable to any surgical option, orthopedic surgeons prefer collagenase, whereas plastic hand surgeons prefer a limited fasciectomy. Conclusion: There are several procedural options for the treatment of Dupuytren disease. This study details current practice patterns among hand surgeons and reveals the increasingly prevalent use of collagenase.
Assuntos
Contratura de Dupuytren/terapia , Mãos/cirurgia , Cirurgiões Ortopédicos/estatística & dados numéricos , Padrões de Prática Médica/tendências , Adulto , Colagenases/uso terapêutico , Gerenciamento Clínico , Fasciotomia/tendências , Feminino , Humanos , Masculino , Articulação Metacarpofalângica/cirurgia , Pessoa de Meia-Idade , Inquéritos e Questionários , Resultado do TratamentoRESUMO
The annotation of the Escherichia coli K-12 genome in the EcoCyc database is one of the most accurate, complete and multidimensional genome annotations. Of the 4460 E. coli genes, EcoCyc assigns biochemical functions to 76%, and 66% of all genes had their functions determined experimentally. EcoCyc assigns E. coli genes to Gene Ontology and to MultiFun. Seventy-five percent of gene products contain reviews authored by the EcoCyc project that summarize the experimental literature about the gene product. EcoCyc information was derived from 15 000 publications. The database contains extensive descriptions of E. coli cellular networks, describing its metabolic, transport and transcriptional regulatory processes. A comparison to genome annotations for other model organisms shows that the E. coli genome contains the most experimentally determined gene functions in both relative and absolute terms: 2941 (66%) for E. coli, 2319 (37%) for Saccharomyces cerevisiae, 1816 (5%) for Arabidopsis thaliana, 1456 (4%) for Mus musculus and 614 (4%) for Drosophila melanogaster. Database queries to EcoCyc survey the global properties of E. coli cellular networks and illuminate the extent of information gaps for E. coli, such as dead-end metabolites. EcoCyc provides a genome browser with novel properties, and a novel interactive display of transcriptional regulatory networks.
Assuntos
Bases de Dados Genéticas , Escherichia coli K12/genética , Genoma Bacteriano , Biologia Computacional , Escherichia coli K12/metabolismo , Proteínas de Escherichia coli/genética , Proteínas de Escherichia coli/fisiologia , Redes Reguladoras de Genes , Genes Bacterianos , SoftwareRESUMO
MetaCyc is a database of metabolic pathways and enzymes located at http://MetaCyc.org/. Its goal is to serve as a metabolic encyclopedia, containing a collection of non-redundant pathways central to small molecule metabolism, which have been reported in the experimental literature. Most of the pathways in MetaCyc occur in microorganisms and plants, although animal pathways are also represented. MetaCyc contains metabolic pathways, enzymatic reactions, enzymes, chemical compounds, genes and review-level comments. Enzyme information includes substrate specificity, kinetic properties, activators, inhibitors, cofactor requirements and links to sequence and structure databases. Data are curated from the primary literature by curators with expertise in biochemistry and molecular biology. MetaCyc serves as a readily accessible comprehensive resource on microbial and plant pathways for genome analysis, basic research, education, metabolic engineering and systems biology. Querying, visualization and curation of the database is supported by SRI's Pathway Tools software. The PathoLogic component of Pathway Tools is used in conjunction with MetaCyc to predict the metabolic network of an organism from its annotated genome. SRI and the European Bioinformatics Institute employed this tool to create pathway/genome databases (PGDBs) for 165 organisms, available at the BioCyc.org website. These PGDBs also include predicted operons and pathway hole fillers.
Assuntos
Bases de Dados Factuais , Enzimas/química , Metabolismo , Animais , Bactérias/enzimologia , Bactérias/metabolismo , Poluentes Ambientais/metabolismo , Enzimas/análise , Enzimas/genética , Humanos , Internet , Plantas/enzimologia , Plantas/metabolismo , Software , Interface Usuário-ComputadorRESUMO
The EcoCyc database (http://EcoCyc.org/) is a comprehensive source of information on the biology of the prototypical model organism Escherichia coli K12. The mission for EcoCyc is to contain both computable descriptions of, and detailed comments describing, all genes, proteins, pathways and molecular interactions in E.coli. Through ongoing manual curation, extensive information such as summary comments, regulatory information, literature citations and evidence types has been extracted from 8862 publications and added to Version 8.5 of the EcoCyc database. The EcoCyc database can be accessed through a World Wide Web interface, while the downloadable Pathway Tools software and data files enable computational exploration of the data and provide enhanced querying capabilities that web interfaces cannot support. For example, EcoCyc contains carefully curated information that can be used as training sets for bioinformatics prediction of entities such as promoters, operons, genetic networks, transcription factor binding sites, metabolic pathways, functionally related genes, protein complexes and protein-ligand interactions.