Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
Mais filtros











Base de dados
Intervalo de ano de publicação
1.
Curr Top Med Chem ; 20(4): 305-317, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-31878856

RESUMO

AIMS: Cheminformatics models are able to predict different outputs (activity, property, chemical reactivity) in single molecules or complex molecular systems (catalyzed organic synthesis, metabolic reactions, nanoparticles, etc.). BACKGROUND: Cheminformatics models are able to predict different outputs (activity, property, chemical reactivity) in single molecules or complex molecular systems (catalyzed organic synthesis, metabolic reactions, nanoparticles, etc.). OBJECTIVE: Cheminformatics prediction of complex catalytic enantioselective reactions is a major goal in organic synthesis research and chemical industry. Markov Chain Molecular Descriptors (MCDs) have been largely used to solve Cheminformatics problems. There are different types of Markov chain descriptors such as Markov-Shannon entropies (Shk), Markov Means (Mk), Markov Moments (πk), etc. However, there are other possible MCDs that have not been used before. In addition, the calculation of MCDs is done very often using specific software not always available for general users and there is not an R library public available for the calculation of MCDs. This fact, limits the availability of MCMDbased Cheminformatics procedures. METHODS: We studied the enantiomeric excess ee(%)[Rcat] for 324 α-amidoalkylation reactions. These reactions have a complex mechanism depending on various factors. The model includes MCDs of the substrate, solvent, chiral catalyst, product along with values of time of reaction, temperature, load of catalyst, etc. We tested several Machine Learning regression algorithms. The Random Forest regression model has R2 > 0.90 in training and test. Secondly, the biological activity of 5644 compounds against colorectal cancer was studied. RESULTS: We developed very interesting model able to predict with Specificity and Sensitivity 70-82% the cases of preclinical assays in both training and validation series. CONCLUSION: The work shows the potential of the new tool for computational studies in organic and medicinal chemistry.


Assuntos
Quimioinformática , Química Farmacêutica , Cadeias de Markov , Algoritmos , Humanos , Aprendizado de Máquina
3.
Curr Top Med Chem ; 13(14): 1681-91, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23889046

RESUMO

The transport of the molecules inside cells is a very important topic, especially in Drug Metabolism. The experimental testing of the new proteins for the transporter molecular function is expensive and inefficient due to the large amount of new peptides. Therefore, there is a need for cheap and fast theoretical models to predict the transporter proteins. In the current work, the primary structure of a protein is represented as a molecular Star graph, characterized by a series of topological indices. The dataset was made up of 2,503 protein chains, out of which 413 have transporter molecular function and 2,090 have no transporter function. These indices were used as input to several classification techniques to find the best Quantitative Structure Activity Relationship (QSAR) model that can evaluate the transporter function of a new protein chain. Among several feature selection techniques, the Support Vector Machine Recursive Feature Elimination allows us to obtain a classification model based on 20 attributes with a true positive rate of 83% and a false positive rate of 16.7%.


Assuntos
Proteínas de Transporte/química , Máquina de Vetores de Suporte , Animais , Proteínas de Transporte/metabolismo , Humanos , Conformação Proteica , Relação Quantitativa Estrutura-Atividade
4.
Front Biosci (Elite Ed) ; 5(2): 446-60, 2013 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-23277001

RESUMO

It usually can take more than ten years from the time a new drug is discovered, until can be launched on the market. Regulatory requirements are part of the process of drug discovery and drug development. It acts at every developmental stage. Regulatory affairs works to establish an effective and uniform balance between voluntary and regulatory compliance and agency responsiveness to consumer needs. It evaluates and coordinates all proposed legal actions to ascertain compliance with regulatory policy. The ontology presented for regulatory affairs and drug research and development gives us the possibility to correlate information from different levels and to discover new relationships between the legal aspects. In addition, the transparency of the information is affected by the inability of existing integration strategies to organize and apply the available knowledge to the range of real scientific and business issue in critical safety and regulatory applications. Therefore, the semantic technologies based on ontologies make the knowledge reusable by several applications across business, from discovery to corporate affairs.


Assuntos
Técnicas de Química Sintética/métodos , Bases de Dados Factuais , Descoberta de Drogas/legislação & jurisprudência , Descoberta de Drogas/métodos , Avaliação Pré-Clínica de Medicamentos/métodos , Regulamentação Governamental , Disseminação de Informação/métodos , Disseminação de Informação/legislação & jurisprudência , Internet , Software
5.
J Theor Biol ; 317: 331-7, 2013 Jan 21.
Artigo em Inglês | MEDLINE | ID: mdl-23116665

RESUMO

Aging and life quality is an important research topic nowadays in areas such as life sciences, chemistry, pharmacology, etc. People live longer, and, thus, they want to spend that extra time with a better quality of life. At this regard, there exists a tiny subset of molecules in nature, named antioxidant proteins that may influence the aging process. However, testing every single protein in order to identify its properties is quite expensive and inefficient. For this reason, this work proposes a model, in which the primary structure of the protein is represented using complex network graphs that can be used to reduce the number of proteins to be tested for antioxidant biological activity. The graph obtained as a representation will help us describe the complex system by using topological indices. More specifically, in this work, Randic's Star Networks have been used as well as the associated indices, calculated with the S2SNet tool. In order to simulate the existing proportion of antioxidant proteins in nature, a dataset containing 1999 proteins, of which 324 are antioxidant proteins, was created. Using this data as input, Star Graph Topological Indices were calculated with the S2SNet tool. These indices were then used as input to several classification techniques. Among the techniques utilised, the Random Forest has shown the best performance, achieving a score of 94% correctly classified instances. Although the target class (antioxidant proteins) represents a tiny subset inside the dataset, the proposed model is able to achieve a percentage of 81.8% correctly classified instances for this class, with a precision of 81.3%.


Assuntos
Algoritmos , Antioxidantes/metabolismo , Proteínas/metabolismo , Sequência de Aminoácidos , Bases de Dados de Proteínas , Dados de Sequência Molecular , Proteínas/química , Relação Quantitativa Estrutura-Atividade , Curva ROC
6.
J Neurosci Methods ; 209(2): 410-9, 2012 Aug 15.
Artigo em Inglês | MEDLINE | ID: mdl-22814089

RESUMO

The recognition of seizures is very important for the diagnosis of patients with epilepsy. The seizure is a process of rhythmic discharge in brain and occurs rarely and unpredictably. This behavior generates a need of an automatic detection of seizures by using the signals of long-term electroencephalographic (EEG) recordings. Due to the non-stationary character of EEG signals, the conventional methods of frequency analysis are not the best alternative to obtain good results in diagnostic purpose. The present work proposes a method of EEG signal analysis based on star graph topological indices (SGTIs) for the first time. The signal information, such as amplitude and time occurrence, is codified into invariant SGTIs which are the basis for the classification models that can discriminate the epileptic EEG records from the non-epileptic ones. The method with SGTIs and the simplest linear discriminant methods provide similar results to those previously published, which are based on the time-frequency analysis and artificial neural networks. Thus, this work proposes a simpler and faster alternative for automatic detection of seizures from the EEG recordings.


Assuntos
Mapeamento Encefálico , Ondas Encefálicas/fisiologia , Eletroencefalografia/métodos , Convulsões/diagnóstico , Processamento de Sinais Assistido por Computador , Análise de Fourier , Humanos , Máquina de Vetores de Suporte
7.
J Theor Biol ; 257(2): 303-11, 2009 Mar 21.
Artigo em Inglês | MEDLINE | ID: mdl-19111559

RESUMO

The cancer diagnostic is a complex process and, sometimes, the specific markers can interfere or produce negative results. Thus, new simple and fast theoretical models are required. One option is the complex network graphs theory that permits us to describe any real system, from the small molecules to the complex genetic, neural or social networks by transforming real properties in topological indices. This work converts the protein primary structure data in specific Randic's star networks topological indices using the new sequence to star networks (S2SNet) application. A set of 1054 proteins were selected from previous works and contains proteins related or not with two types of cancer, human breast cancer (HBC) and human colon cancer (HCC). The general discriminant analysis method generates an input-coded multi-target classification model with the training/predicting set accuracies of 90.0% for the forward stepwise model type. In addition, a protein subset was modified by single amino acid mutations with higher log-odds PAM250 values and tested with the new classification if can be related with HBC or HCC. In conclusion, we shown that, using simple input data such is the primary protein sequence and the simples linear analysis, it is possible to obtain accurate classification models that can predict if a new protein related with two types of cancer. These results promote the use of the S2SNet in clinical proteomics.


Assuntos
Neoplasias da Mama/classificação , Neoplasias do Colo/classificação , Simulação por Computador , Modelos Estatísticos , Proteínas de Neoplasias/classificação , Sequência de Aminoácidos , Neoplasias da Mama/metabolismo , Neoplasias do Colo/metabolismo , Análise Discriminante , Feminino , Humanos , Masculino , Dados de Sequência Molecular , Mutação , Proteínas de Neoplasias/genética , Reconhecimento Automatizado de Padrão
8.
Bioorg Med Chem ; 17(1): 165-75, 2009 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-19026553

RESUMO

Efficient drugs such as statins or mevinic acids are inhibitors of the rate-limiting enzyme of cholesterol biosynthesis, 3-hydroxy-3-methyl-glutaryl coenzyme A reductase (HMGR), an enzyme responsible for the double reduction of 3-hydroxy-3-methyl-glutaryl coenzyme A into mevalonic acid. These compounds promoted the synthesis and evaluation of new inhibitors for HMGR, named HMGRIs. The high number of possible candidates creates the necessity of Quantitative Structure-Activity Relationship models in order to guide the HMGRI synthesis. There are two main problems of the reported QSAR models: the homogeneous series of the compounds and the chirality of many candidates. In this work, we propose for the first time a QSAR model for a very large and heterogeneous series of HMGRIs. The model is based on the Topological Indices (TIs) of molecular structures. Using the predictions of this model as input, we construct the first complex network that describes the drug-drug similarity relationships for more than 1600 experimentally non-explored chiral HMGRIs isomers. We also presented a reduced version of this network (Giant Component) that contains the most representative set of chiral HMGRI candidates. The work suggests a new mixed application in the QSAR study of relevant aspects of structural diversity by using chiral/non-chiral TIs, combined with complex networks.


Assuntos
Inibidores de Hidroximetilglutaril-CoA Redutases/química , Inibidores de Hidroximetilglutaril-CoA Redutases/farmacologia , Relação Quantitativa Estrutura-Atividade , Humanos , Hidroximetilglutaril-CoA Redutases , Modelos Moleculares , Estereoisomerismo
9.
J Theor Biol ; 256(3): 458-66, 2009 Feb 07.
Artigo em Inglês | MEDLINE | ID: mdl-18992259

RESUMO

The importance of the promoter sequences in the function regulation of several important mycobacterial pathogens creates the necessity to design simple and fast theoretical models that can predict them. This work proposes two DNA promoter QSAR models based on pseudo-folding lattice network (LN) and star-graphs (SG) topological indices. In addition, a comparative study with the previous RNA electrostatic parameters of thermodynamically-driven secondary structure folding representations has been carried out. The best model of this work was obtained with only two LN stochastic electrostatic potentials and it is characterized by accuracy, selectivity and specificity of 90.87%, 82.96% and 92.95%, respectively. In addition, we pointed out the SG result dependence on the DNA sequence codification and we proposed a QSAR model based on codons and only three SG spectral moments.


Assuntos
DNA Bacteriano/genética , Mycobacterium/genética , Regiões Promotoras Genéticas/genética , Códon , Cadeias de Markov , Modelos Biológicos , Estrutura Secundária de Proteína , Relação Quantitativa Estrutura-Atividade
10.
Bioorg Med Chem ; 16(22): 9684-93, 2008 Nov 15.
Artigo em Inglês | MEDLINE | ID: mdl-18951807

RESUMO

Numerical parameters of the molecular networks, also referred as Topological Indices or Connectivity Indices (CIs), have been used in Bioorganic and Medicinal Chemistry to find Quantitative Structure-Activity, Property or Toxicity Relationship (QSAR, QSPR and QSTR) models. QSPR models generally use CIs as inputs to predict the biological activity of compounds. However, the literature does not evidence a great effort to find QSAR-like models for other biologically and chemically relevant systems. For instance, blood proteome constitutes a protein-rich information reservoir, since the serum proteome Mass Spectra (MS) represents a potential information source for the early detection of Biomarkers for diseases and/or drug-induced toxicities. The concept of mass spectrum network (MS network) for a single protein is already well-known. However, there are no reported results on the use of CIs for a MS network of a whole proteome to explore MS patterns. In this work, we introduced for the first time a novel network representation and the CIs for the MS of blood proteome samples. The new network bases on Randic's Spiral network have been previously introduced for protein sequences. The new MS CIs, called here Spiral Markov Connectivity (SMC(k)) of the MS Spiral graph can be calculated with the software MARCH-INSIDE, combining network and Markov model theory. The SMC(k) values could be used to seek QSAR-like models, called in this work Quantitative Proteome-Property Relationships (QPPRs). We calculate the SMC(k) values for 62 blood samples and fit a QPPR model by discriminating proteome MS, typical of individuals susceptible to suffer drug-induced cardiotoxicity from control samples. The accuracy, sensitivity, and specificity values of the QPPR model were between 73.08% and 87.5% in training and validation series. This work points to QPPR models as a powerful tool for MS detection of biomarkers in proteomics.


Assuntos
Espectrometria de Massas/métodos , Proteoma/análise , Testes de Toxicidade , Algoritmos , Biomarcadores/sangue , Simulação por Computador , Cadeias de Markov , Modelos Biológicos , Proteômica/métodos , Relação Quantitativa Estrutura-Atividade , Software
11.
J Theor Biol ; 254(4): 775-83, 2008 Oct 21.
Artigo em Inglês | MEDLINE | ID: mdl-18692072

RESUMO

The development of the complex network graphs permits us to describe any real system such as social, neural, computer or genetic networks by transforming real properties in topological indices (TIs). This work uses Randic's star networks in order to convert the protein primary structure data in specific topological indices that are used to construct a natural/random protein classification model. The set of natural proteins contains 1046 protein chains selected from the pre-compiled CulledPDB list from PISCES Dunbrack's Web Lab. This set is characterized by a protein homology of 20%, a structure resolution of 1.6A and R-factor lower than 25%. The set of random amino acid chains contains 1046 sequences which were generated by Python script according to the same type of residues and average chain length found in the natural set. A new Sequence to Star Networks (S2SNet) wxPython GUI application (with a Graphviz graphics back-end) was designed by our group in order to transform any character sequence in the following star network topological indices: Shannon entropy of Markov matrices, trace of connectivity matrices, Harary number, Wiener index, Gutman index, Schultz index, Moreau-Broto indices, Balaban distance connectivity index, Kier-Hall connectivity indices and Randic connectivity index. The model was constructed with the General Discriminant Analysis methods from STATISTICA package and gave training/predicting set accuracies of 90.77% for the forward stepwise model type. In conclusion, this study extends for the first time the classical TIs to protein star network TIs by proposing a model that can predict if a protein/fragment of protein is natural or random using only the amino acid sequence data. This classification can be used in the studies of the protein functions by changing some fragments with random amino acid sequences or to detect the fake amino acid sequences or the errors in proteins. These results promote the use of the S2SNet application not only for protein structure analysis but also for mass spectroscopy, clinical proteomics and imaging, or DNA/RNA structure analysis.


Assuntos
Modelos Estatísticos , Redes Neurais de Computação , Conformação Proteica , Proteínas/classificação , Sequência de Aminoácidos , Animais , Bases de Dados de Proteínas , Modelos Biológicos , Dados de Sequência Molecular , Proteínas/química
12.
J Theor Biol ; 254(2): 476-82, 2008 Sep 21.
Artigo em Inglês | MEDLINE | ID: mdl-18606172

RESUMO

The huge amount of new proteins that need a fast enzymatic activity characterization creates demands of protein QSAR theoretical models. The protein parameters that can be used for an enzyme/non-enzyme classification includes the simpler indices such as composition, sequence and connectivity, also called topological indices (TIs) and the computationally expensive 3D descriptors. A comparison of the 3D versus lower dimension indices has not been reported with respect to the power of discrimination of proteins according to enzyme action. A set of 966 proteins (enzymes and non-enzymes) whose structural characteristics are provided by PDB/DSSP files was analyzed with Python/Biopython scripts, STATISTICA and Weka. The list of indices includes, but it is not restricted to pure composition indices (residue fractions), DSSP secondary structure protein composition and 3D indices (surface and access). We also used mixed indices such as composition-sequence indices (Chou's pseudo-amino acid compositions or coupling numbers), 3D-composition (surface fractions) and DSSP secondary structure amino acid composition/propensities (obtained with our Prot-2S Web tool). In addition, we extend and test for the first time several classic TIs for the Randic's protein sequence Star graphs using our Sequence to Star Graph (S2SG) Python application. All the indices were processed with general discriminant analysis models (GDA), neural networks (NN) and machine learning (ML) methods and the results are presented versus complexity, average of Shannon's information entropy (Sh) and data/method type. This study compares for the first time all these classes of indices to assess the ratios between model accuracy and indices/model complexity in enzyme/non-enzyme discrimination. The use of different methods and complexity of data shows that one cannot establish a direct relation between the complexity and the accuracy of the model.


Assuntos
Simulação por Computador , Modelos Moleculares , Proteínas/classificação , Algoritmos , Sequência de Aminoácidos , Animais , Sequência de Bases , Enzimas/classificação , Dados de Sequência Molecular , Reconhecimento Automatizado de Padrão , Conformação Proteica , Relação Quantitativa Estrutura-Atividade , Análise de Sequência de Proteína
13.
J Chem Phys ; 123(1): 014309, 2005 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-16035837

RESUMO

Accurate ground-state intermolecular potential-energy surfaces are obtained for the HCCH-He, Ne, and Ar van der Waals complexes. The interaction energies are calculated at the coupled cluster singles and doubles including connected triple excitations level and fitted to analytic functions. For the three complexes we start with systematic basis set studies carried out at several intermolecular geometries, and using augmented correlation consistent polarized valence basis sets x-aug-cc-pVXZ (x=-,d; X=D,T,Q,5), also extended with a set of 3s3p2d1f1g midbond functions. The aug-cc-pVQZ-33211 surfaces of HCCH-He, Ne, and Ar complexes are characterized by absolute minima of -24.22, -50.20, and -122.17 cm(-1) at distances R between the rare-gas atom and the HCCH centers of mass of 4.35, 3.95, and 3.99 A, respectively; and at angles between the vector R and the HCCH main symmetry axis of 0 degrees , 43.3 degrees , and 60.6 degrees . The results are compared and considerably improve those previously available.

14.
J Chem Phys ; 121(3): 1390-6, 2004 Jul 15.
Artigo em Inglês | MEDLINE | ID: mdl-15260683

RESUMO

Using the coupled cluster singles and doubles including connected triple excitations model with the augmented correlation consistent polarized valence double zeta basis set extended with a set of 3s3p2d1f1g midbond functions, we evaluate the ground state intermolecular potential energy surface of the chlorobenzene-argon van der Waals complex. The minima of 420 cm(-1) are characterized by Ar atom position vectors of the length 3.583 A, forming an angle of 9.87 degrees with respect to the axis perpendicular to the chlorobenzene plane. These results are compared to those obtained for similar complexes and to the experimental data available. From the potential the three-dimensional vibrational eigenfunctions and eigenvalues are calculated and the results allow to correct and complete the experimental assignment.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA