Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 18 de 18
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Artigo em Inglês | MEDLINE | ID: mdl-38236684

RESUMO

Interactive visualization can support fluid exploration but is often limited to predetermined tasks. Scripting can support a vast range of queries but may be more cumbersome for free-form exploration. Embedding interactive visualization in scripting environments, such as computational notebooks, provides an opportunity to leverage the strengths of both direct manipulation and scripting. We investigate interactive visualization design methodology, choices, and strategies under this paradigm through a design study of calling context trees used in performance analysis, a field which exemplifies typical exploratory data analysis workflows with Big Data and hard to define problems. We first produce a formal task analysis assigning tasks to graphical or scripting contexts based on their specificity, frequency, and suitability. We then design a notebook-embedded interactive visualization and validate it with intended users. In a follow-up study, we present participants with multiple graphical and scripting interaction modes to elicit feedback about notebook-embedded visualization design, finding consensus in support of the interaction model. We report and reflect on observations regarding the process and design implications for combining visualization and scripting in notebooks.

2.
mSystems ; 7(3): e0031222, 2022 06 28.
Artigo em Inglês | MEDLINE | ID: mdl-35543104

RESUMO

Microbial symbiosis drives physiological processes of higher-order systems, including the acquisition and consumption of nutrients that support symbiotic partner reproduction. Metabolic analytics provide new avenues to examine how chemical ecology, or the conversion of existing biomass to new forms, changes over a symbiotic life cycle. We applied these approaches to the nematode Steinernema carpocapsae, its mutualist bacterium, Xenorhabdus nematophila, and the insects they infect. The nematode-bacterium pair infects, kills, and reproduces in an insect until nutrients are depleted. To understand the conversion of insect biomass over time into either nematode or bacterium biomass, we integrated information from trophic, metabolomic, and gene regulation analyses. Trophic analysis established bacteria as meso-predators and primary insect consumers. Nematodes hold a trophic position of 4.6, indicative of an apex predator, consuming bacteria and likely other nematodes. Metabolic changes associated with Galleria mellonella insect bioconversion were assessed using multivariate statistical analyses of metabolomics data sets derived from sampling over an infection time course. Statistically significant, discrete phases were detected, indicating the insect chemical environment changes reproducibly during bioconversion. A novel hierarchical clustering method was designed to probe molecular abundance fluctuation patterns over time, revealing distinct metabolite clusters that exhibit similar abundance shifts across the time course. Composite data suggest bacterial tryptophan and nematode kynurenine pathways are coordinated for reciprocal exchange of tryptophan and NAD+ and for synthesis of intermediates that can have complex effects on bacterial phenotypes and nematode behaviors. Our analysis of pathways and metabolites reveals the chemistry underlying the recycling of organic material during carnivory. IMPORTANCE The processes by which organic life is consumed and reborn in a complex ecosystem were investigated through a multiomics approach applied to the tripartite Xenorhabdus bacterium-Steinernema nematode-Galleria insect symbiosis. Trophic analyses demonstrate the primary consumers of the insect are the bacteria, and the nematode in turn consumes the bacteria. This suggests the Steinernema-Xenorhabdus mutualism is a form of agriculture in which the nematode cultivates the bacterial food sources by inoculating them into insect hosts. Metabolomics analysis revealed a shift in biological material throughout progression of the life cycle: active infection, insect death, and conversion of cadaver tissues into bacterial biomass and nematode tissue. We show that each phase of the life cycle is metabolically distinct, with significant differences including those in the tricarboxylic acid cycle and amino acid pathways. Our findings demonstrate that symbiotic life cycles can be defined by reproducible stage-specific chemical signatures, enhancing our broad understanding of metabolic processes that underpin a three-way symbiosis.


Assuntos
Mariposas , Rabditídios , Xenorhabdus , Animais , Ecossistema , Triptofano , Insetos , Xenorhabdus/genética , Rabditídios/microbiologia
3.
Data Brief ; 40: 107780, 2022 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-35036484

RESUMO

Neural Networks (NNs) are increasingly used across scientific domains to extract knowledge from experimental or computational data. An NN is composed of natural or artificial neurons that serve as simple processing units and are interconnected into a model architecture; it acquires knowledge from the environment through a learning process and stores this knowledge in its connections. The learning process is conducted by training. During NN training, the learning process can be tracked by periodically validating the NN and calculating its fitness. The resulting sequence of fitness values (i.e., validation accuracy or validation loss) is called the NN learning curve. The development of tools for NN design requires knowledge of diverse NNs and their complete learning curves. Generally, only final fully-trained fitness values for highly accurate NNs are made available to the community, hampering efforts to develop tools for NN design and leaving unaddressed aspects such as explaining the generation of an NN and reproducing its learning process. Our dataset fills this gap by fully recording the structure, metadata, and complete learning curves for a wide variety of random NNs throughout their training. Our dataset captures the lifespan of 6000 NNs throughout generation, training, and validation stages. It consists of a suite of 6000 tables, each table representing the lifespan of one NN. We generate each NN with randomized parameter values and train it for 40 epochs on one of three diverse image datasets (i.e., CIFAR-100, FashionMNIST, SVHN). We calculate and record each NN's fitness with high frequency-every half epoch-to capture the evolution of the training and validation process. As a result, for each NN, we record the generated parameter values describing the structure of that NN, the image dataset on which the NN trained, and all loss and accuracy values for the NN every half epoch. We put our dataset to the service of researchers studying NN performance and its evolution throughout training and validation. Statistical methods can be applied to our dataset to analyze the shape of learning curves in diverse NNs, and the relationship between an NN's structure and its fitness. Additionally, the structural data and metadata that we record enable the reconstruction and reproducibility of the associated NN.

4.
IEEE/ACM Trans Comput Biol Bioinform ; 18(4): 1336-1349, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-31603792

RESUMO

In order to successfully predict a proteins function throughout its trajectory, in addition to uncovering changes in its conformational state, it is necessary to employ techniques that maintain its 3D information while performing at scale. We extend a protein representation that encodes secondary and tertiary structure into fix-sized, color images, and a neural network architecture (called GEM-net) that leverages our encoded representation. We show the applicability of our method in two ways: (1) performing protein function prediction, hitting accuracy between 78 and 83 percent, and (2) visualizing and detecting conformational changes in protein trajectories during molecular dynamics simulations.


Assuntos
Biologia Computacional/métodos , Gráficos por Computador , Processamento de Imagem Assistida por Computador/métodos , Conformação Proteica , Proteínas/química , Simulação de Dinâmica Molecular , Redes Neurais de Computação
5.
Philos Trans A Math Phys Eng Sci ; 378(2166): 20190063, 2020 Mar 06.
Artigo em Inglês | MEDLINE | ID: mdl-31955686

RESUMO

This paper presents the survey of three algorithms to transform atomic-level molecular snapshots from molecular dynamics (MD) simulations into metadata representations that are suitable for in situ analytics based on machine learning methods. MD simulations studying the classical time evolution of a molecular system at atomic resolution are widely recognized in the fields of chemistry, material sciences, molecular biology and drug design; these simulations are one of the most common simulations on supercomputers. Next-generation supercomputers will have a dramatically higher performance than current systems, generating more data that needs to be analysed (e.g. in terms of number and length of MD trajectories). In the future, the coordination of data generation and analysis can no longer rely on manual, centralized analysis traditionally performed after the simulation is completed or on current data representations that have been defined for traditional visualization tools. Powerful data preparation phases (i.e. phases in which original row data is transformed to concise and still meaningful representations) will need to proceed data analysis phases. Here, we discuss three algorithms for transforming traditionally used molecular representations into concise and meaningful metadata representations. The transformations can be performed locally. The new metadata can be fed into machine learning methods for runtime in situ analysis of larger MD trajectories supported by high-performance computing. In this paper, we provide an overview of the three algorithms and their use for three different applications: protein-ligand docking in drug design; protein folding simulations; and protein engineering based on analytics of protein functions depending on proteins' three-dimensional structures. This article is part of a discussion meeting issue 'Numerical algorithms for high-performance computational science'.

6.
J Comput Chem ; 38(16): 1419-1430, 2017 06 15.
Artigo em Inglês | MEDLINE | ID: mdl-28093787

RESUMO

The transition toward exascale computing will be accompanied by a performance dichotomy. Computational peak performance will rapidly increase; I/O performance will either grow slowly or be completely stagnant. Essentially, the rate at which data are generated will grow much faster than the rate at which data can be read from and written to the disk. MD simulations will soon face the I/O problem of efficiently writing to and reading from disk on the next generation of supercomputers. This article targets MD simulations at the exascale and proposes a novel technique for in situ data analysis and indexing of MD trajectories. Our technique maps individual trajectories' substructures (i.e., α-helices, ß-strands) to metadata frame by frame. The metadata captures the conformational properties of the substructures. The ensemble of metadata can be used for automatic, strategic analysis within a trajectory or across trajectories, without manually identify those portions of trajectories in which critical changes take place. We demonstrate our technique's effectiveness by applying it to 26.3k helices and 31.2k strands from 9917 PDB proteins and by providing three empirical case studies. © 2017 Wiley Periodicals, Inc.


Assuntos
Ciência de Dados/métodos , Simulação de Dinâmica Molecular , Proteínas/química , Modelos Teóricos , Estrutura Secundária de Proteína
8.
J Comput Chem ; 36(16): 1196-212, 2015 Jun 15.
Artigo em Inglês | MEDLINE | ID: mdl-25868455

RESUMO

In this study, we examine the temperature dependence of free energetics of nanotube association using graphical processing unit-enabled all-atom molecular dynamics simulations (FEN ZI) with two (10,10) single-walled carbon nanotubes in 3 m NaI aqueous salt solution. Results suggest that the free energy, enthalpy and entropy changes for the association process are all reduced at the high temperature, in agreement with previous investigations using other hydrophobes. Via the decomposition of free energy into individual components, we found that solvent contribution (including water, anion, and cation contributions) is correlated with the spatial distribution of the corresponding species and is influenced distinctly by the temperature. We studied the spatial distribution and the structure of the solvent in different regions: intertube, intratube and the bulk solvent. By calculating the fluctuation of coarse-grained tube-solvent surfaces, we found that tube-water interfacial fluctuation exhibits the strongest temperature dependence. By taking ions to be a solvent-like medium in the absence of water, tube-anion interfacial fluctuation shows similar but weaker dependence on temperature, while tube-cation interfacial fluctuation shows no dependence in general. These characteristics are discussed via the malleability of their corresponding solvation shells relative to the nanotube surface. Hydrogen bonding profiles and tetrahedrality of water arrangement are also computed to compare the structure of solvent in the solvent bulk and intertube region. The hydrophobic confinement induces a relatively lower concentration environment in the intertube region, therefore causing different intertube solvent structures which depend on the tube separation. This study is relevant in the continuing discourse on hydrophobic interactions (as they impact generally a broad class of phenomena in biology, biochemistry, and materials science and soft condensed matter research), and interpretations of hydrophobicity in terms of alternative but parallel signatures such as interfacial fluctuations, dewetting transitions, and enhanced fluctuation probabilities at interfaces.


Assuntos
Nanotubos de Carbono/química , Iodeto de Sódio/química , Termodinâmica , Ligação de Hidrogênio , Interações Hidrofóbicas e Hidrofílicas , Simulação de Dinâmica Molecular , Soluções/química , Temperatura , Água/química
9.
BMC Struct Biol ; 13 Suppl 1: S3, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24564983

RESUMO

BACKGROUND: Ribonucleic acid (RNA) molecules play important roles in many biological processes including gene expression and regulation. Their secondary structures are crucial for the RNA functionality, and the prediction of the secondary structures is widely studied. Our previous research shows that cutting long sequences into shorter chunks, predicting secondary structures of the chunks independently using thermodynamic methods, and reconstructing the entire secondary structure from the predicted chunk structures can yield better accuracy than predicting the secondary structure using the RNA sequence as a whole. The chunking, prediction, and reconstruction processes can use different methods and parameters, some of which produce more accurate predictions than others. In this paper, we study the prediction accuracy and efficiency of three different chunking methods using seven popular secondary structure prediction programs that apply to two datasets of RNA with known secondary structures, which include both pseudoknotted and non-pseudoknotted sequences, as well as a family of viral genome RNAs whose structures have not been predicted before. Our modularized MapReduce framework based on Hadoop allows us to study the problem in a parallel and robust environment. RESULTS: On average, the maximum accuracy retention values are larger than one for our chunking methods and the seven prediction programs over 50 non-pseudoknotted sequences, meaning that the secondary structure predicted using chunking is more similar to the real structure than the secondary structure predicted by using the whole sequence. We observe similar results for the 23 pseudoknotted sequences, except for the NUPACK program using the centered chunking method. The performance analysis for 14 long RNA sequences from the Nodaviridae virus family outlines how the coarse-grained mapping of chunking and predictions in the MapReduce framework exhibits shorter turnaround times for short RNA sequences. However, as the lengths of the RNA sequences increase, the fine-grained mapping can surpass the coarse-grained mapping in performance. CONCLUSIONS: By using our MapReduce framework together with statistical analysis on the accuracy retention results, we observe how the inversion-based chunking methods can outperform predictions using the whole sequence. Our chunk-based approach also enables us to predict secondary structures for very long RNA sequences, which is not feasible with traditional methods alone.


Assuntos
Biologia Computacional/métodos , Conformação de Ácido Nucleico , RNA/química , Algoritmos , Pareamento de Bases , Sequência de Bases , Modelos Moleculares , Nodaviridae/genética , RNA Viral/química , Reprodutibilidade dos Testes , Inversão de Sequência
10.
Artigo em Inglês | MEDLINE | ID: mdl-26023357

RESUMO

Secondary structures of ribonucleic acid (RNA) molecules play important roles in many biological processes including gene expression and regulation. Experimental observations and computing limitations suggest that we can approach the secondary structure prediction problem for long RNA sequences by segmenting them into shorter chunks, predicting the secondary structures of each chunk individually using existing prediction programs, and then assembling the results to give the structure of the original sequence. The selection of cutting points is a crucial component of the segmenting step. Noting that stem-loops and pseudoknots always contain an inversion, i.e., a stretch of nucleotides followed closely by its inverse complementary sequence, we developed two cutting methods for segmenting long RNA sequences based on inversion excursions: the centered and optimized method. Each step of searching for inversions, chunking, and predictions can be performed in parallel. In this paper we use a MapReduce framework, i.e., Hadoop, to extensively explore meaningful inversion stem lengths and gap sizes for the segmentation and identify correlations between chunking methods and prediction accuracy. We show that for a set of long RNA sequences in the RFAM database, whose secondary structures are known to contain pseudoknots, our approach predicts secondary structures more accurately than methods that do not segment the sequence, when the latter predictions are possible computationally. We also show that, as sequences exceed certain lengths, some programs cannot computationally predict pseudoknots while our chunking methods can. Overall, our predicted structures still retain the accuracy level of the original prediction programs when compared with known experimental secondary structure.

11.
J Comput Chem ; 32(14): 2958-73, 2011 Nov 15.
Artigo em Inglês | MEDLINE | ID: mdl-21793003

RESUMO

We present results of molecular dynamics simulations of fully hydrated DMPC bilayers performed on graphics processing units (GPUs) using current state-of-the-art non-polarizable force fields and a local GPU-enabled molecular dynamics code named FEN ZI. We treat the conditionally convergent electrostatic interaction energy exactly using the particle mesh Ewald method (PME) for solution of Poisson's Equation for the electrostatic potential under periodic boundary conditions. We discuss elements of our implementation of the PME algorithm on GPUs as well as pertinent performance issues. We proceed to show results of simulations of extended lipid bilayer systems using our program, FEN ZI. We performed simulations of DMPC bilayer systems consisting of 17,004, 68,484, and 273,936 atoms in explicit solvent. We present bilayer structural properties (atomic number densities, electron density profiles), deuterium order parameters (S(CD)), electrostatic properties (dipole potential, water dipole moments), and orientational properties of water. Predicted properties demonstrate excellent agreement with experiment and previous all-atom molecular dynamics simulations. We observe no statistically significant differences in calculated structural or electrostatic properties for different system sizes, suggesting the small bilayer simulations (less than 100 lipid molecules) provide equivalent representation of structural and electrostatic properties associated with significantly larger systems (over 1000 lipid molecules). We stress that the three system size representations will have differences in other properties such as surface capillary wave dynamics or surface tension related effects that are not probed in the current study. The latter properties are inherently dependent on system size. This contribution suggests the suitability of applying emerging GPU technologies to studies of an important class of biological environments, that of lipid bilayers and their associated integral membrane proteins. We envision that this technology will push the boundaries of fully atomic-resolution modeling of these biological systems, thus enabling unprecedented exploration of meso-scale phenomena (mechanisms, kinetics, energetics) with atomic detail at commodity hardware prices.


Assuntos
Gráficos por Computador , Dimiristoilfosfatidilcolina/química , Bicamadas Lipídicas/química , Simulação de Dinâmica Molecular , Algoritmos , Estrutura Molecular , Eletricidade Estática
12.
J Chem Inf Model ; 51(9): 2047-65, 2011 Sep 26.
Artigo em Inglês | MEDLINE | ID: mdl-21644546

RESUMO

The performances of several two-step scoring approaches for molecular docking were assessed for their ability to predict binding geometries and free energies. Two new scoring functions designed for "step 2 discrimination" were proposed and compared to our CHARMM implementation of the linear interaction energy (LIE) approach using the Generalized-Born with Molecular Volume (GBMV) implicit solvation model. A scoring function S1 was proposed by considering only "interacting" ligand atoms as the "effective size" of the ligand and extended to an empirical regression-based pair potential S2. The S1 and S2 scoring schemes were trained and 5-fold cross-validated on a diverse set of 259 protein-ligand complexes from the Ligand Protein Database (LPDB). The regression-based parameters for S1 and S2 also demonstrated reasonable transferability in the CSARdock 2010 benchmark using a new data set (NRC HiQ) of diverse protein-ligand complexes. The ability of the scoring functions to accurately predict ligand geometry was evaluated by calculating the discriminative power (DP) of the scoring functions to identify native poses. The parameters for the LIE scoring function with the optimal discriminative power (DP) for geometry (step 1 discrimination) were found to be very similar to the best-fit parameters for binding free energy over a large number of protein-ligand complexes (step 2 discrimination). Reasonable performance of the scoring functions in enrichment of active compounds in four different protein target classes established that the parameters for S1 and S2 provided reasonable accuracy and transferability. Additional analysis was performed to definitively separate scoring function performance from molecular weight effects. This analysis included the prediction of ligand binding efficiencies for a subset of the CSARdock NRC HiQ data set where the number of ligand heavy atoms ranged from 17 to 35. This range of ligand heavy atoms is where improved accuracy of predicted ligand efficiencies is most relevant to real-world drug design efforts.


Assuntos
Proteínas/química , Bases de Dados de Proteínas , Ligantes , Modelos Químicos , Ligação Proteica , Análise de Regressão
13.
J Comput Chem ; 32(3): 375-85, 2011 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-20862755

RESUMO

Molecular dynamics (MD) simulations are a vital tool in chemical research, as they are able to provide an atomistic view of chemical systems and processes that is not obtainable through experiment. However, large-scale MD simulations require access to multicore clusters or supercomputers that are not always available to all researchers. Recently, scientists have returned to exploring the power of graphics processing units (GPUs) for various applications, such as MD, enabled by the recent advances in hardware and integrated programming interfaces such as NVIDIA's CUDA platform. One area of particular interest within the context of chemical applications is that of aqueous interfaces, the salt solutions of which have found application as model systems for studying atmospheric process as well as physical behaviors such as the Hoffmeister effect. Here, we present results of GPU-accelerated simulations of the liquid-vapor interface of aqueous sodium iodide solutions. Analysis of various properties, such as density and surface tension, demonstrates that our model is consistent with previous studies of similar systems. In particular, we find that the current combination of water and ion force fields coupled with the ability to simulate surfaces of differing area enabled by GPU hardware is able to reproduce the experimental trend of increasing salt solution surface tension relative to pure water. In terms of performance, our GPU implementation performs equivalent to CHARMM running on 21 CPUs. Finally, we address possible issues with the accuracy of MD simulaions caused by nonstandard single-precision arithmetic implemented on current GPUs.


Assuntos
Simulação de Dinâmica Molecular , Iodeto de Sódio/química , Água/química , Íons/química
14.
Virus Res ; 150(1-2): 12-21, 2010 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-20176063

RESUMO

Nodamura virus (NoV; family Nodaviridae) contains a bipartite positive-strand RNA genome that replicates via negative-strand intermediates. The specific structural and sequence determinants for initiation of nodavirus RNA replication have not yet been identified. For the related nodavirus Flock House virus (FHV) undefined sequences within the 3'-terminal 50 nucleotides (nt) of FHV RNA2 are essential for its replication. We previously showed that a conserved stem-loop structure (3'SL) is predicted to form near the 3' end of the RNA2 segments of seven nodaviruses, including NoV. We hypothesized that the 3'SL structure from NoV RNA2 is an essential cis-acting element for RNA replication. To determine whether the structure can actually form within RNA2, we analyzed the secondary structure of NoV RNA2 in vitro transcripts using nuclease mapping. The resulting nuclease maps were 86% consistent with the predicted 3'SL structure, suggesting that it can form in solution. We used a well-defined reverse genetic system for launch of NoV replication in yeast cells to test the function of the 3'SL in the viral life cycle. Deletion of the nucleotides that comprise the 3'SL from a NoV2-GFP chimeric replicon resulted in a severe defect in RNA2 replication. A minimal replicon containing the 5'-terminal 17 nt and the 3'-terminal 54 nt of RNA2 (including the predicted 3'SL) retained the ability to replicate in yeast, suggesting that this region is able to direct replication of a heterologous mRNA. These data suggest that the 3'SL plays an essential role in replication of NoV RNA2. The conservation of the predicted 3'SL suggests that this common motif may play a role in RNA replication for the other members of the Nodaviridae.


Assuntos
Regiões 3' não Traduzidas , Nodaviridae/fisiologia , Conformação de Ácido Nucleico , RNA Viral/genética , Replicação Viral , Sequências Repetidas Invertidas , Nodaviridae/genética , RNA Viral/metabolismo , Ribonucleases/metabolismo , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo
15.
Artigo em Inglês | MEDLINE | ID: mdl-25705724

RESUMO

In this paper, we present a dynamic programming algorithm that runs in polynomial time and allows us to achieve the optimal, non-overlapping segmentation of a long RNA sequence into segments (chunks). The secondary structure of each chunk is predicted independently, then combined with the structures predicted for the other chunks, to generate a complete secondary structure prediction that is thus a combination of local energy minima. The proposed approach not only is more efficient and accurate than other traditionally used methods that are based on global energy minimizations, but it also allows scientists to overcome computing and storage constraints when trying to predict the secondary structure of long RNA sequences.

16.
IEEE Eng Med Biol Mag ; 28(2): 58-69, 2009.
Artigo em Inglês | MEDLINE | ID: mdl-19349252

RESUMO

In biological systems, the binding of small molecule ligands to proteins is a crucial process for almost every aspect of biochemistry and molecular biology. Enzymes are proteins that function by catalyzing specific biochemical reactions that convert reactants into products. Complex organisms are typically composed of cells in which thousands of enzymes participate in complex and interconnected biochemical pathways. Some enzymes serve as sequential steps in specific pathways (such as energy metabolism), while others function to regulate entire pathways and cellular functions [1]. Small molecule ligands can be designed to bind to a specific enzyme and inhibit the biochemical reaction. Inhibiting the activity of key enzymes may result in the entire biochemical pathways being turned on or off [2], [3]. Many small molecule drugs marketed today function in this generic way as enzyme inhibitors. If research identifies a specific enzyme as being crucial to the progress of disease, then this enzyme may be targeted with an inhibitor, which may slow down or reverse the progress of disease. In this way, enzymes are targeted from specific pathogens (e.g., virus, bacteria, fungi) for infectious diseases [4], [5], and human enzymes are targeted for noninfectious diseases such as cardiovascular disease, cancer, diabetes, and neurodegenerative diseases [6].


Assuntos
Biologia Computacional/métodos , Modelos Moleculares , Ligação Proteica , Proteínas , Algoritmos , Simulação por Computador , Ligantes , Método de Monte Carlo , Proteínas/química , Proteínas/metabolismo , Solventes/química
17.
Nucleic Acids Res ; 37(Database issue): D127-35, 2009 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-18988624

RESUMO

Pseudoknots have been recognized to be an important type of RNA secondary structures responsible for many biological functions. PseudoBase, a widely used database of pseudoknot secondary structures developed at Leiden University, contains over 250 records of pseudoknots obtained in the past 25 years through crystallography, NMR, mutational experiments and sequence comparisons. To promptly address the growing analysis requests of the researchers on RNA structures and bring together information from multiple sources across the Internet to a single platform, we designed and implemented PseudoBase++, an extension of PseudoBase for easy searching, formatting and visualization of pseudoknots. PseudoBase++ (http://pseudobaseplusplus.utep.edu) maps the PseudoBase dataset into a searchable relational database including additional functionalities such as pseudoknot type. PseudoBase++ links each pseudoknot in PseudoBase to the GenBank record of the corresponding nucleotide sequence and allows scientists to automatically visualize RNA secondary structures with PseudoViewer. It also includes the capabilities of fine-grained reference searching and collecting new pseudoknot information.


Assuntos
Bases de Dados de Ácidos Nucleicos , RNA/química , Pareamento de Bases , Gráficos por Computador , Integração de Sistemas
18.
Parallel Comput ; 34(11): 661-680, 2008 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-19885376

RESUMO

As ribonucleic acid (RNA) molecules play important roles in many biological processes including gene expression and regulation, their secondary structures have been the focus of many recent studies. Despite the computing power of supercomputers, computationally predicting secondary structures with thermodynamic methods is still not feasible when the RNA molecules have long nucleotide sequences and include complex motifs such as pseudoknots. This paper presents RNAVLab (RNA Virtual Laboratory), a virtual laboratory for studying RNA secondary structures including pseudoknots that allows scientists to address this challenge. Two important case studies show the versatility and functionalities of RNAVLab. The first study quantifies its capability to rebuild longer secondary structures from motifs found in systematically sampled nucleotide segments. The extensive sampling and predictions are made feasible in a short turnaround time because of the grid technology used. The second study shows how RNAVLab allows scientists to study the viral RNA genome replication mechanisms used by members of the virus family Nodaviridae.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...