Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
1.
J Chem Inf Model ; 62(5): 1178-1189, 2022 03 14.
Artículo en Inglés | MEDLINE | ID: mdl-35235748

RESUMEN

Structure-based, virtual High-Throughput Screening (vHTS) methods for predicting ligand activity in drug discovery are important when there are no or relatively few known compounds that interact with a therapeutic target of interest. State-of-the-art computational vHTS necessarily relies on effective methods for pose sampling and docking and generating an accurate affinity score from the docked poses. However, proteins are dynamic; in vivo ligands bind to a conformational ensemble. In silico docking to the single conformation represented by a crystal structure can adversely affect the pose quality. Here, we introduce AtomNet PoseRanker (ANPR), a graph convolutional network trained to identify and rerank crystal-like ligand poses from a sampled ensemble of protein conformations and ligand poses. In contrast to conventional vHTS methods that incorporate receptor flexibility, a deep learning approach can internalize valid cognate and noncognate binding modes corresponding to distinct receptor conformations, thereby learning to infer and account for receptor flexibility even on single conformations. ANPR significantly enriched pose quality in docking to cognate and noncognate receptors of the PDBbind v2019 data set. Improved pose rankings that better represent experimentally observed ligand binding modes improve hit rates in vHTS campaigns and thereby advance computational drug discovery, especially for novel therapeutic targets or novel binding sites.


Asunto(s)
Proteínas , Sitios de Unión , Ligandos , Simulación del Acoplamiento Molecular , Unión Proteica , Conformación Proteica , Proteínas/química
2.
Nat Rev Genet ; 11(9): 647-57, 2010 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-20717155

RESUMEN

Today we can generate hundreds of gigabases of DNA and RNA sequencing data in a week for less than US$5,000. The astonishing rate of data generation by these low-cost, high-throughput technologies in genomics is being matched by that of other technologies, such as real-time imaging and mass spectrometry-based flow cytometry. Success in the life sciences will depend on our ability to properly interpret the large-scale, high-dimensional data sets that are generated by these technologies, which in turn requires us to adopt advances in informatics. Here we discuss how we can master the different types of computational environments that exist - such as cloud and heterogeneous computing - to successfully tackle our big data problems.


Asunto(s)
Biología Computacional/métodos , Animales , Genómica/métodos , Humanos , Análisis de Secuencia de ADN/métodos
3.
N Engl J Med ; 364(1): 33-42, 2011 Jan 06.
Artículo en Inglés | MEDLINE | ID: mdl-21142692

RESUMEN

BACKGROUND: Although cholera has been present in Latin America since 1991, it had not been epidemic in Haiti for at least 100 years. Recently, however, there has been a severe outbreak of cholera in Haiti. METHODS: We used third-generation single-molecule real-time DNA sequencing to determine the genome sequences of 2 clinical Vibrio cholerae isolates from the current outbreak in Haiti, 1 strain that caused cholera in Latin America in 1991, and 2 strains isolated in South Asia in 2002 and 2008. Using primary sequence data, we compared the genomes of these 5 strains and a set of previously obtained partial genomic sequences of 23 diverse strains of V. cholerae to assess the likely origin of the cholera outbreak in Haiti. RESULTS: Both single-nucleotide variations and the presence and structure of hypervariable chromosomal elements indicate that there is a close relationship between the Haitian isolates and variant V. cholerae El Tor O1 strains isolated in Bangladesh in 2002 and 2008. In contrast, analysis of genomic variation of the Haitian isolates reveals a more distant relationship with circulating South American isolates. CONCLUSIONS: The Haitian epidemic is probably the result of the introduction, through human activity, of a V. cholerae strain from a distant geographic source. (Funded by the National Institute of Allergy and Infectious Diseases and the Howard Hughes Medical Institute.).


Asunto(s)
Cólera/microbiología , Genes Bacterianos , Vibrio cholerae/clasificación , Vibrio cholerae/genética , Cólera/epidemiología , Mapeo Cromosómico , Brotes de Enfermedades , Heces/microbiología , Variación Genética , Genoma Bacteriano , Haití/epidemiología , Historia del Siglo XVIII , Humanos , Filogenia , Análisis de Secuencia de ADN , Serotipificación , Vibrio cholerae/aislamiento & purificación , Vibrio cholerae O1/genética
5.
Commun Biol ; 3(1): 318, 2020 06 25.
Artículo en Inglés | MEDLINE | ID: mdl-32587328

RESUMEN

We performed shallow single-cell sequencing of genomic DNA across 1475 cells from a cell-line, COLO829, to resolve overall complexity and clonality. This melanoma tumor-line has been previously characterized by multiple technologies and is a benchmark for evaluating somatic alterations. In some of these studies, COLO829 has shown conflicting and/or indeterminate copy number and, thus, single-cell sequencing provides a tool for gaining insight. Following shallow single-cell sequencing, we first identified at least four major sub-clones by discriminant analysis of principal components of single-cell copy number data. Based on clustering, break-point and loss of heterozygosity analysis of aggregated data from sub-clones, we identified distinct hallmark events that were validated within bulk sequencing and spectral karyotyping. In summary, COLO829 exhibits a classical Dutrillaux's monosomic/trisomic pattern of karyotype evolution with endoreduplication, where consistent sub-clones emerge from the loss/gain of abnormal chromosomes. Overall, our results demonstrate how shallow copy number profiling can uncover hidden biological insights.


Asunto(s)
Melanoma/genética , Melanoma/patología , Análisis de la Célula Individual/métodos , Línea Celular Tumoral , Variaciones en el Número de Copia de ADN , Humanos , Cariotipificación , Pérdida de Heterocigocidad , Análisis de Secuencia de ADN
6.
Proteins ; 46(4): 368-79, 2002 Mar 01.
Artículo en Inglés | MEDLINE | ID: mdl-11835512

RESUMEN

Our recently developed off-lattice bead model capable of simulating protein structures with mixed alpha/beta content has been extended to model the folding of a ubiquitin-like protein and provides a means for examining the more complex kinetics involved in the folding of larger proteins. Using trajectories generated from constant-temperature Langevin dynamics simulations and sampling with the multiple multi-histogram method over five-order parameters, we are able to characterize the free energy landscape for folding and find evidence for folding through compact intermediates. Our model reproduces the observation that the C-terminus loop structure in ubiquitin is the last to fold in the folding process and most likely plays a spectator role in the folding kinetics. The possibility of a productive metastable intermediate along the folding pathway consisting of collapsed states with no secondary structure, and of intermediates or transition structures involving secondary structural elements occurring early in the sequence, is also supported by our model. The kinetics of folding remain multi-exponential below the folding temperature, with glass-like kinetics appearing at T/T(f) approximately 0.86. This new physicochemical model, designed to be predictive, helps validate the value of modeling protein folding at this level of detail for genomic-scale studies, and motivates further studies of other protein topologies and the impact of more complex energy functions, such as the addition of solvation forces.


Asunto(s)
Modelos Moleculares , Ubiquitina/química , Secuencia de Aminoácidos , Animales , Simulación por Computador , Cinética , Modelos Teóricos , Pliegue de Proteína , Estructura Secundaria de Proteína , Proteínas/química , Termodinámica
7.
J Comput Biol ; 9(1): 35-54, 2002.
Artículo en Inglés | MEDLINE | ID: mdl-11911794

RESUMEN

We examine the ability of our recently introduced minimalist protein model to reproduce experimentally measured thermodynamic and kinetic changes upon sequence mutation in the well-studied immunoglobulin-binding protein L. We have examined five different sequence mutations of protein L that are meant to mimic the same mutation type studied experimentally: two different mutations which disrupt the natural preference in the beta-hairpin #1 and beta-hairpin #2 turn regions, two different helix mutants where a surface polar residue in the alpha-helix has been mutated to a hydrophobic residue, and a final mutant to further probe the role of nonnative hydrophobic interactions in the folding process. These simulated mutations are analyzed in terms of various kinetic and thermodynamic changes with respect to wild type, but in addition we evaluate the structure-activity relationship of our model protein based on the phi-value calculated from both the kinetic and thermodynamic perspectives. We find that the simulated thermodynamic phi-values reproduce the experimental trends in the mutations studied and allow us to circumvent the difficult interpretation of the complicated kinetics of our model. Furthermore, the level of resolution of the model allows us to directly predict what experiments seek in regard to protein engineering studies of protein folding--namely the residues or portions of the polypeptide chain that contribute to the crucial step in the folding of the wild-type protein.


Asunto(s)
Proteínas Bacterianas/química , Proteínas de Unión al ADN/química , Ingeniería de Proteínas/métodos , Proteínas Bacterianas/metabolismo , Simulación por Computador , Proteínas de Unión al ADN/metabolismo , Cinética , Modelos Químicos , Mutación/genética , Conformación Proteica , Pliegue de Proteína , Termodinámica
8.
Nat Biotechnol ; 30(7): 701-707, 2012 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-22750883

RESUMEN

Advances in DNA sequencing technology have improved our ability to characterize most genomic diversity. However, accurate resolution of large structural events is challenging because of the short read lengths of second-generation technologies. Third-generation sequencing technologies, which can yield longer multikilobase reads, have the potential to address limitations associated with genome assembly. Here we combine sequencing data from second- and third-generation DNA sequencing technologies to assemble the two-chromosome genome of a recent Haitian cholera outbreak strain into two nearly finished contigs at >99.9% accuracy. Complex regions with clinically relevant structure were completely resolved. In separate control assemblies on experimental and simulated data for the canonical N16961 cholera reference strain, we obtained 14 scaffolds of greater than 1 kb for the experimental data and 8 scaffolds of greater than 1 kb for the simulated data, which allowed us to correct several errors in contigs assembled from the short-read data alone. This work provides a blueprint for the next generation of rapid microbial identification and full-genome assembly.


Asunto(s)
Cólera/genética , Genoma Bacteriano , Análisis de Secuencia de ADN/métodos , Algoritmos , Secuencia de Bases , Biología Computacional , Mapeo Contig , Genes de ARNr/genética , Datos de Secuencia Molecular
9.
Science ; 323(5910): 133-8, 2009 Jan 02.
Artículo en Inglés | MEDLINE | ID: mdl-19023044

RESUMEN

We present single-molecule, real-time sequencing data obtained from a DNA polymerase performing uninterrupted template-directed synthesis using four distinguishable fluorescently labeled deoxyribonucleoside triphosphates (dNTPs). We detected the temporal order of their enzymatic incorporation into a growing DNA strand with zero-mode waveguide nanostructure arrays, which provide optical observation volume confinement and enable parallel, simultaneous detection of thousands of single-molecule sequencing reactions. Conjugation of fluorophores to the terminal phosphate moiety of the dNTPs allows continuous observation of DNA synthesis over thousands of bases without steric hindrance. The data report directly on polymerase dynamics, revealing distinct polymerization states and pause sites corresponding to DNA secondary structure. Sequence data were aligned with the known reference sequence to assay biophysical parameters of polymerization for each template position. Consensus sequences were generated from the single-molecule reads at 15-fold coverage, showing a median accuracy of 99.3%, with no systematic error beyond fluorophore-dependent error rates.


Asunto(s)
ADN Polimerasa Dirigida por ADN/metabolismo , Análisis de Secuencia de ADN/métodos , Secuencia de Bases , Secuencia de Consenso , ADN/biosíntesis , ADN Circular/química , ADN de Cadena Simple/química , Desoxirribonucleótidos/metabolismo , Enzimas Inmovilizadas , Colorantes Fluorescentes , Cinética , Nanoestructuras , Espectrometría de Fluorescencia
10.
Genome Res ; 18(10): 1638-42, 2008 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-18775913

RESUMEN

Forward genetic mutational studies, adaptive evolution, and phenotypic screening are powerful tools for creating new variant organisms with desirable traits. However, mutations generated in the process cannot be easily identified with traditional genetic tools. We show that new high-throughput, massively parallel sequencing technologies can completely and accurately characterize a mutant genome relative to a previously sequenced parental (reference) strain. We studied a mutant strain of Pichia stipitis, a yeast capable of converting xylose to ethanol. This unusually efficient mutant strain was developed through repeated rounds of chemical mutagenesis, strain selection, transformation, and genetic manipulation over a period of seven years. We resequenced this strain on three different sequencing platforms. Surprisingly, we found fewer than a dozen mutations in open reading frames. All three sequencing technologies were able to identify each single nucleotide mutation given at least 10-15-fold nominal sequence coverage. Our results show that detecting mutations in evolved and engineered organisms is rapid and cost-effective at the whole-genome level using new sequencing technologies. Identification of specific mutations in strains with altered phenotypes will add insight into specific gene functions and guide further metabolic engineering efforts.


Asunto(s)
Análisis Mutacional de ADN/métodos , Genoma Fúngico , Mutación , Pichia/genética , Alineación de Secuencia , Análisis de Secuencia de ADN
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA