Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Más filtros

Banco de datos
Tipo del documento
Publication year range
1.
iScience ; 25(10): 105081, 2022 Oct 21.
Artículo en Inglés | MEDLINE | ID: mdl-36204272

RESUMEN

Matching the treatment to an individual patient's tumor state can increase therapeutic efficacy and reduce tumor recurrence. Circulating tumor cells (CTCs) derived from solid tumors are promising subjects for theragnostic analysis. To analyze how CTCs represent tumor states, we established cell lines from CTCs, primary and metastatic tumors from a mouse model and provided phenotypic and multiomic analyses of these cells. CTCs and metastatic cells, but not primary tumor cells, shared stochastic mutations and similar hypomethylation levels at transcription start sites. CTCs and metastatic tumor cells shared a hybrid epithelial/mesenchymal transcriptome state with reduced adhesive and enhanced mobilization characteristics. We tested anti-cancer drugs on tumor cells from a metastatic breast cancer patient. CTC responses mirrored the impact of drugs on metastatic rather than primary tumors. Our multiomic and clinical anti-cancer drug response results reveal that CTCs resemble metastatic tumors and establish CTCs as an ex vivo tool for personalized medicine.

2.
Comput Math Methods Med ; 2020: 7231205, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-32952600

RESUMEN

Although sequencing a human genome has become affordable, identifying genetic variants from whole-genome sequence data is still a hurdle for researchers without adequate computing equipment or bioinformatics support. GATK is a gold standard method for the identification of genetic variants and has been widely used in genome projects and population genetic studies for many years. This was until the Google Brain team developed a new method, DeepVariant, which utilizes deep neural networks to construct an image classification model to identify genetic variants. However, the superior accuracy of DeepVariant comes at the cost of computational intensity, largely constraining its applications. Accordingly, we present DeepVariant-on-Spark to optimize resource allocation, enable multi-GPU support, and accelerate the processing of the DeepVariant pipeline. To make DeepVariant-on-Spark more accessible to everyone, we have deployed the DeepVariant-on-Spark to the Google Cloud Platform (GCP). Users can deploy DeepVariant-on-Spark on the GCP following our instruction within 20 minutes and start to analyze at least ten whole-genome sequencing datasets using free credits provided by the GCP. DeepVaraint-on-Spark is freely available for small-scale genome analysis using a cloud-based computing framework, which is suitable for pilot testing or preliminary study, while reserving the flexibility and scalability for large-scale sequencing projects.


Asunto(s)
Nube Computacional , Aprendizaje Profundo , Variación Genética , Secuenciación Completa del Genoma/estadística & datos numéricos , Nube Computacional/economía , Biología Computacional/métodos , Análisis Costo-Beneficio , Genoma Humano , Secuenciación de Nucleótidos de Alto Rendimiento/economía , Secuenciación de Nucleótidos de Alto Rendimiento/normas , Secuenciación de Nucleótidos de Alto Rendimiento/estadística & datos numéricos , Humanos , Redes Neurales de la Computación , Programas Informáticos , Secuenciación Completa del Genoma/economía , Secuenciación Completa del Genoma/normas
3.
Nucleic Acids Res ; 35(Web Server issue): W465-72, 2007 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-17553839

RESUMEN

This article presents a web server iPDA, which aims at identifying the disordered regions of a query protein. Automatic prediction of disordered regions from protein sequences is an important problem in the study of structural biology. The proposed classifier DisPSSMP2 is different from several existing disorder predictors by its employment of position-specific scoring matrices with respect to physicochemical properties (PSSMP), where the physicochemical properties adopted here especially take the disorder propensity of amino acids into account. The web server iPDA integrates DisPSSMP2 with several other sequence predictors in order to investigate the functional role of the detected disordered region. The predicted information includes sequence conservation, secondary structure, sequence complexity and hydrophobic clusters. According to the proportion of the secondary structure elements predicted, iPDA dynamically adjusts the cutting threshold of determining protein disorder. Furthermore, a pattern mining package for detecting sequence conservation is embedded in iPDA for discovering potential binding regions of the query protein, which is really helpful to uncovering the relationship between protein function and its primary sequence. The web service is available at http://biominer.bime.ntu.edu.tw/ipda and mirrored at http://biominer.cse.yzu.edu.tw/ipda.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Modelos Moleculares , Proteínas/química , Alineación de Secuencia/métodos , Análisis de Secuencia de Proteína/métodos , Programas Informáticos , Secuencias de Aminoácidos , Secuencia de Aminoácidos , Escherichia coli/metabolismo , Humanos , Datos de Secuencia Molecular , Estructura Secundaria de Proteína , Proteínas/análisis , Homología de Secuencia de Aminoácido
4.
BMC Bioinformatics ; 7: 319, 2006 Jun 23.
Artículo en Inglés | MEDLINE | ID: mdl-16796745

RESUMEN

BACKGROUND: More and more disordered regions have been discovered in protein sequences, and many of them are found to be functionally significant. Previous studies reveal that disordered regions of a protein can be predicted by its primary structure, the amino acid sequence. One observation that has been widely accepted is that ordered regions usually have compositional bias toward hydrophobic amino acids, and disordered regions are toward charged amino acids. Recent studies further show that employing evolutionary information such as position specific scoring matrices (PSSMs) improves the prediction accuracy of protein disorder. As more and more machine learning techniques have been introduced to protein disorder detection, extracting more useful features with biological insights attracts more attention. RESULTS: This paper first studies the effect of a condensed position specific scoring matrix with respect to physicochemical properties (PSSMP) on the prediction accuracy, where the PSSMP is derived by merging several amino acid columns of a PSSM belonging to a certain property into a single column. Next, we decompose each conventional physicochemical property of amino acids into two disjoint groups which have a propensity for order and disorder respectively, and show by experiments that some of the new properties perform better than their parent properties in predicting protein disorder. In order to get an effective and compact feature set on this problem, we propose a hybrid feature selection method that inherits the efficiency of uni-variant analysis and the effectiveness of the stepwise feature selection that explores combinations of multiple features. The experimental results show that the selected feature set improves the performance of a classifier built with Radial Basis Function Networks (RBFN) in comparison with the feature set constructed with PSSMs or PSSMPs that adopt simply the conventional physicochemical properties. CONCLUSION: Distinguishing disordered regions from ordered regions in protein sequences facilitates the exploration of protein structures and functions. Results based on independent testing data reveal that the proposed predicting model DisPSSMP performs the best among several of the existing packages doing similar tasks, without either under-predicting or over-predicting the disordered regions. Furthermore, the selected properties are demonstrated to be useful in finding discriminating patterns for order/disorder classification.


Asunto(s)
Modelos Moleculares , Proteínas/química , Análisis de Secuencia de Proteína , Programas Informáticos , Secuencia de Aminoácidos , Análisis por Conglomerados , Simulación por Computador , Bases de Datos de Proteínas , Datos de Secuencia Molecular
SELECCIÓN DE REFERENCIAS
Detalles de la búsqueda