Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 26
Filtrar
1.
J Fluoresc ; 30(3): 637-656, 2020 May.
Artigo em Inglês | MEDLINE | ID: mdl-32314139

RESUMO

The accuracy of detecting protein crystals for fluorescence microscopy images is very critical for high throughput and automated systems. Although the trace fluorescent labeling method could highlight protein crystals, reflection and emission from the fluorescence dye is not always due to crystal regions. Therefore, the analysis of the peak wavelength in the emission spectra of a fluorophore may not always yield effective results. In this paper, we show that using the subordinate color intensity corresponding to longer wavelengths than the peak wavelength of the emission spectra could improve the accuracy of protein crystal detection. Hence, we have built a segmentation method based on the percentile intensity of the subordinate color for trace fluorescently labeled (TFL'd) protein crystallization trial images. Compared to using the dominant color channel, our segmentation method on subordinate color channel was able to reduce the misclassification rate of likely-leads or crystals as non-crystals by the percentage of from 9.71% to 2.02% depending on the classifier. Similarly, the accuracy of classifiers were increased by the percentage of from 1.77% to 5.53%. Our method reached around 94% accuracy while keeping misclassification of likely-leads and crystals as non-crystals below 1%. Moreover, to evaluate the generalizability of our method, we have conducted new wet lab experiments on two proteins, Concanavalin A (Con A) and Ab inorganic pyrophosphate (AbIPPase), and the misclassification rate was below 1%. Our experiments show that using the subordinate channel may be more helpful for TFL'd protein trial image classification.


Assuntos
Cor , Concanavalina A/química , Imagem Óptica , Monoéster Fosfórico Hidrolases/química , Cristalização , Microscopia de Fluorescência , Monoéster Fosfórico Hidrolases/metabolismo
2.
Methods Mol Biol ; 2652: 187-197, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37093476

RESUMO

Protein crystallization is a complex process, where every component and physical parameter of the crystallization process may have an effect on the outcome. Crystallization conditions are typically arrived at by a screening process, where the target is subjected to a broad array of solution conditions with the goal of obtaining at least one condition that can be carried on to a structure. Ionic liquids (IL) have been found to be useful additives for improving the outcomes of the crystallization process, with existing data indicating that the IL structure has an effect. We describe a method for quickly preparing a series of solutions that vary in just one component, in this case a series of ILs that are used as crystallization additives. The method results in a screening grid, where the crystallization conditions being tested are constant in any one column in the Y dimension and they ILs are constant in any one row in the X dimension. This provides a systematic approach to determining effective ILs for obtaining crystals from a limited set of promising starting crystallization conditions. The approach generates an X-Y array of conditions, where the basic precipitant conditions are kept constant in one plate dimension and the additives are kept constant in the second dimension, generating a 12 × 8 array of conditions. This approach would also be useful for surveying other classes of protein crystallization additives in a systematic fashion.


Assuntos
Líquidos Iônicos , Líquidos Iônicos/química , Cristalização , Proteínas/química
3.
Crystals (Basel) ; 11(10)2021 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-34745654

RESUMO

Among its attributes, the mythical philosopher's stone is supposedly capable of turning base metals to gold or silver. In an analogous fashion, we are finding that protein crystallization optimization using ionic liquids (ILs) often results in the conversion of base protein precipitate to crystals. Recombinant inorganic pyrophosphatases (8 of the 11 proteins) from pathogenic bacteria as well as several other proteins were tested for optimization by 23 ILs, plus a dH2O control, at IL concentrations of 0.1, 0.2, and 0.4 M. The ILs were used as additives, and all proteins were crystallized in the presence of at least one IL. For 9 of the 11 proteins, precipitation conditions were converted to crystals with at least one IL. The ILs could be ranked in order of effectiveness, and it was found that ~83% of the precipitation-derived crystallization conditions could be obtained with a suite of just eight ILs, with the top two ILs accounting for ~50% of the hits. Structural trends were found in the effectiveness of the ILs, with shorter-alkyl-chain ILs being more effective. The two top ILs, accounting for ~50% of the unique crystallization results, were choline dihydrogen phosphate and 1-butyl-3-methylimidazolium tetrafluoroborate. Curiously, however, a butyl group was present on the cation of four of the top eight ILs.

4.
IEEE/ACM Trans Comput Biol Bioinform ; 17(6): 2074-2085, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-31034419

RESUMO

The data representation as well as naming conventions used in commercial screen files by different companies make the automated analysis of crystallization experiments difficult and time-consuming. In order to reduce the human effort required to deal with this problem, we present an approach for computationally matching elements of two schemas using linguistic schema matching methods and then transform the input screen format to another format with naming defined by the user. This approach is tested on a number of commercial screens from different companies and the results of the experiments showed an overall accuracy of 97 percent on schema matching which is significantly better than the other two matchers we tested. Our tool enables mapping a screen file in one format to another format preferred by the expert using their preferred chemical names.


Assuntos
Biologia Computacional/métodos , Cristalização/classificação , Mineração de Dados/métodos , Bases de Dados de Proteínas , Proteínas , Proteínas/química , Proteínas/classificação , Terminologia como Assunto
5.
Acta Crystallogr D Struct Biol ; 75(Pt 8): 743-752, 2019 Aug 01.
Artigo em Inglês | MEDLINE | ID: mdl-31373573

RESUMO

The haloacid dehalogenase (HAD) superfamily is one of the largest known groups of enzymes and the majority of its members catalyze the hydrolysis of phosphoric acid monoesters into a phosphate ion and an alcohol. Despite the fact that sequence similarity between HAD phosphatases is generally very low, the members of the family possess some characteristic features, such as a Rossmann-like fold, HAD signature motifs or the requirement for Mg2+ ion as an obligatory cofactor. This study focuses on a new hypothetical HAD phosphatase from Thermococcus thioreducens. The protein crystallized in space group P21212, with unit-cell parameters a = 66.3, b = 117.0, c = 33.8 Å, and the crystals contained one molecule in the asymmetric unit. The protein structure was determined by X-ray crystallography and was refined to 1.75 Šresolution. The structure revealed a putative active site common to all HAD members. Computational docking into the crystal structure was used to propose substrates of the enzyme. The activity of this thermophilic enzyme towards several of the selected substrates was confirmed at temperatures of 37°C as well as 60°C.


Assuntos
Hidrolases/química , Monoéster Fosfórico Hidrolases/química , Thermococcus/enzimologia , Sítios de Ligação , Domínio Catalítico , Cristalografia por Raios X/métodos , Cinética , Modelos Moleculares , Especificidade por Substrato
6.
Methods Mol Biol ; 426: 377-85, 2008.
Artigo em Inglês | MEDLINE | ID: mdl-18542877

RESUMO

Trace fluorescent labeling, typically less than 1%, can be a powerful aid in macromolecule crystallization. Precipitation concentrates a solute, and crystals are the most densely packed solid form. The more densely packed the fluorescing material, the brighter the emission from it; thus, fluorescence intensity of a solid phase is a good indication of whether or not one has crystals. The more brightly fluorescing crystalline phase is easily distinguishable, even when embedded in an amorphous precipitate. This approach conveys several distinct advantages: one can see what the protein is doing in response to the imposed conditions, and distinguishing between amorphous and microcrystalline precipitated phases is considerably simpler. The higher fluorescence intensity of the crystalline phase led the authors to test if they could derive crystallization conditions from screen outcomes that had no obvious crystalline material, but simply "bright spots" in the precipitated phase. Preliminary results show that the presence of these bright spots, not observable under white light, is indeed a good indicator of potential crystallization conditions.


Assuntos
Endo-1,4-beta-Xilanases/química , Corantes Fluorescentes/química , Substâncias Macromoleculares/química , Cristalização , Cristalografia por Raios X/métodos
7.
Acta Crystallogr F Struct Biol Commun ; 73(Pt 12): 657-663, 2017 Dec 01.
Artigo em Inglês | MEDLINE | ID: mdl-29199986

RESUMO

A wide variety of crystallization solutions are screened to establish conditions that promote the growth of a diffraction-quality crystal. Screening these conditions requires the assessment of many crystallization plates for the presence of crystals. Automated systems for screening and imaging are very expensive. A simple approach to imaging trace fluorescently labeled protein crystals in crystallization plates has been devised, and can be implemented at a cost as low as $50. The proteins ß-lactoglobulin B, trypsin and purified concanavalin A (ConA) were trace fluorescently labeled using three different fluorescent probes: Cascade Yellow (CY), Carboxyrhodamine 6G (CR) and Pacific Blue (PB). A crystallization screening plate was set up using ß-lactoglobulin B labeled with CR, trypsin labeled with CY, ConA labeled with each probe, and a mixture consisting of 50% PB-labeled ConA and 50% CR-labeled ConA. The wells of these plates were imaged using a commercially available macro-imaging lens attachment for smart devices that have a camera. Several types of macro lens attachments were tested with smartphones and tablets. Images with the highest quality were obtained with an iPhone 6S and an AUKEY Ora 10× macro lens. Depending upon the fluorescent probe employed and its Stokes shift, a light-emitting diode or a laser diode was used for excitation. An emission filter was used for the imaging of protein crystals labeled with CR and crystals with two-color fluorescence. This approach can also be used with microscopy systems commonly used to observe crystallization plates.


Assuntos
Corantes Fluorescentes/química , Imagem Molecular/instrumentação , Imagem Molecular/métodos , Proteínas/química , Cor , Concanavalina A/química , Custos e Análise de Custo , Cristalização , Desenho de Equipamento , Fluorescência , Lactoglobulinas/química , Imagem Molecular/economia , Rodaminas/química , Smartphone/economia , Smartphone/instrumentação
8.
Artigo em Inglês | MEDLINE | ID: mdl-26992178

RESUMO

In general, a single thresholding technique is developed or enhanced to separate foreground objects from background for a domain of images. This idea may not generate satisfactory results for all images in a dataset, since different images may require different types of thresholding methods for proper binarization or segmentation. To overcome this limitation, in this study, we propose a novel approach called "super-thresholding" that utilizes a supervised classifier to decide an appropriate thresholding method for a specific image. This method provides a generic framework that allows selection of the best thresholding method among different thresholding techniques that are beneficial for the problem domain. A classifier model is built using features extracted priori from the original image only or posteriori by analyzing the outputs of thresholding methods and the original image. This model is applied to identify the thresholding method for new images of the domain. We performed our method on protein crystallization images, and then we compared our results with six thresholding techniques. Numerical results are provided using four different correctness measurements. Super-thresholding outperforms the best single thresholding method around 10 percent, and it gives the best performance for protein crystallization dataset in our experiments.


Assuntos
Cristalização/métodos , Processamento de Imagem Assistida por Computador/métodos , Proteínas/química , Aprendizado de Máquina Supervisionado , Algoritmos , Bases de Dados de Proteínas
9.
BioData Min ; 10: 14, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28465724

RESUMO

BACKGROUND: Large number of features are extracted from protein crystallization trial images to improve the accuracy of classifiers for predicting the presence of crystals or phases of the crystallization process. The excessive number of features and computationally intensive image processing methods to extract these features make utilization of automated classification tools on stand-alone computing systems inconvenient due to the required time to complete the classification tasks. Combinations of image feature sets, feature reduction and classification techniques for crystallization images benefiting from trace fluorescence labeling are investigated. RESULTS: Features are categorized into intensity, graph, histogram, texture, shape adaptive, and region features (using binarized images generated by Otsu's, green percentile, and morphological thresholding). The effects of normalization, feature reduction with principle components analysis (PCA), and feature selection using random forest classifier are also analyzed. The time required to extract feature categories is computed and an estimated time of extraction is provided for feature category combinations. We have conducted around 8624 experiments (different combinations of feature categories, binarization methods, feature reduction/selection, normalization, and crystal categories). The best experimental results are obtained using combinations of intensity features, region features using Otsu's thresholding, region features using green percentile G90 thresholding, region features using green percentile G99 thresholding, graph features, and histogram features. Using this feature set combination, 96% accuracy (without misclassifying crystals as non-crystals) was achieved for the first level of classification to determine presence of crystals. Since missing a crystal is not desired, our algorithm is adjusted to achieve a high sensitivity rate. In the second level classification, 74.2% accuracy for (5-class) crystal sub-category classification. Best classification rates were achieved using random forest classifier. CONTRIBUTIONS: The feature extraction and classification could be completed in about 2 s per image on a stand-alone computing system, which is suitable for real time analysis. These results enable research groups to select features according to their hardware setups for real-time analysis.

10.
Prog Biophys Mol Biol ; 88(3): 359-86, 2005 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-15652250

RESUMO

The common goal for structural genomic centers and consortiums is to decipher as quickly as possible the three-dimensional structures for a multitude of recombinant proteins derived from known genomic sequences. Since X-ray crystallography is the foremost method to acquire atomic resolution for macromolecules, the limiting step is obtaining protein crystals that can be useful of structure determination. High-throughput methods have been developed in recent years to clone, express, purify, crystallize and determine the three-dimensional structure of a protein gene product rapidly using automated devices, commercialized kits and consolidated protocols. However, the average number of protein structures obtained for most structural genomic groups has been very low compared to the total number of proteins purified. As more entire genomic sequences are obtained for different organisms from the three kingdoms of life, only the proteins that can be crystallized and whose structures can be obtained easily are studied. Consequently, an astonishing number of genomic proteins remain unexamined. In the era of high-throughput processes, traditional methods in molecular biology, protein chemistry and crystallization are eclipsed by automation and pipeline practices. The necessity for high-rate production of protein crystals and structures has prevented the usage of more intellectual strategies and creative approaches in experimental executions. Fundamental principles and personal experiences in protein chemistry and crystallization are minimally exploited only to obtain "low-hanging fruit" protein structures. We review the practical aspects of today's high-throughput manipulations and discuss the challenges in fast pace protein crystallization and tools for crystallography. Structural genomic pipelines can be improved with information gained from low-throughput tactics that may help us reach the higher-bearing fruits. Examples of recent developments in this area are reported from the efforts of the Southeast Collaboratory for Structural Genomics (SECSG).


Assuntos
Cristalização/instrumentação , Cristalização/métodos , Cristalografia por Raios X/instrumentação , Cristalografia por Raios X/métodos , Proteínas/química , Proteínas/ultraestrutura , Complexos Multiproteicos/análise , Complexos Multiproteicos/química , Complexos Multiproteicos/ultraestrutura , Proteínas/análise , Integração de Sistemas
11.
IEEE Trans Nanobioscience ; 15(2): 101-12, 2016 03.
Artigo em Inglês | MEDLINE | ID: mdl-26955046

RESUMO

The goal of protein crystallization screening is the determination of the main factors of importance to crystallizing the protein under investigation. One of the major issues about determining these factors is that screening is often expanded to many hundreds or thousands of conditions to maximize combinatorial chemical space coverage for maximizing the chances of a successful (crystalline) outcome. In this paper, we propose an experimental design method called "Associative Experimental Design (AED)" and an optimization method includes eliminating prohibited combinations and prioritizing reagents based on AED analysis of results from protein crystallization experiments. AED generates candidate cocktails based on these initial screening results. These results are analyzed to determine those screening factors in chemical space that are most likely to lead to higher scoring outcomes, crystals. We have tested AED on three proteins derived from the hyperthermophile Thermococcus thioreducens, and we applied an optimization method to these proteins. Our AED method generated novel cocktails (count provided in parentheses) leading to crystals for three proteins as follows: Nucleoside diphosphate kinase (4), HAD superfamily hydrolase (2), Nucleoside kinase (1). After getting promising results, we have tested our optimization method on four different proteins. The AED method with optimization yielded 4, 3, and 20 crystalline conditions for holo Human Transferrin, archaeal exosome protein, and Nucleoside diphosphate kinase, respectively.


Assuntos
Biologia Computacional/métodos , Cristalização/métodos , Conformação Proteica , Proteínas/química , Algoritmos , Projetos de Pesquisa
12.
Artigo em Inglês | MEDLINE | ID: mdl-27045831

RESUMO

Automated image analysis of microscopic images such as protein crystallization images and cellular images is one of the important research areas. If objects in a scene appear at different depths with respect to the camera's focal point, objects outside the depth of field usually appear blurred. Therefore, scientists capture a collection of images with different depths of field. Focal stacking is a technique of creating a single focused image from a stack of images collected with different depths of field. In this paper, we introduce a novel focal stacking technique, FocusALL, which is based on our modified Harris Corner Response Measure. We also propose enhanced FocusALL for application on images collected under high resolution and varying illumination. FocusALL resolves problems related to the assumption that focus regions have high contrast and high intensity. Especially, FocusALL generates sharper boundaries around protein crystal regions and good in focus images for high resolution images in reasonable time. FocusALL outperforms other methods on protein crystallization images and performs comparably well on other datasets such as retinal epithelial images and simulated datasets.


Assuntos
Processamento de Imagem Assistida por Computador/métodos , Microscopia/métodos , Algoritmos , Análise por Conglomerados , Humanos , Modelos Biológicos , Proteínas/química , Epitélio Pigmentado da Retina/diagnóstico por imagem
13.
Biochim Biophys Acta ; 1652(1): 52-63, 2003 Nov 03.
Artigo em Inglês | MEDLINE | ID: mdl-14580996

RESUMO

The thermal stability of a recombinant alpha-amylase from Bacillus halmapalus alpha-amylase (BHA) has been investigated using circular dichroism spectroscopy (CD) and differential scanning calorimetry (DSC). This alpha-amylase is homologous to other Bacillus alpha-amylases where crystallographic studies have identified the existence of three calcium binding sites in the structure. Denaturation of BHA is irreversible with a T(m) of approximately 89 degrees C and DSC thermograms can be described using a one-step irreversible model. A 5 degrees C increase in T(m) in the presence of 10-fold excess CaCl(2) was observed. However, a concomitant increase in the tendency to aggregate was also observed. The presence of 30-40-fold excess calcium chelator (ethylenediaminetetraacetic acid (EDTA) or ethylene glycol-bis[beta-aminoethyl ether] N,N,N',N'-tetraacetic acid (EGTA)) results in a large destabilization of BHA, corresponding to about 40 degrees C lower T(m) as determined by both CD and DSC. Ten-fold excess EGTA reveals complex DSC thermograms corresponding to both reversible and irreversible transitions, which probably originate from different populations of BHA/calcium complexes. Combined interpretation of these observations and structural information on homologous alpha-amylases forms the basis for a suggested mechanism underlying the inactivation mechanism of BHA. The mechanism includes irreversible thermal denaturation of different BHA/calcium complexes and the calcium binding equilibria. Furthermore, the model accounts for a temperature-induced reversible structural change associated with calcium binding.


Assuntos
Bacillus/enzimologia , Cloreto de Cálcio/farmacologia , alfa-Amilases/química , Sítios de Ligação , Cloreto de Cálcio/química , Cloreto de Cálcio/metabolismo , Varredura Diferencial de Calorimetria , Quelantes/química , Quelantes/farmacologia , Dicroísmo Circular , Ácido Edético/química , Ácido Edético/farmacologia , Ácido Egtázico/química , Ácido Egtázico/farmacologia , Estabilidade Enzimática , Temperatura Alta , Cinética , Desnaturação Proteica , Proteínas Recombinantes/química , Proteínas Recombinantes/genética , Termodinâmica , alfa-Amilases/genética
14.
Acta Crystallogr F Struct Biol Commun ; 71(Pt 2): 121-31, 2015 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-25664782

RESUMO

Successful protein crystallization screening experiments are dependent upon the experimenter being able to identify positive outcomes. The introduction of fluorescence techniques has brought a powerful and versatile tool to the aid of the crystal grower. Trace fluorescent labeling, in which a fluorescent probe is covalently bound to a subpopulation (<0.5%) of the protein, enables the use of visible fluorescence. Alternatively, one can avoid covalent modification and use UV fluorescence, exploiting the intrinsic fluorescent amino acids present in most proteins. By the use of these techniques, crystals that had previously been obscured in the crystallization drop can readily be identified and distinguished from amorphous precipitate or salt crystals. Additionally, lead conditions that may not have been obvious as such under white-light illumination can be identified. In all cases review of the screening plate is considerably accelerated, as the eye can quickly note objects of increased intensity.


Assuntos
Cristalização/métodos , Proteínas/química , Espectrometria de Fluorescência/métodos , Animais , Cristalização/instrumentação , Corantes Fluorescentes/química , Humanos
15.
Cryst Growth Des ; 15(11): 5254-5262, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26640418

RESUMO

Thousands of experiments corresponding to different combinations of conditions are set up to determine the relevant conditions for successful protein crystallization. In recent years, high throughput robotic set-ups have been developed to automate the protein crystallization experiments, and imaging techniques are used to monitor the crystallization progress. Images are collected multiple times during the course of an experiment. Huge number of collected images make manual review of images tedious and discouraging. In this paper, utilizing trace fluorescence labeling, we describe an automated system called CrystPro for monitoring the protein crystal growth in crystallization trial images by analyzing the time sequence images. Given the sets of image sequences, the objective is to develop an efficient and reliable system to detect crystal growth changes such as new crystal formation and increase of crystal size. CrystPro consists of three major steps- identification of crystallization trials proper for spatio-temporal analysis, spatio-temporal analysis of identified trials, and crystal growth analysis. We evaluated the performance of our system on 3 crystallization image datasets (PCP-ILopt-11, PCP-ILopt-12, and PCP-ILopt-13) and compared our results with expert scores. Our results indicate a) 98.3% accuracy and .896 sensitivity on identification of trials for spatio-temporal analysis, b) 77.4% accuracy and .986 sensitivity of identifying crystal pairs with new crystal formation, and c) 85.8% accuracy and 0.667 sensitivity on crystal size increase detection. The results show that our method is reliable and efficient for tracking growth of crystals and determining useful image sequences for further review by the crystallographers.

16.
Acta Crystallogr F Struct Biol Commun ; 71(Pt 7): 806-14, 2015 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-26144224

RESUMO

Fluorescence can be a powerful tool to aid in the crystallization of proteins. In the trace-labeling approach, the protein is covalently derivatized with a high-quantum-yield visible-wavelength fluorescent probe. The final probe concentration typically labels ≤0.20% of the protein molecules, which has been shown to not affect the crystal nucleation or diffraction quality. The labeled protein is then used in a plate-screening experiment in the usual manner. As the most densely packed state of the protein is the crystalline form, then crystals show as the brightest objects in the well under fluorescent illumination. A study has been carried out on the effects of trace fluorescent labeling on the screening results obtained compared with nonlabeled protein, and it was found that considering the stochastic nature of the crystal nucleation process the presence of the probe did not affect the outcomes obtained. Other effects are realised when using fluorescence. Crystals are clearly seen even when buried in precipitate. This approach also finds `hidden' leads, in the form of bright spots, with ∼30% of the leads found being optimized to crystals in a single-pass optimization trial. The use of visible fluorescence also enables the selection of colors that bypass interfering substances, and the screening materials do not have to be UV-transparent.


Assuntos
Corantes Fluorescentes/análise , Proteínas de Plantas/análise , Coloração e Rotulagem/métodos , Cristalização/métodos , Microscopia de Fluorescência/métodos , Proteínas de Plantas/química , Proteínas/análise , Proteínas/química
17.
Proc IEEE Southeastcon ; 20142014 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-25983535

RESUMO

One of the difficulties for proper imaging in microscopic image analysis is defocusing. Microscopic images such as cellular images, protein images, etc. need properly focused image for image analysis. A small difference in focal depth affects the details of an object significantly. In this paper, we introduce a novel auto-focusing approach based on Harris Corner Response Measure (HCRM) and compare the performance with some existing auto-focusing methods. We perform our experiments on protein images as well as a simulated image stack to evaluate the performance of our method. Our results show that our HCRM-based technique outperforms other techniques.

18.
Proc IEEE Southeastcon ; 20142014 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-25914518

RESUMO

In this paper, we investigate the performance of two wrapper methods for semi-supervised learning algorithms for classification of protein crystallization images with limited labeled images. Firstly, we evaluate the performance of semi-supervised approach using self-training with naïve Bayesian (NB) and sequential minimum optimization (SMO) as the base classifiers. The confidence values returned by these classifiers are used to select high confident predictions to be used for self-training. Secondly, we analyze the performance of Yet Another Two Stage Idea (YATSI) semi-supervised learning using NB, SMO, multilayer perceptron (MLP), J48 and random forest (RF) classifiers. These results are compared with the basic supervised learning using the same training sets. We perform our experiments on a dataset consisting of 2250 protein crystallization images for different proportions of training and test data. Our results indicate that NB and SMO using both self-training and YATSI semi-supervised approaches improve accuracies with respect to supervised learning. On the other hand, MLP, J48 and RF perform better using basic supervised learning. Overall, random forest classifier yields the best accuracy with supervised learning for our dataset.

19.
Proc IEEE Southeastcon ; 20142014 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-25914519

RESUMO

In this paper, we investigate the performance of classification of protein crystallization images captured during protein crystal growth process. We group protein crystallization images into 3 categories: noncrystals, likely leads (conditions that may yield formation of crystals) and crystals. In this research, we only consider the subcategories of noncrystal and likely leads protein crystallization images separately. We use 5 different classifiers to solve this problem and we applied some data preprocessing methods such as principal component analysis (PCA), min-max (MM) normalization and z-score (ZS) normalization methods to our datasets in order to evaluate their effects on classifiers for the noncrystal and likely leads datasets. We performed our experiments on 1606 noncrystal and 245 likely leads images independently. We had satisfactory results for both datasets. We reached 96.8% accuracy for noncrystal dataset and 94.8% accuracy for likely leads dataset. Our target is to investigate the best classifiers with optimal preprocessing techniques on both noncrystal and likely leads datasets.

20.
Cryst Growth Des ; 13(7): 2728-2736, 2013 Jul 03.
Artigo em Inglês | MEDLINE | ID: mdl-24532991

RESUMO

In this paper, we describe the design and implementation of a stand-alone real-time system for protein crystallization image acquisition and classification with a goal to assist crystallographers in scoring crystallization trials. In-house assembled fluorescence microscopy system is built for image acquisition. The images are classified into three categories as non-crystals, likely leads, and crystals. Image classification consists of two main steps - image feature extraction and application of classification based on multilayer perceptron (MLP) neural networks. Our feature extraction involves applying multiple thresholding techniques, identifying high intensity regions (blobs), and generating intensity and blob features to obtain a 45-dimensional feature vector per image. To reduce the risk of missing crystals, we introduce a max-class ensemble classifier which applies multiple classifiers and chooses the highest score (or class). We performed our experiments on 2250 images consisting 67% non-crystal, 18% likely leads, and 15% clear crystal images and tested our results using 10-fold cross validation. Our results demonstrate that the method is very efficient (< 3 seconds to process and classify an image) and has comparatively high accuracy. Our system only misses 1.2% of the crystals (classified as non-crystals) most likely due to low illumination or out of focus image capture and has an overall accuracy of 88%.

SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa