Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
1.
IEEE Trans Pattern Anal Mach Intell ; 41(12): 3057-3070, 2019 12.
Artigo em Inglês | MEDLINE | ID: mdl-30371353

RESUMO

Sampling is an important and effective strategy in analyzing "big data," whereby a smaller subset of a dataset is used to estimate the characteristics of its entire population. The main goal in sampling is often to achieve a significant gain in the computational time. However, a major obstacle towards this goal is the assessment of the smallest sample size needed to ensure, with a high probability, a faithful representation of the entire dataset, especially when the data set is compiled of a large number of diverse structures (e.g., clusters). To address this problem, we propose a method referred to as the Sparse Withdrawal of Inliers in a First Trial (SWIFT) that determines the smallest sample size of a subset of a dataset sampled in one grab, with the guarantee that the subset provides a sufficient number of samples from each of the underlying structures necessary for the discovery and inference. The latter is established with high probability, and the lower bound of the smallest sample size depends on probabilistic guarantees. In addition, we derive an upper bound on the smallest sample size that allows for detection of the structures and show that the two bounds are very close to each other in a variety of scenarios. We show that the problem can be modeled using either a hypergeometric or a multinomial probability mass function (pmf), and derive accurate mathematical bounds to determine a tight approximation to the sample size, leading thus to a sparse sampling strategy. The key features of the proposed method are: (i) sparseness of the sampled subset for analyzing data, where the level of sparseness is independent of the population size; (ii) no prior knowledge of the distribution of data, or the number of underlying structures in the data; and (iii) robustness in the presence of overwhelming number of outliers. We evaluate the method thoroughly in terms of accuracy, its behavior against different parameters, and its effectiveness in reducing the computational cost in various applications of computer vision, such as subspace clustering and structure from motion.

2.
J Nanobiotechnology ; 9: 20, 2011 May 23.
Artigo em Inglês | MEDLINE | ID: mdl-21605409

RESUMO

BACKGROUND: Gold nanoparticles (AuNPs) scatter light intensely at or near their surface plasmon wavelength region. Using AuNPs coupled with dynamic light scattering (DLS) detection, we developed a facile nanoparticle immunoassay for serum protein biomarker detection and analysis. A serum sample was first mixed with a citrate-protected AuNP solution. Proteins from the serum were adsorbed to the AuNPs to form a protein corona on the nanoparticle surface. An antibody solution was then added to the assay solution to analyze the target proteins of interest that are present in the protein corona. The protein corona formation and the subsequent binding of antibody to the target proteins in the protein corona were detected by DLS. RESULTS: Using this simple assay, we discovered multiple molecular aberrations associated with prostate cancer from both mice and human blood serum samples. From the mice serum study, we observed difference in the size of the protein corona and mouse IgG level between different mice groups (i.e., mice with aggressive or less aggressive prostate cancer, and normal healthy controls). Furthermore, it was found from both the mice model and the human serum sample study that the level of vascular endothelial growth factor (VEGF, a protein that is associated with tumor angiogenesis) adsorbed to the AuNPs is decreased in cancer samples compared to non-cancerous or less malignant cancer samples. CONCLUSION: The molecular aberrations observed from this study may become new biomarkers for prostate cancer detection. The nanoparticle immunoassay reported here can be used as a convenient and general tool to screen and analyze serum proteins and to discover new biomarkers associated with cancer and other human diseases.


Assuntos
Biomarcadores Tumorais/sangue , Ouro/química , Imunoensaio , Nanopartículas Metálicas/química , Neoplasias da Próstata/diagnóstico , Animais , Humanos , Masculino , Camundongos , Fatores de Crescimento do Endotélio Vascular/sangue
3.
Anal Quant Cytol Histol ; 32(5): 280-90, 2010 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-22043504

RESUMO

OBJECTIVE: To distinguish untreated lung cancer cells from normal cells through quantitative analysis and statistical inference of centrosomal features extracted from cell images. STUDY DESIGN: Recent research indicates that human cancer cell development is accompanied by centrosomal abnormalities. For quantitative analysis of centrosome abnormalities, high-resolution images of normal and untreated cancer lung cells were acquired. After the images were preprocessed and segmented, 11 features were extracted. Correlations among the features were calculated to remove redundant features. Ten nonredundant features were selected for further analysis. The mean values of 10 centrosome features were compared between cancer and normal cells by the two-sample t-test; distributions of the 10 features of cancer and normal centrosomes were compared by the two-sample Kolmogorov-Smirnov test. RESULTS: Both tests reject the null hypothesis; the means and distributions of features coincide for normal and cancer cells. The 10 centrosome features separate normal from cancer cells at the 5% significance level and show strong evidence that all 10 features exhibit major differences between normal and cancer cells. CONCLUSION: Centrosomes from untreated cancer and normal bronchial epithelial cells can be distinguished through objective measurement and quantitative analysis, suggesting a new approach for lung cancer detection, early diagnosis and prognosis.


Assuntos
Centrossomo , Neoplasias Pulmonares , Células Epiteliais , Humanos , Pulmão , Prognóstico
4.
J Physiol Paris ; 102(4-6): 304-21, 2008.
Artigo em Inglês | MEDLINE | ID: mdl-18984042

RESUMO

Evolutionary studies of communication can benefit from classification procedures that allow individual animals to be assigned to groups (e.g. species) on the basis of high-dimension data representing their signals. Prior to classification, signals are usually transformed by a signal processing procedure into structural features. Applications of these signal processing procedures to animal communication have been largely restricted to the manual or semi-automated identification of landmark features from graphical representations of signals. Nonetheless, theory predicts that automated time-frequency-based digital signal processing (DSP) procedures can represent signals more efficiently (using fewer features) than can landmark procedures or frequency-based DSP - allowing more accurate classification. Moreover, DSP procedures are objective in that they require little previous knowledge of signal diversity, and are relatively free from potentially ungrounded assumptions of cross-taxon homology. Using a model data set of electric organ discharge waveforms from five sympatric species of the electric fish Gymnotus, we adopted an exhaustive simulation approach to investigate the classificatory performance of different signal processing procedures. We considered a landmark procedure, a frequency-based DSP procedure (the fast Fourier transform), and two kinds of time-frequency-based DSP procedures (a short-time Fourier transform, and several implementations of the discrete wavelet transform -DWT). The features derived from each of these signal processing procedures were then subjected to dimension reduction procedures to separate those features which permit the most effective discrimination among groups of signalers. We considered four alternative dimension reduction methods. Finally, each combination of reduced data was submitted to classification by linear discriminant analysis. Our results support theoretical predictions that time-frequency DSP procedures (especially DWT) permit more efficient discrimination of groups. The performance of signal processing was found to depend largely upon the dimension reduction procedure employed, and upon the number of resulting features. Because the best combinations of procedures are dataset-dependent and difficult to predict, we conclude that simulations of the kind described here, or at least simplified versions of them, should be routinely executed before classification of animal signals - especially unfamiliar ones.


Assuntos
Comunicação Animal , Peixe Elétrico/fisiologia , Análise Multivariada , Processamento de Sinais Assistido por Computador , Animais , Simulação por Computador , Modelos Biológicos
5.
BMC Bioinformatics ; 9: 415, 2008 Oct 06.
Artigo em Inglês | MEDLINE | ID: mdl-18837969

RESUMO

BACKGROUND: Gene expression levels in a given cell can be influenced by different factors, namely pharmacological or medical treatments. The response to a given stimulus is usually different for different genes and may depend on time. One of the goals of modern molecular biology is the high-throughput identification of genes associated with a particular treatment or a biological process of interest. From methodological and computational point of view, analyzing high-dimensional time course microarray data requires very specific set of tools which are usually not included in standard software packages. Recently, the authors of this paper developed a fully Bayesian approach which allows one to identify differentially expressed genes in a 'one-sample' time-course microarray experiment, to rank them and to estimate their expression profiles. The method is based on explicit expressions for calculations and, hence, very computationally efficient. RESULTS: The software package BATS (Bayesian Analysis of Time Series) presented here implements the methodology described above. It allows an user to automatically identify and rank differentially expressed genes and to estimate their expression profiles when at least 5-6 time points are available. The package has a user-friendly interface. BATS successfully manages various technical difficulties which arise in time-course microarray experiments, such as a small number of observations, non-uniform sampling intervals and replicated or missing data. CONCLUSION: BATS is a free user-friendly software for the analysis of both simulated and real microarray time course experiments. The software, the user manual and a brief illustrative example are freely available online at the BATS website: http://www.na.iac.cnr.it/bats.


Assuntos
Teorema de Bayes , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Interface Usuário-Computador , Algoritmos , Expressão Gênica , Perfilação da Expressão Gênica/métodos , Humanos , Projetos de Pesquisa , Tamanho da Amostra , Fatores de Tempo
6.
Stat Appl Genet Mol Biol ; 6: Article24, 2007.
Artigo em Inglês | MEDLINE | ID: mdl-17910530

RESUMO

The objective of the present paper is to develop a truly functional Bayesian method specifically designed for time series microarray data. The method allows one to identify differentially expressed genes in a time-course microarray experiment, to rank them and to estimate their expression profiles. Each gene expression profile is modeled as an expansion over some orthonormal basis, where the coefficients and the number of basis functions are estimated from the data. The proposed procedure deals successfully with various technical difficulties that arise in typical microarray experiments such as a small number of observations, non-uniform sampling intervals and missing or replicated data. The procedure allows one to account for various types of errors and offers a good compromise between nonparametric techniques and techniques based on normality assumptions. In addition, all evaluations are performed using analytic expressions, so the entire procedure requires very small computational effort. The procedure is studied using both simulated and real data, and is compared with competitive recent approaches. Finally, the procedure is applied to a case study of a human breast cancer cell line stimulated with estrogen. We succeeded in finding new significant genes that were not marked in an earlier work on the same dataset.


Assuntos
Algoritmos , Teorema de Bayes , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Neoplasias da Mama , Linhagem Celular Tumoral , Simulação por Computador , Estradiol/farmacologia , Estrogênios/farmacologia , Feminino , Perfilação da Expressão Gênica , Humanos , Modelos Estatísticos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA