Pesquisa | BVS Doenças Infecciosas e Parasitárias

Dynamic Asset Allocation with Expected Shortfall via Quantum Annealing.

Xu, Hanjing; Dasgupta, Samudra; Pothen, Alex; Banerjee, Arnab.

Entropy (Basel) ; 25(3)2023 Mar 21.

Artigo em Inglês | MEDLINE | ID: mdl-36981429

RESUMO

Recent advances in quantum hardware offer new approaches to solve various optimization problems that can be computationally expensive when classical algorithms are employed. We propose a hybrid quantum-classical algorithm to solve a dynamic asset allocation problem where a target return and a target risk metric (expected shortfall) are specified. We propose an iterative algorithm that treats the target return as a constraint in a Markowitz portfolio optimization model, and dynamically adjusts the target return to satisfy the targeted expected shortfall. The Markowitz optimization is formulated as a Quadratic Unconstrained Binary Optimization (QUBO) problem. The use of the expected shortfall risk metric enables the modeling of extreme market events. We compare the results from D-Wave's 2000Q and Advantage quantum annealers using real-world financial data. Both quantum annealers are able to generate portfolios with more than 80% of the return of the classical optimal solutions, while satisfying the expected shortfall. We observe that experiments on assets with higher correlations tend to perform better, which may help to design practical quantum applications in the near term.

flowVS: channel-specific variance stabilization in flow cytometry.

Azad, Ariful; Rajwa, Bartek; Pothen, Alex.

BMC Bioinformatics ; 17: 291, 2016 Jul 28.

Artigo em Inglês | MEDLINE | ID: mdl-27465477

RESUMO

BACKGROUND: Comparing phenotypes of heterogeneous cell populations from multiple biological conditions is at the heart of scientific discovery based on flow cytometry (FC). When the biological signal is measured by the average expression of a biomarker, standard statistical methods require that variance be approximately stabilized in populations to be compared. Since the mean and variance of a cell population are often correlated in fluorescence-based FC measurements, a preprocessing step is needed to stabilize the within-population variances. RESULTS: We present a variance-stabilization algorithm, called flowVS, that removes the mean-variance correlations from cell populations identified in each fluorescence channel. flowVS transforms each channel from all samples of a data set by the inverse hyperbolic sine (asinh) transformation. For each channel, the parameters of the transformation are optimally selected by Bartlett's likelihood-ratio test so that the populations attain homogeneous variances. The optimum parameters are then used to transform the corresponding channels in every sample. flowVS is therefore an explicit variance-stabilization method that stabilizes within-population variances in each channel by evaluating the homoskedasticity of clusters with a likelihood-ratio test. With two publicly available datasets, we show that flowVS removes the mean-variance dependence from raw FC data and makes the within-population variance relatively homogeneous. We demonstrate that alternative transformation techniques such as flowTrans, flowScape, logicle, and FCSTrans might not stabilize variance. Besides flow cytometry, flowVS can also be applied to stabilize variance in microarray data. With a publicly available data set we demonstrate that flowVS performs as well as the VSN software, a state-of-the-art approach developed for microarrays. CONCLUSIONS: The homogeneity of variance in cell populations across FC samples is desirable when extracting features uniformly and comparing cell populations with different levels of marker expressions. The newly developed flowVS algorithm solves the variance-stabilization problem in FC and microarrays by optimally transforming data with the help of Bartlett's likelihood-ratio test. On two publicly available FC datasets, flowVS stabilizes within-population variances more evenly than the available transformation and normalization techniques. flowVS-based variance stabilization can help in performing comparison and alignment of phenotypically identical cell populations across different samples. flowVS and the datasets used in this paper are publicly available in Bioconductor.

Assuntos

Algoritmos , Citometria de Fluxo , Análise de Variância , Antígenos CD/metabolismo , Humanos , Linfócitos/citologia , Linfócitos/metabolismo

A novel statistical methodology for quantifying the spatial arrangements of axons in peripheral nerves.

Shemonti, Abida Sanjana; Plebani, Emanuele; Biscola, Natalia P; Jaffey, Deborah M; Havton, Leif A; Keast, Janet R; Pothen, Alex; Dundar, M Murat; Powley, Terry L; Rajwa, Bartek.

Front Neurosci ; 17: 1072779, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-36968498

RESUMO

A thorough understanding of the neuroanatomy of peripheral nerves is required for a better insight into their function and the development of neuromodulation tools and strategies. In biophysical modeling, it is commonly assumed that the complex spatial arrangement of myelinated and unmyelinated axons in peripheral nerves is random, however, in reality the axonal organization is inhomogeneous and anisotropic. Present quantitative neuroanatomy methods analyze peripheral nerves in terms of the number of axons and the morphometric characteristics of the axons, such as area and diameter. In this study, we employed spatial statistics and point process models to describe the spatial arrangement of axons and Sinkhorn distances to compute the similarities between these arrangements (in terms of first- and second-order statistics) in various vagus and pelvic nerve cross-sections. We utilized high-resolution transmission electron microscopy (TEM) images that have been segmented using a custom-built high-throughput deep learning system based on a highly modified U-Net architecture. Our findings show a novel and innovative approach to quantifying similarities between spatial point patterns using metrics derived from the solution to the optimal transport problem. We also present a generalizable pipeline for quantitative analysis of peripheral nerve architecture. Our data demonstrate differences between male- and female-originating samples and similarities between the pelvic and abdominal vagus nerves.

Matching phosphorylation response patterns of antigen-receptor-stimulated T cells via flow cytometry.

Azad, Ariful; Pyne, Saumyadipta; Pothen, Alex.

BMC Bioinformatics ; 13 Suppl 2: S10, 2012 Mar 13.

Artigo em Inglês | MEDLINE | ID: mdl-22536861

RESUMO

BACKGROUND: When flow cytometric data on mixtures of cell populations are collected from samples under different experimental conditions, computational methods are needed (a) to classify the samples into similar groups, and (b) to characterize the changes within the corresponding populations due to the different conditions. Manual inspection has been used in the past to study such changes, but high-dimensional experiments necessitate developing new computational approaches to this problem. A robust solution to this problem is to construct distinct templates to summarize all samples from a class, and then to compare these templates to study the changes across classes or conditions. RESULTS: We designed a hierarchical algorithm, flowMatch, to first match the corresponding clusters across samples for producing robust meta-clusters, and to then construct a high-dimensional template as a collection of meta-clusters for each class of samples. We applied the algorithm on flow cytometry data obtained from human blood cells before and after stimulation with anti-CD3 monoclonal antibody, which is reported to change phosphorylation responses of memory and naive T cells. The flowMatch algorithm is able to construct representative templates from the samples before and after stimulation, and to match corresponding meta-clusters across templates. The templates of the pre-stimulation and post-stimulation data corresponding to memory and naive T cell populations clearly show, at the level of the meta-clusters, the overall phosphorylation shift due to the stimulation. CONCLUSIONS: We concisely represent each class of samples by a template consisting of a collection of meta-clusters (representative abstract populations). Using flowMatch, the meta-clusters across samples can be matched to assess overall differences among the samples of various phenotypes or time-points.

Assuntos

Algoritmos , Citometria de Fluxo , Receptores de Antígenos de Linfócitos T/metabolismo , Linfócitos T/imunologia , Humanos , Fosforilação

Physical and in silico approaches identify DNA-PK in a Tax DNA-damage response interactome.

Ramadan, Emad; Ward, Michael; Guo, Xin; Durkin, Sarah S; Sawyer, Adam; Vilela, Marcelo; Osgood, Christopher; Pothen, Alex; Semmes, Oliver J.

Retrovirology ; 5: 92, 2008 Oct 15.

Artigo em Inglês | MEDLINE | ID: mdl-18922151

RESUMO

BACKGROUND: We have initiated an effort to exhaustively map interactions between HTLV-1 Tax and host cellular proteins. The resulting Tax interactome will have significant utility toward defining new and understanding known activities of this important viral protein. In addition, the completion of a full Tax interactome will also help shed light upon the functional consequences of these myriad Tax activities. The physical mapping process involved the affinity isolation of Tax complexes followed by sequence identification using tandem mass spectrometry. To date we have mapped 250 cellular components within this interactome. Here we present our approach to prioritizing these interactions via an in silico culling process. RESULTS: We first constructed an in silico Tax interactome comprised of 46 literature-confirmed protein-protein interactions. This number was then reduced to four Tax-interactions suspected to play a role in DNA damage response (Rad51, TOP1, Chk2, 53BP1). The first-neighbor and second-neighbor interactions of these four proteins were assembled from available human protein interaction databases. Through an analysis of betweenness and closeness centrality measures, and numbers of interactions, we ranked proteins in the first neighborhood. When this rank list was compared to the list of physical Tax-binding proteins, DNA-PK was the highest ranked protein common to both lists. An overlapping clustering of the Tax-specific second-neighborhood protein network showed DNA-PK to be one of three bridge proteins that link multiple clusters in the DNA damage response network. CONCLUSION: The interaction of Tax with DNA-PK represents an important biological paradigm as suggested via consensus findings in vivo and in silico. We present this methodology as an approach to discovery and as a means of validating components of a consensus Tax interactome.

Assuntos

Proteínas de Ligação ao Cálcio/metabolismo , Dano ao DNA , Produtos do Gene tax/metabolismo , Infecções por HTLV-I/metabolismo , Vírus Linfotrópico T Tipo 1 Humano/metabolismo , Mapeamento de Interação de Proteínas , Proteínas de Ligação ao Cálcio/genética , Linhagem Celular , Produtos do Gene tax/genética , Infecções por HTLV-I/virologia , Vírus Linfotrópico T Tipo 1 Humano/genética , Humanos , Ligação Proteica

Immunophenotype Discovery, Hierarchical Organization, and Template-Based Classification of Flow Cytometry Samples.

Azad, Ariful; Rajwa, Bartek; Pothen, Alex.

Front Oncol ; 6: 188, 2016.

Artigo em Inglês | MEDLINE | ID: mdl-27630823

RESUMO

We describe algorithms for discovering immunophenotypes from large collections of flow cytometry samples and using them to organize the samples into a hierarchy based on phenotypic similarity. The hierarchical organization is helpful for effective and robust cytometry data mining, including the creation of collections of cell populations' characteristic of different classes of samples, robust classification, and anomaly detection. We summarize a set of samples belonging to a biological class or category with a statistically derived template for the class. Whereas individual samples are represented in terms of their cell populations (clusters), a template consists of generic meta-populations (a group of homogeneous cell populations obtained from the samples in a class) that describe key phenotypes shared among all those samples. We organize an FC data collection in a hierarchical data structure that supports the identification of immunophenotypes relevant to clinical diagnosis. A robust template-based classification scheme is also developed, but our primary focus is in the discovery of phenotypic signatures and inter-sample relationships in an FC data collection. This collective analysis approach is more efficient and robust since templates describe phenotypic signatures common to cell populations in several samples while ignoring noise and small sample-specific variations. We have applied the template-based scheme to analyze several datasets, including one representing a healthy immune system and one of acute myeloid leukemia (AML) samples. The last task is challenging due to the phenotypic heterogeneity of the several subtypes of AML. However, we identified thirteen immunophenotypes corresponding to subtypes of AML and were able to distinguish acute promyelocytic leukemia (APL) samples with the markers provided. Clinically, this is helpful since APL has a different treatment regimen from other subtypes of AML. Core algorithms used in our data analysis are available in the flowMatch package at www.bioconductor.org. It has been downloaded nearly 6,000 times since 2014.

Genome prediction of putative genome-linked viral protein (VPg) of astroviruses.

Al-Mutairy, Badr; Walter, Jolan E; Pothen, Alex; Mitchell, Douglas K.

Virus Genes ; 31(1): 21-30, 2005 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-15965605

RESUMO

Positive-sense single-stranded RNA (+ssRNA) viruses replicate by uncoating the RNA genome for translation to provide viral proteins essential for genome replication and the production of new viral particles. The viral proteins are synthesized from a polyprotein precursor, which is cleaved nascently. The synthesized proteins include viral RNA-dependent RNA polymerase (RdRP), viral genome-linked protein (VPg), and a helicase. VPg is covalently attached to the genomic form of +ssRNA viruses. Helicases and NTPase unwind the RNA before replication. VPg and helicases have been identified in +ssRNA families, however, the presence of VPg and helicase in the Astroviridae, another +ssRNA family, has not been fully elucidated. Computational tools were utilized to provide sequence analysis evidence for the presence and genomic location of astrovirus VPg and helicase. HMMER program v2.1.1 was used to build Hidden Markov Model (HMM) profile for calicivirus VPg to search for conserved motifs in the astrovirus genome. We performed phylogenetic analysis of two genomic regions of astroviruses and caliciviruses (encoding the RdRP and VPg). We identified a putative VPg coding region in astrovirus. This region was located in open reading frame 1a (ORF1 a) and included sites with high sequence similarity to the VPg coding regions of Caliciviridae, Piconaviridae, and Potyviridae. A region encoding a putative astrovirus helicase identified conserved motifs only with pestivirus helicase sequences. Sequence analysis and comparison to other +ssRNA viruses supports the presence of VPg in the Astroviridae. Further laboratory analysis will be necessary to confirm these findings.

Assuntos

Genoma Viral , Mamastrovirus/genética , Proteínas Virais/genética , Sequência de Aminoácidos , Animais , Humanos , Mamastrovirus/química , Mamastrovirus/enzimologia , Dados de Sequência Molecular , RNA Helicases/química , RNA Helicases/genética , RNA Polimerase Dependente de RNA/química , RNA Polimerase Dependente de RNA/genética , Alinhamento de Sequência , Proteínas Virais/química

Computational protein biomarker prediction: a case study for prostate cancer.

Wagner, Michael; Naik, Dayanand N; Pothen, Alex; Kasukurti, Srinivas; Devineni, Raghu Ram; Adam, Bao-Ling; Semmes, O John; Wright, George L.

BMC Bioinformatics ; 5: 26, 2004 Mar 11.

Artigo em Inglês | MEDLINE | ID: mdl-15113409

RESUMO

BACKGROUND: Recent technological advances in mass spectrometry pose challenges in computational mathematics and statistics to process the mass spectral data into predictive models with clinical and biological significance. We discuss several classification-based approaches to finding protein biomarker candidates using protein profiles obtained via mass spectrometry, and we assess their statistical significance. Our overall goal is to implicate peaks that have a high likelihood of being biologically linked to a given disease state, and thus to narrow the search for biomarker candidates. RESULTS: Thorough cross-validation studies and randomization tests are performed on a prostate cancer dataset with over 300 patients, obtained at the Eastern Virginia Medical School using SELDI-TOF mass spectrometry. We obtain average classification accuracies of 87% on a four-group classification problem using a two-stage linear SVM-based procedure and just 13 peaks, with other methods performing comparably. CONCLUSIONS: Modern feature selection and classification methods are powerful techniques for both the identification of biomarker candidates and the related problem of building predictive models from protein mass spectrometric profiles. Cross-validation and randomization are essential tools that must be performed carefully in order not to bias the results unfairly. However, only a biological validation and identification of the underlying proteins will ultimately confirm the actual value and power of any computational predictions.

Assuntos

Biomarcadores Tumorais/classificação , Biologia Computacional/métodos , Proteínas de Neoplasias/classificação , Neoplasias da Próstata/química , Biomarcadores Tumorais/biossíntese , Biologia Computacional/estatística & dados numéricos , Humanos , Masculino , Proteínas de Neoplasias/biossíntese , Valor Preditivo dos Testes , Neoplasias da Próstata/diagnóstico , Neoplasias da Próstata/genética , Distribuição Aleatória , Reprodutibilidade dos Testes , Espectrometria de Massas por Ionização e Dessorção a Laser Assistida por Matriz/métodos , Espectrometria de Massas por Ionização e Dessorção a Laser Assistida por Matriz/estatística & dados numéricos

Protocols for disease classification from mass spectrometry data.

Wagner, Michael; Naik, Dayanand; Pothen, Alex.

Proteomics ; 3(9): 1692-8, 2003 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-12973727

RESUMO

We report our results in classifying protein matrix-assisted laser desorption/ionization-time of flight mass spectra obtained from serum samples into diseased and healthy groups. We discuss in detail five of the steps in preprocessing the mass spectral data for biomarker discovery, as well as our criterion for choosing a small set of peaks for classifying the samples. Cross-validation studies with four selected proteins yielded misclassification rates in the 10-15% range for all the classification methods. Three of these proteins or protein fragments are down-regulated and one up-regulated in lung cancer, the disease under consideration in this data set. When cross-validation studies are performed, care must be taken to ensure that the test set does not influence the choice of the peaks used in the classification. Misclassification rates are lower when both the training and test sets are used to select the peaks used in classification versus when only the training set is used. This expectation was validated for various statistical discrimination methods when thirteen peaks were used in cross-validation studies. One particular classification method, a linear support vector machine, exhibited especially robust performance when the number of peaks was varied from four to thirteen, and when the peaks were selected from the training set alone. Experiments with the samples randomly assigned to the two classes confirmed that misclassification rates were significantly higher in such cases than those observed with the true data. This indicates that our findings are indeed significant. We found closely matching masses in a database for protein expression in lung cancer for three of the four proteins we used to classify lung cancer. Data from additional samples, increased experience with the performance of various preprocessing techniques, and affirmation of the biological roles of the proteins that help in classification, will strengthen our conclusions in the future.

Assuntos

Doença/classificação , Espectrometria de Massas por Ionização e Dessorção a Laser Assistida por Matriz/métodos , Biomarcadores/química , Proteínas Sanguíneas/classificação , Humanos , Neoplasias Pulmonares/classificação , Espectrometria de Massas por Ionização e Dessorção a Laser Assistida por Matriz/estatística & dados numéricos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA