Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 21
Filtrar
1.
Sci Rep ; 12(1): 9045, 2022 05 31.
Artigo em Inglês | MEDLINE | ID: mdl-35641545

RESUMO

Long-term antibody responses to SARS-CoV-2 have focused on responses to full-length spike protein, specific domains within spike, or nucleoprotein. In this study, we used high-density peptide microarrays representing the complete proteome of SARS-CoV-2 to identify binding sites (epitopes) targeted by antibodies present in the blood of COVID-19 resolved cases at 5 months post-diagnosis. Compared to previous studies that evaluated epitope-specific responses early post-diagnosis (< 60 days), we found that epitope-specific responses to nucleoprotein and spike protein have contracted, and that responses to membrane protein have expanded. Although antibody titers to full-length spike and nucleoprotein remain steady over months, taken together our data suggest that the population of epitope-specific antibodies that contribute to this reactivity is dynamic and evolves over time. Further, the spike epitopes bound by polyclonal antibodies in COVID-19 convalescent serum samples aligned with known target sites that can neutralize viral activity suggesting that the maintenance of these antibodies might provide rapid serological immunity. Finally, the most dominant epitopes for membrane protein and spike showed high diagnostic accuracy providing novel biomarkers to refine blood-based antibody tests. This study provides new insights into the specific regions of SARS-CoV-2 targeted by serum antibodies long after infection.


Assuntos
Anticorpos Antivirais , COVID-19 , Convalescença , Anticorpos Antivirais/sangue , COVID-19/sangue , COVID-19/terapia , Proteínas do Nucleocapsídeo de Coronavírus , Epitopos , Humanos , Imunização Passiva , Fosfoproteínas , SARS-CoV-2 , Glicoproteína da Espícula de Coronavírus , Soroterapia para COVID-19
2.
Genes (Basel) ; 12(10)2021 09 28.
Artigo em Inglês | MEDLINE | ID: mdl-34680918

RESUMO

Gene set analysis has been widely used to gain insight from high-throughput expression studies. Although various tools and methods have been developed for gene set analysis, there is no consensus among researchers regarding best practice(s). Most often, evaluation studies have reported contradictory recommendations of which methods are superior. Therefore, an unbiased quantitative framework for evaluations of gene set analysis methods will be valuable. Such a framework requires gene expression datasets where enrichment status of gene sets is known a priori. In the absence of such gold standard datasets, artificial datasets are commonly used for evaluations of gene set analysis methods; however, they often rely on oversimplifying assumptions that make them biased in favor of or against a given method. In this paper, we propose a quantitative framework for evaluation of gene set analysis methods by synthesizing expression datasets using real data, without relying on oversimplifying or unrealistic assumptions, while preserving complex gene-gene correlations and retaining the distribution of expression values. The utility of the quantitative approach is shown by evaluating ten widely used gene set analysis methods. An implementation of the proposed method is publicly available. We suggest using Silver to evaluate existing and new gene set analysis methods. Evaluation using Silver provides a better understanding of current methods and can aid in the development of gene set analysis methods to achieve higher specificity without sacrificing sensitivity.


Assuntos
Bases de Dados Genéticas/normas , Genômica/métodos , Software , Conjuntos de Dados como Assunto/normas
3.
Nutr Res ; 92: 139-149, 2021 08.
Artigo em Inglês | MEDLINE | ID: mdl-34311227

RESUMO

A number of studies have demonstrated that patients with autoimmune disease have lower levels of vitamin D prompting speculation that vitamin D might suppress inflammation and immune responses in children with juvenile idiopathic arthritis (JIA).  The objective of this study was to compare vitamin D levels in children with JIA at disease onset with healthy children. We hypothesized that children and adolescents with JIA have lower vitamin D levels than healthy children and adolescents. Data from a Canadian cohort of children with new-onset JIA (n= 164, data collection 2007-2012) were compared to Canadian Health Measures Survey (CHMS) data (n=4027, data collection 2007-2011). We compared 25-hydroxy vitamin D (25(OH)D) concentrations with measures of inflammation, vitamin D supplement use, milk intake, and season of birth. Mean 25(OH)D level was significantly higher in patients with JIA (79 ± 3.1 nmol/L) than in healthy controls (68 ± 1.8 nmol/L P <.05). Patients with JIA more often used vitamin D containing supplements (50% vs. 7%; P <.05). The prevalence of 25(OH)D deficiency (<30 nmol/L) was 6% for both groups. Children with JIA with 25(OH)D deficiency or insufficiency (<50 nmol/L) had higher C-reactive protein levels. Children with JIA were more often born in the fall and winter compared to healthy children. In contrast to earlier studies, we found vitamin D levels in Canadian children with JIA were higher compared to healthy children and associated with more frequent use of vitamin D supplements. Among children with JIA, low vitamin D levels were associated with indicators of greater inflammation.


Assuntos
Artrite Juvenil/sangue , Suplementos Nutricionais , Inflamação , Parto , Estações do Ano , Deficiência de Vitamina D/sangue , Vitamina D/sangue , Animais , Artrite Juvenil/complicações , Artrite Juvenil/imunologia , Doenças Autoimunes , Proteína C-Reativa/metabolismo , Canadá/epidemiologia , Estudos de Casos e Controles , Criança , Pré-Escolar , Estudos de Coortes , Feminino , Humanos , Recém-Nascido , Inflamação/etiologia , Inflamação/metabolismo , Masculino , Leite , Vitamina D/análogos & derivados , Vitamina D/uso terapêutico , Deficiência de Vitamina D/complicações , Deficiência de Vitamina D/tratamento farmacológico , Deficiência de Vitamina D/imunologia
4.
Front Bioinform ; 1: 694324, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-36303765

RESUMO

Antibodies are critical effector molecules of the humoral immune system. Upon infection or vaccination, populations of antibodies are generated which bind to various regions of the invading pathogen or exogenous agent. Defining the reactivity and breadth of this antibody response provides an understanding of the antigenic determinants and enables the rational development and assessment of vaccine candidates. High-resolution analysis of these populations typically requires advanced techniques such as B cell receptor repertoire sequencing, mass spectrometry of isolated immunoglobulins, or phage display libraries that are dependent upon equipment and expertise which are prohibitive for many labs. High-density peptide microarrays representing diverse populations of putative linear epitopes (immunoarrays) are an effective alternative for high-throughput examination of antibody reactivity and diversity. While a promising technology, widespread adoption of immunoarrays has been limited by the need for, and relative absence of, user-friendly tools for consideration and visualization of the emerging data. To address this limitation, we developed EPIphany, a software platform with a simple web-based user interface, aimed at biological users, that provides access to important analysis parameters, data normalization options, and a variety of unique data visualization options. This platform provides researchers the greatest opportunity to extract biologically meaningful information from the immunoarray data, thereby facilitating the discovery and development of novel immuno-therapeutics.

5.
Front Genet ; 11: 654, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32695141

RESUMO

Gene set analysis methods are widely used to provide insight into high-throughput gene expression data. There are many gene set analysis methods available. These methods rely on various assumptions and have different requirements, strengths and weaknesses. In this paper, we classify gene set analysis methods based on their components, describe the underlying requirements and assumptions for each class, and provide directions for future research in developing and evaluating gene set analysis methods.

6.
Rheumatology (Oxford) ; 59(5): 1066-1075, 2020 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-32321162

RESUMO

OBJECTIVE: To identify discrete clusters comprising clinical features and inflammatory biomarkers in children with JIA and to determine cluster alignment with JIA categories. METHODS: A Canadian prospective inception cohort comprising 150 children with JIA was evaluated at baseline (visit 1) and after six months (visit 2). Data included clinical manifestations and inflammation-related biomarkers. Probabilistic principal component analysis identified sets of composite variables, or principal components, from 191 original variables. To discern new clinical-biomarker clusters (clusters), Gaussian mixture models were fit to the data. Newly-defined clusters and JIA categories were compared. Agreement between the two was assessed using Kruskal-Wallis analyses and contingency plots. RESULTS: Three principal components recovered 35% (three clusters) and 40% (five clusters) of the variance in patient profiles in visits 1 and 2, respectively. None of the clusters aligned precisely with any of the seven JIA categories but rather spanned multiple categories. Results demonstrated that the newly defined clinical-biomarker lustres are more homogeneous than JIA categories. CONCLUSION: Applying unsupervised data mining to clinical and inflammatory biomarker data discerns discrete clusters that intersect multiple JIA categories. Results suggest that certain groups of patients within different JIA categories are more aligned pathobiologically than their separate clinical categorizations suggest. Applying data mining analyses to complex datasets can generate insights into JIA pathogenesis and could contribute to biologically based refinements in JIA classification.


Assuntos
Artrite Juvenil/sangue , Artrite Juvenil/fisiopatologia , Mediadores da Inflamação/sangue , Adolescente , Fatores Etários , Artrite Juvenil/epidemiologia , Biomarcadores/sangue , Canadá/epidemiologia , Criança , Análise por Conglomerados , Estudos de Coortes , Mineração de Dados , Feminino , Humanos , Incidência , Masculino , Distribuição Normal , Estudos Prospectivos , Medição de Risco , Índice de Gravidade de Doença , Fatores Sexuais , Síndrome
7.
Rheumatology (Oxford) ; 59(9): 2402-2411, 2020 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-31919503

RESUMO

OBJECTIVE: To identify early predictors of disease activity at 18 months in JIA using clinical and biomarker profiling. METHODS: Clinical and biomarker data were collected at JIA diagnosis in a prospective longitudinal inception cohort of 82 children with non-systemic JIA, and their ability to predict an active joint count of 0, a physician global assessment of disease activity of ≤1 cm, and inactive disease by Wallace 2004 criteria 18 months later was assessed. Correlation-based feature selection and ReliefF were used to shortlist predictors and random forest models were trained to predict outcomes. RESULTS: From the original 112 features, 13 effectively predicted 18-month outcomes. They included age, number of active/effused joints, wrist, ankle and/or knee involvement, ESR, ANA positivity and plasma levels of five inflammatory biomarkers (IL-10, IL-17, IL-12p70, soluble low-density lipoprotein receptor-related protein 1 and vitamin D), at enrolment. The clinical plus biomarker panel predicted active joint count = 0, physician global assessment ≤ 1, and inactive disease after 18 months with 0.79, 0.80 and 0.83 accuracy and 0.84, 0.83, 0.88 area under the curve, respectively. Using clinical features alone resulted in 0.75, 0.72 and 0.80 accuracy, and area under the curve values of 0.81, 0.78 and 0.83, respectively. CONCLUSION: A panel of five plasma biomarkers combined with clinical features at the time of diagnosis more accurately predicted short-term disease activity in JIA than clinical characteristics alone. If validated in external cohorts, such a panel may guide more rationally conceived, biologically based, personalized treatment strategies in early JIA.


Assuntos
Artrite Juvenil/diagnóstico , Interleucinas/sangue , Proteína-1 Relacionada a Receptor de Lipoproteína de Baixa Densidade/sangue , Índice de Gravidade de Doença , Vitamina D/sangue , Adolescente , Articulação do Tornozelo/patologia , Área Sob a Curva , Artrite Juvenil/sangue , Artrite Juvenil/patologia , Biomarcadores/sangue , Canadá , Criança , Pré-Escolar , Feminino , Humanos , Interleucina-10/sangue , Interleucina-12/sangue , Interleucina-17/sangue , Articulação do Joelho/patologia , Estudos Longitudinais , Masculino , Valor Preditivo dos Testes , Estudos Prospectivos , Articulação do Punho/patologia
8.
J Eukaryot Microbiol ; 67(3): 337-351, 2020 05.
Artigo em Inglês | MEDLINE | ID: mdl-31925980

RESUMO

Plasmodiophora brassicae (Wor.) is an obligate intracellular plant pathogen affecting Brassicas worldwide. Identification of effector proteins is key to understanding the interaction between P. brassicae and its susceptible host plants. To date, there is very little information available on putative effector proteins secreted by P. brassicae during a secondary infection of susceptible host plants, resulting in root gall production. A bioinformatics pipeline approach to RNA-Seq data from Arabidopsis thaliana (L.) Heynh. root tissues at 17, 20, and 24 d postinoculation (dpi) identified 32 small secreted P. brassicae proteins (SSPbPs) that were highly expressed over this secondary infection time frame. Functional signal peptides were confirmed for 31 of the SSPbPs, supporting the accuracy of the pipeline designed to identify secreted proteins. Expression profiles at 0, 2, 5, 7, 14, 21, and 28 dpi verified the involvement of some of the SSPbPs in secondary infection. For seven of the SSPbPs, a functional domain was identified using Blast2GO and 3D structure analysis and domain functionality was confirmed for SSPbP22, a kinase localized to the cytoplasm and nucleus.


Assuntos
Arabidopsis/parasitologia , Perfilação da Expressão Gênica/métodos , Plasmodioforídeos/genética , Proteínas de Protozoários/genética , Regulação para Cima , Modelos Moleculares , Raízes de Plantas/parasitologia , Plasmodioforídeos/metabolismo , Conformação Proteica , Domínios Proteicos , Sinais Direcionadores de Proteínas , Proteínas de Protozoários/química , Análise de Sequência de RNA
9.
J Bioinform Comput Biol ; 17(5): 1940010, 2019 10.
Artigo em Inglês | MEDLINE | ID: mdl-31856670

RESUMO

Gene set analysis is a quantitative approach for generating biological insight from gene expression datasets. The abundance of gene set analysis methods speaks to their popularity, but raises the question of the extent to which results are affected by the choice of method. Our systematic analysis of 13 popular methods using 6 different datasets, from both DNA microarray and RNA-Seq origin, shows that this choice matters a great deal. We observed that the overall number of gene sets reported by each method differed by up to 2 orders of magnitude, and there was a bias toward reporting large gene sets with some methods. Furthermore, there was substantial disagreement between the 20 most statistically significant gene sets reported by the methods. This was also observed when expanding to the 100 most statistically significant reported gene sets. For different datasets of the same phenotype/condition, the top 20 and top 100 most significant results also showed little to no agreement even when using the same method. GAGE, PAGE, and ORA were the only methods able to achieve relatively high reproducibility when comparing the 20 and 100 most statistically significant gene sets. Biological validation on a juvenile idiopathic arthritis (JIA) dataset showed wide variation in terms of the relevance of the top 20 and top 100 most significant gene sets to known biology of the disease, where GAGE predicted the most relevant gene sets, followed by GSEA, ORA, and PAGE.


Assuntos
Bases de Dados Genéticas , Perfilação da Expressão Gênica/estatística & dados numéricos , Artrite Juvenil/genética , Perfilação da Expressão Gênica/métodos , Perfilação da Expressão Gênica/normas , Humanos , Análise de Sequência com Séries de Oligonucleotídeos/estatística & dados numéricos , Fenótipo , Psoríase/genética , Reprodutibilidade dos Testes
10.
Hum Genomics ; 13(Suppl 1): 42, 2019 10 22.
Artigo em Inglês | MEDLINE | ID: mdl-31639047

RESUMO

BACKGROUND: Gene set analysis is a well-established approach for interpretation of data from high-throughput gene expression studies. Achieving reproducible results is an essential requirement in such studies. One factor of a gene expression experiment that can affect reproducibility is the choice of sample size. However, choosing an appropriate sample size can be difficult, especially because the choice may be method-dependent. Further, sample size choice can have unexpected effects on specificity. RESULTS: In this paper, we report on a systematic, quantitative approach to study the effect of sample size on the reproducibility of the results from 13 gene set analysis methods. We also investigate the impact of sample size on the specificity of these methods. Rather than relying on synthetic data, the proposed approach uses real expression datasets to offer an accurate and reliable evaluation. CONCLUSION: Our findings show that, as a general pattern, the results of gene set analysis become more reproducible as sample size increases. However, the extent of reproducibility and the rate at which it increases vary from method to method. In addition, even in the absence of differential expression, some gene set analysis methods report a large number of false positives, and increasing sample size does not lead to reducing these false positives. The results of this research can be used when selecting a gene set analysis method from those available.


Assuntos
Bases de Dados Genéticas , Perfilação da Expressão Gênica , Humanos , Reprodutibilidade dos Testes , Tamanho da Amostra
11.
Virulence ; 9(1): 1344-1353, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-30146948

RESUMO

Clubroot is an economically important disease affecting Brassica plants worldwide. Plasmodiophora brassicae is the protist pathogen associated with the disease, and its soil-borne obligate parasitic nature has impeded studies related to its biology and the mechanisms involved in its infection of the plant host. The identification of effector proteins is key to understanding how the pathogen manipulates the plant's immune response and the genes involved in resistance. After more than 140 years studying clubroot and P. brassicae, very little is known about the effectors playing key roles in the infection process and subsequent disease progression. Here we analyze the information available for identified effectors and suggest several features of effector genes that can be used in the search for others. Based on the information presented in this review, we propose a comprehensive bioinformatics pipeline for effector identification and provide a list of the bioinformatics tools available for such.


Assuntos
Brassica/parasitologia , Resistência à Doença/genética , Doenças das Plantas/parasitologia , Plasmodioforídeos/imunologia , Brassica/imunologia , Biologia Computacional , Interações Hospedeiro-Parasita , Doenças das Plantas/imunologia , Plasmodioforídeos/patogenicidade , Fatores de Transcrição/genética , Transcriptoma
12.
BMC Genomics ; 19(1): 23, 2018 01 05.
Artigo em Inglês | MEDLINE | ID: mdl-29304736

RESUMO

BACKGROUND: Clubroot is an important disease caused by the obligate parasite Plasmodiophora brassicae that infects the Brassicaceae. As a soil-borne pathogen, P. brassicae induces the generation of abnormal tissue in the root, resulting in the formation of galls. Root infection negatively affects the uptake of water and nutrients in host plants, severely reducing their growth and productivity. Many studies have emphasized the molecular and physiological effects of the clubroot disease on root tissues. The aim of the present study is to better understand the effect of P. brassicae on the transcriptome of both shoot and root tissues of Arabidopsis thaliana. RESULTS: Transcriptome profiling using RNA-seq was performed on both shoot and root tissues at 17, 20 and 24 days post inoculation (dpi) of A. thaliana, a model plant host for P. brassicae. The number of differentially expressed genes (DEGs) between infected and uninfected samples was larger in shoot than in root. In both shoot and root, more genes were differentially regulated at 24 dpi than the two earlier time points. Genes that were highly regulated in response to infection in both shoot and root primarily were involved in the metabolism of cell wall compounds, lipids, and shikimate pathway metabolites. Among hormone-related pathways, several jasmonic acid biosynthesis genes were upregulated in both shoot and root tissue. Genes encoding enzymes involved in cell wall modification, biosynthesis of sucrose and starch, and several classes of transcription factors were generally differently regulated in shoot and root. CONCLUSIONS: These results highlight the similarities and differences in the transcriptomic response of above- and below-ground tissues of the model host Arabidopsis following P. brassicae infection. The main transcriptomic changes in root metabolism during clubroot disease progression were identified. An overview of DEGs in the shoot underlined the physiological changes in above-ground tissues following pathogen establishment and disease progression. This study provides insights into host tissue-specific molecular responses to clubroot development and may have applications in the development of clubroot markers for more effective breeding strategies.


Assuntos
Arabidopsis/genética , Arabidopsis/parasitologia , Regulação da Expressão Gênica de Plantas , Doenças das Plantas/parasitologia , Plasmodioforídeos , Transcriptoma , Arabidopsis/anatomia & histologia , Arabidopsis/metabolismo , Perfilação da Expressão Gênica , Doenças das Plantas/genética , Reguladores de Crescimento de Plantas/biossíntese , Raízes de Plantas/genética , Raízes de Plantas/metabolismo , Raízes de Plantas/parasitologia , Brotos de Planta/genética , Brotos de Planta/metabolismo , Brotos de Planta/parasitologia , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo
13.
Artigo em Inglês | MEDLINE | ID: mdl-28368811

RESUMO

De novo peptide sequencing using tandem mass spectrometry (MS/MS) data has become a major computational method for sequence identification in recent years. With the development of new instruments and technology, novel computational methods have emerged with enhanced performance. However, there are only a few methods focusing on ECD/ETD spectra, which mainly contain variants of c -ions and z-ions. Here, a de novo sequencing method for ECD/ETD spectra, NovoExD, is presented. NovoExD applies a new form of spectrum graph with multiple edge types (called a GMET), considers multiple peptide tags, and integrates amino acid combination (AAC) and fragment ion charge information. Its performance is compared with another successful de novo sequencing method, pNovo+, which has an option for ECD/ETD spectra. Experiments conducted on three different datasets show that the average full length peptide identification accuracy of NovoExD is as high as 88.70 percent, and that NovoExD's average accuracy is more than 20 percent greater on all datasets than that of pNovo+.


Assuntos
Peptídeos/análise , Peptídeos/química , Análise de Sequência de Proteína/métodos , Espectrometria de Massas em Tandem/métodos , Algoritmos , Bases de Dados de Proteínas
14.
Proteomics ; 16(20): 2615-2624, 2016 10.
Artigo em Inglês | MEDLINE | ID: mdl-27402425

RESUMO

In tandem mass spectrometry (MS/MS), there are several different fragmentation techniques possible, including, collision-induced dissociation (CID) higher energy collisional dissociation (HCD), electron-capture dissociation (ECD), and electron transfer dissociation (ETD). When using pairs of spectra for de novo peptide sequencing, the most popular methods are designed for CID (or HCD) and ECD (or ETD) spectra because of the complementarity between them. Less attention has been paid to the use of CID and HCD spectra pairs. In this study, a new de novo peptide sequencing method is proposed for these spectra pairs. This method includes a CID and HCD spectra merging criterion and a parent mass correction step, along with improvements to our previously proposed algorithm for sequencing merged spectra. Three pairs of spectral datasets were used to investigate and compare the performance of the proposed method with other existing methods designed for single spectrum (HCD or CID) sequencing. Experimental results showed that full-length peptide sequencing accuracy was increased significantly by using spectra pairs in the proposed method, with the highest accuracy reaching 81.31%.


Assuntos
Peptídeos/química , Análise de Sequência de Proteína/métodos , Espectrometria de Massas em Tandem/métodos , Algoritmos , Humanos , Proteômica/métodos
15.
Protein Pept Lett ; 22(11): 983-91, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26295161

RESUMO

Tandem mass spectrometry (MS/MS) has emerged as a major technology for peptide sequencing. Typically, there are three kinds of methods for the peptide sequencing: database searching, peptide tagging, and de novo sequencing. De novo sequencing has drawn increasing attention because of its independence from existing protein databases and potential for identifying new proteins, proteins resulting from mutations, proteins with unexpected modifications and so on. Recently, with the improvements in the accuracy of MS/MS and development of alternative fragmentation modes of MS/MS, many new de novo sequencing methods have been formulated. This paper reviews these recently developed sequencing methods including those for alternative MS/MS spectra. The paper first introduces background knowledge on peptide sequencing and mass spectrometry, and then reviews de novo peptide sequencing methods for traditional CID spectra. After that, it focuses on the recent development of de novo methods for alternative MS/MS spectra. In addition, methods using multiple spectra from the same peptide are surveyed. Finally, conclusions and some directions of future work are discussed.


Assuntos
Peptídeos/química , Proteômica/métodos , Análise de Sequência de Proteína/métodos , Espectrometria de Massas em Tandem/métodos , Peptídeos/análise
16.
IEEE Trans Nanobioscience ; 14(4): 478-484, 2015 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-25935039

RESUMO

With tandem mass spectrometry (MS/MS), spectra can be generated by various fragmentation techniques including collision-induced dissociation (CID), higher-energy collisional dissociation (HCD), electron capture dissociation (ECD), electron transfer dissociation (ETD) and so on. At the same time, de novo sequencing using multiple spectra from the same peptide generated by different fragmentation techniques is becoming popular in proteomics studies. The focus of this study is the use of paired spectra from CID (or HCD) and ECD (or ETD) fragmentation because of the complementarity between them. We present a de novo peptide sequencing framework for multiple tandem mass spectra, and apply it to paired spectra sequencing problem. The performance of the framework on paired spectra is compared to another successful method named pNovo+. The results show that our proposed method outperforms pNovo+ in terms of full length peptide sequencing accuracy on three pairs of experimental datasets, with the accuracy increasing up to 13.6% compared to pNovo+.

17.
IEEE Trans Nanobioscience ; 13(2): 65-72, 2014 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-24771591

RESUMO

In recent years, de novo peptide sequencing from mass spectrometry data has developed as one of the major peptide identification methods with the emergence of new instruments and advanced computational methods. However, there are still limitations to this method; for example, the typically used spectrum graph model cannot represent all the information and relationships inherent in tandem mass spectra (MS/MS spectra). Here, we present a new method named NovoHCD which applies a spectrum graph model with multiple types of edges (called a multi-edge graph), and integrates into it amino acid combination (AAC) information and peptide tags. In addition, information on immonium ions observed particularly in higher-energy collisional dissociation (HCD) spectra is incorporated. Comparisons between NovoHCD and another successful de novo peptide sequencing method for HCD spectra, pNovo, were performed. Experiments were conducted on five HCD spectral datasets. Results show that NovoHCD outperforms pNovo in terms of full length peptide identification accuracy; specifically, the accuracy increases 13%-21% over the five datasets.


Assuntos
Peptídeos/química , Análise de Sequência de Proteína/métodos , Algoritmos , Modelos Teóricos , Espectrometria de Massas em Tandem
18.
Artigo em Inglês | MEDLINE | ID: mdl-19841683

RESUMO

Computational gene regulation models provide a means for scientists to draw biological inferences from time-course gene expression data. Based on the state-space approach, we developed a new modeling tool for inferring gene regulatory networks, called time-delayed Gene Regulatory Networks (tdGRNs). tdGRN takes time-delayed regulatory relationships into consideration when developing the model. In addition, a priori biological knowledge from genome-wide location analysis is incorporated into the structure of the gene regulatory network. tdGRN is evaluated on both an artificial dataset and a published gene expression data set. It not only determines regulatory relationships that are known to exist but also uncovers potential new ones. The results indicate that the proposed tool is effective in inferring gene regulatory relationships with time delay. tdGRN is complementary to existing methods for inferring gene regulatory networks. The novel part of the proposed tool is that it is able to infer time-delayed regulatory relationships.

19.
J Bioinform Comput Biol ; 4(5): 959-80, 2006 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-17099936

RESUMO

Hidden Markov models (HMMs) are one of various methods that have been applied to prediction of major histo-compatibility complex (MHC) binding peptide. In terms of model topology, a fully-connected HMM (fcHMM) has the greatest potential to predict binders, at the cost of intensive computation. While a profile HMM (pHMM) performs dramatically fewer computations, it potentially merges overlapping patterns into one which results in some patterns being missed. In a profile HMM a state corresponds to a position on a peptide while in an fcHMM a state has no specific biological meaning. This work proposes optimally-connected HMMs (ocHMMs), which do not merge overlapping patterns and yet, by performing topological reductions, a model's connectivity is greatly reduced from an fcHMM. The parameters of ocHMMs are initialized using a novel amino acid grouping approach called "multiple property grouping." Each group represents a state in an ocHMM. The proposed ocHMMs are compared to a pHMM implementation using HMMER, based on performance tests on two MHC alleles HLA (Human Leukocyte Antigen)-A*0201 and HLA-B*3501. The results show that the heuristic approaches can be adjusted to make an ocHMM achieve higher predictive accuracy than HMMER. Hence, such obtained ocHMMs are worthy of trial for predicting MHC-binding peptides.


Assuntos
Inteligência Artificial , Antígenos de Histocompatibilidade Classe I/química , Modelos Químicos , Modelos Moleculares , Peptídeos/química , Mapeamento de Interação de Proteínas/métodos , Análise de Sequência de Proteína/métodos , Sequência de Aminoácidos , Sítios de Ligação , Simulação por Computador , Cadeias de Markov , Dados de Sequência Molecular , Ligação Proteica
20.
BMC Bioinformatics ; 7 Suppl 4: S13, 2006 Dec 12.
Artigo em Inglês | MEDLINE | ID: mdl-17217505

RESUMO

BACKGROUND: One type of DNA microarray experiment is discovery of gene expression patterns for a cell line undergoing a biological process over a series of time points. Two important issues with such an experiment are the number of time points, and the interval between them. In the absence of biological knowledge regarding appropriate values, it is natural to question whether the behaviour of progressively generated data may by itself determine a threshold beyond which further microarray experiments do not contribute to pattern discovery. Additionally, such a threshold implies a minimum number of microarray experiments, which is important given the cost of these experiments. RESULTS: We have developed a method for determining the minimum number of microarray experiments (i.e. time points) for temporal gene expression, assuming that the span between time points is given and the hierarchical clustering technique is used for gene expression pattern discovery. The key idea is a similarity measure for two clusterings which is expressed as a function of the data for progressive time points. While the experiments are underway, this function is evaluated. When the function reaches its maximum, it indicates the set of experiments reach a saturated state. Therefore, further experiments do not contribute to the discrimination of patterns. CONCLUSION: The method has been verified with two previously published gene expression datasets. For both experiments, the number of time points determined with our method is less than in the published experiments. It is noted that the overall approach is applicable to other clustering techniques.


Assuntos
Algoritmos , Análise por Conglomerados , Perfilação da Expressão Gênica/métodos , Expressão Gênica/fisiologia , Modelos Genéticos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Reconhecimento Automatizado de Padrão/métodos , Inteligência Artificial , Simulação por Computador , Interpretação Estatística de Dados , Análise Discriminante , Modelos Estatísticos , Tamanho da Amostra
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA