RESUMEN
SUMMARY: Despite the improvement in variant detection algorithms, visual inspection of the read-level data remains an essential step for accurate identification of variants in genome analysis. We developed BamSnap, an efficient BAM file viewer utilizing a graphics library and BAM indexing. In contrast to existing viewers, BamSnap can generate high-quality snapshots rapidly, with customized tracks and layout. As an example, we produced read-level images at 1000 genomic loci for >2500 whole-genomes. AVAILABILITY AND IMPLEMENTATION: BamSnap is freely available at https://github.com/parklab/bamsnap. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
RESUMEN
OBJECTIVES: To determine gene-gene interactions and missing heritability of complex diseases is a challenging topic in genome-wide association studies. The multifactor dimensionality reduction (MDR) method is one of the most commonly used methods for identifying gene-gene interactions with dichotomous phenotypes. For quantitative phenotypes, the generalized MDR or quantitative MDR (QMDR) methods have been proposed. These methods are known as univariate methods because they consider only one phenotype. To date, there are few methods for analyzing multiple phenotypes. METHODS: To address this problem, we propose a multivariate QMDR method (Multi-QMDR) for multivariate correlated phenotypes. We summarize the multivariate phenotypes into a univariate score by dimensional reduction analysis, and then classify the samples accordingly into high-risk and low-risk groups. We use different ways of summarizing mainly based on the principal components. Multi-QMDR is model-free and easy to implement. RESULTS: Multi-QMDR is applied to lipid-related traits. The properties of Multi- QMDR were investigated through simulation studies. Empirical studies show that Multi-QMDR outperforms existing univariate and multivariate methods at identifying causal interactions. CONCLUSIONS: The Multi-QMDR approach improves the performance of QMDR when multiple quantitative phenotypes are available.
Asunto(s)
Epistasis Genética , Reducción de Dimensionalidad Multifactorial , Simulación por Computador , Redes Reguladoras de Genes , Humanos , Metabolismo de los Lípidos/genética , Análisis Multivariante , Fenotipo , Polimorfismo de Nucleótido Simple/genéticaRESUMEN
We propose a new approach to detect gene × gene joint action in genome-wide association studies (GWASs) for case-control designs. This approach offers an exhaustive search for all two-way joint action (including, as a special case, single gene action) that is computationally feasible at the genome-wide level and has reasonable statistical power under most genetic models. We found that the presence of any gene × gene joint action may imply differences in three types of genetic components: the minor allele frequencies and the amounts of Hardy-Weinberg disequilibrium may differ between cases and controls, and between the two genetic loci the degree of linkage disequilibrium may differ between cases and controls. Using Fisher's method, it is possible to combine the different sources of genetic information in an overall test for detecting gene × gene joint action. The proposed statistical analysis is efficient and its simplicity makes it applicable to GWASs. In the current study, we applied the proposed approach to a GWAS on schizophrenia and found several potential gene × gene interactions. Our application illustrates the practical advantage of the proposed method.
Asunto(s)
Genes , Estudio de Asociación del Genoma Completo/métodos , Esquizofrenia/genética , Estudios de Casos y Controles , Epistasis Genética/genética , Frecuencia de los Genes/genética , Sitios Genéticos/genética , Predisposición Genética a la Enfermedad/genética , Genoma Humano/genética , Humanos , Desequilibrio de Ligamiento/genética , Modelos Genéticos , Polimorfismo de Nucleótido Simple/genéticaRESUMEN
BACKGROUND: microRNA (miRNA) expression plays an influential role in cancer classification and malignancy, and miRNAs are feasible as alternative diagnostic markers for pancreatic cancer, a highly aggressive neoplasm with silent early symptoms, high metastatic potential, and resistance to conventional therapies. METHODS: In this study, we evaluated the benefits of multi-omics data analysis by integrating miRNA and mRNA expression data in pancreatic cancer. Using support vector machine (SVM) modelling and leave-one-out cross validation (LOOCV), we evaluated the diagnostic performance of single- or multi-markers based on miRNA and mRNA expression profiles from 104 PDAC tissues and 17 benign pancreatic tissues. For selecting even more reliable and robust markers, we performed validation by independent datasets from the Gene Expression Omnibus (GEO) data depository. For validation, miRNA activity was estimated by miRNA-target gene interaction and mRNA expression datasets in pancreatic cancer. RESULTS: Using a comprehensive identification approach, we successfully identified 705 multi-markers having powerful diagnostic performance for PDAC. In addition, these marker candidates annotated with cancer pathways using gene ontology analysis. CONCLUSIONS: Our prediction models have strong potential for the diagnosis of pancreatic cancer.
Asunto(s)
Biomarcadores de Tumor/genética , Biología Computacional , MicroARNs/metabolismo , Neoplasias Pancreáticas/diagnóstico , Neoplasias Pancreáticas/genética , ARN Mensajero/metabolismo , Transcriptoma , Humanos , Neoplasias Pancreáticas/metabolismoRESUMEN
BACKGROUND/AIM: Lung cancer remains a leading cause of cancer-related mortality worldwide, necessitating the development of effective early diagnostic strategies. Despite advancements in imaging and screening technologies, late-stage diagnoses remain common, limiting treatment options and reducing survival rates. Thus, there is a critical need for reliable, minimally invasive biomarkers to improve early detection and patient outcomes. Plasma protein biomarkers offer promising potential for early lung cancer detection and continuous disease monitoring. This study explored the potential of specific plasma protein markers as early indicators of lung cancer. PATIENTS AND METHODS: Plasma samples were collected from normal healthy individuals and lung cancer patients, and protein purification and analysis were conducted using LC-MS/MS. A mixed-effect model was applied to select lung cancer-related protein markers based on label-free relative quantification values. RESULTS: We identified 29 proteins with potential for early lung cancer diagnosis, including complement proteins (CFB, C3, C8G, C1QA, C1R, C6), orosomucoid proteins (ORM1, ORM2), ceruloplasmin (CP), alpha-1-B glycoprotein (A1BG), and others. These proteins play diverse roles in immune response, inflammation, and cell signaling, suggesting their relevance in lung cancer pathophysiology. CONCLUSION: Our findings suggest the potential of plasma proteins as early diagnostic biomarkers for lung cancer. Further validation in larger cohorts is needed to confirm their clinical utility. Integrating these biomarkers into existing diagnostic modalities could enhance early detection accuracy, leading to improved patient outcomes.
Asunto(s)
Biomarcadores de Tumor , Proteínas Sanguíneas , Neoplasias Pulmonares , Humanos , Neoplasias Pulmonares/sangre , Neoplasias Pulmonares/diagnóstico , Biomarcadores de Tumor/sangre , Proteínas Sanguíneas/análisis , Femenino , Masculino , Persona de Mediana Edad , Espectrometría de Masas en Tándem , Anciano , Detección Precoz del Cáncer/métodos , Proteómica/métodos , Cromatografía Liquida , Estudios de Casos y Controles , AdultoRESUMEN
This research introduces a vascular phenotypic and proteomic analysis (VPT) platform designed to perform high-throughput experiments on vascular development. The VPT platform utilizes an open-channel configuration that facilitates angiogenesis by precise alignment of endothelial cells, allowing for a 3D morphological examination and protein analysis. We study the effects of antiangiogenic agentsâbevacizumab, ramucirumab, cabozantinib, regorafenib, wortmannin, chloroquine, and paclitaxelâon cytoskeletal integrity and angiogenic sprouting, observing an approximately 50% reduction in sprouting at higher drug concentrations. Precise LC-MS/MS analyses reveal global protein expression changes in response to four of these drugs, providing insights into the signaling pathways related to the cell cycle, cytoskeleton, cellular senescence, and angiogenesis. Our findings emphasize the intricate relationship between cytoskeletal alterations and angiogenic responses, underlining the significance of integrating morphological and proteomic data for a comprehensive understanding of angiogenesis. The VPT platform not only advances our understanding of drug impacts on vascular biology but also offers a versatile tool for analyzing proteome and morphological features across various models beyond blood vessels.
Asunto(s)
Inhibidores de la Angiogénesis , Células Endoteliales de la Vena Umbilical Humana , Proteómica , Humanos , Inhibidores de la Angiogénesis/farmacología , Inhibidores de la Angiogénesis/química , Células Endoteliales de la Vena Umbilical Humana/metabolismo , Células Endoteliales de la Vena Umbilical Humana/efectos de los fármacos , Fenotipo , Neovascularización Fisiológica/efectos de los fármacosRESUMEN
MOTIVATION: For the past few decades, many statistical methods in genome-wide association studies (GWAS) have been developed to identify SNP-SNP interactions for case-control studies. However, there has been less work for prospective cohort studies, involving the survival time. Recently, Gui et al. (2011) proposed a novel method, called Surv-MDR, for detecting gene-gene interactions associated with survival time. Surv-MDR is an extension of the multifactor dimensionality reduction (MDR) method to the survival phenotype by using the log-rank test for defining a binary attribute. However, the Surv-MDR method has some drawbacks in the sense that it needs more intensive computations and does not allow for a covariate adjustment. In this article, we propose a new approach, called Cox-MDR, which is an extension of the generalized multifactor dimensionality reduction (GMDR) to the survival phenotype by using a martingale residual as a score to classify multi-level genotypes as high- and low-risk groups. The advantages of Cox-MDR over Surv-MDR are to allow for the effects of discrete and quantitative covariates in the frame of Cox regression model and to require less computation than Surv-MDR. RESULTS: Through simulation studies, we compared the power of Cox-MDR with those of Surv-MDR and Cox regression model for various heritability and minor allele frequency combinations without and with adjusting for covariate. We found that Cox-MDR and Cox regression model perform better than Surv-MDR for low minor allele frequency of 0.2, but Surv-MDR has high power for minor allele frequency of 0.4. However, when the effect of covariate is adjusted for, Cox-MDR and Cox regression model perform much better than Surv-MDR. We also compared the performance of Cox-MDR and Surv-MDR for a real data of leukemia patients to detect the gene-gene interactions with the survival time. CONTACT: leesy@sejong.ac.kr; tspark@snu.ac.kr.
Asunto(s)
Reducción de Dimensionalidad Multifactorial/métodos , Modelos de Riesgos Proporcionales , Algoritmos , Femenino , Frecuencia de los Genes , Estudio de Asociación del Genoma Completo , Genotipo , Humanos , Leucemia Mieloide Aguda/genética , Leucemia Mieloide Aguda/mortalidad , Masculino , Modelos Genéticos , Fenotipo , Polimorfismo de Nucleótido SimpleRESUMEN
BACKGROUND: Because common complex diseases are affected by multiple genes and environmental factors, it is essential to investigate gene-gene and/or gene-environment interactions to understand genetic architecture of complex diseases. After the great success of large scale genome-wide association (GWA) studies using the high density single nucleotide polymorphism (SNP) chips, the study of gene-gene interaction becomes a next challenge. Multifactor dimensionality reduction (MDR) analysis has been widely used for the gene-gene interaction analysis. In practice, however, it is not easy to perform high order gene-gene interaction analyses via MDR in genome-wide level because it requires exploring a huge search space and suffers from a computational burden due to high dimensionality. RESULTS: We propose dimensional reduction analysis, Gene-MDR analysis for the fast and efficient high order gene-gene interaction analysis. The proposed Gene-MDR method is composed of two-step applications of MDR: within- and between-gene MDR analyses. First, within-gene MDR analysis summarizes each gene effect via MDR analysis by combining multiple SNPs from the same gene. Second, between-gene MDR analysis then performs interaction analysis using the summarized gene effects from within-gene MDR analysis. We apply the Gene-MDR method to bipolar disorder (BD) GWA data from Wellcome Trust Case Control Consortium (WTCCC). The results demonstrate that Gene-MDR is capable of detecting high order gene-gene interactions associated with BD. CONCLUSION: By reducing the dimension of genome-wide data from SNP level to gene level, Gene-MDR efficiently identifies high order gene-gene interactions. Therefore, Gene-MDR can provide the key to understand complex disease etiology.
Asunto(s)
Biología Computacional/métodos , Interacción Gen-Ambiente , Estudio de Asociación del Genoma Completo/métodos , Reducción de Dimensionalidad Multifactorial/métodos , Algoritmos , Trastorno Bipolar/genética , Predisposición Genética a la Enfermedad , Humanos , Polimorfismo de Nucleótido SimpleRESUMEN
Disease biomarkers are predicted to be in low abundance; thus, the most crucial step of biomarker discovery is the efficient fractionation of clinical samples into protein sets that define disease stages and/or predict disease development. For this purpose, we developed a new platform that uses peptide-based size exclusion chromatography (pep-SEC) to quantify disease biomarker candidates. This new platform has many advantages over previously described biomarker profiling platforms, including short run time, high resolution, and good reproducibility, which make it suitable for large-scale analysis. We combined this platform with isotope labeling and label-free methods to identify and quantitate differentially expressed proteins in hepatocellular carcinoma (HCC) tissues. When we combined pep-SEC with a gas phase fractionation method, which broadens precursor ion selection, the protein coverage was significantly increased, which is critical for the global profiling of HCC specimens. Furthermore, pep-SEC-LC-MS/MS analysis enhanced the detection of low-abundance proteins (e.g. insulin receptor substrate 2 and carboxylesterase 1) and glycopeptides in HCC plasma. Thus, our pep-SEC platform is an efficient and versatile pre-fractionation system for the large-scale profiling and quantitation of candidate biomarkers in complex disease proteomes.
Asunto(s)
Biomarcadores de Tumor/análisis , Carcinoma Hepatocelular/química , Cromatografía en Gel/métodos , Neoplasias Hepáticas/química , Fragmentos de Péptidos/análisis , Secuencia de Aminoácidos , Biomarcadores de Tumor/sangre , Carcinoma Hepatocelular/sangre , Carcinoma Hepatocelular/metabolismo , Cromatografía Liquida , Glicopéptidos/análisis , Humanos , Hígado/química , Neoplasias Hepáticas/sangre , Neoplasias Hepáticas/metabolismo , Datos de Secuencia Molecular , Proteínas de Neoplasias/análisis , Proteínas de Neoplasias/metabolismo , Fragmentos de Péptidos/metabolismo , Mapeo Peptídico , Proteómica , Espectrometría de Masas en TándemRESUMEN
BACKGROUND: Quantification of protein expression by means of mass spectrometry (MS) has been introduced in various proteomics studies. In particular, two label-free quantification methods, such as spectral counting and spectra feature analysis have been extensively investigated in a wide variety of proteomic studies. The cornerstone of both methods is peptide identification based on a proteomic database search and subsequent estimation of peptide retention time. However, they often suffer from restrictive database search and inaccurate estimation of the liquid chromatography (LC) retention time. Furthermore, conventional peptide identification methods based on the spectral library search algorithms such as SEQUEST or SpectraST have been found to provide neither the best match nor high-scored matches. Lastly, these methods are limited in the sense that target peptides cannot be identified unless they have been previously generated and stored into the database or spectral libraries.To overcome these limitations, we propose a novel method, namely Quantification method based on Finding the Identical Spectral set for a Homogenous peptide (Q-FISH) to estimate the peptide's abundance from its tandem mass spectrometry (MS/MS) spectra through the direct comparison of experimental spectra. Intuitively, our Q-FISH method compares all possible pairs of experimental spectra in order to identify both known and novel proteins, significantly enhancing identification accuracy by grouping replicated spectra from the same peptide targets. RESULTS: We applied Q-FISH to Nano-LC-MS/MS data obtained from human hepatocellular carcinoma (HCC) and normal liver tissue samples to identify differentially expressed peptides between the normal and disease samples. For a total of 44,318 spectra obtained through MS/MS analysis, Q-FISH yielded 14,747 clusters. Among these, 5,777 clusters were identified only in the HCC sample, 6,648 clusters only in the normal tissue sample, and 2,323 clusters both in the HCC and normal tissue samples. While it will be interesting to investigate peptide clusters only found from one sample, further examined spectral clusters identified both in the HCC and normal samples since our goal is to identify and assess differentially expressed peptides quantitatively. The next step was to perform a beta-binomial test to isolate differentially expressed peptides between the HCC and normal tissue samples. This test resulted in 84 peptides with significantly differential spectral counts between the HCC and normal tissue samples. We independently identified 50 and 95 peptides by SEQUEST, of which 24 and 56 peptides, respectively, were found to be known biomarkers for the human liver cancer. Comparing Q-FISH and SEQUEST results, we found 22 of the differentially expressed 84 peptides by Q-FISH were also identified by SEQUEST. Remarkably, of these 22 peptides discovered both by Q-FISH and SEQUEST, 13 peptides are known for human liver cancer and the remaining 9 peptides are known to be associated with other cancers. CONCLUSIONS: We proposed a novel statistical method, Q-FISH, for accurately identifying protein species and simultaneously quantifying the expression levels of identified peptides from mass spectrometry data. Q-FISH analysis on human HCC and liver tissue samples identified many protein biomarkers that are highly relevant to HCC. Q-FISH can be a useful tool both for peptide identification and quantification on mass spectrometry data analysis. It may also prove to be more effective in discovering novel protein biomarkers than SEQUEST and other standard methods.
Asunto(s)
Péptidos/análisis , Proteómica/métodos , Algoritmos , Carcinoma Hepatocelular/metabolismo , Cromatografía Liquida , Análisis por Conglomerados , Humanos , Neoplasias Hepáticas/metabolismo , Programas Informáticos , Espectrometría de Masas en Tándem/métodosRESUMEN
Although cell lineage information is fundamental to understanding organismal development, very little direct information is available for humans. We performed high-depth (250×) whole-genome sequencing of multiple tissues from three individuals to identify hundreds of somatic single-nucleotide variants (sSNVs). Using these variants as "endogenous barcodes" in single cells, we reconstructed early embryonic cell divisions. Targeted sequencing of clonal sSNVs in different organs (about 25,000×) and in more than 1000 cortical single cells, as well as single-nucleus RNA sequencing and single-nucleus assay for transposase-accessible chromatin sequencing of ~100,000 cortical single cells, demonstrated asymmetric contributions of early progenitors to extraembryonic tissues, distinct germ layers, and organs. Our data suggest onset of gastrulation at an effective progenitor pool of about 170 cells and about 50 to 100 founders for the forebrain. Thus, mosaic mutations provide a permanent record of human embryonic development at very high resolution.
Asunto(s)
Linaje de la Célula , Gastrulación , Mutación , Células-Madre Neurales/citología , Prosencéfalo/citología , Adolescente , Adulto , División Celular , Células Clonales/citología , Desarrollo Embrionario/genética , Femenino , Gástrula/citología , Variación Genética , Estratos Germinativos/citología , Humanos , Masculino , Neuronas/citología , Organogénesis , Polimorfismo de Nucleótido Simple , Prosencéfalo/embriología , Análisis de la Célula Individual , Secuenciación Completa del GenomaRESUMEN
We characterize the landscape of somatic mutations-mutations occurring after fertilization-in the human brain using ultra-deep (~250×) whole-genome sequencing of prefrontal cortex from 59 donors with autism spectrum disorder (ASD) and 15 control donors. We observe a mean of 26 somatic single-nucleotide variants per brain present in ≥4% of cells, with enrichment of mutations in coding and putative regulatory regions. Our analysis reveals that the first cell division after fertilization produces ~3.4 mutations, followed by 2-3 mutations in subsequent generations. This suggests that a typical individual possesses ~80 somatic single-nucleotide variants present in ≥2% of cells-comparable to the number of de novo germline mutations per generation-with about half of individuals having at least one potentially function-altering somatic mutation somewhere in the cortex. ASD brains show an excess of somatic mutations in neural enhancer sequences compared with controls, suggesting that mosaic enhancer mutations may contribute to ASD risk.
Asunto(s)
Trastorno del Espectro Autista/patología , Corteza Prefrontal/patología , División Celular/genética , Cromatina/genética , Desarrollo Embrionario/genética , Epigénesis Genética , Exones , Femenino , Redes Reguladoras de Genes/genética , Predisposición Genética a la Enfermedad , Genoma Humano/genética , Mutación de Línea Germinal/genética , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Polimorfismo de Nucleótido Simple , Embarazo , Secuenciación Completa del GenomaRESUMEN
The Asia Oceania Human Proteome Organisation (AOHUPO) has embarked on a Membrane Proteomics Initiative with goals of systematic comparison of strategies for analysis of membrane proteomes and discovery of membrane proteins. This multilaboratory project is based on the analysis of a subcellular fraction from mouse liver that contains endoplasmic reticulum and other organelles. In this study, we present the strategy used for the preparation and initial characterization of the membrane sample, including validation that the carbonate-washing step enriches for integral and lipid-anchored membrane proteins. Analysis of 17 independent data sets from five types of proteomic workflows is in progress.
Asunto(s)
Membrana Celular/química , Membranas Intracelulares/química , Proteínas de la Membrana/química , Proteoma , Proteómica/normas , Animales , Asia , Carbonatos , Humanos , Proteínas de la Membrana/normas , Ratones , Oceanía , Organizaciones , Proteómica/métodosRESUMEN
MOTIVATION: Gene-gene interactions are important contributors to complex biological traits. Multifactor dimensionality reduction (MDR) is a method to analyze gene-gene interactions and has been applied to many genetics studies of complex diseases. In order to identify the best interaction model associated with disease susceptibility, MDR classifiers corresponding to interaction models has been constructed and evaluated as a predictor of disease status via a certain measure such as balanced accuracy (BA). It has been shown that the performance of MDR tends to depend on the choice of the evaluation measures. RESULTS: In this article, we introduce two types of new evaluation measures. First, we develop weighted BA (wBA) that utilizes the quantitative information on the effect size of each multi-locus genotype on a trait. Second, we employ ordinal association measures to assess the performance of MDR classifiers. Simulation studies were conducted to compare the proposed measures with BA, a current measure. Our results showed that the wBA and tau(b) improved the power of MDR in detecting gene-gene interactions. Noticeably, the power increment was higher when data contains the greater number of genetic markers. Finally, we applied the proposed evaluation measures to real data.
Asunto(s)
Predisposición Genética a la Enfermedad , Genotipo , Simulación por Computador , Expresión Génica , Frecuencia de los Genes , Marcadores GenéticosRESUMEN
Detection of mosaic mutations that arise in normal development is challenging, as such mutations are typically present in only a minute fraction of cells and there is no clear matched control for removing germline variants and systematic artifacts. We present MosaicForecast, a machine-learning method that leverages read-based phasing and read-level features to accurately detect mosaic single-nucleotide variants and indels, achieving a multifold increase in specificity compared with existing algorithms. Using single-cell sequencing and targeted sequencing, we validated 80-90% of the mosaic single-nucleotide variants and 60-80% of indels detected in human brain whole-genome sequencing data. Our method should help elucidate the contribution of mosaic somatic mutations to the origin and development of disease.
Asunto(s)
Mutación INDEL , Polimorfismo de Nucleótido Simple , Análisis de la Célula Individual/métodos , Secuenciación Completa del Genoma/métodos , Química Encefálica , Mutación de Línea Germinal , Humanos , Aprendizaje Automático , Mosaicismo , Programas InformáticosRESUMEN
Reversible phosphorylation of proteins is the most common PTM in cell-signaling pathways. Despite this, high-throughput methods for the systematic detection, identification, and quantification of phosphorylated peptides have yet to be developed. In this paper, we describe the establishment of an efficient online titaniuim dioxide (TiO2)-based 3-D LC (strong cationic exchange/TiO2/C18)-MS(3)-linear ion trap system, which provides fully automatic and highly efficient identification of phosphorylation sites in complex peptide mixtures. Using this system, low-abundance phosphopeptides were isolated from cell lines, plasma, and tissue of healthy and hepatocellular carcinoma (HCC) patients. Furthermore, the phosphorylation sites were identified and the differences in phosphorylation levels between healthy and HCC patient specimens were quantified by labeling the phosphopeptides with isotopic analogs of amino acids (stable isotope labeling with amino acids in cell culture for HepG2 cells) or water (H(2) (18)O for tissues and plasma). Two examples of potential HCC phospho-biomarkers including plectin-1(phopho-Ser-4253) and alpha-HS-glycoprotein (phospho-Ser 138 and 312) were identified by this analysis. Our results suggest that this comprehensive TiO2-based online-3-D LC-MS(3)-linear ion trap system with high-throughput potential will be useful for the global profiling and quantification of the phosphoproteome and the identification of disease biomarkers.
Asunto(s)
Biomarcadores de Tumor/análisis , Carcinoma Hepatocelular/química , Neoplasias Hepáticas/química , Fosfopéptidos/análisis , Acetatos/química , Secuencia de Aminoácidos , Biomarcadores de Tumor/sangre , Biomarcadores de Tumor/aislamiento & purificación , Carcinoma Hepatocelular/sangre , Cromatografía por Intercambio Iónico/métodos , Humanos , Marcaje Isotópico/métodos , Neoplasias Hepáticas/sangre , Espectrometría de Masas/métodos , Datos de Secuencia Molecular , Fosfopéptidos/sangre , Fosfopéptidos/aislamiento & purificación , Proteoma/análisis , Reproducibilidad de los Resultados , Titanio/químicaRESUMEN
We have developed a proteome database (DB), BiomarkerDigger (http://biomarkerdigger.org) that automates data analysis, searching, and metadata-gathering function. The metadata-gathering function searches proteome DBs for protein-protein interaction, Gene Ontology, protein domain, Online Mendelian Inheritance in Man, and tissue expression profile information and integrates it into protein data sets that are accessed through a search function in BiomarkerDigger. This DB also facilitates cross-proteome comparisons by classifying proteins based on their annotation. BiomarkerDigger highlights relationships between a given protein in a proteomic data set and any known biomarkers or biomarker candidates. The newly developed BiomarkerDigger system is useful for multi-level synthesis, comparison, and analyses of data sets obtained from currently available web sources. We demonstrate the application of this resource to the identification of a serological biomarker for hepatocellular carcinoma by comparison of plasma and tissue proteomic data sets from healthy volunteers and cancer patients.
Asunto(s)
Biomarcadores de Tumor/análisis , Bases de Datos de Proteínas , Neoplasias de Células Plasmáticas/metabolismo , Proteómica/métodos , Programas Informáticos , Biología Computacional/métodos , Humanos , Modelos Teóricos , Interfaz Usuario-ComputadorRESUMEN
UNLABELLED: Pharmaceutical industry has been striving to reduce the costs of drug development and increase productivity. Among the many different attempts, drug repositioning (retargeting existing drugs) comes into the spotlight because of its financial efficiency. We introduce IDMap which predicts novel relationships between targets and chemicals and thus is capable of repositioning the marketed drugs by using text mining and chemical structure information. Also capable of mapping commercial chemicals to possible drug targets and vice versa, IDMap creates convenient environments for identifying the potential lead and its targets, especially in the field of drug repositioning. AVAILABILITY: IDMap executable and its user manual including color images are freely available to non-commercial users at http://www.equispharm.com/idmap
Asunto(s)
Inteligencia Artificial , Sistemas de Administración de Bases de Datos , Bases de Datos Factuales , Diseño de Fármacos , Quimioterapia/métodos , Preparaciones Farmacéuticas/química , Programas Informáticos , Almacenamiento y Recuperación de la Información/métodosRESUMEN
Whole-genome sequencing of DNA from single cells has the potential to reshape our understanding of mutational heterogeneity in normal and diseased tissues. However, a major difficulty is distinguishing amplification artifacts from biologically derived somatic mutations. Here, we describe linked-read analysis (LiRA), a method that accurately identifies somatic single-nucleotide variants (sSNVs) by using read-level phasing with nearby germline heterozygous polymorphisms, thereby enabling the characterization of mutational signatures and estimation of somatic mutation rates in single cells.