Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 69
Filter
1.
Sci Rep ; 14(1): 16537, 2024 07 17.
Article in English | MEDLINE | ID: mdl-39019929

ABSTRACT

Long noncoding RNAs (lncRNAs) are RNA molecules with a length greater than 200 nucleotides that do not code for functional proteins. Although, genes play a vital role in immune response against a disease, it is less known that lncRNAs also contribute through gene regulation. Bovine tuberculosis is a significant zoonotic disease caused by Mycobacterium bovis (M. bovis) in cattle. Here, we report the in-silico analysis of the publicly available transcriptomic data of calves infected with M. bovis. A total of 51,812 lncRNAs were extracted across all the samples. A total of 216 genes and 260 lncRNAs were found to be differentially expressed across all the 4 conditions-infected vs uninfected at 8- and 20-week post-infection (WPI), 8 vs 20-WPI of both infected and uninfected. Gene Ontology and Functional annotation showed that 8 DEGs were annotated with immune system GOs and 2 DEGs with REACTOME immune system pathways. Co-expression analysis of DElncRNAs with DEGs revealed the involvement of lncRNAs with the genes annotated with Immune related GOs and pathways. Overall, our study sheds light on the dynamic transcriptomic changes in response to M. bovis infection, particularly highlighting the involvement of lncRNAs with immune-related genes. The identified immune pathways and gene-lncRNA interactions offer valuable insights for further research in understanding host-pathogen interactions and potential avenues for genetic improvement strategies in cattle.


Subject(s)
Mycobacterium bovis , RNA, Long Noncoding , Transcriptome , Tuberculosis, Bovine , Animals , Cattle , RNA, Long Noncoding/genetics , Tuberculosis, Bovine/genetics , Tuberculosis, Bovine/immunology , Tuberculosis, Bovine/microbiology , Computer Simulation , Gene Expression Profiling , Gene Ontology , Gene Regulatory Networks , Gene Expression Regulation
2.
Cureus ; 16(5): e60044, 2024 May.
Article in English | MEDLINE | ID: mdl-38854210

ABSTRACT

Background Clinical trial matching, essential for advancing medical research, involves detailed screening of potential participants to ensure alignment with specific trial requirements. Research staff face challenges due to the high volume of eligible patients and the complexity of varying eligibility criteria. The traditional manual process, both time-consuming and error-prone, often leads to missed opportunities. Recently, large language models (LLMs), specifically generative pre-trained transformers (GPTs), have become impressive and impactful tools. Utilizing such tools from artificial intelligence (AI) and natural language processing (NLP) may enhance the accuracy and efficiency of this process through automated patient screening against established criteria. Methods Utilizing data from the National NLP Clinical Challenges (n2c2) 2018 Challenge, we utilized 202 longitudinal patient records. These records were annotated by medical professionals and evaluated against 13 selection criteria encompassing various health assessments. Our approach involved embedding medical documents into a vector database to determine relevant document sections and then using an LLM (OpenAI's GPT-3.5 Turbo and GPT-4) in tandem with structured and chain-of-thought prompting techniques for systematic document assessment against the criteria. Misclassified criteria were also examined to identify classification challenges. Results This study achieved an accuracy of 0.81, sensitivity of 0.80, specificity of 0.82, and a micro F1 score of 0.79 using GPT-3.5 Turbo, and an accuracy of 0.87, sensitivity of 0.85, specificity of 0.89, and micro F1 score of 0.86 using GPT-4. Notably, some criteria in the ground truth appeared mislabeled, an issue we couldn't explore further due to insufficient label generation guidelines on the website. Conclusion Our findings underscore the potential of AI and NLP technologies, including LLMs, in the clinical trial matching process. The study demonstrated strong capabilities in identifying eligible patients and minimizing false inclusions. Such automated systems promise to alleviate the workload of research staff and improve clinical trial enrollment, thus accelerating the process and enhancing the overall feasibility of clinical research. Further work is needed to determine the potential of this approach when implemented on real clinical data.

3.
Environ Sci Pollut Res Int ; 31(24): 35173-35193, 2024 May.
Article in English | MEDLINE | ID: mdl-38722519

ABSTRACT

Nowadays, concurrent attention to economic development and ecological issues is becoming an important trend. In this paper, we measure the eco-efficiency of 285 Chinese cities from 2003 to 2019 using a non-radial directional distance function and the data envelopment analysis method, based on which we analyze the club convergence of cities' eco-efficiency using the logt test; we estimate the impact of open public data platforms on eco-efficiency and its convergence using a multi-period difference in difference model and panel-ordered logit model, respectively. We find that, first, open public data platforms improve cities' eco-efficiency by about 6.5%, and the impact mechanisms include scale efficiency, technical efficiency, and total factor productivity, or, at the micro level, increasing the economic agglomeration degree, boosting the amount of foreign investment used, and increasing green innovation level. Second, there are three convergence clubs of eco-efficiency in China's cities, whose average eco-efficiency trends are above, close to, and below average, respectively. Third, public data platforms significantly increase the probability of cities belonging to the convergence clubs of high and medium eco-efficiency (Clubs 1 and 2) and decrease the probability of belonging to the low one (Club 3). However, the mechanisms only include technical efficiency and total factor productivity, or the amount of foreign investment used and the green innovation level at the micro level.


Subject(s)
Cities , China , Economic Development , Conservation of Natural Resources , Ecology
4.
Cancer Res Treat ; 2024 Apr 30.
Article in English | MEDLINE | ID: mdl-38697846

ABSTRACT

This paper provides a comprehensive overview of the Cancer Public Library Database (CPLD), established under the Korean Clinical Data Utilization for Research Excellence project (K-CURE). The CPLD links data from four major population-based public sources: the Korea National Cancer Incidence Database in the Korea Central Cancer Registry, cause-of-death data in Statistics Korea, the National Health Information Database in the National Health Insurance Service, and the National Health Insurance Research Database in the Health Insurance Review & Assessment Service. These databases are linked using an encrypted resident registration number. The CPLD, established in 2022 and updated annually, comprises 1,983,499 men and women newly diagnosed with cancer between 2012 and 2019. It contains data on cancer registration and death, demographics, medical claims, general health checkups, and national cancer screening. The most common cancers among men in the CPLD were stomach (16.1%), lung (14.0%), colorectal (13.3%), prostate (9.6%), and liver (9.3%) cancers. The most common cancers among women were thyroid (20.4%), breast (16.6%), colorectal (9.0%), stomach (7.8%), and lung (6.2%) cancers. Among them, 571,285 died between 2012 and 2020 owing to cancer (89.2%) or other causes (10.8%). Upon approval, the CPLD is accessible to researchers through the K-CURE portal. The CPLD is a unique resource for diverse cancer research to investigate medical use before a cancer diagnosis, during initial diagnosis and treatment, and long-term follow-up. This offers expanded insight into healthcare delivery across the cancer continuum, from screening to end-of-life care.

5.
Elife ; 132024 Jun 03.
Article in English | MEDLINE | ID: mdl-38742735

ABSTRACT

Transcriptomic profiling became a standard approach to quantify a cell state, which led to the accumulation of huge amount of public gene expression datasets. However, both reuse of these datasets or analysis of newly generated ones requires significant technical expertise. Here, we present Phantasus: a user-friendly web application for interactive gene expression analysis which provides a streamlined access to more than 96,000 public gene expression datasets, as well as allows analysis of user-uploaded datasets. Phantasus integrates an intuitive and highly interactive JavaScript-based heatmap interface with an ability to run sophisticated R-based analysis methods. Overall Phantasus allows users to go all the way from loading, normalizing, and filtering data to doing differential gene expression and downstream analysis. Phantasus can be accessed online at https://alserglab.wustl.edu/phantasus or can be installed locally from Bioconductor (https://bioconductor.org/packages/phantasus). Phantasus source code is available at https://github.com/ctlab/phantasus under an MIT license.


Subject(s)
Gene Expression Profiling , Internet , Software , Gene Expression Profiling/methods , Computational Biology/methods , Humans
6.
JMIR Public Health Surveill ; 10: e53330, 2024 Jul 02.
Article in English | MEDLINE | ID: mdl-38666756

ABSTRACT

BACKGROUND: The prevalence of type 2 diabetes mellitus (DM) and pre-diabetes mellitus (pre-DM) has been increasing among youth in recent decades in the United States, prompting an urgent need for understanding and identifying their associated risk factors. Such efforts, however, have been hindered by the lack of easily accessible youth pre-DM/DM data. OBJECTIVE: We aimed to first build a high-quality, comprehensive epidemiological data set focused on youth pre-DM/DM. Subsequently, we aimed to make these data accessible by creating a user-friendly web portal to share them and the corresponding codes. Through this, we hope to address this significant gap and facilitate youth pre-DM/DM research. METHODS: Building on data from the National Health and Nutrition Examination Survey (NHANES) from 1999 to 2018, we cleaned and harmonized hundreds of variables relevant to pre-DM/DM (fasting plasma glucose level ≥100 mg/dL or glycated hemoglobin ≥5.7%) for youth aged 12-19 years (N=15,149). We identified individual factors associated with pre-DM/DM risk using bivariate statistical analyses and predicted pre-DM/DM status using our Ensemble Integration (EI) framework for multidomain machine learning. We then developed a user-friendly web portal named Prediabetes/diabetes in youth Online Dashboard (POND) to share the data and codes. RESULTS: We extracted 95 variables potentially relevant to pre-DM/DM risk organized into 4 domains (sociodemographic, health status, diet, and other lifestyle behaviors). The bivariate analyses identified 27 significant correlates of pre-DM/DM (P<.001, Bonferroni adjusted), including race or ethnicity, health insurance, BMI, added sugar intake, and screen time. Among these factors, 16 factors were also identified based on the EI methodology (Fisher P of overlap=7.06×106). In addition to those, the EI approach identified 11 additional predictive variables, including some known (eg, meat and fruit intake and family income) and less recognized factors (eg, number of rooms in homes). The factors identified in both analyses spanned across all 4 of the domains mentioned. These data and results, as well as other exploratory tools, can be accessed on POND. CONCLUSIONS: Using NHANES data, we built one of the largest public epidemiological data sets for studying youth pre-DM/DM and identified potential risk factors using complementary analytical approaches. Our results align with the multifactorial nature of pre-DM/DM with correlates across several domains. Also, our data-sharing platform, POND, facilitates a wide range of applications to inform future youth pre-DM/DM studies.


Subject(s)
Diabetes Mellitus, Type 2 , Internet , Nutrition Surveys , Humans , Adolescent , Child , Female , Male , Diabetes Mellitus, Type 2/epidemiology , United States/epidemiology , Young Adult , Prediabetic State/epidemiology , Risk Factors , Datasets as Topic , Prevalence
7.
Article in English | MEDLINE | ID: mdl-38550554

ABSTRACT

Introduction: Patient selection remains challenging as the clinical use of re-irradiation (re-RT) increases. Re-RT data is limited to retrospective studies and small prospective single-institution reports, resulting in small, heterogenous data sets. Validated prognostic and predictive biomarkers are derived from large-volume studies with long-term follow-up. This review aims to examine existing re-RT publications and available data sets and discuss strategies using artificial intelligence (AI) to approach small data sets to optimize the use of re-RT data. Methods: Re-RT publications were identified where associated public data was present. The existing literature on small data sets to identify biomarkers was also explored. Results: Publications with associated public data were identified, with glioma and nasopharyngeal cancers emerging as the most common tumor sites where the use of re-RT was the primary management approach. Existing and emerging AI strategies have been used to approach small data sets including data generation, augmentation, discovery, and transfer learning. Conclusions: Further data is needed to generate adaptive frameworks, improve the collection of specimens for molecular analysis, and improve the interpretability of results in re-RT data.

8.
Proteomics ; : e2400005, 2024 Mar 31.
Article in English | MEDLINE | ID: mdl-38556628

ABSTRACT

We here present a chatbot assistant infrastructure (https://www.ebi.ac.uk/pride/chatbot/) that simplifies user interactions with the PRIDE database's documentation and dataset search functionality. The framework utilizes multiple Large Language Models (LLM): llama2, chatglm, mixtral (mistral), and openhermes. It also includes a web service API (Application Programming Interface), web interface, and components for indexing and managing vector databases. An Elo-ranking system-based benchmark component is included in the framework as well, which allows for evaluating the performance of each LLM and for improving PRIDE documentation. The chatbot not only allows users to interact with PRIDE documentation but can also be used to search and find PRIDE datasets using an LLM-based recommendation system, enabling dataset discoverability. Importantly, while our infrastructure is exemplified through its application in the PRIDE database context, the modular and adaptable nature of our approach positions it as a valuable tool for improving user experiences across a spectrum of bioinformatics and proteomics tools and resources, among other domains. The integration of advanced LLMs, innovative vector-based construction, the benchmarking framework, and optimized documentation collectively form a robust and transferable chatbot assistant infrastructure. The framework is open-source (https://github.com/PRIDE-Archive/pride-chatbot).

9.
Zhonghua Gan Zang Bing Za Zhi ; 31(7): 716-722, 2023 Jul 20.
Article in Chinese | MEDLINE | ID: mdl-37580254

ABSTRACT

Objective: To analyze the expression levels of the F9 gene and F9 protein in hepatocellular carcinoma by combining multiple gene chip data, real-time fluorescence quantitative PCR (RT qPCR), and immunohistochemistry. Additionally, explore their correlation with the occurrence and development of hepatocellular carcinoma, as well as with various clinical indicators and prognosis. Methods: The mRNA microarray dataset from the GEO database was analyzed to identify the F9 gene with significant expression differences associated with hepatocellular carcinoma. Liver cancer and adjacent tissues were collected from 18 cases of hepatocellular carcinoma. RT-qPCR method was used to detect the F9 gene expression level. Immunohistochemistry was used to detect the F9 protein level. Combined with the TCGA database information, the correlation between F9 gene expression level and prognostic and clinicopathological parameters was analyzed. The biological function of F9 co-expressed genes associated with hepatocellular carcinoma was analyzed by the Gene Ontology (GO) and the Kyoto Encyclopedia of Genes and Genomes (KEGG). Statistical analysis was performed using Graphpad Prism software. Results: Meta-analysis results showed that the expression of the F9 gene was lower in HCC tissues than in non-cancerous tissues. Immunohistochemistry results were basically consistent with those of RT-qPCR. The data obtained from TCGA showed that the F9 gene had lower expression values in stages III-IV, T3-T4, and patients with vascular invasion. A total of 127 genes were selected for bioinformatics analysis as co-expressed genes of F9, which were highly enriched in redox processes and metabolic pathways. Conclusion: This study validates that the F9 gene and F9 protein are lower in HCC. The down-regulation of the F9 gene predicts adverse outcomes, which may provide a new therapeutic target for HCC.


Subject(s)
Carcinoma, Hepatocellular , Liver Neoplasms , Humans , Carcinoma, Hepatocellular/pathology , Liver Neoplasms/pathology , Down-Regulation , Prognosis , Gene Expression , Gene Expression Regulation, Neoplastic
10.
Cancers (Basel) ; 15(13)2023 Jun 27.
Article in English | MEDLINE | ID: mdl-37444480

ABSTRACT

Death is a crucial outcome in retrospective cohort studies, serving as a criterion for analyzing mortality in a database. This study aimed to assess the quality of extracted death data and investigate the potential of the final-administered medication as a variable to quantify accuracy for the validation dataset. Electronic health records from both an in-hospital and the Korean Central Cancer Registry were used for this study. The gold standard was established by examining the differences between the dates of in-hospital deaths and cancer-registered deaths. Cosine similarity was employed to quantify the final-administered medication similarities between the gold standard and other cohorts. The gold standard was determined as patients who died in the hospital after 2006 and whose final hospital visit/discharge date and death date differed by 0 or 1 day. For all three criteria-(a) cancer stage, (b) cancer type, and (c) type of final visit-there was a positive correlation between mortality rates and the similarities of the final-administered medication. This study introduces a measure that can provide additional accurate information regarding death and differentiates the reliability of the dataset.

11.
Proteomics ; 23(20): e2300188, 2023 Oct.
Article in English | MEDLINE | ID: mdl-37488995

ABSTRACT

Relative and absolute intensity-based protein quantification across cell lines, tissue atlases and tumour datasets is increasingly available in public datasets. These atlases enable researchers to explore fundamental biological questions, such as protein existence, expression location, quantity and correlation with RNA expression. Most studies provide MS1 feature-based label-free quantitative (LFQ) datasets; however, growing numbers of isobaric tandem mass tags (TMT) datasets remain unexplored. Here, we compare traditional intensity-based absolute quantification (iBAQ) proteome abundance ranking to an analogous method using reporter ion proteome abundance ranking with data from an experiment where LFQ and TMT were measured on the same samples. This new TMT method substitutes reporter ion intensities for MS1 feature intensities in the iBAQ framework. Additionally, we compared LFQ-iBAQ values to TMT-iBAQ values from two independent large-scale tissue atlas datasets (one LFQ and one TMT) using robust bottom-up proteomic identification, normalisation and quantitation workflows.

13.
J Thorac Dis ; 15(5): 2450-2457, 2023 May 30.
Article in English | MEDLINE | ID: mdl-37324106

ABSTRACT

Background: The prevalence of asthma has increased in many countries. However, whether asthma prevalence may only be true in a specific age band is not well known. Thus, we analyzed the increase in asthma prevalence according to age band and analyzed the related factors. Methods: We analyzed the trend of asthma prevalence according to 10-year age band intervals by using 2007 to 2018 data from the Korean National Health and Nutrition Survey. We determined the presence of subject-reported, physician-diagnosed asthma in 89,179 subjects. Multiple logistic regression analyses with a complex sample design were conducted to identify the risk factors for asthma. Results: Among all age ranges, only the 20s age band showed an increase in the trend of asthma prevalence from 0.7% in 2007 to 5.1% in 2018 (P<0.001 for joinpoint regression). Among the 7,658 subjects in the 20s age band, 237 (3.1%) subjects were asthma. In asthma group, 54.9% were male, 43.9% were ever-smokers, 44.6% had allergic rhinitis, 25.3% had atopic dermatitis, and 29.1% were obese. Multiple logistic regression analysis showed that asthma was related to allergic rhinitis [odds ratio (OR), 2.78; 95% confidence interval (CI): 2.03-3.81] and atopic dermatitis (OR, 4.13; 95% CI: 2.85-5.98), but not to male sex, ever-smoking, obesity, or socioeconomics status. Conclusions: From 2007 to 2018, the prevalence of asthma significantly increased in the 20s age band in South Korea. This may be related to the increase in the cases of allergic rhinitis and atopic dermatitis.

14.
Artif Intell Rev ; : 1-32, 2023 Mar 15.
Article in English | MEDLINE | ID: mdl-37362900

ABSTRACT

The volume of data generated by today's digitally connected world is enormous, and a significant portion of it is publicly available. These data sources are web archives, public databases, and social networks such as Facebook, Twitter, LinkedIn, Emails, Telegrams, etc. Open-source intelligence (OSINT) extracts information from a collection of publicly available and accessible data. OSINT can provide a solution to the challenges in extracting and gathering intelligence from various publicly available information and social networks. OSINT is currently expanding at an incredible rate, bringing new artificial intelligence-based approaches to address issues of national security, political campaign, the cyber industry, criminal profiling, and society, as well as cyber threats and crimes. In this paper, we have described the current state of OSINT tools/techniques and the state of the art for various applications of OSINT in cyber security. In addition, we have discussed the challenges and future directions to develop autonomous models. These models can provide solutions for different social network-based security, digital forensics, and cyber crime-based problems using various machine learning (ML), deep learning (DL) and artificial intelligence (AI) with OSINT.

15.
Risk Manag Healthc Policy ; 16: 1101-1117, 2023.
Article in English | MEDLINE | ID: mdl-37346248

ABSTRACT

Purpose: The purpose of this study lies in verifying the effectiveness of the health promotion project which the public health center at the local level conducted by systematically linking the health examination results from the Health Insurance Corporation. We intend to emphasize the importance of linking the health-related public data. Methods: A survey was conducted to measure the effect of improving health behavior using EQ-5D-5L and demographic variables. Results: As a result of the analysis, the residents (3.13) who had experienced the use of public health centers recognized more necessity for the service linked systematically with health checkup data than those (2.93) who had not. In addition, the residents who had experienced the use of public health centers responded that their chronic diseases had improved compared to a year ago (2.78→2.93). Next, those (3.04) who had experienced the services linked with health checkup data recognized that their chronic diseases and health conditions had been improved compared to those (2.81) who had not. However, in EQ-5D-5L, after using the service, mobility showed no difference between those who had used the service and those who had not. Furthermore, even in terms of self-management, daily life, etc., the management ability was further improved compared to those who had not used it, before using the service. Conclusion: This study showed the improved health level when the health promotion service of the public health center was provided by systematically linking the health checkup data of the Health Insurance Corporation in Korea. In order to increase the effectiveness of health data-linked projects, it is necessary to prepare guidelines for linking the public health data and to expand the data-linked project. It will be needed to further subdivide the health checkup results to provide customized services, and to secure dedicated personnel to reinforce the system link.

16.
Front Genet ; 14: 1106631, 2023.
Article in English | MEDLINE | ID: mdl-37065493

ABSTRACT

The human genome project galvanized the scientific community around an ambitious goal. Upon completion, the project delivered several discoveries, and a new era of research commenced. More importantly, novel technologies and analysis methods materialized during the project period. The cost reduction allowed many more labs to generate high-throughput datasets. The project also served as a model for other extensive collaborations that generated large datasets. These datasets were made public and continue to accumulate in repositories. As a result, the scientific community should consider how these data can be utilized effectively for the purposes of research and the public good. A dataset can be re-analyzed, curated, or integrated with other forms of data to enhance its utility. We highlight three important areas to achieve this goal in this brief perspective. We also emphasize the critical requirements for these strategies to be successful. We draw on our own experience and others in using publicly available datasets to support, develop, and extend our research interest. Finally, we underline the beneficiaries and discuss some risks involved in data reuse.

17.
J Proteome Res ; 22(4): 1181-1192, 2023 04 07.
Article in English | MEDLINE | ID: mdl-36963412

ABSTRACT

Using data from 183 public human data sets from PRIDE, a machine learning model was trained to identify tissue and cell-type specific protein patterns. PRIDE projects were searched with ionbot and tissue/cell type annotation was manually added. Data from physiological samples were used to train a Random Forest model on protein abundances to classify samples into tissues and cell types. Subsequently, a one-vs-all classification and feature importance were used to analyze the most discriminating protein abundances per class. Based on protein abundance alone, the model was able to predict tissues with 98% accuracy, and cell types with 99% accuracy. The F-scores describe a clear view on tissue-specific proteins and tissue-specific protein expression patterns. In-depth feature analysis shows slight confusion between physiologically similar tissues, demonstrating the capacity of the algorithm to detect biologically relevant patterns. These results can in turn inform downstream uses, from identification of the tissue of origin of proteins in complex samples such as liquid biopsies, to studying the proteome of tissue-like samples such as organoids and cell lines.


Subject(s)
Proteome , Proteomics , Humans , Proteomics/methods , Proteome/genetics , Proteome/metabolism , Algorithms , Machine Learning
18.
J Proteome Res ; 22(3): 729-742, 2023 03 03.
Article in English | MEDLINE | ID: mdl-36577097

ABSTRACT

The availability of proteomics datasets in the public domain, and in the PRIDE database, in particular, has increased dramatically in recent years. This unprecedented large-scale availability of data provides an opportunity for combined analyses of datasets to get organism-wide protein abundance data in a consistent manner. We have reanalyzed 24 public proteomics datasets from healthy human individuals to assess baseline protein abundance in 31 organs. We defined tissue as a distinct functional or structural region within an organ. Overall, the aggregated dataset contains 67 healthy tissues, corresponding to 3,119 mass spectrometry runs covering 498 samples from 489 individuals. We compared protein abundances between different organs and studied the distribution of proteins across these organs. We also compared the results with data generated in analogous studies. Additionally, we performed gene ontology and pathway-enrichment analyses to identify organ-specific enriched biological processes and pathways. As a key point, we have integrated the protein abundance results into the resource Expression Atlas, where they can be accessed and visualized either individually or together with gene expression data coming from transcriptomics datasets. We believe this is a good mechanism to make proteomics data more accessible for life scientists.


Subject(s)
Proteome , Proteomics , Humans , Proteome/analysis , Proteomics/methods , Gene Expression Profiling , Databases, Factual , Mass Spectrometry/methods , Databases, Protein
19.
Crit Rev Microbiol ; 49(3): 391-413, 2023 May.
Article in English | MEDLINE | ID: mdl-35468027

ABSTRACT

Staphylococcus aureus is a notorious pathogen posing challenges in the medical industry due to drug resistance and biofilm formation. The horizon of knowledge on S. aureus pathogenesis has expanded with the advancement of data-driven bioinformatics techniques. Mining information from sequenced genomes and their expression data is an economic approach that alleviates wastage of resources and redundancy in experiments. The current review covers how big data bioinformatics has been used in the analysis of S. aureus from publicly available -omics data to uncover mechanisms of infection and inhibition. Particularly, advances in the past two decades in biomarker discovery, host responses, phenotype identification, consolidation of information, and drug development are discussed highlighting the challenges and shortcomings. Overall, the review summarizes the diverse aspects of scrupulous re-analysis of S. aureus proteomic and transcriptomic expression datasets retrieved from public repositories in terms of the efforts taken, benefits offered, and follow-up actions. The detailed review thus serves as a reference and aid for (i) Computational biologists by briefing the approaches utilized for bacterial omics re-analysis concerning S. aureus and (ii) Experimental biologists by elucidating the potential of bioinformatics in biological research to generate reliable postulates in a prompt and economical manner.


Subject(s)
Staphylococcal Infections , Staphylococcus aureus , Humans , Proteomics , Big Data , Staphylococcal Infections/drug therapy , Staphylococcal Infections/microbiology , Computational Biology
20.
Chinese Journal of Hepatology ; (12): 716-722, 2023.
Article in Chinese | WPRIM (Western Pacific) | ID: wpr-986200

ABSTRACT

Objective: To analyze the expression levels of the F9 gene and F9 protein in hepatocellular carcinoma by combining multiple gene chip data, real-time fluorescence quantitative PCR (RT qPCR), and immunohistochemistry. Additionally, explore their correlation with the occurrence and development of hepatocellular carcinoma, as well as with various clinical indicators and prognosis. Methods: The mRNA microarray dataset from the GEO database was analyzed to identify the F9 gene with significant expression differences associated with hepatocellular carcinoma. Liver cancer and adjacent tissues were collected from 18 cases of hepatocellular carcinoma. RT-qPCR method was used to detect the F9 gene expression level. Immunohistochemistry was used to detect the F9 protein level. Combined with the TCGA database information, the correlation between F9 gene expression level and prognostic and clinicopathological parameters was analyzed. The biological function of F9 co-expressed genes associated with hepatocellular carcinoma was analyzed by the Gene Ontology (GO) and the Kyoto Encyclopedia of Genes and Genomes (KEGG). Statistical analysis was performed using Graphpad Prism software. Results: Meta-analysis results showed that the expression of the F9 gene was lower in HCC tissues than in non-cancerous tissues. Immunohistochemistry results were basically consistent with those of RT-qPCR. The data obtained from TCGA showed that the F9 gene had lower expression values in stages III-IV, T3-T4, and patients with vascular invasion. A total of 127 genes were selected for bioinformatics analysis as co-expressed genes of F9, which were highly enriched in redox processes and metabolic pathways. Conclusion: This study validates that the F9 gene and F9 protein are lower in HCC. The down-regulation of the F9 gene predicts adverse outcomes, which may provide a new therapeutic target for HCC.


Subject(s)
Humans , Carcinoma, Hepatocellular/pathology , Liver Neoplasms/pathology , Down-Regulation , Prognosis , Gene Expression , Gene Expression Regulation, Neoplastic
SELECTION OF CITATIONS
SEARCH DETAIL