Your browser doesn't support javascript.
loading
: 20 | 50 | 100
1 - 18 de 18
1.
Clin Transl Med ; 14(4): e1657, 2024 04.
Article En | MEDLINE | ID: mdl-38629623

PURPOSE: Systematic repurposing of approved medicines for another indication may accelerate drug development in oncology. We present a strategy combining biomarker testing with drug repurposing to identify new treatments for patients with advanced cancer. METHODS: Tumours were sequenced with the Illumina TruSight Oncology 500 (TSO-500) platform or the FoundationOne CDx panel. Mutations were screened by two medical oncologists and pathogenic mutations were categorised referencing literature. Variants of unknown significance were classified as potentially pathogenic using plausible mechanisms and computational prediction of pathogenicity. Gain of function (GOF) mutations were evaluated through repurposing databases Probe Miner (PM), Broad Institute Drug Repurposing Hub (Broad Institute DRH) and TOPOGRAPH. GOF mutations were repurposing events if identified in PM, not indexed in TOPOGRAPH and excluding mutations with a known Food and Drug Administration (FDA)-approved biomarker. The computational repurposing approach was validated by evaluating its ability to identify FDA-approved biomarkers. The total repurposable genome was identified by evaluating all possible gene-FDA drug-approved combinations in the PM dataset. RESULTS: The computational repurposing approach was accurate at identifying FDA therapies with known biomarkers (94%). Using next-generation sequencing molecular reports (n = 94), a meaningful percentage of patients (14%) could have an off-label therapeutic identified. The frequency of theoretical drug repurposing events in The Cancer Genome Atlas pan-cancer dataset was 73% of the samples in the cohort. CONCLUSION: A computational drug repurposing approach may assist in identifying novel repurposing events in cancer patients with no access to standard therapies. Further validation is needed to confirm a precision oncology approach using drug repurposing.


Neoplasms , Humans , Neoplasms/drug therapy , Neoplasms/genetics , Drug Repositioning , Precision Medicine , Pharmaceutical Preparations , Biomarkers
2.
JCO Precis Oncol ; 8: e2300317, 2024 Jan.
Article En | MEDLINE | ID: mdl-38190581

Advances in genomics have enabled anticancer therapies to be tailored to target specific genomic alterations. Single-arm trials (SATs), including those incorporated within umbrella, basket, and platform trials, are widely adopted when it is not feasible to conduct randomized controlled trials in rare biomarker-defined subpopulations. External controls (ECs), defined as control arm data derived outside the clinical trial, have gained renewed interest as a strategy to supplement evidence generated from SATs to allow comparative analysis. There are increasing examples demonstrating the application of EC in precision oncology trials. The prospective application of EC in conducting comparative studies is associated with distinct methodological challenges, the specific considerations for EC use in biomarker-defined subpopulations have not been adequately discussed, and a formal framework is yet to be established. In this review, we present a framework for conducting a prospective comparative analysis using EC. Key steps are (1) defining the purpose of using EC to address the study question, (2) determining if the external data are fit for purpose, (3) developing a transparent study protocol and a statistical analysis plan, and (iv) interpreting results and drawing conclusions on the basis of a prespecified hypothesis. We specify the considerations required for the biomarker-defined subpopulations, which include (1) specifying the comparator and biomarker status of the comparator group, (2) defining lines of treatment, (3) assessment of the biomarker testing panels used, and (4) assessment of cohort stratification in tumor-agnostic studies. We further discuss novel clinical trial designs and statistical techniques leveraging EC to propose future directions to advance evidence generation and facilitate drug development in precision oncology.


Neoplasms , Humans , Neoplasms/drug therapy , Precision Medicine , Medical Oncology , Treatment Outcome , Biomarkers
4.
JAAD Int ; 14: 39-47, 2024 Mar.
Article En | MEDLINE | ID: mdl-38089398

Background: Real-time review of frozen sections underpins the quality of Mohs surgery. There is an unmet need for low-cost techniques that can improve Mohs surgery by reliably corroborating cancerous regions of interest and surgical margin proximity. Objective: To test that deep learning models can identify nonmelanoma skin cancer regions in Mohs frozen section specimens. Methods: Deep learning models were developed on archival images of focused microscopic views (FMVs) containing regions of annotated, invasive nonmelanoma skin cancer between 2015 and 2018, then validated on prospectively collected images in a temporal cohort (2019-2021). Results: The tile-based classification models were derived using 1423 focused microscopic view images from 154 patients and tested on 374 images from 66 patients. The best models detected basal cell carcinomas with a median average precision of 0.966 and median area under the receiver operating curve of 0.889 at 100x magnification (0.943 and 0.922 at 40x magnification). For invasive squamous cell carcinomas, high median average precision of 0.904 was achieved at 100x magnification. Limitations: Single institution study with limited cases of squamous cell carcinoma and rare nonmelanoma skin cancer. Conclusion: Deep learning appears highly accurate for detecting skin cancers in Mohs frozen sections, supporting its potential for enhancing surgical margin control and increasing operational efficiency.

6.
Int J Cancer ; 153(7): 1413-1422, 2023 10 01.
Article En | MEDLINE | ID: mdl-37424386

The Dutch Drug Rediscovery Protocol (DRUP) and the Australian Cancer Molecular Screening and Therapeutic (MoST) Program are similar nonrandomized, multidrug, pan-cancer trial platforms that aim to identify signals of clinical activity of molecularly matched targeted therapies or immunotherapies outside their approved indications. Here, we report results for advanced or metastatic cancer patients with tumors harboring cyclin D-CDK4/6 pathway alterations treated with CDK4/6 inhibitors palbociclib or ribociclib. We included adult patients that had therapy-refractory solid malignancies with the following alterations: amplifications of CDK4, CDK6, CCND1, CCND2 or CCND3, or complete loss of CDKN2A or SMARCA4. Within MoST, all patients were treated with palbociclib, whereas in DRUP, palbociclib and ribociclib were assigned to different cohorts (defined by tumor type and alteration). The primary endpoint for this combined analysis was clinical benefit, defined as confirmed objective response or stable disease ≥16 weeks. We treated 139 patients with a broad variety of tumor types; 116 with palbociclib and 23 with ribociclib. In 112 evaluable patients, the objective response rate was 0% and clinical benefit rate at 16 weeks was 15%. Median progression-free survival was 4 months (95% CI: 3-5 months), and median overall survival 5 months (95% CI: 4-6 months). In conclusion, only limited clinical activity of palbociclib and ribociclib monotherapy in patients with pretreated cancers harboring cyclin D-CDK4/6 pathway alterations was observed. Our findings indicate that monotherapy use of palbociclib or ribociclib is not recommended and that merging data of two similar precision oncology trials is feasible.


Breast Neoplasms , Neoplasms , Humans , Female , Neoplasms/drug therapy , Cyclins , Australia , Precision Medicine , Aminopyridines/therapeutic use , Cyclin D , Cyclin-Dependent Kinase 4 , Breast Neoplasms/drug therapy , Protein Kinase Inhibitors/pharmacology , Cyclin-Dependent Kinase 6 , DNA Helicases , Nuclear Proteins
7.
PLoS One ; 18(4): e0284327, 2023.
Article En | MEDLINE | ID: mdl-37053216

Intragenic CpG dinucleotides are tightly conserved in evolution yet are also vulnerable to methylation-dependent mutation, raising the question as to why these functionally critical sites have not been deselected by more stable coding sequences. We previously showed in cell lines that altered exonic CpG methylation can modify promoter start sites, and hence protein isoform expression, for the human TP53 tumor suppressor gene. Here we extend this work to the in vivo setting by testing whether synonymous germline modifications of exonic CpG sites affect murine development, fertility, longevity, or cancer incidence. We substituted the DNA-binding exons 5-8 of Trp53, the mouse ortholog of human TP53, with variant-CpG (either CpG-depleted or -enriched) sequences predicted to encode the normal p53 amino acid sequence; a control construct was also created in which all non-CpG sites were synonymously substituted. Homozygous Trp53-null mice were the only genotype to develop tumors. Mice with variant-CpG Trp53 sequences remained tumor-free, but were uniquely prone to dental anomalies causing jaw malocclusion (p < .0001). Since the latter phenotype also characterises murine Rett syndrome due to dysfunction of the trans-repressive MeCP2 methyl-CpG-binding protein, we hypothesise that CpG sites may exert non-coding phenotypic effects via pre-translational cis-interactions of 5-methylcytosine with methyl-binding proteins which regulate mRNA transcript initiation, expression or splicing, although direct effects on mRNA structure or translation are also possible.


Genes, p53 , Neoplasms , Mice , Humans , Animals , Mutation , Neoplasms/genetics , Methyl-CpG-Binding Protein 2/genetics , RNA, Messenger , CpG Islands , DNA Methylation
8.
Front Oncol ; 13: 1074091, 2023.
Article En | MEDLINE | ID: mdl-36910667
9.
NPJ Precis Oncol ; 5(1): 58, 2021 Jun 23.
Article En | MEDLINE | ID: mdl-34162978

While several resources exist that interpret therapeutic significance of genomic alterations in cancer, many regional real-world issues limit access to drugs. There is a need for a pragmatic, evidence-based, context-adapted tool to guide clinical management based on molecular biomarkers. To this end, we have structured a compendium of approved and experimental therapies with associated biomarkers following a survey of drug regulatory databases, existing knowledge bases, and published literature. Each biomarker-disease-therapy triplet was categorised using a tiering system reflective of key therapeutic considerations: approved and reimbursed therapies with respect to a jurisdiction (Tier 1), evidence of efficacy or approval in another jurisdiction (Tier 2), evidence of antitumour activity (Tier 3), and plausible biological rationale (Tier 4). Two resistance categories were defined: lack of efficacy (Tier R1) or antitumor activity (Tier R2). Based on this framework, we curated a digital resource focused on drugs relevant in the Australian healthcare system (TOPOGRAPH: Therapy Oriented Precision Oncology Guidelines for Recommending Anticancer Pharmaceuticals). As of November 2020, TOPOGRAPH comprised 2810 biomarker-disease-therapy triplets in 989 expert-appraised entries, including 373 therapies, 199 biomarkers, and 106 cancer types. In the 345 therapies catalogued, 84 (24%) and 65 (19%) were designated Tiers 1 and 2, respectively, while 271 (79%) therapies were supported by preclinical studies, early clinical trials, retrospective studies, or case series (Tiers 3 and 4). A companion algorithm was also developed to support rational, context-appropriate treatment selection informed by molecular biomarkers. This framework can be readily adapted to build similar resources in other jurisdictions to support therapeutic decision-making.

10.
Ecancermedicalscience ; 15: ed119, 2021.
Article En | MEDLINE | ID: mdl-35211208

Population aging is causing a demographic redistribution with implications for the future of healthcare. How will this affect oncology? First, there will be an overall rise in cancer affecting older adults, even though age-specific cancer incidences continue to fall due to better prevention. Second, there will be a wider spectrum of health functionality in this expanding cohort of older adults, with differences between "physiologically older" and "physiologically younger" patients becoming more important for optimal treatment selection. Third, greater teamwork with supportive care, geriatric, mental health and rehabilitation experts will come to enrich oncologic decision-making by making it less formulaic than it is at present. Success in this transition to a more nuanced professional mindset will depend in part on the development of user-friendly computational tools that can integrate a complex mix of quantitative and qualitative inputs from evidence-based medicine, functional and cognitive assessments, and the personal priorities of older adults.

11.
JCO Clin Cancer Inform ; 2: 1-14, 2018 12.
Article En | MEDLINE | ID: mdl-30652600

PURPOSE: There is as yet no computer-processable resource to describe treatment end points in cancer, hindering our ability to systematically capture and share outcomes data to inform better patient care. To address these unmet needs, we have built an ontology, the Cancer Care Treatment Outcome Ontology (CCTOO), to organize high-level concepts of treatment end points with structured knowledge representation to facilitate standardized sharing of real-world data. METHODS: End points from oncology trials in ClinicalTrials.gov were extracted, queried using the keyword cancer, and followed by an expert appraisal. Synonyms and relevant terms were imported from the National Cancer Institute Thesaurus and Common Terminology Criteria for Adverse Events. Logical relationships among concepts were manually represented by production rules. The applicability of 1,847 rules was tested in an index case. RESULTS: After removing duplicated terms from 54,705 trial entries, an ontology holding 1,133 terms was built. CCTOO organized concepts into four domains (cancer treatment, health services, physical, and psychosocial health-related concepts), 13 subgroups (including efficacy, safety, and quality of life), and two (taxonomic and evaluative) concept hierarchies. This ontology has a comprehensive term coverage in the cancer trial literature: at least one term was mentioned in 98% of MEDLINE abstracts of phase I to III trials, whereas concepts about efficacy were mentioned in 7,208 (79%) phase I, 15,051 (92%) phase II, and 3,884 (86%) phase III trials. The event sequence of the index case was readily convertible to a comprehensive profile incorporating response, treatment toxicity, and survival by applying the set of production rules curated in the CCTOO. CONCLUSION: CCTOO categorizes high-level treatment end points used in oncology and provides a mechanism for profiling individual patient data by outcomes to facilitate translational analysis.


Biological Ontologies/trends , Neoplasms/therapy , Quality of Life/psychology , Humans , Treatment Outcome
12.
Aust Fam Physician ; 46(4): 189-193, 2017.
Article En | MEDLINE | ID: mdl-28376570

BACKGROUND: Internal medicine is in flux because of the 'omics revolution', with cancer medicine being a good example. Molecular technologies that detect alterations in gene-based structure or function are having an impact on diagnosis, prognosis and treatment of cancer. OBJECTIVE: In this article, recent advances in gene-based characterisation of cancer are presented, and illustrated where possible by clinical applications. DISCUSSION: The research-based vision of precision medicine is now on its way to becoming a clinical reality. A key limiting factor is the small number of therapeutic options available for customisation, which contrasts with the rising abundance of omics-derived data. However, further translational progress is anticipated over the next decade.


Neoplasms/genetics , Neoplasms/therapy , Precision Medicine/methods , Review Literature as Topic , DNA/pharmacology , DNA/therapeutic use , Humans , Karyotype , Precision Medicine/instrumentation , Proteomics/methods , RNA/pharmacology , RNA/therapeutic use
13.
BMC Cancer ; 16(1): 929, 2016 12 01.
Article En | MEDLINE | ID: mdl-27905893

BACKGROUND: Multidisciplinary team (MDT) meetings are used to optimise expert decision-making about treatment options, but such expertise is not digitally transferable between centres. To help standardise medical decision-making, we developed a machine learning model designed to predict MDT decisions about adjuvant breast cancer treatments. METHODS: We analysed MDT decisions regarding adjuvant systemic therapy for 1065 breast cancer cases over eight years. Machine learning classifiers with and without bootstrap aggregation were correlated with MDT decisions (recommended, not recommended, or discussable) regarding adjuvant cytotoxic, endocrine and biologic/targeted therapies, then tested for predictability using stratified ten-fold cross-validations. The predictions so derived were duly compared with those based on published (ESMO and NCCN) cancer guidelines. RESULTS: Machine learning more accurately predicted adjuvant chemotherapy MDT decisions than did simple application of guidelines. No differences were found between MDT- vs. ESMO/NCCN- based decisions to prescribe either adjuvant endocrine (97%, p = 0.44/0.74) or biologic/targeted therapies (98%, p = 0.82/0.59). In contrast, significant discrepancies were evident between MDT- and guideline-based decisions to prescribe chemotherapy (87%, p < 0.01, representing 43% and 53% variations from ESMO/NCCN guidelines, respectively). Using ten-fold cross-validation, the best classifiers achieved areas under the receiver operating characteristic curve (AUC) of 0.940 for chemotherapy (95% C.I., 0.922-0.958), 0.899 for the endocrine therapy (95% C.I., 0.880-0.918), and 0.977 for trastuzumab therapy (95% C.I., 0.955-0.999) respectively. Overall, bootstrap aggregated classifiers performed better among all evaluated machine learning models. CONCLUSIONS: A machine learning approach based on clinicopathologic characteristics can predict MDT decisions about adjuvant breast cancer drug therapies. The discrepancy between MDT- and guideline-based decisions regarding adjuvant chemotherapy implies that certain non-clincopathologic criteria, such as patient preference and resource availability, are factored into clinical decision-making by local experts but not captured by guidelines.


Antineoplastic Combined Chemotherapy Protocols/therapeutic use , Breast Neoplasms/diagnosis , Breast Neoplasms/drug therapy , Clinical Decision-Making , Machine Learning , Models, Theoretical , Patient Care Team , Adult , Aged , Aged, 80 and over , Algorithms , Chemotherapy, Adjuvant , Combined Modality Therapy , Computer Simulation , Female , Humans , Middle Aged , Supervised Machine Learning
14.
J Biomed Inform ; 49: 221-6, 2014 Jun.
Article En | MEDLINE | ID: mdl-24681202

MOTIVATION: Gene set enrichment analysis (GSEA) annotates gene microarray data with functional information from the biomedical literature to improve gene-disease association prediction. We hypothesize that supplementing GSEA with comprehensive gene function catalogs built automatically using information extracted from the scientific literature will significantly enhance GSEA prediction quality. METHODS: Gold standard gene sets for breast cancer (BrCa) and colorectal cancer (CRC) were derived from the literature. Two gene function catalogs (CMeSH and CUMLS) were automatically generated. 1. By using Entrez Gene to associate all recorded human genes with PubMed article IDs. 2. Using the genes mentioned in each PubMed article and associating each with the article's MeSH terms (in CMeSH) and extracted UMLS concepts (in CUMLS). Microarray data from the Gene Expression Omnibus for BrCa and CRC was then annotated using CMeSH and CUMLS and for comparison, also with several pre-existing catalogs (C2, C4 and C5 from the Molecular Signatures Database). Ranking was done using, a standard GSEA implementation (GSEA-p). Gene function predictions for enriched array data were evaluated against the gold standard by measuring area under the receiver operating characteristic curve (AUC). RESULTS: Comparison of ranking using the literature enrichment catalogs, the pre-existing catalogs as well as five randomly generated catalogs show the literature derived enrichment catalogs are more effective. The AUC for BrCa using the unenriched gene expression dataset was 0.43, increasing to 0.89 after gene set enrichment with CUMLS. The AUC for CRC using the unenriched gene expression dataset was 0.54, increasing to 0.9 after enrichment with CMeSH. C2 increased AUC (BrCa 0.76, CRC 0.71) but C4 and C5 performed poorly (between 0.35 and 0.5). The randomly generated catalogs also performed poorly, equivalent to random guessing. DISCUSSION: Gene set enrichment significantly improved prediction of gene-disease association. Selection of enrichment catalog had a substantial effect on prediction accuracy. The literature based catalogs performed better than the MSigDB catalogs, possibly because they are more recent. Catalogs generated automatically from the literature can be kept up to date. CONCLUSION: Prediction of gene-disease association is a fundamental task in biomedical research. GSEA provides a promising method when using literature-based enrichment catalogs. AVAILABILITY: The literature based catalogs generated and used in this study are available from http://www2.chi.unsw.edu.au/literature-enrichment.


Genetic Predisposition to Disease , Breast Neoplasms/genetics , Colorectal Neoplasms/genetics , Female , Genome-Wide Association Study , Humans
15.
J Clin Neuromuscul Dis ; 13(3): 105-12, 2012 Mar.
Article En | MEDLINE | ID: mdl-22538304

OBJECTIVES: Chronic inflammatory demyelinating polyradiculoneuropathy is a treatable neuropathy that is challenging to diagnose and has a broad spectrum of presentations. We report the clinical, electrodiagnostic, and radiographic presentations in three patients whose workup revealed hypertrophic nerve roots. METHODS: We retrospectively reviewed the clinical, electrodiagnostic, and imaging data for patients diagnosed with chronic inflammatory demyelinating polyradiculoneuropathy over a 3-year period. RESULTS: All patients had features of proximal and distal neuropathy with progressive or recurrent courses. Diagnosis and management were significantly altered by the concomitant clinical findings and/or radiographic findings. CONCLUSIONS: Our cases highlight the use of magnetic resonance imaging to evaluate for nerve root hypertrophy as an additional tool to electrodiagnostic testing in the setting of refractory or atypical neuropathy condition. Awareness of the radiographic features will assist in confirmation of the diagnosis, institution of the appropriate therapy, and prevention of inadequate or delay of treatment.


Polyradiculoneuropathy, Chronic Inflammatory Demyelinating/diagnosis , Polyradiculoneuropathy, Chronic Inflammatory Demyelinating/physiopathology , Spinal Nerve Roots/pathology , Aged , Electric Stimulation/methods , Female , Humans , Hypertrophy/complications , Hypertrophy/pathology , Longitudinal Studies , Magnetic Resonance Imaging , Male , Middle Aged , Neural Conduction/physiology , Peripheral Nerves/physiopathology , Reaction Time/physiology , Retrospective Studies
16.
BMC Bioinformatics ; 12: 112, 2011 Apr 21.
Article En | MEDLINE | ID: mdl-21510898

BACKGROUND: The identification of drug characteristics is a clinically important task, but it requires much expert knowledge and consumes substantial resources. We have developed a statistical text-mining approach (BInary Characteristics Extractor and biomedical Properties Predictor: BICEPP) to help experts screen drugs that may have important clinical characteristics of interest. RESULTS: BICEPP first retrieves MEDLINE abstracts containing drug names, then selects tokens that best predict the list of drugs which represents the characteristic of interest. Machine learning is then used to classify drugs using a document frequency-based measure. Evaluation experiments were performed to validate BICEPP's performance on 484 characteristics of 857 drugs, identified from the Australian Medicines Handbook (AMH) and the PharmacoKinetic Interaction Screening (PKIS) database. Stratified cross-validations revealed that BICEPP was able to classify drugs into all 20 major therapeutic classes (100%) and 157 (of 197) minor drug classes (80%) with areas under the receiver operating characteristic curve (AUC) > 0.80. Similarly, AUC > 0.80 could be obtained in the classification of 173 (of 238) adverse events (73%), up to 12 (of 15) groups of clinically significant cytochrome P450 enzyme (CYP) inducers or inhibitors (80%), and up to 11 (of 14) groups of narrow therapeutic index drugs (79%). Interestingly, it was observed that the keywords used to describe a drug characteristic were not necessarily the most predictive ones for the classification task. CONCLUSIONS: BICEPP has sufficient classification power to automatically distinguish a wide range of clinical properties of drugs. This may be used in pharmacovigilance applications to assist with rapid screening of large drug databases to identify important characteristics for further evaluation.


Data Mining , Drug-Related Side Effects and Adverse Reactions , Pharmaceutical Preparations/analysis , Databases, Factual , Drug Interactions , Humans , Pharmacokinetics , Therapeutic Uses
17.
Br J Clin Pharmacol ; 71(5): 727-36, 2011 May.
Article En | MEDLINE | ID: mdl-21223357

AIMS: To catalogue the perpetrators of CYP-mediated pharmacokinetic drug-drug interactions (PK-DDIs) using clinically relevant criteria, and to compare this with an analogous catalogue. METHODS: Candidate inhibitors and inducers of CYP1A2, CYP2C9, CYP2C19, CYP2D6 and CYP3A ('perpetrators') were evaluated using published clinical pharmacokinetic interaction studies. Studies were selected on the basis of ≥six human subjects, use of a validated in vivo probe substrate for the CYP enzyme, and clinically relevant dosing. Inhibitors were described according to the FDA classifications of strong, moderate or weak, whereas inducers were classified as major (≥twofold decrease in AUC) or weak (

Cytochrome P-450 Enzyme System/physiology , Drug Interactions/physiology , Catalogs, Drug as Topic , Cytochrome P-450 Enzyme Inhibitors , Cytochrome P-450 Enzyme System/biosynthesis , Enzyme Induction , Enzyme Inhibitors/pharmacology , Evidence-Based Medicine/methods , Humans
18.
BMC Bioinformatics ; 10: 86, 2009 Mar 17.
Article En | MEDLINE | ID: mdl-19292914

BACKGROUND: In silico candidate gene prioritisation (CGP) aids the discovery of gene functions by ranking genes according to an objective relevance score. While several CGP methods have been described for identifying human disease genes, corresponding methods for prokaryotic gene function discovery are lacking. Here we present two prokaryotic CGP methods, based on phylogenetic profiles, to assist with this task. RESULTS: Using gene occurrence patterns in sample genomes, we developed two CGP methods (statistical and inductive CGP) to assist with the discovery of bacterial gene functions. Statistical CGP exploits the differences in gene frequency against phenotypic groups, while inductive CGP applies supervised machine learning to identify gene occurrence pattern across genomes. Three rediscovery experiments were designed to evaluate the CGP frameworks. The first experiment attempted to rediscover peptidoglycan genes with 417 published genome sequences. Both CGP methods achieved best areas under receiver operating characteristic curve (AUC) of 0.911 in Escherichia coli K-12 (EC-K12) and 0.978 Streptococcus agalactiae 2603 (SA-2603) genomes, with an average improvement in precision of >3.2-fold and a maximum of >27-fold using statistical CGP. A median AUC of >0.95 could still be achieved with as few as 10 genome examples in each group of genome examples in the rediscovery of the peptidoglycan metabolism genes. In the second experiment, a maximum of 109-fold improvement in precision was achieved in the rediscovery of anaerobic fermentation genes in EC-K12. The last experiment attempted to rediscover genes from 31 metabolic pathways in SA-2603, where 14 pathways achieved AUC >0.9 and 28 pathways achieved AUC >0.8 with the best inductive CGP algorithms. CONCLUSION: Our results demonstrate that the two CGP methods can assist with the study of functionally uncategorised genomic regions and discovery of bacterial gene-function relationships. Our rediscovery experiments also provide a set of standard tasks against which future methods may be compared.


Computational Biology/methods , Genes, Bacterial , Phylogeny , Algorithms , Gene Expression Profiling/methods , Genome, Bacterial/genetics
...