Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 24
Filter
Add more filters










Publication year range
1.
Article in English | MEDLINE | ID: mdl-38777805

ABSTRACT

OBJECTIVES: Biomedical Knowledge Graphs play a pivotal role in various biomedical research domains. Concurrently, term clustering emerges as a crucial step in constructing these knowledge graphs, aiming to identify synonymous terms. Due to a lack of knowledge, previous contrastive learning models trained with Unified Medical Language System (UMLS) synonyms struggle at clustering difficult terms and do not generalize well beyond UMLS terms. In this work, we leverage the world knowledge from large language models (LLMs) and propose Contrastive Learning for Representing Terms via Explanations (CoRTEx) to enhance term representation and significantly improves term clustering. MATERIALS AND METHODS: The model training involves generating explanations for a cleaned subset of UMLS terms using ChatGPT. We employ contrastive learning, considering term and explanation embeddings simultaneously, and progressively introduce hard negative samples. Additionally, a ChatGPT-assisted BIRCH algorithm is designed for efficient clustering of a new ontology. RESULTS: We established a clustering test set and a hard negative test set, where our model consistently achieves the highest F1 score. With CoRTEx embeddings and the modified BIRCH algorithm, we grouped 35 580 932 terms from the Biomedical Informatics Ontology System (BIOS) into 22 104 559 clusters with O(N) queries to ChatGPT. Case studies highlight the model's efficacy in handling challenging samples, aided by information from explanations. CONCLUSION: By aligning terms to their explanations, CoRTEx demonstrates superior accuracy over benchmark models and robustness beyond its training set, and it is suitable for clustering terms for large-scale biomedical ontologies.

2.
Front Mol Biosci ; 11: 1325041, 2024.
Article in English | MEDLINE | ID: mdl-38419689

ABSTRACT

Protein-RNA interactions are central to numerous cellular processes. In this work, we present an easy and straightforward NMR-based approach to determine the RNA binding site of RNA binding proteins and to evaluate the binding of pairs of proteins to a single-stranded RNA (ssRNA) under physiological conditions, in this case in nuclear extracts. By incorporation of a 19F atom on the ribose of different nucleotides along the ssRNA sequence, we show that, upon addition of an RNA binding protein, the intensity of the 19F NMR signal changes when the 19F atom is located near the protein binding site. Furthermore, we show that the addition of pairs of proteins to a ssRNA containing two 19F atoms at two different locations informs on their concurrent binding or competition. We demonstrate that such studies can be done in a nuclear extract that mimics the physiological environment in which these protein-ssRNA interactions occur. Finally, we demonstrate that a trifluoromethoxy group (-OCF3) incorporated in the 2'ribose position of ssRNA sequences increases the sensitivity of the NMR signal, leading to decreased measurement times, and reduces the issue of RNA degradation in cellular extracts.

3.
Sci Data ; 10(1): 909, 2023 Dec 18.
Article in English | MEDLINE | ID: mdl-38110415

ABSTRACT

Retrieval-based Clinical Decision Support (ReCDS) can aid clinical workflow by providing relevant literature and similar patients for a given patient. However, the development of ReCDS systems has been severely obstructed by the lack of diverse patient collections and publicly available large-scale patient-level annotation datasets. In this paper, we collect a novel dataset of patient summaries and relations called PMC-Patients to benchmark two ReCDS tasks: Patient-to-Article Retrieval (ReCDS-PAR) and Patient-to-Patient Retrieval (ReCDS-PPR). Specifically, we extract patient summaries from PubMed Central articles using simple heuristics and utilize the PubMed citation graph to define patient-article relevance and patient-patient similarity. PMC-Patients contains 167k patient summaries with 3.1 M patient-article relevance annotations and 293k patient-patient similarity annotations, which is the largest-scale resource for ReCDS and also one of the largest patient collections. Human evaluation and analysis show that PMC-Patients is a diverse dataset with high-quality annotations. We also implement and evaluate several ReCDS systems on the PMC-Patients benchmarks to show its challenges and conduct several case studies to show the clinical utility of PMC-Patients.


Subject(s)
Decision Support Systems, Clinical , Patients , Humans , PubMed
4.
J Biomed Inform ; 137: 104266, 2023 01.
Article in English | MEDLINE | ID: mdl-36494059

ABSTRACT

Liver cancer is a common malignant tumor, and its clinical stage is closely related to the clinical treatment and prognosis of patients. Currently, the BCLC staging system revised by the BCLC group of University of Barcelona is the globally recognized staging system for liver cancer. However, with the deepening of related research, the current staging system can no longer fully meet the clinical needs. In this work, we propose a novel machine learning method for constructing an automatic hepatocellular carcinoma staging model that incorporates far more clinical variables than any existing staging system. Our model is based on random survival forests, which generates a unique hazard function for each patient. B-splines are used to embed hazard functions into vectors in low-dimensional space and hierarchical clustering method groups similar patients to form staging cohorts. The resulting staging system significantly outperforms the BCLC system in terms of distinctiveness between patients in different stages.


Subject(s)
Carcinoma, Hepatocellular , Liver Neoplasms , Humans , Neoplasm Staging , Retrospective Studies , Liver Neoplasms/diagnosis , Liver Neoplasms/pathology , Carcinoma, Hepatocellular/diagnosis , Carcinoma, Hepatocellular/pathology , Prognosis
5.
ACS Omega ; 7(44): 40569-40577, 2022 Nov 08.
Article in English | MEDLINE | ID: mdl-36385847

ABSTRACT

In recent times, the importance of peptides in the biomedical domain has received increasing concern in terms of their effect on multiple disease treatments. However, before successful large-scale implementation in the industry, accurate identification of peptide toxicity is a vital prerequisite. The existing computational methods have reached good results from toxicity prediction, and we present an improved model based on different deep learning architectures. The modification mainly focuses on two aspects: sequence encoding and variational information bottlenecks. Consequently, one of our modified plans shows an obvious increase in sensitivity, while the rest show good performance meanwhile adding novelty in the peptide toxicity prediction domain. In detail, our best model could achieve an accuracy of 97.38 and 95.03% in protein and peptide toxicity predictions, respectively. The performance was superior to previous predictors on the same datasets.

6.
Front Plant Sci ; 13: 996265, 2022.
Article in English | MEDLINE | ID: mdl-36204049

ABSTRACT

Cysteine-rich poly comb-like protein (CPP) is a member of cysteine-rich transcription factors that regulates plant growth and development. In the present work, we characterized twelve CPP transcription factors encoding genes in soybean (Glycine max). Phylogenetic analyses classified CPP genes into six clades. Sequence logos analyses between G. max and G. soja amino acid residues exhibited high conservation. The presence of growth and stress-related cis-acting elements in the upstream regions of GmCPPs highlight their role in plant development and tolerance against abiotic stress. Ka/Ks levels showed that GmCPPs experienced limited selection pressure with limited functional divergence arising from segmental or whole genome duplication events. By using the PAN-genome of soybean, a single nucleotide polymorphism was identified in GmCPP-6. To perform high throughput genotyping, a kompetitive allele-specific PCR (KASP) marker was developed. Association analyses indicated that GmCPP-6-T allele of GmCPP-6 (in exon region) was associated with higher thousand seed weight under both water regimes (well-water and water-limited). Taken together, these results provide vital information to further decipher the biological functions of CPP genes in soybean molecular breeding.

7.
J Biomed Inform ; 126: 103983, 2022 02.
Article in English | MEDLINE | ID: mdl-34990838

ABSTRACT

OBJECTIVE: This paper aims to propose knowledge-aware embedding, a critical tool for medical term normalization. METHODS: We develop CODER (Cross-lingual knowledge-infused medical term embedding) via contrastive learning based on a medical knowledge graph (KG) named the Unified Medical Language System, and similarities are calculated utilizing both terms and relation triplets from the KG. Training with relations injects medical knowledge into embeddings and can potentially improve their performance as machine learning features. RESULTS: We evaluate CODER based on zero-shot term normalization, semantic similarity, and relation classification benchmarks, and the results show that CODER outperforms various state-of-the-art biomedical word embeddings, concept embeddings, and contextual embeddings. CONCLUSION: CODER embeddings excellently reflect semantic similarity and relatedness of medical concepts. One can use CODER for embedding-based medical term normalization or to provide features for machine learning. Similar to other pretrained language models, CODER can also be fine-tuned for specific tasks. Codes and models are available at https://github.com/GanjinZero/CODER.


Subject(s)
Natural Language Processing , Unified Medical Language System , Language , Machine Learning , Semantics
8.
Int J Med Sci ; 17(18): 3073-3081, 2020.
Article in English | MEDLINE | ID: mdl-33173428

ABSTRACT

Patient-derived xenograft (PDX) models are effective preclinical cancer models that reproduce the tumor microenvironment of the human body. The methods have been widely used for drug screening, biomarker development, co-clinical trials, and personalized medicine. However, the low success rate and the long tumorigenesis period have largely limited their usage. In the present studies, we compared the PDX establishment between hepatocellular cancer (HCC) and metastatic liver cancer (MLC), and identified the key factors affecting the transplantation rate of PDXs. Surgically resected tumor specimens obtained from patients were subcutaneously inoculated into immunodeficient mice to construct PDX models. The overall transplantation rate was 38.5% (20/52), with the HCC group (28.1%, 9/32) being lower than MLC group (56.2%, 9/16). In addition, HCC group took significantly longer latency period than MLC group to construct PDX models. Hematoxylin and eosin staining results showed that the histopathology of all generations in PDX models was similar to the original tumor in all three types of cancer. The transplantation rate of PDX models in HCC patients was significantly associated with blood type (P=0.001), TNM stage (P=0.023), lymph node metastasis (P=0.042) and peripheral blood CA19-9 level (P=0.049), while the transplantation rate of PDX models in MLC patients was significantly associated with tumor size (P=0.034). This study demonstrates that PDX models can effectively reproduce the histological patterns of human tumors. The transplantation rate depends on the type of original tumor. Furthermore, it shows that the invasiveness of the original liver cancer affects the possibility of its growth in immunodeficient mice.


Subject(s)
Carcinoma, Hepatocellular/pathology , Colorectal Neoplasms/pathology , Liver Neoplasms/secondary , Liver/pathology , Tumor Microenvironment , Animals , Carcinoma, Hepatocellular/surgery , Colorectal Neoplasms/surgery , Female , Hepatectomy , Humans , Liver/surgery , Liver Neoplasms/surgery , Male , Mice , Middle Aged , Xenograft Model Antitumor Assays/methods
9.
Am J Transl Res ; 11(5): 3128-3139, 2019.
Article in English | MEDLINE | ID: mdl-31217882

ABSTRACT

Tumor samples of pancreatic ductal adenocarcinoma patients, who underwent resection surgery, were implanted into NOD/SCID mice to construct pancreatic cancer patient-derived xenograft (PDX) models and explore the biological changes in the different generations of PDXs. Ten PDXs were successfully generated, and the tumor formation rate of F1 PDXs was found to be 38.46%, which was lower than F2 (77.78%) and F3 (71.43%) PDXs. In addition, latent periods of tumorigenesis of F2 and F3 PDXs were significantly shorter, compared to that in F1 PDXs (P<0.05). Comparison of H&E staining of tumor tissue from primary pancreatic cancer and PDXs showed that all three generations of PDXs had similar histopathology to primary pancreatic cancer, indicating that PDXs may well reproduce the histological patterns of primary human cancer. Besides, Ki67 expression was increased in all three generations of PDXs compared to primary tumors of patients, and additionally, EpCAM expression was increased in F3 PDXs. These results were corroborated by the real-time qPCR and western blot results. Therefore, we concluded that PDXs are able to preserve the differentiation degree, morphological characteristics, and structural features of tumor cells. Furthermore, the latent periods of tumorigenesis are shortened after the first generation, which may be attributed to an increase in expression levels of tumor promoters such as Ki67 and EpCAM. PDX models may become an efficient tool for pancreatic cancer research.

10.
Org Biomol Chem ; 16(35): 6576-6585, 2018 09 11.
Article in English | MEDLINE | ID: mdl-30168560

ABSTRACT

The labelling of DNA oligonucleotides with signalling groups that give a unique response to duplex formation depending on the target sequence is a highly effective strategy in the design of DNA-based hybridisation sensors. A key challenge in the design of these so-called base discriminating probes (BDPs) is to understand how the local environment of the signalling group affects the sensing response. The work herein describes a comprehensive study involving a variety of photophysical techniques, NMR studies and molecular dynamics simulations, on anthracene-tagged oligonucleotide probes that can sense single base changes (point variants) in target DNA strands. A detailed analysis of the fluorescence sensing mechanism is provided, with a particular focus on rationalising the high dependence of this process on not only the linker stereochemistry but also the site of nucleobase variation within the target strand. The work highlights the various factors and techniques used to respectively underpin and rationalise the BDP approach to point variant sensing, which relies on different responses to duplex formation rather than different duplex binding strengths.


Subject(s)
Anthracenes/chemistry , DNA/chemistry , DNA/genetics , Molecular Probes/chemistry , Polymorphism, Single Nucleotide , Base Sequence , Molecular Dynamics Simulation , Nucleic Acid Conformation , Staining and Labeling
11.
ACS Chem Biol ; 11(3): 717-21, 2016 Mar 18.
Article in English | MEDLINE | ID: mdl-26580817

ABSTRACT

The ability to discriminate between epigenetic variants in DNA is a necessary tool if we are to increase our understanding of the roles that they play in various biological processes and medical conditions. Herein, it is demonstrated how a simple two-step fluorescent probe assay can be used to differentiate all three major epigenetic variants of cytosine at a single locus site in a target strand of DNA.


Subject(s)
5-Methylcytosine/chemistry , Anthracenes/chemistry , Cytosine/analogs & derivatives , Cytosine/chemistry , DNA/chemistry , Fluorescent Dyes/chemistry , Nucleic Acid Conformation
12.
PLoS One ; 9(4): e95097, 2014.
Article in English | MEDLINE | ID: mdl-24755680

ABSTRACT

Förster resonance energy transfer (FRET) technology relies on the close proximity of two compatible fluorophores for energy transfer. Tagged (Cy3 and Cy5) complementary DNA strands forming a stable duplex and a doubly-tagged single strand were shown to demonstrate FRET outside of a cellular environment. FRET was also observed after transfecting these DNA strands into fixed and live cells using methods such as microinjection and electroporation, but not when using lipid based transfection reagents, unless in the presence of the endosomal acidification inhibitor bafilomycin. Avoiding the endocytosis pathway is essential for efficient delivery of intact DNA probes into cells.


Subject(s)
DNA, Single-Stranded/metabolism , Electroporation , Fluorescence Resonance Energy Transfer/methods , Microinjections , Animals , CHO Cells , Carbocyanines/metabolism , Cell Survival , Cricetinae , Cricetulus , DNA Probes/metabolism , Microscopy, Confocal , Transfection
13.
Chem Commun (Camb) ; 48(100): 12165-7, 2012 Dec 28.
Article in English | MEDLINE | ID: mdl-23090440

ABSTRACT

The design, synthesis and electrochemical behaviour of an oligomer consisting of linked thymine-functionalised ferrocene units are reported, which, as a so-called form of ferrocene nucleic acid (FcNA), acts as a structural mimic of DNA.


Subject(s)
Biomimetic Materials/chemistry , DNA/chemistry , Ferrous Compounds/chemistry , Organometallic Compounds/chemistry , Polymers/chemistry , Biomimetic Materials/chemical synthesis , Metallocenes , Models, Molecular , Molecular Conformation , Polymers/chemical synthesis
14.
J Am Chem Soc ; 134(26): 10791-4, 2012 Jul 04.
Article in English | MEDLINE | ID: mdl-22694485

ABSTRACT

Modified DNA strands undergo a reversible light-induced reaction involving the intramolecular photodimerization of two appended anthracene tags. The photodimers exhibit markedly different binding behavior toward a complementary strand that depends on the number of bases between the modified positions. By preforming the duplex, photochromism can be suppressed, illustrating dual-mode gated behavior.


Subject(s)
Anthracenes/chemistry , DNA/radiation effects , Anthracenes/radiation effects , Base Sequence , DNA/chemistry , Light
15.
Bioorg Med Chem Lett ; 22(1): 129-32, 2012 Jan 01.
Article in English | MEDLINE | ID: mdl-22169264

ABSTRACT

Single nucleotide polymorphisms within a sequence of a gene associated with prostate cancer were identified using oligodeoxynucleotide probe sequences bearing internal anthracene fluorophores proximal to the SNP site. Depending upon the nature of the synthesised target sequences, probe-target duplex formation could lead to enhanced or attenuated fluorescence emission from the anthracene, enabling detection of a proximal base-pair as either matching or mismatching.


Subject(s)
DNA Probes/chemistry , DNA/chemistry , Fluorescent Dyes/chemistry , Polymorphism, Single Nucleotide , Prostatic Neoplasms/metabolism , Anthracenes/chemistry , Base Pair Mismatch , Base Sequence , DNA/genetics , Humans , Male , Models, Chemical , Molecular Sequence Data , Nucleic Acid Conformation , Prostatic Neoplasms/genetics , Spectrophotometry, Ultraviolet , Temperature , Thermodynamics
16.
Chem Commun (Camb) ; 47(23): 6629-31, 2011 Jun 21.
Article in English | MEDLINE | ID: mdl-21562680

ABSTRACT

A fluorescent DNA probe containing an anthracene group attached via an anucleosidic linker can identify all four DNA bases at a single site as well as the epigenetic modification C/5-MeC via a hybridisation sensing assay.


Subject(s)
Cytosine/chemistry , DNA/chemistry , Fluorescent Dyes/chemistry , Nucleic Acid Hybridization/methods , Anthracenes/chemistry , Circular Dichroism , DNA Methylation , DNA Probes/chemistry , Spectrometry, Fluorescence
17.
Org Biomol Chem ; 8(12): 2728-34, 2010 Jun 21.
Article in English | MEDLINE | ID: mdl-20393654

ABSTRACT

HyBeacon probes have been used to characterise SNPs in the CYP2C9 and VKORC1 genes associated with variations in the efficiency of warfarin metabolism. PCR amplification of genomic DNA and probe target melting analysis provided a robust and reliable method to differentiate polymorphic sequences at these loci. Probes capped with 5'-trimethoxystilbene were found to exhibit larger fluorescence differences between hybridised and dissociated states than uncapped probes, generating melting peaks of considerably improved height. 3'-Pyrene modifications enhanced probe signal-to-noise and melting peak heights with the additional benefit of acting as PCR stoppers, and 5',3'-doubly capped probes gave the best analytical results.


Subject(s)
Fluorescent Dyes/chemistry , Oligonucleotide Probes/chemistry , Polymorphism, Genetic , Warfarin/metabolism , Aryl Hydrocarbon Hydroxylases/genetics , Cytochrome P-450 CYP2C9 , DNA/genetics , DNA/metabolism , Genotype , Humans , Mixed Function Oxygenases/genetics , Polymerase Chain Reaction/methods , Vitamin K Epoxide Reductases
18.
Chembiochem ; 10(11): 1839-51, 2009 Jul 20.
Article in English | MEDLINE | ID: mdl-19554592

ABSTRACT

Anthraquinone and pyrene analogues attached to the 3' and/or 5' termini of triplex-forming oligonucleotides (TFOs) by various linkers increased the stability of parallel triple helices. The modifications are simple to synthesize and can be introduced during standard solid-phase oligonucleotide synthesis. Potent triplex stability was achieved by using doubly modified TFOs, which in the most favourable cases gave an increase in melting temperature of 30 degrees C over the unmodified counterparts and maintained their selectivity for the correct target duplex. Such TFOs can produce triplexes with melting temperatures of 40 degrees C at pH 7 even though they do not contain any triplex-stabilizing base analogues. These studies have implications for the design of triplex-forming oligonucleotides for use in biology and nanotechnology.


Subject(s)
Anthraquinones/chemistry , DNA/chemistry , Oligonucleotides/chemistry , Pyrenes/chemistry , Amines/chemical synthesis , Amines/chemistry , Anthraquinones/chemical synthesis , Base Sequence , Nucleic Acid Conformation , Pyrenes/chemical synthesis , Transition Temperature , Ultraviolet Rays
19.
J Org Chem ; 74(6): 2350-6, 2009 Mar 20.
Article in English | MEDLINE | ID: mdl-19220046

ABSTRACT

The synthesis of a novel C4-linked C2-imidazole ribonucleoside phosphoramidite (ICN-C2-PA 1) with a two-carbon linker between imidazole and ribose moieties is described. In the phosphoramidite, POM and 2-cyanoethyl groups were selected to protect the endocyclic amine function of imidazole and the 2'-hydroxyl function of D-ribose, respectively. The C2-imidazole nucleoside, a flexible structural mimic of a purine nucleobase, was successfully incorporated using ICN-C2-PA 1 into position 638 of the VS ribozyme through 2'-TBDMS chemistry to study the role of G638 in general acid-base catalysis. The modified VS ribozyme (G638C2Imz) exhibited significantly greater catalytic activity than observed with the C0-imidazole that has no carbon atoms linking the ribose and the C4-imidazole. Imidazole nucleoside analogues with variable spacer lengths could provide a valuable general methodology for exploring the catalytic mechanisms of ribozymes.


Subject(s)
Organophosphorus Compounds/chemical synthesis , RNA, Catalytic/chemistry , Ribonucleosides/chemical synthesis , Catalysis , Imidazoles/chemical synthesis
20.
Article in English | MEDLINE | ID: mdl-18058509

ABSTRACT

The synthesis of two anthraquinone phosphoramidites is described. In both cases the anthraquinone moiety is attached via a linker to the 5-position of a uracil base, allowing incorporation at any thymidine position in an oligonucleotide sequence. Anthraquinone-modified oligonucleotides have potential applications as triplex stabilizers and fluorescence quenchers.


Subject(s)
Anthraquinones/chemistry , Oligodeoxyribonucleotides/chemistry , Oligodeoxyribonucleotides/chemical synthesis , Base Sequence , Drug Design , Drug Stability , Fluorescence , Molecular Structure , Nucleic Acid Conformation
SELECTION OF CITATIONS
SEARCH DETAIL
...