Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 6 de 6
Filter
Add more filters











Database
Language
Publication year range
1.
BMC Bioinformatics ; 25(1): 281, 2024 Aug 27.
Article in English | MEDLINE | ID: mdl-39192204

ABSTRACT

BACKGROUND: Mining the vast pool of biomedical literature to extract accurate responses and relevant references is challenging due to the domain's interdisciplinary nature, specialized jargon, and continuous evolution. Early natural language processing (NLP) approaches often led to incorrect answers as they failed to comprehend the nuances of natural language. However, transformer models have significantly advanced the field by enabling the creation of large language models (LLMs), enhancing question-answering (QA) tasks. Despite these advances, current LLM-based solutions for specialized domains like biology and biomedicine still struggle to generate up-to-date responses while avoiding "hallucination" or generating plausible but factually incorrect responses. RESULTS: Our work focuses on enhancing prompts using a retrieval-augmented architecture to guide LLMs in generating meaningful responses for biomedical QA tasks. We evaluated two approaches: one relying on text embedding and vector similarity in a high-dimensional space, and our proposed method, which uses explicit signals in user queries to extract meaningful contexts. For robust evaluation, we tested these methods on 50 specific and challenging questions from diverse biomedical topics, comparing their performance against a baseline model, BM25. Retrieval performance of our method was significantly better than others, achieving a median Precision@10 of 0.95, which indicates the fraction of the top 10 retrieved chunks that are relevant. We used GPT-4, OpenAI's most advanced LLM to maximize the answer quality and manually accessed LLM-generated responses. Our method achieved a median answer quality score of 2.5, surpassing both the baseline model and the text embedding-based approach. We developed a QA bot, WeiseEule ( https://github.com/wasimaftab/WeiseEule-LocalHost ), which utilizes these methods for comparative analysis and also offers advanced features for review writing and identifying relevant articles for citation. CONCLUSIONS: Our findings highlight the importance of prompt enhancement methods that utilize explicit signals in user queries over traditional text embedding-based approaches to improve LLM-generated responses for specialized queries in specialized domains such as biology and biomedicine. By providing users complete control over the information fed into the LLM, our approach addresses some of the major drawbacks of existing web-based chatbots and LLM-based QA systems, including hallucinations and the generation of irrelevant or outdated responses.


Subject(s)
Data Mining , Natural Language Processing , Data Mining/methods , Information Storage and Retrieval/methods
2.
Elife ; 92020 05 20.
Article in English | MEDLINE | ID: mdl-32432549

ABSTRACT

Histone acetylation and deposition of H2A.Z variant are integral aspects of active transcription. In Drosophila, the single DOMINO chromatin regulator complex is thought to combine both activities via an unknown mechanism. Here we show that alternative isoforms of the DOMINO nucleosome remodeling ATPase, DOM-A and DOM-B, directly specify two distinct multi-subunit complexes. Both complexes are necessary for transcriptional regulation but through different mechanisms. The DOM-B complex incorporates H2A.V (the fly ortholog of H2A.Z) genome-wide in an ATP-dependent manner, like the yeast SWR1 complex. The DOM-A complex, instead, functions as an ATP-independent histone acetyltransferase complex similar to the yeast NuA4, targeting lysine 12 of histone H4. Our work provides an instructive example of how different evolutionary strategies lead to similar functional separation. In yeast and humans, nucleosome remodeling and histone acetyltransferase complexes originate from gene duplication and paralog specification. Drosophila generates the same diversity by alternative splicing of a single gene.


Cells contain a large number of proteins that control the activity of genes in response to various signals and changes in their environment. Often these proteins work together in groups called complexes. In the fruit fly Drosophila melanogaster, one of these complexes is called DOMINO. The DOMINO complex alters gene activity by interacting with other proteins called histones which influence how the genes are packaged and accessed within the cell. DOMINO works in two separate ways. First, it can replace certain histones with other variants that regulate genes differently. Second, it can modify histones by adding a chemical marker to them, which alters how they interact with genes. It was not clear how DOMINO can do both of these things and how that is controlled; but it was known that cells can make two different forms of the central component of the complex, called DOM-A and DOM-B, which are both encoded by the same gene. Scacchetti et al. have now studied fruit flies to understand the activities of these forms. This revealed that they do have different roles and that gene activity in cells changes if either one is lost. The two forms operate as part complexes with different compositions and only DOM-A includes the TIP60 enzyme that is needed to modify histones. As such, it seems that DOM-B primarily replaces histones with variant forms, while DOM-A modifies existing histones. This means that each form has a unique role associated with each of the two known behaviors of this complex. The presence of two different DOMINO complexes is common to flies and, probably, other insects. Yet, in other living things, such as mammals and yeast, their two roles are carried out by protein complexes originating from two distinct genes. This illustrates a concept called convergent evolution, where different organisms find different solutions for the same problem. As such, these findings provide an insight into the challenges encountered through evolution and the diverse solutions that have developed. They will also help us to understand the ways in which protein activities can adapt to different needs over evolutionary time.


Subject(s)
Drosophila Proteins/metabolism , Drosophila/enzymology , Histone Acetyltransferases/metabolism , Multiprotein Complexes/metabolism , Transcription Factors/metabolism , Adenosine Triphosphatases/genetics , Adenosine Triphosphatases/metabolism , Animals , Chromatin Assembly and Disassembly , Drosophila/genetics , Drosophila Proteins/genetics , Histone Acetyltransferases/genetics , Histones/genetics , Histones/metabolism , Multiprotein Complexes/genetics , Nucleosomes/genetics , Nucleosomes/metabolism , Saccharomyces cerevisiae/enzymology , Saccharomyces cerevisiae/genetics , Saccharomyces cerevisiae Proteins/genetics , Saccharomyces cerevisiae Proteins/metabolism , Transcription Factors/genetics
3.
Bioessays ; 41(4): e1800201, 2019 04.
Article in English | MEDLINE | ID: mdl-30919497

ABSTRACT

Transcription is a potential threat to genome integrity, and transcription-associated DNA damage must be repaired for proper messenger RNA (mRNA) synthesis and for cells to transmit their genome intact into progeny. For a wide range of structurally diverse DNA lesions, cells employ the highly conserved nucleotide excision repair (NER) pathway to restore their genome back to its native form. Recent evidence suggests that NER factors function, in addition to the canonical DNA repair mechanism, in processes that facilitate mRNA synthesis or shape the 3D chromatin architecture. Here, these findings are critically discussed and a working model that explains the puzzling clinical heterogeneity of NER syndromes highlighting the relevance of physiological, transcription-associated DNA damage to mammalian development and disease is proposed.


Subject(s)
DNA Repair/genetics , Genomic Instability , Transcription, Genetic , Animals , Chromatin/chemistry , Chromatin/metabolism , DNA Damage/genetics , Humans , RNA, Messenger/biosynthesis
4.
Nat Cell Biol ; 19(5): 421-432, 2017 May.
Article in English | MEDLINE | ID: mdl-28368372

ABSTRACT

Inborn defects in DNA repair are associated with complex developmental disorders whose causal mechanisms are poorly understood. Using an in vivo biotinylation tagging approach in mice, we show that the nucleotide excision repair (NER) structure-specific endonuclease ERCC1-XPF complex interacts with the insulator binding protein CTCF, the cohesin subunits SMC1A and SMC3 and with MBD2; the factors co-localize with ATRX at the promoters and control regions (ICRs) of imprinted genes during postnatal hepatic development. Loss of Ercc1 or exposure to MMC triggers the localization of CTCF to heterochromatin, the dissociation of the CTCF-cohesin complex and ATRX from promoters and ICRs, altered histone marks and the aberrant developmental expression of imprinted genes without altering DNA methylation. We propose that ERCC1-XPF cooperates with CTCF and cohesin to facilitate the developmental silencing of imprinted genes and that persistent DNA damage triggers chromatin changes that affect gene expression programs associated with NER disorders.


Subject(s)
Cell Cycle Proteins/metabolism , Chromosomal Proteins, Non-Histone/metabolism , DNA Repair , DNA-Binding Proteins/metabolism , Endonucleases/metabolism , Gene Silencing , Genomic Imprinting , Repressor Proteins/metabolism , Age Factors , Animals , Animals, Newborn , CCCTC-Binding Factor , Cell Cycle Proteins/genetics , Cells, Cultured , Chondroitin Sulfate Proteoglycans/genetics , Chondroitin Sulfate Proteoglycans/metabolism , Chromosomal Proteins, Non-Histone/genetics , Coculture Techniques , DNA Damage , DNA Helicases/genetics , DNA Helicases/metabolism , DNA-Binding Proteins/genetics , Endonucleases/genetics , Fibroblasts/enzymology , Gene Expression Regulation, Developmental , Genotype , Histones/metabolism , Liver/enzymology , Mice, 129 Strain , Mice, Inbred C57BL , Mice, Transgenic , Nuclear Proteins/genetics , Nuclear Proteins/metabolism , Phenotype , Promoter Regions, Genetic , Repressor Proteins/genetics , X-linked Nuclear Protein , Cohesins
5.
PLoS Biol ; 12(11): e1002005, 2014 Nov.
Article in English | MEDLINE | ID: mdl-25423365

ABSTRACT

Myriapods (e.g., centipedes and millipedes) display a simple homonomous body plan relative to other arthropods. All members of the class are terrestrial, but they attained terrestriality independently of insects. Myriapoda is the only arthropod class not represented by a sequenced genome. We present an analysis of the genome of the centipede Strigamia maritima. It retains a compact genome that has undergone less gene loss and shuffling than previously sequenced arthropods, and many orthologues of genes conserved from the bilaterian ancestor that have been lost in insects. Our analysis locates many genes in conserved macro-synteny contexts, and many small-scale examples of gene clustering. We describe several examples where S. maritima shows different solutions from insects to similar problems. The insect olfactory receptor gene family is absent from S. maritima, and olfaction in air is likely effected by expansion of other receptor gene families. For some genes S. maritima has evolved paralogues to generate coding sequence diversity, where insects use alternate splicing. This is most striking for the Dscam gene, which in Drosophila generates more than 100,000 alternate splice forms, but in S. maritima is encoded by over 100 paralogues. We see an intriguing linkage between the absence of any known photosensory proteins in a blind organism and the additional absence of canonical circadian clock genes. The phylogenetic position of myriapods allows us to identify where in arthropod phylogeny several particular molecular mechanisms and traits emerged. For example, we conclude that juvenile hormone signalling evolved with the emergence of the exoskeleton in the arthropods and that RR-1 containing cuticle proteins evolved in the lineage leading to Mandibulata. We also identify when various gene expansions and losses occurred. The genome of S. maritima offers us a unique glimpse into the ancestral arthropod genome, while also displaying many adaptations to its specific life history.


Subject(s)
Arthropods/genetics , Genome , Synteny , Animals , Circadian Rhythm Signaling Peptides and Proteins/genetics , DNA Methylation , Evolution, Molecular , Female , Genome, Mitochondrial , Hormones/genetics , Male , Multigene Family , Phylogeny , Polymorphism, Genetic , Protein Kinases/genetics , RNA, Untranslated/genetics , Receptors, Odorant/genetics , Selenoproteins/genetics , Sex Chromosomes , Transcription Factors/genetics
6.
Evol Dev ; 12(4): 347-52, 2010.
Article in English | MEDLINE | ID: mdl-20618430

ABSTRACT

Geophilomorph centipedes show variation in segment number (a) between closely related species and (b) within and between populations of the same species. We have previously shown for a Scottish population of the coastal centipede Strigamia maritima that the temperature of embryonic development is one of the factors that affects the segment number of hatchlings, and hence of adults, as these animals grow epimorphically--that is, without postembryonic addition of segments. Here, we show, using temperature-shift experiments, that the main developmental period during which embryos are sensitive to environmental temperature is surprisingly early, during blastoderm formation and before, or very shortly after, the onset of segmentation.


Subject(s)
Arthropods/embryology , Body Patterning/physiology , Embryonic Development , Temperature , Animals , Arthropods/anatomy & histology , Blastoderm/growth & development , Blastoderm/ultrastructure , Embryo, Nonmammalian , Female , Male , Time Factors
SELECTION OF CITATIONS
SEARCH DETAIL