Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 225
Filter
1.
Nucleic Acids Res ; 2024 Sep 24.
Article in English | MEDLINE | ID: mdl-39315698

ABSTRACT

Epigenetic aberration is one of the major driving factors in human cancer, often leading to acquired resistance to chemotherapies. Various small molecule epigenetic modulators have been reported. Nonetheless, outcomes from animal models and clinical trials have underscored the substantial setbacks attributed to pronounced on- and off-target toxicities. To address these challenges, CRISPR/dCas9 technology is emerging as a potent tool for precise modulation of epigenetic mechanism. However, this technology involves co-expressing exogenous epigenetic modulator proteins, which presents technical challenges in preparation and delivery with potential undesirable side effects. Recently, our research demonstrated that Cas9 tagged with the Phe-Cys-Pro-Phe (FCPF)-peptide motif can be specifically targeted by perfluorobiphenyl (PFB) derivatives. Here, we integrated the FCPF-tag into dCas9 and established a chemically inducible platform for epigenome editing, called Chem-CRISPR/dCas9FCPF. We designed a series of chemical inhibitor-PFB conjugates targeting various epigenetic modulator proteins. Focusing on JQ1, a panBET inhibitor, we demonstrate that c-MYC-sgRNA-guided JQ1-PFB specifically inhibits BRD4 in close proximity to the c-MYC promoter/enhancer, thereby effectively repressing the intricate transcription networks orchestrated by c-MYC as compared with JQ1 alone. In conclusion, our Chem-CRISPR/dCas9FCPF platform significantly increased target specificity of chemical epigenetic inhibitors, offering a viable alternative to conventional fusion protein systems for epigenome editing.

2.
Sci Rep ; 14(1): 20051, 2024 08 29.
Article in English | MEDLINE | ID: mdl-39209947

ABSTRACT

Skin inflammation with the potential sequel of moist epitheliolysis and edema constitute the most frequent breast radiotherapy (RT) acute side effects. The aim of this study was to compare the predictive value of tissue-derived radiomics features to the total breast volume (TBV) for the moist cells epitheliolysis as a surrogate for skin inflammation, and edema. Radiomics features were extracted from computed tomography (CT) scans of 252 breast cancer patients from two volumes of interest: TBV and glandular tissue (GT). Machine learning classifiers were trained on radiomics and clinical features, which were evaluated for both side effects. The best radiomics model was a least absolute shrinkage and selection operator (LASSO) classifier, using TBV features, predicting moist cells epitheliolysis, achieving an area under the receiver operating characteristic (AUROC) of 0.74. This was comparable to TBV breast volume (AUROC of 0.75). Combined models of radiomics and clinical features did not improve performance. Exclusion of volume-correlated features slightly reduced the predictive performance (AUROC 0.71). We could demonstrate the general propensity of planning CT-based radiomics models to predict breast RT-dependent side effects. Mammary tissue was more predictive than glandular tissue. The radiomics features performance was influenced by their high correlation to TBV volume.


Subject(s)
Breast Neoplasms , Tomography, X-Ray Computed , Humans , Female , Breast Neoplasms/radiotherapy , Breast Neoplasms/diagnostic imaging , Breast Neoplasms/pathology , Tomography, X-Ray Computed/methods , Middle Aged , Aged , Adult , Machine Learning , Breast/diagnostic imaging , Breast/pathology , Breast/radiation effects , Radiomics
3.
Genome Biol ; 25(1): 222, 2024 Aug 16.
Article in English | MEDLINE | ID: mdl-39152483

ABSTRACT

BACKGROUND: Reproducibility is a major concern in biomedical studies, and existing publication guidelines do not solve the problem. Batch effects and quality imbalances between groups of biological samples are major factors hampering reproducibility. Yet, the latter is rarely considered in the scientific literature. RESULTS: Our analysis uses 40 clinically relevant RNA-seq datasets to quantify the impact of quality imbalance between groups of samples on the reproducibility of gene expression studies. High-quality imbalance is frequent (14 datasets; 35%), and hundreds of quality markers are present in more than 50% of the datasets. Enrichment analysis suggests common stress-driven effects among the low-quality samples and highlights a complementary role of transcription factors and miRNAs to regulate stress response. Preliminary ChIP-seq results show similar trends. Quality imbalance has an impact on the number of differential genes derived by comparing control to disease samples (the higher the imbalance, the higher the number of genes), on the proportion of quality markers in top differential genes (the higher the imbalance, the higher the proportion; up to 22%) and on the proportion of known disease genes in top differential genes (the higher the imbalance, the lower the proportion). We show that removing outliers based on their quality score improves the resulting downstream analysis. CONCLUSIONS: Thanks to a stringent selection of well-designed datasets, we demonstrate that quality imbalance between groups of samples can significantly reduce the relevance of differential genes, consequently reducing reproducibility between studies. Appropriate experimental design and analysis methods can substantially reduce the problem.


Subject(s)
Sequence Analysis, RNA , Humans , Reproducibility of Results
4.
NAR Genom Bioinform ; 6(2): lqae053, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38774515

ABSTRACT

Genetic variation within populations plays a crucial role in driving evolution. Unlike the average protein sequence, the evolution of homorepeats can be influenced by DNA replication slippage, when DNA polymerases either add or skip repeats of nucleotides. While there are some diseases known to be caused by abnormal changes in the length of amino acid homorepeats, naturally occurring variations in homorepeat length remain relatively unexplored. In our study, we examined the variation in amino acid homorepeat length of human individuals by analyzing 125 748 exomes, as well as 15 708 whole genomes. Our analyses revealed significant variability in homorepeat length across the human population, indicating that these motifs are prone to mutations at higher rates than non repeat sequences. We focused our study on glutamine homorepeats, also known as polyQ sequences, and found that shorter polyQ sequences tend to exhibit greater length variation, while longer ones primarily undergo deletions. Notably, polyQ sequencesthat are more conserved across primates tend to show less variation within the human population, indicating stronger selective pressure to maintain their length. Overall, our results demonstrate that there is large natural variation in the length of homorepeats within the human population, with no apparent impact on observable traits.

5.
Int J Mol Sci ; 25(5)2024 Mar 05.
Article in English | MEDLINE | ID: mdl-38474241

ABSTRACT

Tandem repeats (TRs) in protein sequences are consecutive, highly similar sequence motifs. Some types of TRs fold into structural units that pack together in ensembles, forming either an (open) elongated domain or a (closed) propeller, where the last unit of the ensemble packs against the first one. Here, we examine TR proteins (TRPs) to see how their sequence, structure, and evolutionary properties favor them for a function as mediators of protein interactions. Our observations suggest that TRPs bind other proteins using large, structured surfaces like globular domains; in particular, open-structured TR ensembles are favored by flexible termini and the possibility to tightly coil against their targets. While, intuitively, open ensembles of TRs seem prone to evolve due to their potential to accommodate insertions and deletions of units, these evolutionary events are unexpectedly rare, suggesting that they are advantageous for the emergence of the ancestral sequence but are early fixed. We hypothesize that their flexibility makes it easier for further proteins to adapt to interact with them, which would explain their large number of protein interactions. We provide insight into the properties of open TR ensembles, which make them scaffolds for alternative protein complexes to organize genes, RNA and proteins.


Subject(s)
Proteins , Tandem Repeat Sequences , Proteins/chemistry , Amino Acid Sequence
6.
Front Mol Neurosci ; 16: 1280546, 2023.
Article in English | MEDLINE | ID: mdl-38125008

ABSTRACT

Spinocerebellar ataxia type 1 (SCA1) is an autosomal dominant neurodegenerative disease caused by a trinucleotide (CAG) repeat expansion in the ATXN1 gene. It is characterized by the presence of polyglutamine (polyQ) intranuclear inclusion bodies (IIBs) within affected neurons. In order to investigate the impact of polyQ IIBs in SCA1 pathogenesis, we generated a novel protein aggregation model by inducible overexpression of the mutant ATXN1(Q82) isoform in human neuroblastoma SH-SY5Y cells. Moreover, we developed a simple and reproducible protocol for the efficient isolation of insoluble IIBs. Biophysical characterization showed that polyQ IIBs are enriched in RNA molecules which were further identified by next-generation sequencing. Finally, a protein interaction network analysis indicated that sequestration of essential RNA transcripts within ATXN1(Q82) IIBs may affect the ribosome resulting in error-prone protein synthesis and global proteome instability. These findings provide novel insights into the molecular pathogenesis of SCA1, highlighting the role of polyQ IIBs and their impact on critical cellular processes.

7.
Curr Issues Mol Biol ; 45(12): 9904-9916, 2023 Dec 09.
Article in English | MEDLINE | ID: mdl-38132464

ABSTRACT

Lipids are important modifiers of protein function, particularly as parts of lipoproteins, which transport lipophilic substances and mediate cellular uptake of circulating lipids. As such, lipids are of particular interest as blood biological markers for cardiovascular disease (CVD) as well as for conditions linked to CVD such as atherosclerosis, diabetes mellitus, obesity and dietary states. Notably, lipid research is particularly well developed in the context of CVD because of the relevance and multiple causes and risk factors of CVD. The advent of methods for high-throughput screening of biological molecules has recently resulted in the generation of lipidomic profiles that allow monitoring of lipid compositions in biological samples in an untargeted manner. These and other earlier advances in biomedical research have shaped the knowledge we have about lipids in CVD. To evaluate the knowledge acquired on the multiple biological functions of lipids in CVD and the trends in their research, we collected a dataset of references from the PubMed database of biomedical literature focused on plasma lipids and CVD in human and mouse. Using annotations from these records, we were able to categorize significant associations between lipids and particular types of research approaches, distinguish non-biological lipids used as markers, identify differential research between human and mouse models, and detect the increasingly mechanistic nature of the results in this field. Using known associations between lipids and proteins that metabolize or transport them, we constructed a comprehensive lipid-protein network, which we used to highlight proteins strongly connected to lipids found in the CVD-lipid literature. Our approach points to a series of proteins for which lipid-focused research would bring insights into CVD, including Prostaglandin G/H synthase 2 (PTGS2, a.k.a. COX2) and Acylglycerol kinase (AGK). In this review, we summarize our findings, putting them in a historical perspective of the evolution of lipid research in CVD.

8.
Curr Opin Struct Biol ; 83: 102726, 2023 Dec.
Article in English | MEDLINE | ID: mdl-37924569

ABSTRACT

Homorepeats (or polyX), protein segments containing repetitions of the same amino acid, are abundant in proteomes from all kingdoms of life and are involved in crucial biological functions as well as several neurodegenerative and developmental diseases. Mainly inserted in disordered segments of proteins, the structure/function relationships of homorepeats remain largely unexplored. In this review, we summarize present knowledge for the most abundant homorepeats, highlighting the role of the inherent structure and the conformational influence exerted by their flanking regions. Recent experimental and computational methods enable residue-specific investigations of these regions and promise novel structural and dynamic information for this elusive group of proteins. This information should increase our knowledge about the structural bases of phenomena such as liquid-liquid phase separation and trinucleotide repeat disorders.


Subject(s)
Intrinsically Disordered Proteins , Proteome , Proteome/chemistry , Protein Conformation , Repetitive Sequences, Amino Acid , Amino Acids , Structure-Activity Relationship , Intrinsically Disordered Proteins/chemistry
9.
Comput Struct Biotechnol J ; 21: 5408-5412, 2023.
Article in English | MEDLINE | ID: mdl-38022702

ABSTRACT

PolyXY regions are compositionally biased regions composed of two different amino acids. They are classified according to the arrangement of the two amino acid types 'X' and 'Y' into direpeats (composed of alternating amino acids, e.g. 'XYXYXY'), joined (composed of two consecutive stretches of each amino acid, e.g. 'XXXYYY') and shuffled (other arrangements, e.g., 'XYXXYY'). They have been characterized at the amino acid level in all domains of life, and are described as often found within intrinsically disordered regions. Since DNA replication slippage has been proposed as a driver of repeat variation, and given that some polyXY have a repetitive nature, we hypothesized that characterizing the nucleotide coding of various types of polyXY could give hints about their origin and evolution. To test this, we obtained all polyXY regions in the human transcriptome, categorized them, and studied their coding nucleotide sequences. We observed that polyXY exacerbates the codon biases, and that the similarity between the X and Y codons is higher than in the background proteome. Our results support a general mechanism of emergence and evolution of polyXY from single-codon polyX. PolyXY are revealed as hotspots for replication slippage, particularly those composed of repeats: joined and direpeat polyXY. Inter-conversion to shuffled polyXY disrupts nucleotide repeats and restricts further evolution by replication slippage, a mechanism that we previously observed in polyX. Our results shed light on polyXY composition and should simplify the determination of their functions.

10.
Sci Rep ; 13(1): 17427, 2023 10 13.
Article in English | MEDLINE | ID: mdl-37833283

ABSTRACT

Patients suffering from painful spinal bone metastases (PSBMs) often undergo palliative radiation therapy (RT), with an efficacy of approximately two thirds of patients. In this exploratory investigation, we assessed the effectiveness of machine learning (ML) models trained on radiomics, semantic and clinical features to estimate complete pain response. Gross tumour volumes (GTV) and clinical target volumes (CTV) of 261 PSBMs were segmented on planning computed tomography (CT) scans. Radiomics, semantic and clinical features were collected for all patients. Random forest (RFC) and support vector machine (SVM) classifiers were compared using repeated nested cross-validation. The best radiomics classifier was trained on CTV with an area under the receiver-operator curve (AUROC) of 0.62 ± 0.01 (RFC; 95% confidence interval). The semantic model achieved a comparable AUROC of 0.63 ± 0.01 (RFC), significantly below the clinical model (SVM, AUROC: 0.80 ± 0.01); and slightly lower than the spinal instability neoplastic score (SINS; LR, AUROC: 0.65 ± 0.01). A combined model did not improve performance (AUROC: 0,74 ± 0,01). We could demonstrate that radiomics and semantic analyses of planning CTs allowed for limited prediction of therapy response to palliative RT. ML predictions based on established clinical parameters achieved the best results.


Subject(s)
Neoplasms , Tomography, X-Ray Computed , Humans , ROC Curve , Tomography, X-Ray Computed/methods , Neoplasms/radiotherapy , Machine Learning , Pain , Retrospective Studies
11.
Genes (Basel) ; 14(9)2023 08 28.
Article in English | MEDLINE | ID: mdl-37761851

ABSTRACT

Intrinsically disordered regions (IDRs) in protein sequences are emerging as functionally important elements for interaction and regulation. While being generally flexible, we previously showed, by observation of experimentally obtained structures, that they contain regions of reduced sequence complexity that have an increased propensity to form structure. Here we expand the universe of cases taking advantage of structural predictions by AlphaFold. Our studies focus on low complexity regions (LCRs) found within IDRs, where these LCRs have only one or two residue types (polyX and polyXY, respectively). In addition to confirming previous observations that polyE and polyEK have a tendency towards helical structure, we find a similar tendency for other LCRs such as polyQ and polyER, most of them including charged residues. We analyzed the position of polyXY containing IDRs within proteins, which allowed us to show that polyAG and polyAK accumulate at the N-terminal, with the latter showing increased helical propensity at that location. Functional enrichment analysis of polyXY with helical propensity indicated functions requiring interaction with RNA and DNA. Our work adds evidence of the function of LCRs in interaction-dependent structuring of disordered regions, encouraging the development of tools for the prediction of their dynamic structural properties.


Subject(s)
RNA , Amino Acid Sequence , Protein Domains
12.
Int J Mol Sci ; 24(18)2023 Sep 13.
Article in English | MEDLINE | ID: mdl-37762354

ABSTRACT

Tuberculosis remains the leading cause of death from a single pathogen. On the other hand, antimicrobial resistance (AMR) makes it increasingly difficult to deal with this disease. We present the hyperbolic embedding of the Mycobacterium tuberculosis protein interaction network (mtbPIN) of resistant strain (MTB XDR1219) to determine the biological relevance of its latent geometry. In this hypermap, proteins with similar interacting partners occupy close positions. An analysis of the hypermap of available drug targets (DTs) and their direct and intermediate interactors was used to identify potentially useful drug combinations and drug targets. We identify rpsA and rpsL as close DTs targeted by different drugs (pyrazinamide and aminoglycosides, respectively) and propose that the combination of these drugs could have a synergistic effect. We also used the hypermap to explain the effects of drugs that affect multiple DTs, for example, forcing the bacteria to deal with multiple stresses like ethambutol, which affects the synthesis of both arabinogalactan and lipoarabinomannan. Our strategy uncovers novel potential DTs, such as dprE1 and dnaK proteins, which interact with two close DT pairs: arabinosyltransferases (embC and embB), Ser/Thr protein kinase (pknB) and RNA polymerase (rpoB), respectively. Our approach provides mechanistic explanations for existing drugs and suggests new DTs. This strategy can also be applied to the study of other resistant strains.

13.
J Struct Biol ; 215(4): 108023, 2023 12.
Article in English | MEDLINE | ID: mdl-37652396

ABSTRACT

Tandem Repeat Proteins (TRPs) are a class of proteins with repetitive amino acid sequences that have been studied extensively for over two decades. Different features at the level of sequence, structure, function and evolution have been attributed to them by various authors. And yet many of its salient features appear only when looking at specific subclasses of protein tandem repeats. Here, we attempt to rationalize the existing knowledge on Tandem Repeat Proteins (TRPs) by pointing out several dichotomies. The emerging picture is more nuanced than generally assumed and allows us to draw some boundaries of what is not a "proper" TRP. We conclude with an operational definition of a specific subset, which we have denominated STRPs (Structural Tandem Repeat Proteins), which separates a subclass of tandem repeats with distinctive features from several other less well-defined types of repeats. We believe that this definition will help researchers in the field to better characterize the biological meaning of this large yet largely understudied group of proteins.


Subject(s)
Proteins , Tandem Repeat Sequences , Proteins/genetics , Proteins/chemistry , Tandem Repeat Sequences/genetics , Amino Acid Sequence
14.
J Thromb Haemost ; 21(10): 2797-2810, 2023 10.
Article in English | MEDLINE | ID: mdl-37481073

ABSTRACT

BACKGROUND: Recurrent events frequently occur after venous thromboembolism (VTE) and remain difficult to predict based on established genetic, clinical, and proteomic contributors. The role of circulating microRNAs (miRNAs) has yet to be explored in detail. OBJECTIVES: To identify circulating miRNAs predictive of recurrent VTE or death, and to interpret their mechanistic involvement. METHODS: Data from 181 participants of a cohort study of acute VTE and 302 individuals with a history of VTE from a population-based cohort were investigated. Next-generation sequencing was performed on EDTA plasma samples to detect circulating miRNAs. The endpoint of interest was recurrent VTE or death. Penalized regression was applied to identify an outcome-relevant miRNA signature, and results were validated in the population-based cohort. The involvement of miRNAs in coregulatory networks was assessed using principal component analysis, and the associated clinical and molecular phenotypes were investigated. Mechanistic insights were obtained from target gene and pathway enrichment analyses. RESULTS: A total of 1950 miRNAs were detected across cohorts after postprocessing. In the discovery cohort, 50 miRNAs were associated with recurrent VTE or death (cross-validated C-index, 0.65). A weighted miRNA score predicted outcome over an 8-year follow-up period (HRSD, 2.39; 95% CI, 1.98-2.88; P < .0001). The independent validation cohort validated 20 miRNAs (ORSD for score, 3.47; 95% CI, 2.37-5.07; P < .0001; cross-validated-area under the curve, 0.61). Principal component analysis revealed 5 miRNA networks with distinct relationships to clinical phenotype and outcome. Mapping of target genes indicated regulation via transcription factors and kinases involved in signaling pathways associated with fibrinolysis. CONCLUSION: Circulating miRNAs predicted the risk of recurrence or death after VTE over several years, both in the acute and chronic phases.


Subject(s)
Circulating MicroRNA , MicroRNAs , Venous Thromboembolism , Humans , Circulating MicroRNA/genetics , Venous Thromboembolism/diagnosis , Venous Thromboembolism/genetics , Cohort Studies , Proteomics , MicroRNAs/genetics
15.
Biomolecules ; 13(7)2023 07 13.
Article in English | MEDLINE | ID: mdl-37509152

ABSTRACT

Tandem repeats in proteins are patterns of residues repeated directly adjacent to each other. The evolution of these repeats can be assessed by using groups of homologous sequences, which can help pointing to events of unit duplication or deletion. High pressure in a protein family for variation of a given type of repeat might point to their function. Here, we propose the analysis of protein families to calculate protein short tandem repeats (pSTRs) in each protein sequence and assess their variability within the family in terms of number of units. To facilitate this analysis, we developed the pSTR tool, a method to analyze the evolution of protein short tandem repeats in a given protein family by pairwise comparisons between evolutionarily related protein sequences. We evaluated pSTR unit number variation in protein families of 12 complete metazoan proteomes. We hypothesize that families with more dynamic ensembles of repeats could reflect particular roles of these repeats in processes that require more adaptability.


Subject(s)
Microsatellite Repeats , Proteome , Animals , Amino Acid Sequence , Evolution, Molecular
17.
Eur J Med Chem ; 257: 115513, 2023 Sep 05.
Article in English | MEDLINE | ID: mdl-37253308

ABSTRACT

The identification of small molecules capable of replacing transcription factors has been a longstanding challenge in the generation of human chemically induced pluripotent stem cells (iPSCs). Recent studies have shown that ectopic expression of OCT4, one of the master pluripotency regulators, compromised the developmental potential of resulting iPSCs, This highlights the importance of finding endogenous OCT4 inducers for the generation of clinical-grade human iPSCs. Through a cell-based high throughput screen, we have discovered several new OCT4-inducing compounds (O4Is). In this work, we prepared metabolically stable analogues, including O4I4, which activate endogenous OCT4 and associated signaling pathways in various cell lines. By combining these with a transcription factor cocktail consisting of SOX2, KLF4, MYC, and LIN28 (referred to as "CSKML") we achieved to reprogram human fibroblasts into a stable and authentic pluripotent state without the need for exogenous OCT4. In Caenorhabditis elegans and Drosophila, O4I4 extends lifespan, suggesting the potential application of OCT4-inducing compounds in regenerative medicine and rejuvenation therapy.


Subject(s)
Cellular Reprogramming , Induced Pluripotent Stem Cells , Humans , Kruppel-Like Factor 4 , Induced Pluripotent Stem Cells/metabolism , Transcription Factors/metabolism , Aging , Cell Differentiation
18.
EMBO J ; 42(11): e110384, 2023 06 01.
Article in English | MEDLINE | ID: mdl-37083045

ABSTRACT

Most adult hippocampal neural stem cells (NSCs) remain quiescent, with only a minor portion undergoing active proliferation and neurogenesis. The molecular mechanisms that trigger the transition from quiescence to activation are still poorly understood. Here, we found the activity of the transcriptional co-activator Yap1 to be enriched in active NSCs. Genetic deletion of Yap1 led to a significant reduction in the relative proportion of active NSCs, supporting a physiological role of Yap1 in regulating the transition from quiescence to activation. Overexpression of wild-type Yap1 in adult NSCs did not induce NSC activation, suggesting tight upstream control mechanisms, but overexpression of a gain-of-function mutant (Yap1-5SA) elicited cell cycle entry in NSCs and hilar astrocytes. Consistent with a role of Yap1 in NSC activation, single cell RNA sequencing revealed a partial induction of an activated NSC gene expression program. Furthermore, Yap1-5SA expression also induced expression of Taz and other key components of the Yap/Taz regulon that were previously identified in glioblastoma stem cell-like cells. Consequently, dysregulated Yap1 activity led to repression of hippocampal neurogenesis, aberrant cell differentiation, and partial acquisition of a glioblastoma stem cell-like signature.


Subject(s)
Glioblastoma , Neural Stem Cells , Adult , Humans , Glioblastoma/metabolism , Cell Differentiation/physiology , Hippocampus/metabolism , Neurogenesis/genetics , Transcription Factors/genetics , Transcription Factors/metabolism , Adaptor Proteins, Signal Transducing/genetics , Adaptor Proteins, Signal Transducing/metabolism , Neural Stem Cells/metabolism
19.
Cancers (Basel) ; 15(7)2023 Apr 05.
Article in English | MEDLINE | ID: mdl-37046811

ABSTRACT

BACKGROUND: The aim of this study was to develop and validate radiogenomic models to predict the MDM2 gene amplification status and differentiate between ALTs and lipomas on preoperative MR images. METHODS: MR images were obtained in 257 patients diagnosed with ALTs (n = 65) or lipomas (n = 192) using histology and the MDM2 gene analysis as a reference standard. The protocols included T2-, T1-, and fat-suppressed contrast-enhanced T1-weighted sequences. Additionally, 50 patients were obtained from a different hospital for external testing. Radiomic features were selected using mRMR. Using repeated nested cross-validation, the machine-learning models were trained on radiomic features and demographic information. For comparison, the external test set was evaluated by three radiology residents and one attending radiologist. RESULTS: A LASSO classifier trained on radiomic features from all sequences performed best, with an AUC of 0.88, 70% sensitivity, 81% specificity, and 76% accuracy. In comparison, the radiology residents achieved 60-70% accuracy, 55-80% sensitivity, and 63-77% specificity, while the attending radiologist achieved 90% accuracy, 96% sensitivity, and 87% specificity. CONCLUSION: A radiogenomic model combining features from multiple MR sequences showed the best performance in predicting the MDM2 gene amplification status. The model showed a higher accuracy compared to the radiology residents, though lower compared to the attending radiologist.

20.
J Struct Biol ; 215(2): 107962, 2023 06.
Article in English | MEDLINE | ID: mdl-37031868

ABSTRACT

Nucleocytoplasmatic large DNA viruses (NCLDVs or giant viruses) stand out because of their relatively large genomes encoding hundreds of proteins. These species give us an unprecedented opportunity to study the emergence and evolution of repeats in protein sequences. On the one hand, as viruses, these species have a restricted set of functions, which can help us better define the functional landscape of repeats. On the other hand, given the particular use of the genetic machinery of the host, it is worth asking whether this allows the variations of genetic material that lead to repeats in non-viral species. To support research in the characterization of repeat protein evolution and function, we present here an analysis focused on the repeat proteins of giant viruses, namely tandem repeats (TRs), short repeats (SRs), and homorepeats (polyX). Proteins with large and short repeats are not very frequent in non-eukaryotic organisms because of the difficulties that their folding may entail; however, their presence in giant viruses remarks their advantage for performance in the protein environment of the eukaryotic host. The heterogeneous content of these TRs, SRs and polyX in some viruses hints at diverse needs. Comparisons to homologs suggest that the mechanisms that generate these repeats are extensively used by some of these viruses, but also their capacity to adopt genes with repeats. Giant viruses could be very good models for the study of the emergence and evolution of protein repeats.


Subject(s)
Giant Viruses , Viruses , Giant Viruses/genetics , Evolution, Molecular , DNA Viruses/genetics , Proteins/genetics , Viruses/genetics , Eukaryota
SELECTION OF CITATIONS
SEARCH DETAIL