RESUMO
Single-cell genome analyses of human oocytes are important for meiosis research and preimplantation genomic screening. However, the nonuniformity of single-cell whole-genome amplification hindered its use. Here, we demonstrate genome analyses of single human oocytes using multiple annealing and looping-based amplification cycle (MALBAC)-based sequencing technology. By sequencing the triads of the first and second polar bodies (PB1 and PB2) and the oocyte pronuclei from same female egg donors, we phase the genomes of these donors with detected SNPs and determine the crossover maps of their oocytes. Our data exhibit an expected crossover interference and indicate a weak chromatid interference. Further, the genome of the oocyte pronucleus, including information regarding aneuploidy and SNPs in disease-associated alleles, can be accurately deduced from the genomes of PB1 and PB2. The MALBAC-based preimplantation genomic screening in in vitro fertilization (IVF) enables accurate and cost-effective selection of normal fertilized eggs for embryo transfer.
Assuntos
Fertilização in vitro , Genoma Humano , Oócitos/metabolismo , Análise de Sequência de DNA/métodos , Adulto , Aneuploidia , Blastocisto/metabolismo , Feminino , Humanos , Corpos Polares/metabolismo , Polimorfismo de Nucleotídeo Único , Análise de Célula Única , Doadores de TecidosRESUMO
Predicting protein-DNA binding specificity is a challenging yet essential task for understanding gene regulation. Protein-DNA complexes usually exhibit binding to a selected DNA target site, whereas a protein binds, with varying degrees of binding specificity, to a wide range of DNA sequences. This information is not directly accessible in a single structure. Here, to access this information, we present Deep Predictor of Binding Specificity (DeepPBS), a geometric deep-learning model designed to predict binding specificity from protein-DNA structure. DeepPBS can be applied to experimental or predicted structures. Interpretable protein heavy atom importance scores for interface residues can be extracted. When aggregated at the protein residue level, these scores are validated through mutagenesis experiments. Applied to designed proteins targeting specific DNA sequences, DeepPBS was demonstrated to predict experimentally measured binding specificity. DeepPBS offers a foundation for machine-aided studies that advance our understanding of molecular interactions and guide experimental designs and synthetic biology.
Assuntos
Proteínas de Ligação a DNA , DNA , Aprendizado Profundo , Ligação Proteica , DNA/metabolismo , DNA/química , Proteínas de Ligação a DNA/metabolismo , Proteínas de Ligação a DNA/química , Sítios de Ligação , Biologia Computacional/métodos , Modelos MolecularesRESUMO
Sequence-dependent DNA shape plays an important role in understanding protein-DNA binding mechanisms. High-throughput prediction of DNA shape features has become a valuable tool in the field of protein-DNA recognition, transcription factor-DNA binding specificity, and gene regulation. However, our widely used webserver, DNAshape, relies on statistically summarized pentamer query tables to query DNA shape features. These query tables do not consider flanking regions longer than two base pairs, and acquiring a query table for hexamers or higher-order k-mers is currently still unrealistic due to limitations in achieving sufficient statistical coverage in molecular simulations or structural biology experiments. A recent deep-learning method, Deep DNAshape, can predict DNA shape features at the core of a DNA fragment considering flanking regions of up to seven base pairs, trained on limited simulation data. However, Deep DNAshape is rather complicated to install, and it must run locally compared to the pentamer-based DNAshape webserver, creating a barrier for users. Here, we present the Deep DNAshape webserver, which has the benefits of both methods while being accurate, fast, and accessible to all users. Additional improvements of the webserver include the detection of user input in real time, the ability of interactive visualization tools and different modes of analyses. URL: https://deepdnashape.usc.edu.
Assuntos
DNA , Internet , Conformação de Ácido Nucleico , Software , DNA/química , Aprendizado ProfundoRESUMO
BACKGROUND: Benign paroxysmal positional vertigo (BPPV) is a prevalent form of vertigo that necessitates a skilled physician to diagnose by observing the nystagmus and vertigo resulting from specific changes in the patient's position. In this study, we aim to explore the integration of eye movement video and position information for BPPV diagnosis and apply artificial intelligence (AI) methods to improve the accuracy of BPPV diagnosis. METHODS: We collected eye movement video and diagnostic data from 518 patients with BPPV who visited the hospital for examination from January to March 2021 and developed a BPPV dataset. Based on the characteristics of the dataset, we propose a multimodal deep learning diagnostic model, which combines a video understanding model, self-encoder, and cross-attention mechanism structure. RESULT: Our validation test on the test set showed that the average accuracy of the model reached 81.7%, demonstrating the effectiveness of the proposed multimodal deep learning method for BPPV diagnosis. Furthermore, our study highlights the significance of combining head position information and eye movement information in BPPV diagnosis. We also found that postural and eye movement information plays a critical role in the diagnosis of BPPV, as demonstrated by exploring the necessity of postural information for the diagnostic model and the contribution of cross-attention mechanisms to the fusion of postural and oculomotor information. Our results underscore the potential of AI-based methods for improving the accuracy of BPPV diagnosis and the importance of considering both postural and oculomotor information in BPPV diagnosis.
Assuntos
Aprendizado Profundo , Nistagmo Patológico , Humanos , Vertigem Posicional Paroxística Benigna/diagnóstico , Inteligência Artificial , Nistagmo Patológico/diagnóstico , HospitaisRESUMO
Uncovering the mechanisms that affect the binding specificity of transcription factors (TFs) is critical for understanding the principles of gene regulation. Although sequence-based models have been used successfully to predict TF binding specificities, we found that including DNA shape information in these models improved their accuracy and interpretability. Previously, we developed a method for modeling DNA binding specificities based on DNA shape features extracted from Monte Carlo (MC) simulations. Prediction accuracies of our models, however, have not yet been compared to accuracies of models incorporating DNA shape information extracted from X-ray crystallography (XRC) data or Molecular Dynamics (MD) simulations. Here, we integrated DNA shape information extracted from MC or MD simulations and XRC data into predictive models of TF binding and compared their performance. Models that incorporated structural information consistently showed improved performance over sequence-based models regardless of data source. Furthermore, we derived and validated nine additional DNA shape features beyond our original set of four features. The expanded repertoire of 13 distinct DNA shape features, including six intra-base pair and six inter-base pair parameters and minor groove width, is available in our R/Bioconductor package DNAshapeR and enables a comprehensive structural description of the double helix on a genome-wide scale.
Assuntos
Algoritmos , Biologia Computacional/métodos , DNA/química , Estudo de Associação Genômica Ampla/métodos , Fatores de Transcrição/química , Sequência de Bases , Cristalografia por Raios X , DNA/genética , DNA/metabolismo , Simulação de Dinâmica Molecular , Método de Monte Carlo , Conformação de Ácido Nucleico , Ligação Proteica , Reprodutibilidade dos Testes , Fatores de Transcrição/metabolismoRESUMO
Understanding the mechanisms of protein-DNA binding is critical in comprehending gene regulation. Three-dimensional DNA structure, also described as DNA shape, plays a key role in these mechanisms. In this study, we present a deep learning-based method, Deep DNAshape, that fundamentally changes the current k-mer based high-throughput prediction of DNA shape features by accurately accounting for the influence of extended flanking regions, without the need for extensive molecular simulations or structural biology experiments. By using the Deep DNAshape method, DNA structural features can be predicted for any length and number of DNA sequences in a high-throughput manner, providing an understanding of the effects of flanking regions on DNA structure in a target region of a sequence. The Deep DNAshape method provides access to the influence of distant flanking regions on a region of interest. Our findings reveal that DNA shape readout mechanisms of a core target are quantitatively affected by flanking regions, including extended flanking regions, providing valuable insights into the detailed structural readout mechanisms of protein-DNA binding. Furthermore, when incorporated in machine learning models, the features generated by Deep DNAshape improve the model prediction accuracy. Collectively, Deep DNAshape can serve as versatile and powerful tool for diverse DNA structure-related studies.
Assuntos
Aprendizado Profundo , Proteínas/metabolismo , Ligação Proteica , Aprendizado de Máquina , DNA/metabolismoRESUMO
BACKGROUND: Primary liver cancer (PHC) stands as one of the most prevalent malignant diseases in clinical settings. Studies have indicated that transcatheter arterial chemoembolization (TACE) treatment exhibits superior clinical outcomes, potentially increasing the complete necrosis rate in patients with PHC. A correlation exists between the clinical outcomes of TACE surgery and the process of epithelial-mesenchymal transition (EMT), yet the underlying mechanism remains a mystery. Hence, it is crucial to investigate the impact and mechanism of EMT on hepatocellular carcinoma (HCC). METHODS: Retrospectively, patients with advanced liver cancer who underwent TACE were selected and categorized into two groups based on the assessment of clinical efficacy: the effective group and the ineffective group. The expression levels of nuclear factor-kappa B (NF-κB), matrix metalloproteinase 9 (MMP9), Ki-67, B-cell lymphoma-2 (Bcl-2), Bcl-2-associated X (Bax), Vimentin, E-cadherin, and N-cadherin in tumor tissues were evaluated using reverse transcription-polymerase chain reaction (RT-PCR). In vitro, Huh7 cells were cultured, and lentivirus infections were utilized to inhibit the overexpression of NF-κB and MMP9. The determination of EMT and cell viability was conducted through Cell Counting Kit-8 (CCK-8) assays, RT-PCR, and Western blot. RESULTS: Sixty patients diagnosed with advanced liver cancer were selected for the study. Based on their clinical outcomes, 30 patients with advanced hepatocellular carcinoma were categorized into the effective group, while the remaining 30 patients were categorized into the ineffective group. The results of the Western blot analysis indicated that, in comparison to the effective group, the expression levels of NF-κB, MMP9, Ki-67, Bcl-2, Vimentin, and N-cadherin were significantly higher in the tumor tissues of the ineffective group. Conversely, the expression of Bax and E-cadherin was notably lower in the effective group. Following the individual knockdown of NF-κB and MMP9, the cell experiments revealed a remarkable decrease in the expression levels of Ki-67, Bcl-2, Vimentin, and N-cadherin, whereas the expression of Bax and E-cadherin showed significant elevation (p < 0.05). Furthermore, there was a significant increase in cell viability and a decrease in cell apoptosis after the knockdown of NF-κB and MMP9. CONCLUSIONS: The NF-κB/MMP9 signaling axis serves as a pivotal regulator that fosters proliferation and impedes apoptosis in Huh7 cells by modulating the process of EMT.
Assuntos
Carcinoma Hepatocelular , Transição Epitelial-Mesenquimal , Neoplasias Hepáticas , Transdução de Sinais , Idoso , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Carcinoma Hepatocelular/patologia , Carcinoma Hepatocelular/metabolismo , Carcinoma Hepatocelular/terapia , Carcinoma Hepatocelular/genética , Linhagem Celular Tumoral , Proliferação de Células , Progressão da Doença , Regulação Neoplásica da Expressão Gênica , Neoplasias Hepáticas/patologia , Neoplasias Hepáticas/metabolismo , Neoplasias Hepáticas/genética , Neoplasias Hepáticas/terapia , Metaloproteinase 9 da Matriz/metabolismo , NF-kappa B/metabolismo , Estudos RetrospectivosRESUMO
Understanding the mechanisms of protein-DNA binding is critical in comprehending gene regulation. Three-dimensional DNA shape plays a key role in these mechanisms. In this study, we present a deep learning-based method, Deep DNAshape, that fundamentally changes the current k -mer based high-throughput prediction of DNA shape features by accurately accounting for the influence of extended flanking regions, without the need for extensive molecular simulations or structural biology experiments. By using the Deep DNAshape method, refined DNA shape features can be predicted for any length and number of DNA sequences in a high-throughput manner, providing a deeper understanding of the effects of flanking regions on DNA shape in a target region of a sequence. Deep DNAshape method provides access to the influence of distant flanking regions on a region of interest. Our findings reveal that DNA shape readout mechanisms of a core target are quantitatively affected by flanking regions, including extended flanking regions, providing valuable insights into the detailed structural readout mechanisms of protein-DNA binding. Furthermore, when incorporated in machine learning models, the features generated by Deep DNAshape improve the model prediction accuracy. Collectively, Deep DNAshape can serve as a versatile and powerful tool for diverse DNA structure-related studies.
RESUMO
Predicting specificity in protein-DNA interactions is a challenging yet essential task for understanding gene regulation. Here, we present Deep Predictor of Binding Specificity (DeepPBS), a geometric deep-learning model designed to predict binding specificity across protein families based on protein-DNA structures. The DeepPBS architecture allows investigation of different family-specific recognition patterns. DeepPBS can be applied to predicted structures, and can aid in the modeling of protein-DNA complexes. DeepPBS is interpretable and can be used to calculate protein heavy atom-level importance scores, demonstrated as a case-study on p53-DNA interface. When aggregated at the protein residue level, these scores conform well with alanine scanning mutagenesis experimental data. The inference time for DeepPBS is sufficiently fast for analyzing simulation trajectories, as demonstrated on a molecular-dynamics simulation of a Drosophila Hox-DNA tertiary complex with its cofactor. DeepPBS and its corresponding data resources offer a foundation for machine-aided protein-DNA interaction studies, guiding experimental choices and complex design, as well as advancing our understanding of molecular interactions.
RESUMO
The Origin Recognition Complex (ORC) is an evolutionarily conserved six-subunit protein complex that binds specific sites at many locations to coordinately replicate the entire eukaryote genome. Though highly conserved in structure, ORC's selectivity for replication origins has diverged tremendously between yeasts and humans to adapt to vastly different life cycles. In this work, we demonstrate that the selectivity determinant of ORC for DNA binding lies in a 19-amino acid insertion helix in the Orc4 subunit, which is present in yeast but absent in human. Removal of this motif from Orc4 transforms the yeast ORC, which selects origins based on base-specific binding at defined locations, into one whose selectivity is dictated by chromatin landscape and afforded with plasticity, as reported for human. Notably, the altered yeast ORC has acquired an affinity for regions near transcriptional start sites (TSSs), which the human ORC also favors.
Assuntos
Complexo de Reconhecimento de Origem/metabolismo , Saccharomyces cerevisiae/metabolismo , Sequência de Aminoácidos , Sequência de Bases , Sítios de Ligação , DNA Fúngico/metabolismo , Fase G2/genética , Genoma Fúngico , Humanos , Modelos Genéticos , Mutação/genética , Nucleossomos/metabolismo , Motivos de Nucleotídeos/genética , Complexo de Reconhecimento de Origem/química , Fase S , Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/metabolismo , Processos Estocásticos , Sítio de Iniciação de TranscriçãoRESUMO
Meiotic recombination creates genetic diversity and ensures segregation of homologous chromosomes. Previous population analyses yielded results averaged among individuals and affected by evolutionary pressures. We sequenced 99 sperm from an Asian male by using the newly developed amplification method-multiple annealing and looping-based amplification cycles-to phase the personal genome and map recombination events at high resolution, which are nonuniformly distributed across the genome in the absence of selection pressure. The paucity of recombination near transcription start sites observed in individual sperm indicates that such a phenomenon is intrinsic to the molecular mechanism of meiosis. Interestingly, a decreased crossover frequency combined with an increase of autosomal aneuploidy is observable on a global per-sperm basis.