Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 88
Filtrar
1.
bioRxiv ; 2024 Jun 25.
Artículo en Inglés | MEDLINE | ID: mdl-38979156

RESUMEN

Cellular senescence, a stress-induced stable proliferation arrest associated with an inflammatory Senescence-Associated Secretory Phenotype (SASP), is a cause of aging. In senescent cells, Cytoplasmic Chromatin Fragments (CCFs) activate SASP via the anti-viral cGAS/STING pathway. PML protein organizes PML nuclear bodies (NBs), also involved in senescence and anti-viral immunity. The HIRA histone H3.3 chaperone localizes to PML NBs in senescent cells. Here, we show that HIRA and PML are essential for SASP expression, tightly linked to HIRA's localization to PML NBs. Inactivation of HIRA does not directly block expression of NF-κB target genes. Instead, an H3.3-independent HIRA function activates SASP through a CCF-cGAS-STING-TBK1-NF-κB pathway. HIRA physically interacts with p62/SQSTM1, an autophagy regulator and negative SASP regulator. HIRA and p62 co-localize in PML NBs, linked to their antagonistic regulation of SASP, with PML NBs controlling their spatial configuration. These results outline a role for HIRA and PML in regulation of SASP.

2.
Genome Biol ; 25(1): 118, 2024 05 13.
Artículo en Inglés | MEDLINE | ID: mdl-38741205

RESUMEN

The precision-recall curve (PRC) and the area under the precision-recall curve (AUPRC) are useful for quantifying classification performance. They are commonly used in situations with imbalanced classes, such as cancer diagnosis and cell type annotation. We evaluate 10 popular tools for plotting PRC and computing AUPRC, which were collectively used in more than 3000 published studies. We find the AUPRC values computed by the tools rank classifiers differently and some tools produce overly-optimistic results.


Asunto(s)
Programas Informáticos , Humanos , Área Bajo la Curva , Biología Computacional/métodos
3.
Sci Rep ; 14(1): 5583, 2024 03 07.
Artículo en Inglés | MEDLINE | ID: mdl-38448490

RESUMEN

In this report, we present OLAF-Seq, a novel strategy to construct a long-read sequencing library such that adjacent fragments are linked with end-terminal duplications. We use the CRISPR-Cas9 nickase enzyme and a pool of multiple sgRNAs to perform non-random fragmentation of targeted long DNA molecules (> 300kb) into smaller library-sized fragments (about 20 kbp) in a manner so as to retain physical linkage information (up to 1000 bp) between adjacent fragments. DNA molecules targeted for fragmentation are preferentially ligated with adaptors for sequencing, so this method can enrich targeted regions while taking advantage of the long-read sequencing platforms. This enables the sequencing of target regions with significantly lower total coverage, and the genome sequence within linker regions provides information for assembly and phasing. We demonstrated the validity and efficacy of the method first using phage and then by sequencing a panel of 100 full-length cancer-related genes (including both exons and introns) in the human genome. When the designed linkers contained heterozygous genetic variants, long haplotypes could be established. This sequencing strategy can be readily applied in both PacBio and Oxford Nanopore platforms for both long and short genes with an easy protocol. This economically viable approach is useful for targeted enrichment of hundreds of target genomic regions and where long no-gap contigs need deep sequencing.


Asunto(s)
Bacteriófagos , ARN Guía de Sistemas CRISPR-Cas , Humanos , Análisis de Secuencia de ADN , Genómica , Proteína 9 Asociada a CRISPR , ADN/genética
4.
bioRxiv ; 2024 Feb 07.
Artículo en Inglés | MEDLINE | ID: mdl-38370825

RESUMEN

The precision-recall curve (PRC) and the area under it (AUPRC) are useful for quantifying classification performance. They are commonly used in situations with imbalanced classes, such as cancer diagnosis and cell type annotation. We evaluated 10 popular tools for plotting PRC and computing AUPRC, which were collectively used in >3,000 published studies. We found the AUPRC values computed by the tools rank classifiers differently and some tools produce overly-optimistic results.

5.
Diabetes Care ; 46(6): 1271-1281, 2023 06 01.
Artículo en Inglés | MEDLINE | ID: mdl-37125963

RESUMEN

OBJECTIVE: In this study we aim to unravel genetic determinants of coronary heart disease (CHD) in type 2 diabetes (T2D) and explore their applications. RESEARCH DESIGN AND METHODS: We performed a two-stage genome-wide association study for CHD in Chinese patients with T2D (3,596 case and 8,898 control subjects), followed by replications in European patients with T2D (764 case and 4,276 control subjects) and general populations (n = 51,442-547,261). Each identified variant was examined for its association with a wide range of phenotypes and its interactions with glycemic, blood pressure (BP), and lipid controls in incident cardiovascular diseases. RESULTS: We identified a novel variant (rs10171703) for CHD (odds ratio 1.21 [95% CI 1.13-1.30]; P = 2.4 × 10-8) and BP (ß ± SE 0.130 ± 0.017; P = 4.1 × 10-14) at PDE1A in Chinese T2D patients but found only a modest association with CHD in general populations. This variant modulated the effects of BP goal attainment (130/80 mmHg) on CHD (Pinteraction = 0.0155) and myocardial infarction (MI) (Pinteraction = 5.1 × 10-4). Patients with CC genotype of rs10171703 had >40% reduction in either cardiovascular events in response to BP control (2.9 × 10-8 < P < 3.6 × 10-5), those with CT genotype had no difference (0.0726 < P < 0.2614), and those with TT genotype had a threefold increase in MI risk (P = 6.7 × 10-3). CONCLUSIONS: We discovered a novel CHD- and BP-related variant at PDE1A that interacted with BP goal attainment with divergent effects on CHD risk in Chinese patients with T2D. Incorporating this information may facilitate individualized treatment strategies for precision care in diabetes, only when our findings are validated.


Asunto(s)
Enfermedad Coronaria , Fosfodiesterasas de Nucleótidos Cíclicos Tipo 1 , Diabetes Mellitus Tipo 2 , Infarto del Miocardio , Humanos , Enfermedad Coronaria/genética , Diabetes Mellitus Tipo 2/complicaciones , Pueblos del Este de Asia , Estudio de Asociación del Genoma Completo , Objetivos , Infarto del Miocardio/complicaciones , Infarto del Miocardio/genética , Polimorfismo de Nucleótido Simple , Medición de Riesgo , Factores de Riesgo , Fosfodiesterasas de Nucleótidos Cíclicos Tipo 1/genética
6.
Nat Commun ; 14(1): 2543, 2023 05 15.
Artículo en Inglés | MEDLINE | ID: mdl-37188670

RESUMEN

Epigenetic markers are potential biomarkers for diabetes and related complications. Using a prospective cohort from the Hong Kong Diabetes Register, we perform two independent epigenome-wide association studies to identify methylation markers associated with baseline estimated glomerular filtration rate (eGFR) and subsequent decline in kidney function (eGFR slope), respectively, in 1,271 type 2 diabetes subjects. Here we show 40 (30 previously unidentified) and eight (all previously unidentified) CpG sites individually reach epigenome-wide significance for baseline eGFR and eGFR slope, respectively. We also develop a multisite analysis method, which selects 64 and 37 CpG sites for baseline eGFR and eGFR slope, respectively. These models are validated in an independent cohort of Native Americans with type 2 diabetes. Our identified CpG sites are near genes enriched for functional roles in kidney diseases, and some show association with renal damage. This study highlights the potential of methylation markers in risk stratification of kidney disease among type 2 diabetes individuals.


Asunto(s)
Diabetes Mellitus Tipo 2 , Nefropatías Diabéticas , Insuficiencia Renal Crónica , Humanos , Nefropatías Diabéticas/genética , Nefropatías Diabéticas/metabolismo , Diabetes Mellitus Tipo 2/complicaciones , Diabetes Mellitus Tipo 2/genética , Diabetes Mellitus Tipo 2/metabolismo , Estudios Prospectivos , Metilación de ADN/genética , Progresión de la Enfermedad , Riñón/metabolismo , Marcadores Genéticos , Insuficiencia Renal Crónica/genética
7.
Genome Biol ; 24(1): 79, 2023 04 18.
Artículo en Inglés | MEDLINE | ID: mdl-37072822

RESUMEN

A promising alternative to comprehensively performing genomics experiments is to, instead, perform a subset of experiments and use computational methods to impute the remainder. However, identifying the best imputation methods and what measures meaningfully evaluate performance are open questions. We address these questions by comprehensively analyzing 23 methods from the ENCODE Imputation Challenge. We find that imputation evaluations are challenging and confounded by distributional shifts from differences in data collection and processing over time, the amount of available data, and redundancy among performance measures. Our analyses suggest simple steps for overcoming these issues and promising directions for more robust research.


Asunto(s)
Algoritmos , Epigenómica , Genómica/métodos
8.
NAR Cancer ; 5(1): zcad012, 2023 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-36879684

RESUMEN

Immune checkpoint inhibitors (ICIs) have led to durable responses in cancer patients, yet their efficacy varies significantly across cancer types and patients. To stratify patients based on their potential clinical benefits, there have been substantial research efforts in identifying biomarkers and computational models that can predict the efficacy of ICIs, and it has become difficult to keep track of all of them. It is also difficult to compare findings of different studies since they involve different cancer types, ICIs, and various other details. To make it easy to access the latest information about ICI efficacy, we have developed a knowledgebase and a corresponding web-based portal (https://iciefficacy.org/). Our knowledgebase systematically records information about latest publications related to ICI efficacy, predictors proposed, and datasets used to test them. All information recorded is checked carefully by a manual curation process. The web-based portal provides functions to browse, search, filter, and sort the information. Digests of method details are provided based on the original descriptions in the publications. Evaluation results of the effectiveness of the predictors reported in the publications are summarized for quick overviews. Overall, our resource provides centralized access to the burst of information produced by the vibrant research on ICI efficacy.

9.
Proc Natl Acad Sci U S A ; 120(1): e2208623119, 2023 01 03.
Artículo en Inglés | MEDLINE | ID: mdl-36584300

RESUMEN

Haploinsufficiency for SOX9, the master chondrogenesis transcription factor, can underlie campomelic dysplasia (CD), an autosomal dominant skeletal malformation syndrome, because heterozygous Sox9 null mice recapitulate the bent limb (campomelia) and some other phenotypes associated with CD. However, in vitro cell assays suggest haploinsufficiency may not apply for certain mutations, notably those that truncate the protein, but in these cases in vivo evidence is lacking and underlying mechanisms are unknown. Here, using conditional mouse mutants, we compared the impact of a heterozygous Sox9 null mutation (Sox9+/-) with the Sox9+/Y440X CD mutation that truncates the C-terminal transactivation domain but spares the DNA-binding domain. While some Sox9+/Y440X mice survived, all Sox9+/- mice died perinatally. However, the skeletal defects were more severe and IHH signaling in developing limb cartilage was significantly enhanced in Sox9+/Y440X compared with Sox9+/-. Activating Sox9Y440X specifically in the chondrocyte-osteoblast lineage caused milder campomelia, and revealed cell- and noncell autonomous mechanisms acting on chondrocyte differentiation and osteogenesis in the perichondrium. Transcriptome analyses of developing Sox9+/Y440X limbs revealed dysregulated expression of genes for the extracellular matrix, as well as changes consistent with aberrant WNT and HH signaling. SOX9Y440X failed to interact with ß-catenin and was unable to suppress transactivation of Ihh in cell-based assays. We propose enhanced HH signaling in the adjacent perichondrium induces asymmetrically localized excessive perichondrial osteogenesis resulting in campomelia. Our study implicates combined haploinsufficiency/hypomorphic and dominant-negative actions of SOX9Y440X, cell-autonomous and noncell autonomous mechanisms, and dysregulated WNT and HH signaling, as the cause of human campomelia.


Asunto(s)
Erizos , Vía de Señalización Wnt , Humanos , Ratones , Animales , Erizos/metabolismo , Regulación de la Expresión Génica , Factor de Transcripción SOX9/genética , Factor de Transcripción SOX9/metabolismo , Diferenciación Celular/genética , Proteínas/metabolismo , Condrocitos/metabolismo
10.
Brief Bioinform ; 25(1)2023 11 22.
Artículo en Inglés | MEDLINE | ID: mdl-38233091

RESUMEN

Structural variations (SVs) are commonly found in cancer genomes. They can cause gene amplification, deletion and fusion, among other functional consequences. With an average read length of hundreds of kilobases, nano-channel-based optical DNA mapping is powerful in detecting large SVs. However, existing SV calling methods are not tailored for cancer samples, which have special properties such as mixed cell types and sub-clones. Here we propose the Cancer Optical Mapping for detecting Structural Variations (COMSV) method that is specifically designed for cancer samples. It shows high sensitivity and specificity in benchmark comparisons. Applying to cancer cell lines and patient samples, COMSV identifies hundreds of novel SVs per sample.


Asunto(s)
Genoma Humano , Neoplasias , Humanos , Análisis de Secuencia de ADN/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Neoplasias/genética
11.
Sci Rep ; 12(1): 20423, 2022 11 28.
Artículo en Inglés | MEDLINE | ID: mdl-36443333

RESUMEN

Common variants in RET and NRG1 have been associated with Hirschsprung disease (HSCR), a congenital disorder characterised by incomplete innervation of distal gut, in East Asian (EA) populations. However, the allelic effects so far identified do not fully explain its heritability, suggesting the presence of epistasis, where effect of one genetic variant differs depending on other (modifier) variants. Few instances of epistasis have been documented in complex diseases due to modelling complexity and data challenges. We proposed four epistasis models to comprehensively capture epistasis for HSCR between and within RET and NRG1 loci using whole genome sequencing (WGS) data in EA samples. 65 variants within the Topologically Associating Domain (TAD) of RET demonstrated significant epistasis with the lead enhancer variant (RET+3; rs2435357). These epistatic variants formed two linkage disequilibrium (LD) clusters represented by rs2506026 and rs2506028 that differed in minor allele frequency and the best-supported epistatic model. Intriguingly, rs2506028 is in high LD with one cis-regulatory variant (rs2506030) highlighted previously, suggesting that detected epistasis might be mediated through synergistic effects on transcription regulation of RET. Our findings demonstrated the advantages of WGS data for detecting epistasis, and support the presence of interactive effects of regulatory variants in RET for HSCR.


Asunto(s)
Enfermedad de Hirschsprung , Humanos , Enfermedad de Hirschsprung/genética , Epistasis Genética , Secuenciación Completa del Genoma , Alelos , Pueblo Asiatico , Proteínas Proto-Oncogénicas c-ret/genética
13.
BMC Genomics ; 23(1): 422, 2022 Jun 06.
Artículo en Inglés | MEDLINE | ID: mdl-35668367

RESUMEN

BACKGROUND: After an infection, human cells may contain viral genomes in the form of episomes or integrated DNA. Comparing the genomic sequences of different strains of a virus in human cells can often provide useful insights into its behaviour, activity and pathology, and may help develop methods for disease prevention and treatment. To support such comparative analyses, the viral genomes need to be accurately reconstructed from a large number of samples. Previous efforts either rely on customized experimental protocols or require high similarity between the sequenced genomes and a reference, both of which limit the general applicability of these approaches. In this study, we propose a pipeline, named ASPIRE, for reconstructing viral genomes accurately from short reads data of human samples, which are increasingly available from genome projects and personal genomics. ASPIRE contains a basic part that involves de novo assembly, tiling and gap filling, and additional components for iterative refinement, sequence corrections and wrapping. RESULTS: Evaluated by the alignment quality of sequencing reads to the reconstructed genomes, these additional components improve the assembly quality in general, and in some particular samples quite substantially, especially when the sequenced genome is significantly different from the reference. We use ASPIRE to reconstruct the genomes of Epstein Barr Virus (EBV) from the whole-genome sequencing data of 61 nasopharyngeal carcinoma (NPC) samples and provide these sequences as a resource for EBV research. CONCLUSIONS: ASPIRE improves the quality of the reconstructed EBV genomes in published studies and outperforms TRACESPipe in some samples considered.


Asunto(s)
Infecciones por Virus de Epstein-Barr , Herpesvirus Humano 4 , Infecciones por Virus de Epstein-Barr/genética , Genoma Viral , Genómica/métodos , Herpesvirus Humano 4/genética , Humanos , Filogenia , Análisis de Secuencia de ADN/métodos
14.
Bioinformatics ; 38(10): 2683-2691, 2022 05 13.
Artículo en Inglés | MEDLINE | ID: mdl-35561158

RESUMEN

MOTIVATION: Recombination is one of the essential genetic processes for sexually reproducing organisms, which can happen more frequently in some regions, called recombination hotspots. Although several factors, such as PRDM9 binding motifs, are known to be related to the hotspots, their contributions to the recombination hotspots have not been quantified, and other determinants are yet to be elucidated. Here, we propose a computational method, RHSNet, based on deep learning and signal processing, to identify and quantify the hotspot determinants in a purely data-driven manner, utilizing datasets from various studies, populations, sexes and species. RESULTS: RHSNet can significantly outperform other sequence-based methods on multiple datasets across different species, sexes and studies. In addition to being able to identify hotspot regions and the well-known determinants accurately, more importantly, RHSNet can quantify the determinants that contribute significantly to the recombination hotspot formation in the relation between PRDM9 binding motif, histone modification and GC content. Further cross-sex, cross-population and cross-species studies suggest that the proposed method has the generalization power and potential to identify and quantify the evolutionary determinant motifs. AVAILABILITY AND IMPLEMENTATION: https://github.com/frankchen121212/RHSNet. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Aprendizaje Profundo , Recombinación Genética , Meiosis
15.
Cancer Lett ; 525: 115-130, 2022 01 28.
Artículo en Inglés | MEDLINE | ID: mdl-34736960

RESUMEN

Hepatocellular carcinoma (HCC) is a major cancer burden worldwide with increasing incidence in many developed countries. Super-enhancers (SEs) drive gene expressions required for cell type-specificity and tumor cell identity. However, their roles in HCC remain unclear because of data scarcity from primary tumors. Herein, chromatin profiling of non-alcoholic fatty liver disease (NAFLD)-associated HCCs and matched liver tissues uncovered an average of ∼500 somatically-acquired SEs per patient. The identified SE-target genes were functionally enriched for aberrant metabolism and cancer phenotypes, especially chromatin regulators including deacetylases and Polycomb repressive complexes. Notably, all examined tumors exhibited SE activation of Sirtuin 7 (SIRT7), genome-wide promoter H3K18 deacetylation and concurrent H3K27me3, as well as tumor-suppressor gene silencing. Depletion of SIRT7 SE in hepatoma cells induced global H3K18 acetylation and reactivated key metabolic and immune regulators, leading to marked suppression of tumorigenicity in vitro and in vivo. In concordance, SIRT7 physically interacted with the methyltransferase EZH2, and they were co-expressed in primary HCCs. In summary, our integrative analysis establishes a compendium of SEs in NAFLD-associated HCCs and uncovers SIRT7-driven chromatin regulatory network as potential druggable vulnerability of this increasingly prevalent cancer.


Asunto(s)
Carcinoma Hepatocelular/genética , Elementos de Facilitación Genéticos/genética , Neoplasias Hepáticas/genética , Sirtuinas/genética , Carcinogénesis/genética , Carcinoma Hepatocelular/patología , Reprogramación Celular/genética , Epigenómica , Femenino , Silenciador del Gen , Humanos , Neoplasias Hepáticas/patología , Masculino , Sirtuinas/antagonistas & inhibidores
16.
Genome Res ; 31(12): 2340-2353, 2021 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-34663689

RESUMEN

Circular RNAs (circRNAs) are abundantly expressed in cancer. Their resistance to exonucleases enables them to have potentially stable interactions with different types of biomolecules. Alternative splicing can create different circRNA isoforms that have different sequences and unequal interaction potentials. The study of circRNA function thus requires knowledge of complete circRNA sequences. Here we describe psirc, a method that can identify full-length circRNA isoforms and quantify their expression levels from RNA sequencing data. We confirm the effectiveness and computational efficiency of psirc using both simulated and actual experimental data. Applying psirc on transcriptome profiles from nasopharyngeal carcinoma and normal nasopharynx samples, we discover and validate circRNA isoforms differentially expressed between the two groups. Compared with the assumed circular isoforms derived from linear transcript annotations, some of the alternatively spliced circular isoforms have 100 times higher expression and contain substantially fewer microRNA response elements, showing the importance of quantifying full-length circRNA isoforms.

17.
Nat Commun ; 12(1): 4193, 2021 07 07.
Artículo en Inglés | MEDLINE | ID: mdl-34234122

RESUMEN

Interplay between EBV infection and acquired genetic alterations during nasopharyngeal carcinoma (NPC) development remains vague. Here we report a comprehensive genomic analysis of 70 NPCs, combining whole-genome sequencing (WGS) of microdissected tumor cells with EBV oncogene expression to reveal multiple aspects of cellular-viral co-operation in tumorigenesis. Genomic aberrations along with EBV-encoded LMP1 expression underpin constitutive NF-κB activation in 90% of NPCs. A similar spectrum of somatic aberrations and viral gene expression undermine innate immunity in 79% of cases and adaptive immunity in 47% of cases; mechanisms by which NPC may evade immune surveillance despite its pro-inflammatory phenotype. Additionally, genomic changes impairing TGFBR2 promote oncogenesis and stabilize EBV infection in tumor cells. Fine-mapping of CDKN2A/CDKN2B deletion breakpoints reveals homozygous MTAP deletions in 32-34% of NPCs that confer marked sensitivity to MAT2A inhibition. Our work concludes that NPC is a homogeneously NF-κB-driven and immune-protected, yet potentially druggable, cancer.


Asunto(s)
Infecciones por Virus de Epstein-Barr/inmunología , Herpesvirus Humano 4/genética , Carcinoma Nasofaríngeo/inmunología , Neoplasias Nasofaríngeas/inmunología , Escape del Tumor/genética , Animales , Antineoplásicos/farmacología , Antineoplásicos/uso terapéutico , Carcinogénesis/efectos de los fármacos , Carcinogénesis/genética , Carcinogénesis/inmunología , Línea Celular Tumoral , Inhibidor p15 de las Quinasas Dependientes de la Ciclina/genética , Inhibidor p16 de la Quinasa Dependiente de Ciclina/genética , Infecciones por Virus de Epstein-Barr/genética , Infecciones por Virus de Epstein-Barr/terapia , Infecciones por Virus de Epstein-Barr/virología , Femenino , Regulación Viral de la Expresión Génica/inmunología , Herpesvirus Humano 4/inmunología , Herpesvirus Humano 4/patogenicidad , Interacciones Huésped-Patógeno/genética , Interacciones Huésped-Patógeno/inmunología , Humanos , Metionina Adenosiltransferasa/antagonistas & inhibidores , Metionina Adenosiltransferasa/metabolismo , Ratones , FN-kappa B/metabolismo , Carcinoma Nasofaríngeo/genética , Carcinoma Nasofaríngeo/terapia , Carcinoma Nasofaríngeo/virología , Neoplasias Nasofaríngeas/genética , Neoplasias Nasofaríngeas/terapia , Neoplasias Nasofaríngeas/virología , Nasofaringe/inmunología , Nasofaringe/patología , Nasofaringe/cirugía , Nasofaringe/virología , Receptor Tipo II de Factor de Crecimiento Transformador beta/genética , Receptor Tipo II de Factor de Crecimiento Transformador beta/metabolismo , Eliminación de Secuencia , Transducción de Señal/efectos de los fármacos , Transducción de Señal/genética , Transducción de Señal/inmunología , Escape del Tumor/efectos de los fármacos , Secuenciación Completa del Genoma , Ensayos Antitumor por Modelo de Xenoinjerto
18.
Sci Rep ; 11(1): 14392, 2021 07 13.
Artículo en Inglés | MEDLINE | ID: mdl-34257379

RESUMEN

Epstein-Barr virus (EBV) has been recently found to generate novel circular RNAs (circRNAs) through backsplicing. However, comprehensive catalogs of EBV circRNAs in other cell lines and their functional characterization are still lacking. In this study, we have identified a list of putative EBV circRNAs in GM12878, an EBV-transformed lymphoblastoid cell line, with a significant majority encoded from the EBV latent genes. A novel EBV circRNA derived from the exon 5 of LMP-2 gene which exhibited highest prevalence, was further validated using RNase R assay and Sanger sequencing. This circRNA, which we term circLMP-2_e5, can be universally detected in a panel of EBV-positive cell lines modelling different latency programs. It ranges from lower expression in nasopharyngeal carcinoma (NPC) cells to higher expression in B cells, and is localized to both the cytoplasm and the nucleus. We provide evidence that circLMP-2_e5 is expressed concomitantly with its cognate linear LMP-2 RNA upon EBV lytic reactivation, and may be produced as a result of exon skipping, with its circularization possibly occurring without the involvement of cis elements in the short flanking introns. Furthermore, we show that circLMP-2_e5 is not involved in regulating cell proliferation, host innate immune response, its linear parental transcripts, or EBV lytic reactivation. Taken together, our study expands the current repertoire of putative EBV circRNAs, broadens our understanding of the biology of EBV circRNAs, and lays the foundation for further investigation of their function in the EBV life cycle and disease development.


Asunto(s)
Herpesvirus Humano 4 , ARN Circular , Línea Celular , Humanos
19.
Commun Biol ; 4(1): 83, 2021 01 19.
Artículo en Inglés | MEDLINE | ID: mdl-33469163

RESUMEN

Whole genome duplication (WGD) has occurred in relatively few sexually reproducing invertebrates. Consequently, the WGD that occurred in the common ancestor of horseshoe crabs ~135 million years ago provides a rare opportunity to decipher the evolutionary consequences of a duplicated invertebrate genome. Here, we present a high-quality genome assembly for the mangrove horseshoe crab Carcinoscorpius rotundicauda (1.7 Gb, N50 = 90.2 Mb, with 89.8% sequences anchored to 16 pseudomolecules, 2n = 32), and a resequenced genome of the tri-spine horseshoe crab Tachypleus tridentatus (1.7 Gb, N50 = 109.7 Mb). Analyses of gene families, microRNAs, and synteny show that horseshoe crabs have undergone three rounds (3R) of WGD. Comparison of C. rotundicauda and T. tridentatus genomes from populations from several geographic locations further elucidates the diverse fates of both coding and noncoding genes. Together, the present study represents a cornerstone for improving our understanding of invertebrate WGD events on the evolutionary fates of genes and microRNAs, at both the individual and population level. We also provide improved genomic resources for horseshoe crabs, of applied value for breeding programs and conservation of this fascinating and unusual invertebrate lineage.


Asunto(s)
Duplicación de Gen/genética , Cangrejos Herradura/genética , MicroARNs/genética , Animales , Evolución Molecular , Genoma/genética , Genómica , Filogenia
20.
Bioinformatics ; 36(Suppl_2): i625-i633, 2020 12 30.
Artículo en Inglés | MEDLINE | ID: mdl-33381843

RESUMEN

MOTIVATION: In de novo sequence assembly, a standard pre-processing step is k-mer counting, which computes the number of occurrences of every length-k sub-sequence in the sequencing reads. Sequencing errors can produce many k-mers that do not appear in the genome, leading to the need for an excessive amount of memory during counting. This issue is particularly serious when the genome to be assembled is large, the sequencing depth is high, or when the memory available is limited. RESULTS: Here, we propose a fast near-exact k-mer counting method, CQF-deNoise, which has a module for dynamically removing noisy false k-mers. It automatically determines the suitable time and number of rounds of noise removal according to a user-specified wrong removal rate. We tested CQF-deNoise comprehensively using data generated from a diverse set of genomes with various data properties, and found that the memory consumed was almost constant regardless of the sequencing errors while the noise removal procedure had minimal effects on counting accuracy. Compared with four state-of-the-art k-mer counting methods, CQF-deNoise consistently performed the best in terms of memory usage, consuming 49-76% less memory than the second best method. When counting the k-mers from a human dataset with around 60× coverage, the peak memory usage of CQF-deNoise was only 10.9 GB (gigabytes) for k = 28 and 21.5 GB for k = 55. De novo assembly of 106× human sequencing data using CQF-deNoise for k-mer counting required only 2.7 h and 90 GB peak memory. AVAILABILITY AND IMPLEMENTATION: The source codes of CQF-deNoise and SH-assembly are available at https://github.com/Christina-hshi/CQF-deNoise.git and https://github.com/Christina-hshi/SH-assembly.git, respectively, both under the BSD 3-Clause license.


Asunto(s)
Algoritmos , Programas Informáticos , Secuencia de Bases , Genoma , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Análisis de Secuencia de ADN
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA