Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Comput Struct Biotechnol J ; 23: 1919-1928, 2024 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-38711760

RESUMEN

The decrease in sequencing expenses has facilitated the creation of reference genomes and proteomes for an expanding array of organisms. Nevertheless, no established repository that details organism-specific genomic and proteomic sequences of specific lengths, referred to as kmers, exists to our knowledge. In this article, we present kmerDB, a database accessible through an interactive web interface that provides kmer-based information from genomic and proteomic sequences in a systematic way. kmerDB currently contains 202,340,859,107 base pairs and 19,304,903,356 amino acids, spanning 54,039 and 21,865 reference genomes and proteomes, respectively, as well as 6,905,362 and 149,305,183 genomic and proteomic species-specific sequences, termed quasi-primes. Additionally, we provide access to 5,186,757 nucleic and 214,904,089 peptide sequences absent from every genome and proteome, termed primes. kmerDB features a user-friendly interface offering various search options and filters for easy parsing and searching. The service is available at: www.kmerdb.com.

2.
Cancer Gene Ther ; 2024 Feb 14.
Artículo en Inglés | MEDLINE | ID: mdl-38351138

RESUMEN

Early detection of cancer can significantly improve patient outcomes; however, sensitive and highly specific biomarkers for cancer detection are currently missing. Nullomers are the shortest sequences that are absent from the human genome but can emerge due to somatic mutations in cancer. We examine over 10,000 whole exome sequencing matched tumor-normal samples to characterize nullomer emergence across exonic regions of the genome. We also identify nullomer emerging mutational hotspots within tumor genes. Finally, we provide evidence for the identification of nullomers in cell-free RNA from peripheral blood samples, enabling detection of multiple tumor types. We show multiple tumor classification models with an AUC greater than 0.9, including a hepatocellular carcinoma classifier with an AUC greater than 0.99.

3.
Eur J Cancer ; 196: 113421, 2024 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-37952501

RESUMEN

Early diagnosis of cancer can significantly improve survival of cancer patients; however sensitive and highly specific biomarkers for cancer detection are currently lacking for most cancer types. Nullpeptides are short peptides that are absent from the human proteome. Here, we examined the emergence of nullpeptides during cancer development. We analyzed 3,600,964 somatic mutations across 10,064 whole exome sequencing tumor samples spanning 32 cancer types. We analyze RNA-seq data from primary tumor samples to identify the subset of nullpeptides that emerge in highly expresed genes. We show that nullpeptides, and particularly the subset that is highly recurrent across cancer patients, can be identified in tumor biopsy samples. We find that cancer genes show an excess of nullpeptides and detect nullpeptide hotspots in specific loci of oncogenes and tumor suppressors. We also observe that recurrent nullpeptides are more likely to be found in neoantigens, which have been shown to be effective targets for immunotherapy, suggesting that they can be used to prioritize candidates. Our findings provide evidence for the utility of nullpeptides as cancer detection and therapeutic biomarkers.


Asunto(s)
Neoplasias , Humanos , Neoplasias/terapia , Oncogenes , Péptidos/genética , Inmunoterapia , Biomarcadores , Mutación , Antígenos de Neoplasias
4.
BMC Genomics ; 24(1): 768, 2023 Dec 12.
Artículo en Inglés | MEDLINE | ID: mdl-38087204

RESUMEN

Early detection of human disease is associated with improved clinical outcomes. However, many diseases are often detected at an advanced, symptomatic stage where patients are past efficacious treatment periods and can result in less favorable outcomes. Therefore, methods that can accurately detect human disease at a presymptomatic stage are urgently needed. Here, we introduce "frequentmers"; short sequences that are specific and recurrently observed in either patient or healthy control samples, but not in both. We showcase the utility of frequentmers for the detection of liver cirrhosis using metagenomic Next Generation Sequencing data from stool samples of patients and controls. We develop classification models for the detection of liver cirrhosis and achieve an AUC score of 0.91 using ten-fold cross-validation. A small subset of 200 frequentmers can achieve comparable results in detecting liver cirrhosis. Finally, we identify the microbial organisms in liver cirrhosis samples, which are associated with the most predictive frequentmer biomarkers.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento , Cirrosis Hepática , Humanos , Cirrosis Hepática/diagnóstico , Cirrosis Hepática/genética , Estado de Salud , Metagenoma , Metagenómica , Sensibilidad y Especificidad
5.
Nat Genet ; 55(10): 1735-1744, 2023 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-37735198

RESUMEN

Candidate cis-regulatory elements (cCREs) in microglia demonstrate the most substantial enrichment for Alzheimer's disease (AD) heritability compared to other brain cell types. However, whether and how these genome-wide association studies (GWAS) variants contribute to AD remain elusive. Here we prioritize 308 previously unreported AD risk variants at 181 cCREs by integrating genetic information with microglia-specific 3D epigenome annotation. We further establish the link between functional variants and target genes by single-cell CRISPRi screening in microglia. In addition, we show that AD variants exhibit allelic imbalance on target gene expression. In particular, rs7922621 is the effective variant in controlling TSPAN14 expression among other nominated variants in the same cCRE and exerts multiple physiological effects including reduced cell surface ADAM10 and altered soluble TREM2 (sTREM2) shedding. Our work represents a systematic approach to prioritize and characterize AD-associated variants and provides a roadmap for advancing genetic association to experimentally validated cell-type-specific phenotypes and mechanisms.


Asunto(s)
Enfermedad de Alzheimer , Humanos , Enfermedad de Alzheimer/genética , Enfermedad de Alzheimer/metabolismo , Microglía/metabolismo , Estudio de Asociación del Genoma Completo , Membrana Celular/metabolismo , Fenotipo
6.
NAR Genom Bioinform ; 5(2): lqad039, 2023 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-37101657

RESUMEN

Determining the organisms present in a biosample has many important applications in agriculture, wildlife conservation, and healthcare. Here, we develop a universal fingerprint based on the identification of short peptides that are unique to a specific organism. We define quasi-prime peptides as sequences that are found in only one species, and we analyzed proteomes from 21 875 species, from viruses to humans, and annotated the smallest peptide kmer sequences that are unique to a species and absent from all other proteomes. We also perform simulations across all reference proteomes and observe a lower than expected number of peptide kmers across species and taxonomies, indicating an enrichment for nullpeptides, sequences absent from a proteome. For humans, we find that quasi-primes are found in genes enriched for specific gene ontology terms, including proteasome and ATP and GTP catalysis. We also provide a set of quasi-prime peptides for a number of human pathogens and model organisms and further showcase its utility via two case studies for Mycobacterium tuberculosis and Vibrio cholerae, where we identify quasi-prime peptides in two transmembrane and extracellular proteins with relevance for pathogen detection. Our catalog of quasi-prime peptides provides the smallest unit of information that is specific to a single organism at the protein level, providing a versatile tool for species identification.

7.
Nat Commun ; 14(1): 2333, 2023 04 22.
Artículo en Inglés | MEDLINE | ID: mdl-37087538

RESUMEN

The gene regulatory code and grammar remain largely unknown, precluding our ability to link phenotype to genotype in regulatory sequences. Here, using a massively parallel reporter assay (MPRA) of 209,440 sequences, we examine all possible pair and triplet combinations, permutations and orientations of eighteen liver-associated transcription factor binding sites (TFBS). We find that TFBS orientation and order have a major effect on gene regulatory activity. Corroborating these results with genomic analyses, we find clear human promoter TFBS orientation biases and similar TFBS orientation and order transcriptional effects in an MPRA that tested 164,307 liver candidate regulatory elements. Additionally, by adding TFBS orientation to a model that predicts expression from sequence we improve performance by 7.7%. Collectively, our results show that TFBS orientation and order have a significant effect on gene regulatory activity and need to be considered when analyzing the functional effect of variants on the activity of these sequences.


Asunto(s)
Regulación de la Expresión Génica , Factores de Transcripción , Humanos , Factores de Transcripción/genética , Factores de Transcripción/metabolismo , Sitios de Unión/genética , Regiones Promotoras Genéticas/genética , Unión Proteica
9.
Genome Biol ; 23(1): 159, 2022 07 18.
Artículo en Inglés | MEDLINE | ID: mdl-35851062

RESUMEN

The most stable structure of DNA is the canonical right-handed double helix termed B DNA. However, certain environments and sequence motifs favor alternative conformations, termed non-canonical secondary structures. The roles of DNA and RNA secondary structures in transcriptional regulation remain incompletely understood. However, advances in high-throughput assays have enabled genome wide characterization of some secondary structures. Here, we describe their regulatory functions in promoters and 3'UTRs, providing insights into key mechanisms through which they regulate gene expression. We discuss their implication in human disease, and how advances in molecular technologies and emerging high-throughput experimental methods could provide additional insights.


Asunto(s)
ADN , Regulación de la Expresión Génica , Regiones no Traducidas 3' , Humanos , Regiones Promotoras Genéticas
10.
J Biol Chem ; 298(4): 101674, 2022 04.
Artículo en Inglés | MEDLINE | ID: mdl-35148987

RESUMEN

Adeno-associated viruses (AAVs) targeting specific cell types are powerful tools for studying distinct cell types in the central nervous system (CNS). Cis-regulatory modules (CRMs), e.g., enhancers, are highly cell-type-specific and can be integrated into AAVs to render cell type specificity. Chromatin accessibility has been commonly used to nominate CRMs, which have then been incorporated into AAVs and tested for cell type specificity in the CNS. However, chromatin accessibility data alone cannot accurately annotate active CRMs, as many chromatin-accessible CRMs are not active and fail to drive gene expression in vivo. Using available large-scale datasets on chromatin accessibility, such as those published by the ENCODE project, here we explored strategies to increase efficiency in identifying active CRMs for AAV-based cell-type-specific labeling and manipulation. We found that prescreening of chromatin-accessible putative CRMs based on the density of cell-type-specific transcription factor binding sites (TFBSs) can significantly increase efficiency in identifying active CRMs. In addition, generation of synthetic CRMs by stitching chromatin-accessible regions flanking cell-type-specific genes can render cell type specificity in many cases. Using these straightforward strategies, we generated AAVs that can target the extensively studied interneuron and glial cell types in the retina and brain. Both strategies utilize available genomic datasets and can be employed to generate AAVs targeting specific cell types in CNS without conducting comprehensive screening and sequencing experiments, making a step forward in cell-type-specific research.


Asunto(s)
Encéfalo , Dependovirus , Retina , Coloración y Etiquetado , Factores de Transcripción , Animales , Sitios de Unión , Encéfalo/citología , Encéfalo/metabolismo , Cromatina/genética , Cromatina/metabolismo , Dependovirus/genética , Dependovirus/metabolismo , Ratones , Retina/citología , Retina/metabolismo , Coloración y Etiquetado/métodos , Factores de Transcripción/metabolismo
11.
Development ; 147(14)2020 07 26.
Artículo en Inglés | MEDLINE | ID: mdl-32631829

RESUMEN

Transcription factors (TFs) are often used repeatedly during development and homeostasis to control distinct processes in the same and/or different cellular contexts. Considering the limited number of TFs in the genome and the tremendous number of events that need to be regulated, re-use of TFs is necessary. We analyzed how the expression of the homeobox TF, orthodenticle homeobox 2 (Otx2), is regulated in a cell type- and stage-specific manner during development in the mouse retina. We identified seven Otx2 cis-regulatory modules (CRMs), among which the O5, O7 and O9 CRMs mark three distinct cellular contexts of Otx2 expression. We discovered that Otx2, Crx and Sox2, which are well-known TFs regulating retinal development, bind to and activate the O5, O7 or O9 CRMs, respectively. The chromatin status of these three CRMs was found to be distinct in vivo in different retinal cell types and at different stages. We conclude that retinal cells use a cohort of TFs with different expression patterns and multiple CRMs with different chromatin configurations to regulate the expression of Otx2 precisely.


Asunto(s)
Factores de Transcripción Otx/metabolismo , Elementos Reguladores de la Transcripción/genética , Retina/metabolismo , Factores de Transcripción/metabolismo , Animales , Cromatina/metabolismo , Fase G2 , Células HEK293 , Proteínas de Homeodominio/genética , Proteínas de Homeodominio/metabolismo , Humanos , Ratones , Mutagénesis , Factores de Transcripción Otx/antagonistas & inhibidores , Factores de Transcripción Otx/genética , Células Fotorreceptoras de Vertebrados/citología , Células Fotorreceptoras de Vertebrados/metabolismo , Unión Proteica , Interferencia de ARN , ARN Interferente Pequeño/metabolismo , Retina/crecimiento & desarrollo , Factores de Transcripción SOXB1/genética , Factores de Transcripción SOXB1/metabolismo , Transactivadores/genética , Transactivadores/metabolismo , Factores de Transcripción/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...