Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Dev Cell ; 58(21): 2206-2216.e5, 2023 11 06.
Artigo em Inglês | MEDLINE | ID: mdl-37848026

RESUMO

Transcriptional enhancers direct precise gene expression patterns during development and harbor the majority of variants associated with phenotypic diversity, evolutionary adaptations, and disease. Pinpointing which enhancer variants contribute to changes in gene expression and phenotypes is a major challenge. Here, we find that suboptimal or low-affinity binding sites are necessary for precise gene expression during heart development. Single-nucleotide variants (SNVs) can optimize the affinity of ETS binding sites, causing gain-of-function (GOF) gene expression, cell migration defects, and phenotypes as severe as extra beating hearts in the marine chordate Ciona robusta. In human induced pluripotent stem cell (iPSC)-derived cardiomyocytes, a SNV within a human GATA4 enhancer increases ETS binding affinity and causes GOF enhancer activity. The prevalence of suboptimal-affinity sites within enhancers creates a vulnerability whereby affinity-optimizing SNVs can lead to GOF gene expression, changes in cellular identity, and organismal-level phenotypes that could contribute to the evolution of novel traits or diseases.


Assuntos
Elementos Facilitadores Genéticos , Células-Tronco Pluripotentes Induzidas , Humanos , Elementos Facilitadores Genéticos/genética , Miócitos Cardíacos/metabolismo , Sítios de Ligação , Nucleotídeos
2.
bioRxiv ; 2023 Jul 07.
Artigo em Inglês | MEDLINE | ID: mdl-37461500

RESUMO

Cancer is a highly heterogeneous disease caused by genetic and epigenetic alterations in normal cells. A recent study uncovered methylation quantitative trait loci (meQTLs) associated with different levels of local DNA methylation in cancers. Here, we investigated whether the distribution of cancer meQTLs reflected functional organization of the genome in the form of chromatin topologically associated domains (TADs), and evaluated whether cancer meQTLs near known driver genes have the potential to influence cancer risk or progression. At TAD boundaries, we observed differences in the distribution of meQTLs when one or both of the adjacent TADs was transcriptionally active, with higher densities near inactive TADs. Furthermore, we found differences in cancer meQTL distributions in active versus inactive TADs and observed an enrichment of meQTLs in active TADs near tumor suppressors, whereas there was a depletion of such meQTLs near oncogenes. Several meQTLs were associated with cancer risk in the UKBioBank, and we were able to reproduce breast cancer risk associations in the DRIVE cohort. Survival analysis in TCGA implicated a number of meQTLs in 13 tumor types. In 10 of these, polygenic meQTL scores were associated with increased hazard in a CoxPH analysis. Risk and survival-associated meQTLs tended to affect cancer genes involved in DNA damage repair and cellular adhesion and reproduced cancer-specific associations reported in prior literature. In summary, this study provides evidence that genetic variants that influence local DNA methylation are affected by chromatin structure and can impact tumor evolution.

3.
Nat Comput Sci ; 3(11): 946-956, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-38177592

RESUMO

Deep learning has become a popular tool to study cis-regulatory function. Yet efforts to design software for deep-learning analyses in regulatory genomics that are findable, accessible, interoperable and reusable (FAIR) have fallen short of fully meeting these criteria. Here we present elucidating the utility of genomic elements with neural nets (EUGENe), a FAIR toolkit for the analysis of genomic sequences with deep learning. EUGENe consists of a set of modules and subpackages for executing the key functionality of a genomics deep learning workflow: (1) extracting, transforming and loading sequence data from many common file formats; (2) instantiating, initializing and training diverse model architectures; and (3) evaluating and interpreting model behavior. We designed EUGENe as a simple, flexible and extensible interface for streamlining and customizing end-to-end deep-learning sequence analyses, and illustrate these principles through application of the toolkit to three predictive modeling tasks. We hope that EUGENe represents a springboard towards a collaborative ecosystem for deep-learning applications in genomics research.


Assuntos
Genômica , Genoma , Software , Fluxo de Trabalho
4.
Database (Oxford) ; 20212021 04 29.
Artigo em Inglês | MEDLINE | ID: mdl-33914028

RESUMO

High-quality metadata annotations for data hosted in large public repositories are essential for research reproducibility and for conducting fast, powerful and scalable meta-analyses. Currently, a majority of sequencing samples in the National Center for Biotechnology Information's Sequence Read Archive (SRA) are missing metadata across several categories. In an effort to improve the metadata coverage of these samples, we leveraged almost 44 million attribute-value pairs from SRA BioSample to train a scalable, recurrent neural network that predicts missing metadata via named entity recognition (NER). The network was first trained to classify short text phrases according to 11 metadata categories and achieved an overall accuracy and area under the receiver operating characteristic curve of 85.2% and 0.977, respectively. We then applied our classifier to predict 11 metadata categories from the longer TITLE attribute of samples, evaluating performance on a set of samples withheld from model training. Prediction accuracies were high when extracting sample Genus/Species (94.85%), Condition/Disease (95.65%) and Strain (82.03%) from TITLEs, with lower accuracies and lack of predictions for other categories highlighting multiple issues with the current metadata annotations in BioSample. These results indicate the utility of recurrent neural networks for NER-based metadata prediction and the potential for models such as the one presented here to increase metadata coverage in BioSample while minimizing the need for manual curation. Database URL: https://github.com/cartercompbio/PredictMEE.


Assuntos
Aprendizado Profundo , Metadados , Sequenciamento de Nucleotídeos em Larga Escala , Reprodutibilidade dos Testes , Software
5.
Annu Int Conf IEEE Eng Med Biol Soc ; 2020: 5459-5463, 2020 07.
Artigo em Inglês | MEDLINE | ID: mdl-33019215

RESUMO

Fungemia is a life-threatening infection, but predictive models of in-patient mortality in this infection are few. In this study, we developed models predicting all-cause in-hospital mortality among 265 fungemic patients in the Medical Information Mart for Intensive Care (MIMIC-III) database using both structured and unstructured data. Structured data models included multivariable logistic regression, extreme gradient boosting, and stacked ensemble models. Unstructured data models were developed using Amazon Comprehend Medical and BioWordVec embeddings in logistic regression, convolutional neural networks (CNNs), and recurrent neural networks (RNNs). We evaluated models trained on all notes, notes from only the first three days of hospitalization, and models trained on only physician notes. The best-performing structured data model was a multivariable logistic regression model that achieved an accuracy of 0.74 and AUC of 0.76. Liver disease, acute renal failure, and intubation were some of the top features driving prediction in multiple models. CNNs using unstructured data achieved similar performance even when trained with notes from only the first three days of hospitalization. The best-performing unstructured data models used the Amazon Comprehend Medical document classifier and CNNs, achieving accuracy ranging from 0.99-1.00, and AUCs of 1.00. Therefore, unstructured data - particularly notes composed by physicians - offer added predictive value over models based on structured data alone.


Assuntos
Fungemia , Área Sob a Curva , Cuidados Críticos , Humanos , Modelos Logísticos , Redes Neurais de Computação
6.
J Med Internet Res ; 22(8): e18855, 2020 08 14.
Artigo em Inglês | MEDLINE | ID: mdl-32795984

RESUMO

BACKGROUND: Fungal ocular involvement can develop in patients with fungal bloodstream infections and can be vision-threatening. Ocular involvement has become less common in the current era of improved antifungal therapies. Retrospectively determining the prevalence of fungal ocular involvement is important for informing clinical guidelines, such as the need for routine ophthalmologic consultations. However, manual retrospective record review to detect cases is time-consuming. OBJECTIVE: This study aimed to determine the prevalence of fungal ocular involvement in a critical care database using both structured and unstructured electronic health record (EHR) data. METHODS: We queried microbiology data from 46,467 critical care patients over 12 years (2000-2012) from the Medical Information Mart for Intensive Care III (MIMIC-III) to identify 265 patients with culture-proven fungemia. For each fungemic patient, demographic data, fungal species present in blood culture, and risk factors for fungemia (eg, presence of indwelling catheters, recent major surgery, diabetes, immunosuppressed status) were ascertained. All structured diagnosis codes and free-text narrative notes associated with each patient's hospitalization were also extracted. Screening for fungal endophthalmitis was performed using two approaches: (1) by querying a wide array of eye- and vision-related diagnosis codes, and (2) by utilizing a custom regular expression pipeline to identify and collate relevant text matches pertaining to fungal ocular involvement. Both approaches were validated using manual record review. The main outcome measure was the documentation of any fungal ocular involvement. RESULTS: In total, 265 patients had culture-proven fungemia, with Candida albicans (n=114, 43%) and Candida glabrata (n=74, 28%) being the most common fungal species in blood culture. The in-hospital mortality rate was 121 (46%). In total, 7 patients were identified as having eye- or vision-related diagnosis codes, none of whom had fungal endophthalmitis based on record review. There were 26,830 free-text narrative notes associated with these 265 patients. A regular expression pipeline based on relevant terms yielded possible matches in 683 notes from 108 patients. Subsequent manual record review again demonstrated that no patients had fungal ocular involvement. Therefore, the prevalence of fungal ocular involvement in this cohort was 0%. CONCLUSIONS: MIMIC-III contained no cases of ocular involvement among fungemic patients, consistent with prior studies reporting low rates of ocular involvement in fungemia. This study demonstrates an application of natural language processing to expedite the review of narrative notes. This approach is highly relevant for ophthalmology, where diagnoses are often based on physical examination findings that are documented within clinical notes.


Assuntos
Cuidados Críticos/métodos , Endoftalmite/diagnóstico , Olho/patologia , Micoses/diagnóstico por imagem , Processamento de Linguagem Natural , Estudos Transversais , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Estudos Retrospectivos , Fatores de Risco
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...