Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Bioinform Adv ; 3(1): vbad034, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37250111

RESUMO

Motivation: The application of machine learning (ML) techniques in the medical field has demonstrated both successes and challenges in the precision medicine era. The ability to accurately classify a subject as a potential responder versus a nonresponder to a given therapy is still an active area of research pushing the field to create new approaches for applying machine-learning techniques. In this study, we leveraged publicly available data through the BeatAML initiative. Specifically, we used gene count data, generated via RNA-seq, from 451 individuals matched with ex vivo data generated from treatment with RTK-type-III inhibitors. Three feature selection techniques were tested, principal component analysis, Shapley Additive Explanation (SHAP) technique and differential gene expression analysis, with three different classifiers, XGBoost, LightGBM and random forest (RF). Sensitivity versus specificity was analyzed using the area under the curve (AUC)-receiver operating curves (ROCs) for every model developed. Results: Our work demonstrated that feature selection technique, rather than the classifier, had the greatest impact on model performance. The SHAP technique outperformed the other feature selection techniques and was able to with high accuracy predict outcome response, with the highest performing model: Foretinib with 89% AUC using the SHAP technique and RF classifier. Our ML pipelines demonstrate that at the time of diagnosis, a transcriptomics signature exists that can potentially predict response to treatment, demonstrating the potential of using ML applications in precision medicine efforts. Availability and implementation: https://github.com/UD-CRPL/RCDML. Supplementary information: Supplementary data are available at Bioinformatics Advances online.

2.
Front Cell Infect Microbiol ; 12: 816601, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35310842

RESUMO

Background: Different feeding regimens in infancy alter the gastrointestinal (gut) microbial environment. The fecal microbiota in turn influences gastrointestinal homeostasis including metabolism, immune function, and extra-/intra-intestinal signaling. Advances in next generation sequencing (NGS) have enhanced our ability to study the gut microbiome of breast-fed (BF) and formula-fed (FF) infants with a data-driven hypothesis approach. Methods: Next generation sequencing libraries were constructed from fecal samples of BF (n=24) and FF (n=10) infants and sequenced on an Illumina HiSeq 2500. Taxonomic classification of the NGS data was performed using the Sunbeam/Kraken pipeline and a functional analysis at the gene level was performed using publicly available algorithms, including BLAST, and custom scripts. Differentially represented genera, genes, and NCBI Clusters of Orthologous Genes (COG) were determined between cohorts using count data and R (statistical packages edgeR and DESeq2). Results: Thirty-nine genera were found to be differentially represented between the BF and FF cohorts (FDR ≤ 0.01) including Parabacteroides, Enterococcus, Haemophilus, Gardnerella, and Staphylococcus. A Welch t-test of the Shannon diversity index for BF and FF samples approached significance (p=0.061). Bray-Curtis and Jaccard distance analyses demonstrated clustering and overlap in each analysis. Sixty COGs were significantly overrepresented and those most significantly represented in BF vs. FF samples showed dichotomy of categories representing gene functions. Over 1,700 genes were found to be differentially represented (abundance) between the BF and FF cohorts. Conclusions: Fecal samples analyzed from BF and FF infants demonstrated differences in microbiota genera. The BF cohort includes greater presence of beneficial genus Bifidobacterium. Several genes were identified as present at different abundances between cohorts indicating differences in functional pathways such as cellular defense mechanisms and carbohydrate metabolism influenced by feeding. Confirmation of gene level NGS data via PCR and electrophoresis analysis revealed distinct differences in gene abundances associated with important biologic pathways.


Assuntos
Microbioma Gastrointestinal , Microbiota , Aleitamento Materno , Fezes/microbiologia , Feminino , Microbioma Gastrointestinal/genética , Humanos , Lactente , Fórmulas Infantis , Metagenômica
3.
PLoS One ; 17(1): e0262573, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35045124

RESUMO

The use of next generation sequencing is critical for the surveillance of severe acute respiratory syndrome coronavirus 2, SARS-CoV-2, transmission, as single base mutations have been identified with differences in infectivity. A total of 1,459 high quality samples were collected, sequenced, and analyzed in the state of Delaware, a location that offers a unique perspective on transmission given its proximity to large international airports on the east coast. Pangolin and Nextclade were used to classify these sequences into 16 unique clades and 88 lineages. A total of 411 samples belonging to the Alpha 20I/501Y.V1 (B.1.1.7) strain of concern were identified, as well as one sample belonging to Beta 20H/501.V2 (B.1.351), thirteen belonging to Epsilon 20C/S:452R (B.1.427/B.1.429), two belonging to Delta 20A/S:478K (B.1.617.2), and 15 belonging to Gamma 20J/501Y.V3 (p.1). A total of 2217 unique coding mutations were observed with an average of 17.7 coding mutations per genome. These data paired with continued sample collection and sequencing will give a deeper understanding of the spread of SARS-CoV-2 strains within Delaware and its surrounding areas.


Assuntos
COVID-19/patologia , Genoma Viral , SARS-CoV-2/genética , COVID-19/epidemiologia , COVID-19/virologia , Delaware/epidemiologia , Ligação Genética , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Filogenia , RNA Viral/química , RNA Viral/metabolismo , SARS-CoV-2/classificação , SARS-CoV-2/isolamento & purificação
4.
Genomics Inform ; 18(1): e10, 2020 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-32224843

RESUMO

Advancements in next generation sequencing (NGS) technologies have significantly increased the translational use of genomics data in the medical field as well as the demand for computational infrastructure capable processing that data. To enhance the current understanding of software and hardware used to compute large scale human genomic datasets (NGS), the performance and accuracy of optimized versions of GATK algorithms, including Parabricks and Sentieon, were compared to the results of the original application (GATK V4.1.0, Intel x86 CPUs). Parabricks was able to process a 50× whole-genome sequencing library in under 3 h and Sentieon finished in under 8 h, whereas GATK v4.1.0 needed nearly 24 h. These results were achieved while maintaining greater than 99% accuracy and precision compared to stock GATK. Sentieon's somatic pipeline achieved similar results greater than 99%. Additionally, the IBM POWER9 CPU performed well on bioinformatic workloads when tested with 10 different tools for alignment/mapping.

5.
BMC Genomics ; 19(1): 547, 2018 Jul 20.
Artigo em Inglês | MEDLINE | ID: mdl-30029591

RESUMO

BACKGROUND: Since the proposal of Brachypodium distachyon as a model for the grasses, over 500 Bdi-miRNAs have been annotated in miRBase making Brachypodium second in number only to rice. Other monocots, such as switchgrass, are completely absent from the miRBase database. While a significant number of miRNAs have been identified which are highly conserved across plants, little research has been done with respect to the conservation of miRNA targets. Plant responses to abiotic stresses are regulated by diverse pathways many of which involve miRNAs; however, it can be difficult to identify miRNA guided gene regulation when the miRNA is not the primary regulator of the target mRNA. RESULTS: To investigate miRNA target conservation and stress response involvement, a set of PARE (Parallel Analysis of RNA Ends) libraries totaling over two billion reads was constructed and sequenced from Brachypodium, switchgrass, and sorghum representing the first report of RNA degradome data from the latter two species. Analysis of this data provided not only PARE evidence for miRNA guided cleavage of over 7000 predicted target mRNAs in Brachypodium, but also evidence for miRNA guided cleavage of over 1000 homologous transcripts in sorghum and switchgrass. A pipeline was constructed to compare RNA-seq and PARE data made from Brachypodium plants exposed to various abiotic stress conditions. This resulted in the identification of 44 miRNA targets which exhibit stress regulated cleavage. Time course experiments were performed to reveal the relationship between miR393ab, miR169a, miR394ab, and their respective targets throughout the first 36 h of the cold stress response in Brachypodium. CONCLUSIONS: Knowledge gained from this study provides considerable insight into the RNA degradomes and the breadth of miRNA target conservation among these three species. Additionally, associations of a number of miRNAs and target mRNAs with the stress responses have been revealed which could aid in the development of stress tolerant transgenic crops.


Assuntos
Brachypodium/genética , MicroRNAs/metabolismo , RNA Mensageiro/metabolismo , Brachypodium/metabolismo , Temperatura Baixa , Produtos Agrícolas/genética , Regulação da Expressão Gênica de Plantas , Panicum/genética , Clivagem do RNA , Análise de Sequência de RNA , Sorghum/genética , Estresse Fisiológico/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA