Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 199
Filtrar
1.
Bioinformatics ; 40(Supplement_1): i39-i47, 2024 Jun 28.
Artigo em Inglês | MEDLINE | ID: mdl-38940175

RESUMO

MOTIVATION: World Health Organization estimates that there were over 10 million cases of tuberculosis (TB) worldwide in 2019, resulting in over 1.4 million deaths, with a worrisome increasing trend yearly. The disease is caused by Mycobacterium tuberculosis (MTB) through airborne transmission. Treatment of TB is estimated to be 85% successful, however, this drops to 57% if MTB exhibits multiple antimicrobial resistance (AMR), for which fewer treatment options are available. RESULTS: We develop a robust machine-learning classifier using both linear and nonlinear models (i.e. LASSO logistic regression (LR) and random forests (RF)) to predict the phenotypic resistance of Mycobacterium tuberculosis (MTB) for a broad range of antibiotic drugs. We use data from the CRyPTIC consortium to train our classifier, which consists of whole genome sequencing and antibiotic susceptibility testing (AST) phenotypic data for 13 different antibiotics. To train our model, we assemble the sequence data into genomic contigs, identify all unique 31-mers in the set of contigs, and build a feature matrix M, where M[i, j] is equal to the number of times the ith 31-mer occurs in the jth genome. Due to the size of this feature matrix (over 350 million unique 31-mers), we build and use a sparse matrix representation. Our method, which we refer to as MTB++, leverages compact data structures and iterative methods to allow for the screening of all the 31-mers in the development of both LASSO LR and RF. MTB++ is able to achieve high discrimination (F-1 >80%) for the first-line antibiotics. Moreover, MTB++ had the highest F-1 score in all but three classes and was the most comprehensive since it had an F-1 score >75% in all but four (rare) antibiotic drugs. We use our feature selection to contextualize the 31-mers that are used for the prediction of phenotypic resistance, leading to some insights about sequence similarity to genes in MEGARes. Lastly, we give an estimate of the amount of data that is needed in order to provide accurate predictions. AVAILABILITY: The models and source code are publicly available on Github at https://github.com/M-Serajian/MTB-Pipeline.


Assuntos
Aprendizado de Máquina , Mycobacterium tuberculosis , Mycobacterium tuberculosis/genética , Mycobacterium tuberculosis/efeitos dos fármacos , Farmacorresistência Bacteriana/genética , Testes de Sensibilidade Microbiana , Antibacterianos/farmacologia , Sequenciamento Completo do Genoma/métodos , Genoma Bacteriano , Humanos
2.
PLoS Comput Biol ; 20(4): e1011351, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38598563

RESUMO

In the midst of an outbreak or sustained epidemic, reliable prediction of transmission risks and patterns of spread is critical to inform public health programs. Projections of transmission growth or decline among specific risk groups can aid in optimizing interventions, particularly when resources are limited. Phylogenetic trees have been widely used in the detection of transmission chains and high-risk populations. Moreover, tree topology and the incorporation of population parameters (phylodynamics) can be useful in reconstructing the evolutionary dynamics of an epidemic across space and time among individuals. We now demonstrate the utility of phylodynamic trees for transmission modeling and forecasting, developing a phylogeny-based deep learning system, referred to as DeepDynaForecast. Our approach leverages a primal-dual graph learning structure with shortcut multi-layer aggregation, which is suited for the early identification and prediction of transmission dynamics in emerging high-risk groups. We demonstrate the accuracy of DeepDynaForecast using simulated outbreak data and the utility of the learned model using empirical, large-scale data from the human immunodeficiency virus epidemic in Florida between 2012 and 2020. Our framework is available as open-source software (MIT license) at github.com/lab-smile/DeepDynaForcast.


Assuntos
Biologia Computacional , Aprendizado Profundo , Epidemias , Filogenia , Humanos , Epidemias/estatística & dados numéricos , Biologia Computacional/métodos , Infecções por HIV/transmissão , Infecções por HIV/epidemiologia , Software , Florida/epidemiologia , Algoritmos , Simulação por Computador , Surtos de Doenças/estatística & dados numéricos
3.
Nucleic Acids Res ; 51(D1): D744-D752, 2023 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-36382407

RESUMO

Antimicrobial resistance (AMR) is considered a critical threat to public health, and genomic/metagenomic investigations featuring high-throughput analysis of sequence data are increasingly common and important. We previously introduced MEGARes, a comprehensive AMR database with an acyclic hierarchical annotation structure that facilitates high-throughput computational analysis, as well as AMR++, a customized bioinformatic pipeline specifically designed to use MEGARes in high-throughput analysis for characterizing AMR genes (ARGs) in metagenomic sequence data. Here, we present MEGARes v3.0, a comprehensive database of published ARG sequences for antimicrobial drugs, biocides, and metals, and AMR++ v3.0, an update to our customized bioinformatic pipeline for high-throughput analysis of metagenomic data (available at MEGLab.org). Database annotations have been expanded to include information regarding specific genomic locations for single-nucleotide polymorphisms (SNPs) and insertions and/or deletions (indels) when required by specific ARGs for resistance expression, and the updated AMR++ pipeline uses this information to check for presence of resistance-conferring genetic variants in metagenomic sequenced reads. This new information encompasses 337 ARGs, whose resistance-conferring variants could not previously be confirmed in such a manner. In MEGARes 3.0, the nodes of the acyclic hierarchical ontology include 4 antimicrobial compound types, 59 resistance classes, 233 mechanisms and 1448 gene groups that classify the 8733 accessions.


Assuntos
Antibacterianos , Anti-Infecciosos , Antibacterianos/farmacologia , Farmacorresistência Bacteriana/genética , Software , Sequenciamento de Nucleotídeos em Larga Escala
4.
Brief Bioinform ; 23(2)2022 03 10.
Artigo em Inglês | MEDLINE | ID: mdl-35212354

RESUMO

Antimicrobial resistance (AMR) is a growing threat to public health and farming at large. In clinical and veterinary practice, timely characterization of the antibiotic susceptibility profile of bacterial infections is a crucial step in optimizing treatment. High-throughput sequencing is a promising option for clinical point-of-care and ecological surveillance, opening the opportunity to develop genotyping-based AMR determination as a possibly faster alternative to phenotypic testing. In the present work, we compare the performance of state-of-the-art methods for detection of AMR using high-throughput sequencing data from clinical settings. We consider five computational approaches based on alignment (AMRPlusPlus), deep learning (DeepARG), k-mer genomic signatures (KARGA, ResFinder) or hidden Markov models (Meta-MARC). We use an extensive collection of 585 isolates with available AMR resistance profiles determined by phenotypic tests across nine antibiotic classes. We show how the prediction landscape of AMR classifiers is highly heterogeneous, with balanced accuracy varying from 0.40 to 0.92. Although some algorithms-ResFinder, KARGA and AMRPlusPlus-exhibit overall better balanced accuracy than others, the high per-AMR-class variance and related findings suggest that: (1) all algorithms might be subject to sampling bias both in data repositories used for training and experimental/clinical settings; and (2) a portion of clinical samples might contain uncharacterized AMR genes that the algorithms-mostly trained on known AMR genes-fail to generalize upon. These results lead us to formulate practical advice for software configuration and application, and give suggestions for future study designs to further develop AMR prediction tools from proof-of-concept to bedside.


Assuntos
Antibacterianos , Farmacorresistência Bacteriana , Antibacterianos/farmacologia , Farmacorresistência Bacteriana/genética , Emprego , Sequenciamento de Nucleotídeos em Larga Escala , Testes de Sensibilidade Microbiana
5.
AIDS Behav ; 28(7): 2286-2295, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38551720

RESUMO

Substance use disorder (SUD), a common comorbidity among people with HIV (PWH), adversely affects HIV clinical outcomes and HIV-related comorbidities. However, less is known about the incidence of different chronic conditions, changes in overall comorbidity burden, and health care utilization by SUD status and patterns among PWH in Florida, an area disproportionately affected by the HIV epidemic. We used electronic health records (EHR) from a large southeastern US consortium, the OneFlorida + clinical research data network. We identified a cohort of PWH with 3 + years of EHRs after the first visit with HIV diagnosis. International Classification of Diseases (ICD) codes were used to identify SUD and comorbidity conditions listed in the Charlson comorbidity index (CCI). A total of 42,271 PWH were included (mean age 44.5, 52% Black, 45% female). The prevalence SUD among PWH was 45.1%. Having a SUD diagnosis among PWH was associated with a higher incidence for most of the conditions listed on the CCI and faster increase in CCI score overtime (rate ratio = 1.45, 95%CI 1.42, 1.49). SUD in PWH was associated with a higher mean number of any care visits (21.7 vs. 14.8) and more frequent emergency department (ED, 3.5 vs. 2.0) and inpatient (8.5 vs. 24.5) visits compared to those without SUD. SUD among PWH was associated with a higher comorbidity burden and more frequent ED and inpatient visits than PWH without a diagnosis of SUD. The high SUD prevalence and comorbidity burden call for improved SUD screening, treatment, and integrated care among PWH.


Assuntos
Comorbidade , Infecções por HIV , Aceitação pelo Paciente de Cuidados de Saúde , Transtornos Relacionados ao Uso de Substâncias , Humanos , Feminino , Florida/epidemiologia , Masculino , Infecções por HIV/epidemiologia , Adulto , Transtornos Relacionados ao Uso de Substâncias/epidemiologia , Pessoa de Meia-Idade , Aceitação pelo Paciente de Cuidados de Saúde/estatística & dados numéricos , Prevalência , Incidência , Registros Eletrônicos de Saúde , Efeitos Psicossociais da Doença
6.
AIDS Care ; 36(2): 248-254, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-37939211

RESUMO

HIV-related stigma is a key contributor to poor HIV-related health outcomes. The purpose of this study is to explore implementing a stigma measure into routine HIV care focusing on the 10-item Medical Monitoring Project measure as a proposed measure. Healthcare providers engaged in HIV-related care in Florida were recruited. Participants completed an interview about their perceptions of measures to assess stigma during clinical care. The analysis followed a directed content approach. Fifteen participants completed the interviews (87% female, 47% non-Hispanic White, case manager 40%). Most providers thought that talking about stigma would be helpful (89%). Three major themes emerged from the analysis: acceptability, subscales of interest, and utility. In acceptability, participants mentioned that assessing stigma could encourage patient-centered care and serve as a conversation starter, but some mentioned not having enough time. Participants thought that the disclosure concerns and negative self-image subscales were most relevant. Some worried they would not have resources for patients or that some issues were beyond their influence. Participants were generally supportive of routinely addressing HIV-related stigma in clinical care, but were concerned that resources, especially to address concerns about disclosure and negative self-image, were not available.


Assuntos
Infecções por HIV , Humanos , Feminino , Masculino , Florida , Estigma Social , Ansiedade , Revelação
7.
BMC Public Health ; 24(1): 749, 2024 Mar 09.
Artigo em Inglês | MEDLINE | ID: mdl-38459461

RESUMO

BACKGROUND: Racial/ethnic disparities in the HIV care continuum have been well documented in the US, with especially striking inequalities in viral suppression rates between White and Black persons with HIV (PWH). The South is considered an epicenter of the HIV epidemic in the US, with the largest population of PWH living in Florida. It is unclear whether any disparities in viral suppression or immune reconstitution-a clinical outcome highly correlated with overall prognosis-have changed over time or are homogenous geographically. In this analysis, we 1) investigate longitudinal trends in viral suppression and immune reconstitution among PWH in Florida, 2) examine the impact of socio-ecological factors on the association between race/ethnicity and clinical outcomes, 3) explore spatial and temporal variations in disparities in clinical outcomes. METHODS: Data were obtained from the Florida Department of Health for 42,369 PWH enrolled in the Ryan White program during 2008-2020. We linked the data to county-level socio-ecological variables available from County Health Rankings. GEE models were fit to assess the effect of race/ethnicity on immune reconstitution and viral suppression longitudinally. Poisson Bayesian hierarchical models were fit to analyze geographic variations in racial/ethnic disparities while adjusting for socio-ecological factors. RESULTS: Proportions of PWH who experienced viral suppression and immune reconstitution rose by 60% and 45%, respectively, from 2008-2020. Odds of immune reconstitution and viral suppression were significantly higher among White [odds ratio =2.34, 95% credible interval=2.14-2.56; 1.95 (1.85-2.05)], and Hispanic [1.70 (1.54-1.87); 2.18(2.07-2.31)] PWH, compared with Black PWH. These findings remained unchanged after accounting for socio-ecological factors. Rural and urban counties in north-central Florida saw the largest racial/ethnic disparities. CONCLUSIONS: There is persistent, spatially heterogeneous, racial/ethnic disparity in HIV clinical outcomes in Florida. This disparity could not be explained by socio-ecological factors, suggesting that further research on modifiable factors that can improve HIV outcomes among Black and Hispanic PWH in Florida is needed.


Assuntos
Etnicidade , Infecções por HIV , Humanos , Teorema de Bayes , Florida/epidemiologia , Disparidades em Assistência à Saúde , Hispânico ou Latino , Infecções por HIV/epidemiologia , Brancos , Negro ou Afro-Americano
8.
Bioinformatics ; 38(3): 856-860, 2022 01 12.
Artigo em Inglês | MEDLINE | ID: mdl-34672334

RESUMO

SUMMARY: TARDiS is a novel phylogenetic tool for optimal genetic subsampling. It optimizes both genetic diversity and temporal distribution through a genetic algorithm. AVAILABILITY AND IMPLEMENTATION: TARDiS, along with example datasets and a user manual, is available at https://github.com/smarini/tardis-phylogenetics.


Assuntos
Genoma Viral , Software , Filogenia , Variação Genética
9.
BMC Med Inform Decis Mak ; 23(1): 181, 2023 09 13.
Artigo em Inglês | MEDLINE | ID: mdl-37704994

RESUMO

BACKGROUND: Prognostic models of hospital-induced delirium, that include potential predisposing and precipitating factors, may be used to identify vulnerable patients and inform the implementation of tailored preventive interventions. It is recommended that, in prediction model development studies, candidate predictors are selected on the basis of existing knowledge, including knowledge from clinical practice. The purpose of this article is to describe the process of identifying and operationalizing candidate predictors of hospital-induced delirium for application in a prediction model development study using a practice-based approach. METHODS: This study is part of a larger, retrospective cohort study that is developing prognostic models of hospital-induced delirium for medical-surgical older adult patients using structured data from administrative and electronic health records. First, we conducted a review of the literature to identify clinical concepts that had been used as candidate predictors in prognostic model development-and-validation studies of hospital-induced delirium. Then, we consulted a multidisciplinary task force of nine members who independently judged whether each clinical concept was associated with hospital-induced delirium. Finally, we mapped the clinical concepts to the administrative and electronic health records and operationalized our candidate predictors. RESULTS: In the review of 34 studies, we identified 504 unique clinical concepts. Two-thirds of the clinical concepts (337/504) were used as candidate predictors only once. The most common clinical concepts included age (31/34), sex (29/34), and alcohol use (22/34). 96% of the clinical concepts (484/504) were judged to be associated with the development of hospital-induced delirium by at least two members of the task force. All of the task force members agreed that 47 or 9% of the 504 clinical concepts were associated with hospital-induced delirium. CONCLUSIONS: Heterogeneity among candidate predictors of hospital-induced delirium in the literature suggests a still evolving list of factors that contribute to the development of this complex phenomenon. We demonstrated a practice-based approach to variable selection for our model development study of hospital-induced delirium. Expert judgement of variables enabled us to categorize the variables based on the amount of agreement among the experts and plan for the development of different models, including an expert-model and data-driven model.


Assuntos
Comitês Consultivos , Delírio , Humanos , Idoso , Estudos Retrospectivos , Consumo de Bebidas Alcoólicas , Hospitais , Delírio/diagnóstico
10.
Clin Infect Dis ; 75(9): 1618-1627, 2022 10 29.
Artigo em Inglês | MEDLINE | ID: mdl-35271704

RESUMO

BACKGROUND: Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) Delta variant has caused a dramatic resurgence in infections in the United Sates, raising questions regarding potential transmissibility among vaccinated individuals. METHODS: Between October 2020 and July 2021, we sequenced 4439 SARS-CoV-2 full genomes, 23% of all known infections in Alachua County, Florida, including 109 vaccine breakthrough cases. Univariate and multivariate regression analyses were conducted to evaluate associations between viral RNA burden and patient characteristics. Contact tracing and phylogenetic analysis were used to investigate direct transmissions involving vaccinated individuals. RESULTS: The majority of breakthrough sequences with lineage assignment were classified as Delta variants (74.6%) and occurred, on average, about 3 months (104 ±â€…57.5 days) after full vaccination, at the same time (June-July 2021) of Delta variant exponential spread within the county. Six Delta variant transmission pairs between fully vaccinated individuals were identified through contact tracing, 3 of which were confirmed by phylogenetic analysis. Delta breakthroughs exhibited broad viral RNA copy number values during acute infection (interquartile range, 1.2-8.64 Log copies/mL), on average 38% lower than matched unvaccinated patients (3.29-10.81 Log copies/mL, P < .00001). Nevertheless, 49% to 50% of all breakthroughs, and 56% to 60% of Delta-infected breakthroughs exhibited viral RNA levels above the transmissibility threshold (4 Log copies/mL) irrespective of time after vaccination. CONCLUSIONS: Delta infection transmissibility and general viral RNA quantification patterns in vaccinated individuals suggest limited levels of sterilizing immunity that need to be considered by public health policies. In particular, ongoing evaluation of vaccine boosters should specifically address whether extra vaccine doses curb breakthrough contribution to epidemic spread.


Assuntos
COVID-19 , Vacinas Virais , Humanos , SARS-CoV-2/genética , RNA Viral/genética , Filogenia , Florida/epidemiologia , COVID-19/epidemiologia , COVID-19/prevenção & controle , Vacinação
11.
AIDS Behav ; 26(10): 3164-3173, 2022 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-35362911

RESUMO

HIV care engagement is a dynamic process. We employed group-based trajectory modeling to examine longitudinal patterns in care engagement among people who were newly diagnosed with HIV and enrolled in the Ryan White program in Florida (n = 9,755) between 2010 and 2015. Five trajectories were identified (47.9% "in care" with 1-2 care visit(s) per 6 months, 18.0% "frequent care" with 3 or more care visits per 6 months, 11.0% "re-engage", 11.0% "gradual drop out", 12.6% "early dropout") based on the number of care attendances (including outpatient/case management visits, viral load or CD4 test) for each six-month during the first five years since diagnosis. Relative to "in care", people in the "frequent care" trajectory were more likely to be Hispanic/Latino and older at HIV diagnosis, whereas people in the three suboptimal care retention trajectories were more likely to be younger. Area deprivation index, rurality, and county health rankings were also strongly associated with care trajectories. Individual- and community-level factors associated to the three suboptimal care retention trajectories, if confirmed to be causative and actionable, could be prioritized to improve HIV care engagement.


Assuntos
Infecções por HIV , Retenção nos Cuidados , Administração de Caso , Florida/epidemiologia , Infecções por HIV/diagnóstico , Infecções por HIV/tratamento farmacológico , Infecções por HIV/epidemiologia , Humanos , Carga Viral
12.
J Infect Dis ; 223(5): 866-875, 2021 03 03.
Artigo em Inglês | MEDLINE | ID: mdl-32644119

RESUMO

BACKGROUND: Persons living with human immunodeficiency virus (HIV) with resistance to antiretroviral therapy are vulnerable to adverse HIV-related health outcomes and can contribute to transmission of HIV drug resistance (HIVDR) when nonvirally suppressed. The degree to which HIVDR contributes to disease burden in Florida-the US state with the highest HIV incidence- is unknown. METHODS: We explored sociodemographic, ecological, and spatiotemporal associations of HIVDR. HIV-1 sequences (n = 34 447) collected during 2012-2017 were obtained from the Florida Department of Health. HIVDR was categorized by resistance class, including resistance to nucleoside reverse-transcriptase , nonnucleoside reverse-transcriptase , protease , and integrase inhibitors. Multidrug resistance and transmitted drug resistance were also evaluated. Multivariable fixed-effects logistic regression models were fitted to associate individual- and county-level sociodemographic and ecological health indicators with HIVDR. RESULTS: The HIVDR prevalence was 19.2% (nucleoside reverse-transcriptase inhibitor resistance), 29.7% (nonnucleoside reverse-transcriptase inhibitor resistance), 6.6% (protease inhibitor resistance), 23.5% (transmitted drug resistance), 13.2% (multidrug resistance), and 8.2% (integrase strand transfer inhibitor resistance), with significant variation by Florida county. Individuals who were older, black, or acquired HIV through mother-to-child transmission had significantly higher odds of HIVDR. HIVDR was linked to counties with lower socioeconomic status, higher rates of unemployment, and poor mental health. CONCLUSIONS: Our findings indicate that HIVDR prevalence is higher in Florida than aggregate North American estimates with significant geographic and socioecological heterogeneity.


Assuntos
Fármacos Anti-HIV , Farmacorresistência Viral , Infecções por HIV , HIV-1 , Fármacos Anti-HIV/uso terapêutico , RNA Polimerases Dirigidas por DNA , Florida/epidemiologia , Infecções por HIV/tratamento farmacológico , Infecções por HIV/epidemiologia , HIV-1/efeitos dos fármacos , HIV-1/genética , Humanos , Transmissão Vertical de Doenças Infecciosas , Mutação , Nucleosídeos/uso terapêutico , Estudos Retrospectivos , Inibidores da Transcriptase Reversa/uso terapêutico , Fatores Sociodemográficos , Análise Espaço-Temporal
13.
BMC Bioinformatics ; 22(1): 445, 2021 Sep 18.
Artigo em Inglês | MEDLINE | ID: mdl-34537012

RESUMO

BACKGROUND: Identification of motifs and quantification of their occurrences are important for the study of genetic diseases, gene evolution, transcription sites, and other biological mechanisms. Exact formulae for estimating count distributions of motifs under Markovian assumptions have high computational complexity and are impractical to be used on large motif sets. Approximated formulae, e.g. based on compound Poisson, are faster, but reliable p value calculation remains challenging. Here, we introduce 'motif_prob', a fast implementation of an exact formula for motif count distribution through progressive approximation with arbitrary precision. Our implementation speeds up the exact calculation, usually impractical, making it feasible and posit to substitute currently employed heuristics. RESULTS: We implement motif_prob in both Perl and C+ + languages, using an efficient error-bound iterative process for the exact formula, providing comparison with state-of-the-art tools (e.g. MoSDi) in terms of precision, run time benchmarks, along with a real-world use case on bacterial motif characterization. Our software is able to process a million of motifs (13-31 bases) over genome lengths of 5 million bases within the minute on a regular laptop, and the run times for both the Perl and C+ + code are several orders of magnitude smaller (50-1000× faster) than MoSDi, even when using their fast compound Poisson approximation (60-120× faster). In the real-world use cases, we first show the consistency of motif_prob with MoSDi, and then how the p-value quantification is crucial for enrichment quantification when bacteria have different GC content, using motifs found in antimicrobial resistance genes. The software and the code sources are available under the MIT license at https://github.com/DataIntellSystLab/motif_prob . CONCLUSIONS: The motif_prob software is a multi-platform and efficient open source solution for calculating exact frequency distributions of motifs. It can be integrated with motif discovery/characterization tools for quantifying enrichment and deviation from expected frequency ranges with exact p values, without loss in data processing efficiency.


Assuntos
Algoritmos , Software
14.
Bioinformatics ; 36(16): 4399-4405, 2020 08 15.
Artigo em Inglês | MEDLINE | ID: mdl-32277811

RESUMO

MOTIVATION: Oxford Nanopore technologies (ONT) add miniaturization and real time to high-throughput sequencing. All available software for ONT data analytics run on cloud/clusters or personal computers. Instead, a linchpin to true portability is software that works on mobile devices of internet connections. Smartphones' and tablets' chipset/memory/operating systems differ from desktop computers, but software can be recompiled. We sought to understand how portable current ONT analysis methods are. RESULTS: Several tools, from base-calling to genome assembly, were ported and benchmarked on an Android smartphone. Out of 23 programs, 11 succeeded. Recompilation failures included lack of standard headers and unsupported instruction sets. Only DSK, BCALM2 and Kraken were able to process files up to 16 GB, with linearly scaling CPU-times. However, peak CPU temperatures were high. In conclusion, the portability scenario is not favorable. Given the fast market growth, attention of developers to ARM chipsets and Android/iOS is warranted, as well as initiatives to implement mobile-specific libraries. AVAILABILITY AND IMPLEMENTATION: The source code is freely available at: https://github.com/marco-oliva/portable-nanopore-analytics.


Assuntos
Nanoporos , Benchmarking , Sequenciamento de Nucleotídeos em Larga Escala , Análise de Sequência de DNA , Software
15.
J Neurovirol ; 27(1): 101-115, 2021 02.
Artigo em Inglês | MEDLINE | ID: mdl-33405206

RESUMO

Despite improvements in antiretroviral therapy, human immunodeficiency virus type 1 (HIV-1)-associated neurocognitive disorders (HAND) remain prevalent in subjects undergoing therapy. HAND significantly affects individuals' quality of life, as well as adherence to therapy, and, despite the increasing understanding of neuropathogenesis, no definitive diagnostic or prognostic marker has been identified. We investigated transcriptomic profiles in frontal cortex tissues of Simian immunodeficiency virus (SIV)-infected Rhesus macaques sacrificed at different stages of infection. Gene expression was compared among SIV-infected animals (n = 11), with or without CD8+ lymphocyte depletion, based on detectable (n = 6) or non-detectable (n = 5) presence of the virus in frontal cortex tissues. Significant enrichment in activation of monocyte and macrophage cellular pathways was found in animals with detectable brain infection, independently from CD8+ lymphocyte depletion. In addition, transcripts of four poly (ADP-ribose) polymerases (PARPs) were up-regulated in the frontal cortex, which was confirmed by real-time polymerase chain reaction. Our results shed light on involvement of PARPs in SIV infection of the brain and their role in SIV-associated neurodegenerative processes. Inhibition of PARPs may provide an effective novel therapeutic target for HIV-related neuropathology.


Assuntos
Transtornos Cognitivos/virologia , Lobo Frontal/metabolismo , Lobo Frontal/virologia , Poli(ADP-Ribose) Polimerases/metabolismo , Síndrome de Imunodeficiência Adquirida dos Símios/metabolismo , Animais , Transtornos Cognitivos/metabolismo , Macaca mulatta , Masculino , Síndrome de Imunodeficiência Adquirida dos Símios/virologia
16.
J Biomed Inform ; 115: 103689, 2021 03.
Artigo em Inglês | MEDLINE | ID: mdl-33548542

RESUMO

Learning causal effects from observational data, e.g. estimating the effect of a treatment on survival by data-mining electronic health records (EHRs), can be biased due to unmeasured confounders, mediators, and colliders. When the causal dependencies among features/covariates are expressed in the form of a directed acyclic graph, using do-calculus it is possible to identify one or more adjustment sets for eliminating the bias on a given causal query under certain assumptions. However, prior knowledge of the causal structure might be only partial; algorithms for causal structure discovery often provide ambiguous solutions, and their computational complexity becomes practically intractable when the feature sets grow large. We hypothesize that the estimation of the true causal effect of a causal query on to an outcome can be approximated as an ensemble of lower complexity estimators, namely bagged random causal networks. A bagged random causal network is an ensemble of subnetworks constructed by sampling the feature subspaces (with the query, the outcome, and a random number of other features), drawing conditional dependencies among the features, and inferring the corresponding adjustment sets. The causal effect can be then estimated by any regression function of the outcome by the query paired with the adjustment sets. Through simulations and a real-world clinical dataset (class III malocclusion data), we show that the bagged estimator is -in most cases- consistent with the true causal effect if the structure is known, has a good variance/bias trade-off when the structure is unknown (estimated using heuristics), has lower computational complexity than learning a full network, and outperforms boosted regression. In conclusion, the bagged random causal network is well-suited to estimate query-target causal effects from observational studies on EHR and other high-dimensional biomedical databases.


Assuntos
Algoritmos , Viés , Causalidade
17.
Environ Res ; 197: 111185, 2021 06.
Artigo em Inglês | MEDLINE | ID: mdl-33901445

RESUMO

An individual's health and conditions are associated with a complex interplay between the individual's genetics and his or her exposures to both internal and external environments. Much attention has been placed on characterizing of the genome in the past; nevertheless, genetics only account for about 10% of an individual's health conditions, while the remaining appears to be determined by environmental factors and gene-environment interactions. To comprehensively understand the causes of diseases and prevent them, environmental exposures, especially the external exposome, need to be systematically explored. However, the heterogeneity of the external exposome data sources (e.g., same exposure variables using different nomenclature in different data sources, or vice versa, two variables have the same or similar name but measure different exposures in reality) increases the difficulty of analyzing and understanding the associations between environmental exposures and health outcomes. To solve the issue, the development of semantic standards using an ontology-driven approach is inevitable because ontologies can (1) provide a unambiguous and consistent understanding of the variables in heterogeneous data sources, and (2) explicitly express and model the context of the variables and relationships between those variables. We conducted a review of existing ontology for the external exposome and found only four relevant ontologies. Further, the four existing ontologies are limited: they (1) often ignored the spatiotemporal characteristics of external exposome data, and (2) were developed in isolation from other conceptual frameworks (e.g., the socioecological model and the social determinants of health). Moving forward, the combination of multi-domain and multi-scale data (i.e., genome, phenome and exposome at different granularity) and different conceptual frameworks is the basis of health outcomes research in the future.


Assuntos
Expossoma , Causalidade , Exposição Ambiental , Feminino , Humanos , Masculino , Semântica
18.
Bioinformatics ; 35(11): 1963-1965, 2019 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-30358807

RESUMO

SUMMARY: UGENE is a free, open-source, cross-platform bioinformatics software. UGENE deploys pre-defined pipelines and a flexible instrument to design new workflows and visually build multi-step analytics pipelines. The new UGENE v.1.31 release offers graphical, user-friendly wrapping of a number of popular command-line metagenomics classification programs (Kraken, CLARK, DIAMOND), combinable serially and in parallel through the workflow designer, with multiple, customizable reference databases. Ensemble classification voting is available through the WEVOTE algorithm, with augmented output in the form of detailed table reports. Pre-built workflows (which include all steps from data cleaning to summaries) are included with the installation and a tutorial is available on the UGENE website. Further expansion with multiple visualization tools for reports is planned. AVAILABILITY AND IMPLEMENTATION: UGENE is available at http://ugene.net/, implemented in C++ and Qt, and released under GNU General Public License (GPL) version 2.


Assuntos
Metagenômica , Software , Algoritmos , Biologia Computacional , Fluxo de Trabalho
19.
J Asthma ; 57(11): 1155-1167, 2020 11.
Artigo em Inglês | MEDLINE | ID: mdl-31288571

RESUMO

Objectives: To identify prodromal correlates of asthma as compared to chronic obstructive pulmonary disease and allied-conditions (COPDAC) using a multi domain analysis of socio-ecological, clinical, and demographic domains.Methods: This is a retrospective case-risk-control study using data from Florida's statewide Healthcare Cost and Utilization Project (HCUP). Patients were grouped into three groups: asthma, COPDAC (without asthma), and neither asthma nor COPDAC. To identify socio-ecological, clinical, demographic, and clinical predictors of asthma and COPDAC, we used univariate analysis, feature ranking by bootstrapped information gain ratio, multivariable logistic regression with LogitBoost selection, decision trees, and random forests.Results: A total of 141,729 patients met inclusion criteria, of whom 56,052 were diagnosed with asthma, 85,677 with COPDAC, and 84,737 with neither asthma nor COPDAC. The multi-domain approach proved superior in distinguishing asthma versus COPDAC and non-asthma/non-COPDAC controls (area under the curve (AUROC) 84%). The best domain to distinguish asthma from COPDAC without controls was prior clinical diagnoses (AUROC 82%). Ranking variables from all the domains found the most important predictors for the asthma versus COPDAC and controls were primarily socio-ecological variables, while for asthma versus COPDAC without controls, demographic and clinical variables such as age, CCI, and prior clinical diagnoses, scored better.Conclusions: In this large statewide study using a machine learning approach, we found that a multi-domain approach with demographics, clinical, and socio-ecological variables best predicted an asthma diagnosis. Future work should focus on integrating machine learning-generated predictive models into clinical practice to improve early detection of those common respiratory diseases.


Assuntos
Asma/diagnóstico , Aprendizado de Máquina , Modelos Biológicos , Demandas Administrativas em Assistência à Saúde/estatística & dados numéricos , Adulto , Asma/epidemiologia , Big Data , Estudos de Casos e Controles , Diagnóstico Precoce , Feminino , Florida/epidemiologia , Humanos , Estudos Longitudinais , Masculino , Pessoa de Meia-Idade , Curva ROC , Estudos Retrospectivos , Medição de Risco/métodos , Medição de Risco/estatística & dados numéricos , Fatores de Risco , Fatores Socioeconômicos
20.
Environ Res ; 183: 109275, 2020 04.
Artigo em Inglês | MEDLINE | ID: mdl-32105887

RESUMO

Environment-wide association studies (EWAS) are an untargeted, agnostic, and hypothesis-generating approach to exploring environmental factors associated with health outcomes, akin to genome-wide association studies (GWAS). While design, methodology, and replicability standards for GWAS are established, EWAS pose many challenges. We systematically reviewed published literature on EWAS to categorize scope, impact, types of analytical approaches, and open challenges in designs and methodologies. The Web of Science and PubMed databases were searched through multiple queries to identify EWAS articles between January 2010 and December 2018, and a systematic review was conducted following the Preferred Reporting Item for Systematic Reviews and Meta-Analyses (PRISMA) reporting standard. Twenty-three articles met our inclusion criteria and were included. For each study, we categorized the data sources, the definitions of study outcomes, the sets of environmental variables, and the data engineering/analytical approaches, e.g. neighborhood definition, variable standardization, handling of multiple hypothesis testing, model selection, and validation. We identified limited exploitation of data sources, high heterogeneity in analytical approaches, and lack of replication. Despite of the promising utility of EWAS, further development of EWAS will require improved data sources, standardization of study designs, and rigorous testing of methodologies.


Assuntos
Exposição Ambiental , Saúde Ambiental , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Criança , Pré-Escolar , China , Estudos de Coortes , Feminino , Estudo de Associação Genômica Ampla , Humanos , Recém-Nascido , Masculino , Pessoa de Meia-Idade , Inquéritos Nutricionais , Gravidez , Estudos Prospectivos , Adulto Jovem
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA