Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 143
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
PLoS Genet ; 20(5): e1011273, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38728357

RESUMO

Existing imaging genetics studies have been mostly limited in scope by using imaging-derived phenotypes defined by human experts. Here, leveraging new breakthroughs in self-supervised deep representation learning, we propose a new approach, image-based genome-wide association study (iGWAS), for identifying genetic factors associated with phenotypes discovered from medical images using contrastive learning. Using retinal fundus photos, our model extracts a 128-dimensional vector representing features of the retina as phenotypes. After training the model on 40,000 images from the EyePACS dataset, we generated phenotypes from 130,329 images of 65,629 British White participants in the UK Biobank. We conducted GWAS on these phenotypes and identified 14 loci with genome-wide significance (p<5×10-8 and intersection of hits from left and right eyes). We also did GWAS on the retina color, the average color of the center region of the retinal fundus photos. The GWAS of retina colors identified 34 loci, 7 are overlapping with GWAS of raw image phenotype. Our results establish the feasibility of this new framework of genomic study based on self-supervised phenotyping of medical images.


Assuntos
Fundo de Olho , Estudo de Associação Genômica Ampla , Fenótipo , Retina , Humanos , Estudo de Associação Genômica Ampla/métodos , Retina/diagnóstico por imagem , Masculino , Polimorfismo de Nucleotídeo Único , Feminino , Processamento de Imagem Assistida por Computador/métodos
2.
Genome Res ; 33(7): 1007-1014, 2023 07.
Artigo em Inglês | MEDLINE | ID: mdl-37316352

RESUMO

The Li and Stephens (LS) hidden Markov model (HMM) models the process of reconstructing a haplotype as a mosaic copy of haplotypes in a reference panel. For small panels, the probabilistic parameterization of LS enables modeling the uncertainties of such mosaics. However, LS becomes inefficient when sample size is large, because of its linear time complexity. Recently the PBWT, an efficient data structure capturing the local haplotype matching among haplotypes, was proposed to offer a fast method for giving some optimal solution (Viterbi) to the LS HMM. Previously, we introduced the minimal positional substring cover (MPSC) problem as an alternative formulation of LS whose objective is to cover a query haplotype by a minimum number of segments from haplotypes in a reference panel. The MPSC formulation allows the generation of a haplotype threading in time constant to sample size (O(N)). This allows haplotype threading on very large biobank-scale panels on which the LS model is infeasible. Here, we present new results on the solution space of the MPSC. In addition, we derived a number of optimal algorithms for MPSC, including solution enumerations, the length maximal MPSC, and h-MPSC solutions. In doing so, our algorithms reveal the solution space of LS for large panels. We show that our method is informative in terms of revealing the characteristics of biobank-scale data sets and can improve genotype imputation.


Assuntos
Algoritmos , Software , Humanos , Haplótipos , Genótipo , Etnicidade
3.
Genome Res ; 33(7): 1015-1022, 2023 07.
Artigo em Inglês | MEDLINE | ID: mdl-37349109

RESUMO

Although rates of recombination events across the genome (genetic maps) are fundamental to genetic research, the majority of current studies only use one standard map. There is evidence suggesting population differences in genetic maps, and thus estimating population-specific maps, are of interest. Although the recent availability of biobank-scale data offers such opportunities, current methods are not efficient at leveraging very large sample sizes. The most accurate methods are still linkage disequilibrium (LD)-based methods that are only tractable for a few hundred samples. In this work, we propose a fast and memory-efficient method for estimating genetic maps from population genotyping data. Our method, FastRecomb, leverages the efficient positional Burrows-Wheeler transform (PBWT) data structure for counting IBD segment boundaries as potential recombination events. We used PBWT blocks to avoid redundant counting of pairwise matches. Moreover, we used a panel-smoothing technique to reduce the noise from errors and recent mutations. Using simulation, we found that FastRecomb achieves state-of-the-art performance at 10-kb resolution, in terms of correlation coefficients between the estimated map and the ground truth. This is mainly because FastRecomb can effectively take advantage of large panels comprising more than hundreds of thousands of haplotypes. At the same time, other methods lack the efficiency to handle such data. We believe further refinement of FastRecomb would deliver more accurate genetic maps for the genetics community.


Assuntos
Bancos de Espécimes Biológicos , Genoma , Haplótipos , Desequilíbrio de Ligação , Polimorfismo de Nucleotídeo Único , Recombinação Genética
4.
PLoS Genet ; 19(12): e1011057, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-38039339

RESUMO

Although genome-wide association studies (GWAS) have identified tens of thousands of genetic loci, the genetic architecture is still not fully understood for many complex traits. Most GWAS and sequencing association studies have focused on single nucleotide polymorphisms or copy number variations, including common and rare genetic variants. However, phased haplotype information is often ignored in GWAS or variant set tests for rare variants. Here we leverage the identity-by-descent (IBD) segments inferred from a random projection-based IBD detection algorithm in the mapping of genetic associations with complex traits, to develop a computationally efficient statistical test for IBD mapping in biobank-scale cohorts. We used sparse linear algebra and random matrix algorithms to speed up the computation, and a genome-wide IBD mapping scan of more than 400,000 samples finished within a few hours. Simulation studies showed that our new method had well-controlled type I error rates under the null hypothesis of no genetic association in large biobank-scale cohorts, and outperformed traditional GWAS single-variant tests when the causal variants were untyped and rare, or in the presence of haplotype effects. We also applied our method to IBD mapping of six anthropometric traits using the UK Biobank data and identified a total of 3,442 associations, 2,131 (62%) of which remained significant after conditioning on suggestive tag variants in the ± 3 centimorgan flanking regions from GWAS.


Assuntos
Bancos de Espécimes Biológicos , Estudo de Associação Genômica Ampla , Humanos , Estudo de Associação Genômica Ampla/métodos , Variações do Número de Cópias de DNA , Haplótipos/genética , Fenótipo , Polimorfismo de Nucleotídeo Único/genética
5.
Bioinformatics ; 39(1)2023 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-36440908

RESUMO

MOTIVATION: The positional Burrows-Wheeler transform (PBWT) has led to tremendous strides in haplotype matching on biobank-scale data. For genetic genealogical search, PBWT-based methods have optimized the asymptotic runtime of finding long matches between a query haplotype and a predefined panel of haplotypes. However, to enable fast query searches, the full-sized panel and PBWT data structures must be kept in memory, preventing existing algorithms from scaling up to modern biobank panels consisting of millions of haplotypes. In this work, we propose a space-efficient variation of PBWT named Syllable-PBWT, which divides every haplotype into syllables, builds the PBWT positional prefix arrays on the compressed syllabic panel, and leverages the polynomial rolling hash function for positional substring comparison. With the Syllable-PBWT data structures, we then present a long match query algorithm named Syllable-Query. RESULTS: Compared to the most time- and space-efficient previously published solution to the long match query problem, Syllable-Query reduced the memory use by a factor of over 100 on both the UK Biobank genotype data and the 1000 Genomes Project sequence data. Surprisingly, the smaller size of our syllabic data structures allows for more efficient iteration and CPU cache usage, granting Syllable-Query even faster runtime than existing solutions. AVAILABILITY AND IMPLEMENTATION: https://github.com/ZhiGroup/Syllable-PBWT. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Genoma , Haplótipos , Genótipo , Software , Análise de Sequência de DNA/métodos
6.
Bioinformatics ; 39(6)2023 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-37166451

RESUMO

MOTIVATION: Due to the rapid growth of the genetic database size, genealogical search, a process of inferring familial relatedness by identifying DNA matches, has become a viable approach to help individuals finding missing family members or law enforcement agencies locating suspects. A fast and accurate method is needed to search an out-of-database individual against millions of individuals. Most existing approaches only offer all-versus-all within panel match. Some prototype algorithms offer one-versus-all query from out-of-panel individual, but they do not tolerate errors. RESULTS: A new method, random projection-based identity-by-descent (IBD) detection (RaPID) query, is introduced to make fast genealogical search possible. RaPID-Query identifies IBD segments between a query haplotype and a panel of haplotypes. By integrating matches over multiple PBWT indexes, RaPID-Query manages to locate IBD segments quickly with a given cutoff length while allowing mismatched sites. A single query against all UK biobank autosomal chromosomes was completed within 2.76 seconds on average, with the minimum length 7 cM and 700 markers. RaPID-Query achieved a 0.016 false negative rate and a 0.012 false positive rate simultaneously on a chromosome 20 sequencing panel having 86 265 sites. This is comparable to the state-of-the-art IBD detection method TPBWT(out-of-sample) and Hap-IBD. The high-quality IBD segments yielded by RaPID-Query were able to distinguish up to fourth degree of the familial relatedness for a given individual pair, and the area under the receiver operating characteristic curve values are at least 97.28%. AVAILABILITY AND IMPLEMENTATION: The RaPID-Query program is available at https://github.com/ucfcbb/RaPID-Query.


Assuntos
Algoritmos , Cromossomos , Humanos , Haplótipos , Análise de Sequência
7.
PLoS Genet ; 17(1): e1009315, 2021 01.
Artigo em Inglês | MEDLINE | ID: mdl-33476339

RESUMO

Inference of relationships from whole-genome genetic data of a cohort is a crucial prerequisite for genome-wide association studies. Typically, relationships are inferred by computing the kinship coefficients (ϕ) and the genome-wide probability of zero IBD sharing (π0) among all pairs of individuals. Current leading methods are based on pairwise comparisons, which may not scale up to very large cohorts (e.g., sample size >1 million). Here, we propose an efficient relationship inference method, RAFFI. RAFFI leverages the efficient RaPID method to call IBD segments first, then estimate the ϕ and π0 from detected IBD segments. This inference is achieved by a data-driven approach that adjusts the estimation based on phasing quality and genotyping quality. Using simulations, we showed that RAFFI is robust against phasing/genotyping errors, admix events, and varying marker densities, and achieves higher accuracy compared to KING, the current leading method, especially for more distant relatives. When applied to the phased UK Biobank data with ~500K individuals, RAFFI is approximately 18 times faster than KING. We expect RAFFI will offer fast and accurate relatedness inference for even larger cohorts.


Assuntos
Estudo de Associação Genômica Ampla/estatística & dados numéricos , Técnicas de Genotipagem/estatística & dados numéricos , Modelos Genéticos , Bancos de Espécimes Biológicos , Genoma Humano/genética , Haplótipos/genética , Humanos , Linhagem , Polimorfismo de Nucleotídeo Único/genética
8.
Comput Inform Nurs ; 42(5): 377-387, 2024 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-39248448

RESUMO

We are in a booming era of artificial intelligence, particularly with the increased availability of technologies that can help generate content, such as ChatGPT. Healthcare institutions are discussing or have started utilizing these innovative technologies within their workflow. Major electronic health record vendors have begun to leverage large language models to process and analyze vast amounts of clinical natural language text, performing a wide range of tasks in healthcare settings to help alleviate clinicians' burden. Although such technologies can be helpful in applications such as patient education, drafting responses to patient questions and emails, medical record summarization, and medical research facilitation, there are concerns about the tools' readiness for use within the healthcare domain and acceptance by the current workforce. The goal of this article is to provide nurses with an understanding of the currently available foundation models and artificial intelligence tools, enabling them to evaluate the need for such tools and assess how they can impact current clinical practice. This will help nurses efficiently assess, implement, and evaluate these tools to ensure these technologies are ethically and effectively integrated into healthcare systems, while also rigorously monitoring their performance and impact on patient care.


Assuntos
Inteligência Artificial , Registros Eletrônicos de Saúde , Humanos , Processamento de Linguagem Natural , Informática em Enfermagem
9.
BMC Bioinformatics ; 23(Suppl 6): 281, 2022 Jul 14.
Artigo em Inglês | MEDLINE | ID: mdl-35836130

RESUMO

BACKGROUND: Model card reports aim to provide informative and transparent description of machine learning models to stakeholders. This report document is of interest to the National Institutes of Health's Bridge2AI initiative to address the FAIR challenges with artificial intelligence-based machine learning models for biomedical research. We present our early undertaking in developing an ontology for capturing the conceptual-level information embedded in model card reports. RESULTS: Sourcing from existing ontologies and developing the core framework, we generated the Model Card Report Ontology. Our development efforts yielded an OWL2-based artifact that represents and formalizes model card report information. The current release of this ontology utilizes standard concepts and properties from OBO Foundry ontologies. Also, the software reasoner indicated no logical inconsistencies with the ontology. With sample model cards of machine learning models for bioinformatics research (HIV social networks and adverse outcome prediction for stent implantation), we showed the coverage and usefulness of our model in transforming static model card reports to a computable format for machine-based processing. CONCLUSIONS: The benefit of our work is that it utilizes expansive and standard terminologies and scientific rigor promoted by biomedical ontologists, as well as, generating an avenue to make model cards machine-readable using semantic web technology. Our future goal is to assess the veracity of our model and later expand the model to include additional concepts to address terminological gaps. We discuss tools and software that will utilize our ontology for potential application services.


Assuntos
Ontologias Biológicas , Semântica , Inteligência Artificial , Biologia Computacional , Aprendizado de Máquina , Software
10.
Bioinformatics ; 37(16): 2390-2397, 2021 Aug 25.
Artigo em Inglês | MEDLINE | ID: mdl-33624749

RESUMO

MOTIVATION: Durbin's positional Burrows-Wheeler transform (PBWT) is a scalable data structure for haplotype matching. It has been successfully applied to identical by descent (IBD) segment identification and genotype imputation. Once the PBWT of a haplotype panel is constructed, it supports efficient retrieval of all shared long segments among all individuals (long matches) and efficient query between an external haplotype and the panel. However, the standard PBWT is an array-based static data structure and does not support dynamic updates of the panel. RESULTS: Here, we generalize the static PBWT to a dynamic data structure, d-PBWT, where the reverse prefix sorting at each position is stored with linked lists. We also developed efficient algorithms for insertion and deletion of individual haplotypes. In addition, we verified that d-PBWT can support all algorithms of PBWT. In doing so, we systematically investigated variations of set maximal match and long match query algorithms: while they all have average case time complexity independent of database size, they have different worst case complexities and dependencies on additional data structures. AVAILABILITYAND IMPLEMENTATION: The benchmarking code is available at genome.ucf.edu/d-PBWT. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

11.
J Biomed Inform ; 133: 104166, 2022 09.
Artigo em Inglês | MEDLINE | ID: mdl-35985620

RESUMO

Vancomycin is a commonly used antimicrobial in hospitals, and therapeutic drug monitoring (TDM) is required to optimize its efficacy and avoid toxicities. Bayesian models are currently recommended to predict the antibiotic levels. These models, however, although using carefully designed lab observations, were often developed in limited patient populations. The increasing availability of electronic health record (EHR) data offers an opportunity to develop TDM models for real-world patient populations. Here, we present a deep learning-based pharmacokinetic prediction model for vancomycin (PK-RNN-V E) using a large EHR dataset of 5,483 patients with 55,336 vancomycin administrations. PK-RNN-V E takes the patient's real-time sparse and irregular observations and offers dynamic predictions. Our results show that RNN-PK-V E offers a root mean squared error (RMSE) of 5.39 and outperforms the traditional Bayesian model (VTDM model) with an RMSE of 6.29. We believe that PK-RNN-V E can provide a pharmacokinetic model for vancomycin and other antimicrobials that require TDM.


Assuntos
Aprendizado Profundo , Vancomicina , Teorema de Bayes , Monitoramento de Medicamentos/métodos , Registros Eletrônicos de Saúde , Humanos , Vancomicina/uso terapêutico
12.
BMC Biol ; 19(1): 32, 2021 02 16.
Artigo em Inglês | MEDLINE | ID: mdl-33593342

RESUMO

BACKGROUND: The genealogical histories of individuals within populations are of interest to studies aiming both to uncover detailed pedigree information and overall quantitative population demographic histories. However, the analysis of quantitative details of individual genealogical histories has faced challenges from incomplete available pedigree records and an absence of objective and quantitative details in pedigree information. Although complete pedigree information for most individuals is difficult to track beyond a few generations, it is possible to describe a person's genealogical history using their genetic relatives revealed by identity by descent (IBD) segments-long genomic segments shared by two individuals within a population, which are identical due to inheritance from common ancestors. When modern biobanks collect genotype information for a significant fraction of a population, dense genetic connections of a person can be traced using such IBD segments, offering opportunities to characterize individuals in the context of the underlying populations. Here, we conducted an individual-centric analysis of IBD segments among the UK Biobank participants that represent 0.7% of the UK population. RESULTS: We made a high-quality call set of IBD segments over 5 cM among all 500,000 UK Biobank participants. On average, one UK individual shares IBD segments with 14,000 UK Biobank participants, which we refer to as "relatives." Using these segments, approximately 80% of a person's genome can be imputed. We subsequently propose genealogical descriptors based on the genetic connections of relative cohorts of individuals sharing at least one IBD segment and show that such descriptors offer important information about one's genetic makeup, personal genealogical history, and social behavior. Through analysis of relative counts sharing segments at different lengths, we identified a group, potentially British Jews, who has a distinct pattern of familial expansion history. Finally, using the enrichment of relatives in one's neighborhood, we identified regional variations of personal preference favoring living closer to one's extended families. CONCLUSIONS: Our analysis revealed genetic makeup, personal genealogical history, and social behaviors at the population scale, opening possibilities for further studies of individual's genetic connections in biobank data.


Assuntos
Bancos de Espécimes Biológicos/estatística & dados numéricos , Genealogia e Heráldica , Variação Genética , Linhagem , Humanos , Reino Unido
13.
Bioinformatics ; 35(14): i233-i241, 2019 07 15.
Artigo em Inglês | MEDLINE | ID: mdl-31510689

RESUMO

MOTIVATION: With the wide availability of whole-genome genotype data, there is an increasing need for conducting genetic genealogical searches efficiently. Computationally, this task amounts to identifying shared DNA segments between a query individual and a very large panel containing millions of haplotypes. The celebrated Positional Burrows-Wheeler Transform (PBWT) data structure is a pre-computed index of the panel that enables constant time matching at each position between one haplotype and an arbitrarily large panel. However, the existing algorithm (Durbin's Algorithm 5) can only identify set-maximal matches, the longest matches ending at any location in a panel, while in real genealogical search scenarios, multiple 'good enough' matches are desired. RESULTS: In this work, we developed two algorithmic extensions of Durbin's Algorithm 5, that can find all L-long matches, matches longer than or equal to a given length L, between a query and a panel. In the first algorithm, PBWT-Query, we introduce 'virtual insertion' of the query into the PBWT matrix of the panel, and then scanning up and down for the PBWT match blocks with length greater than L. In our second algorithm, L-PBWT-Query, we further speed up PBWT-Query by introducing additional data structures that allow us to avoid iterating through blocks of incomplete matches. The efficiency of PBWT-Query and L-PBWT-Query is demonstrated using the simulated data and the UK Biobank data. Our results show that our proposed algorithms can detect related individuals for a given query efficiently in very large cohorts which enables a fast on-line query search. AVAILABILITY AND IMPLEMENTATION: genome.ucf.edu/pbwt-query. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Haplótipos , Algoritmos , Genoma , Genótipo , Software
14.
J Am Coll Nutr ; 39(1): 47-53, 2020 01.
Artigo em Inglês | MEDLINE | ID: mdl-31498715

RESUMO

Objective: To investigate gut microbial composition in Latino infants in relation to breastfeeding, obesity, and antibiotic exposure.Method: We analyzed the gut microbiome in 6-month-old Latino infants from an ongoing urban mother-child cohort. Alpha and beta diversity were assessed in relation to infants' early dietary exposure and anthropometrics including obesity.Results: Infants exclusively breastfed at 4 to 6 weeks had lower alpha diversity and less bacterial abundance compared with those who did not. Breastfeeding status at 4 to 6 weeks and 6 months of age accounted for differences in alpha and beta diversity. Infants who were obese at 6 months of age had higher levels of alpha diversity compared with non-obese infants.Conclusions: Early exclusive breastfeeding and obesity impacts microbial diversity by 6 months of age in Latino infants, a group at high risk for future obesity.


Assuntos
Comportamento Alimentar/fisiologia , Microbioma Gastrointestinal/genética , Hispânico ou Latino/estatística & dados numéricos , Obesidade Infantil/etnologia , Obesidade Infantil/microbiologia , Antropometria , Antibacterianos/efeitos adversos , Aleitamento Materno , Exposição Dietética/efeitos adversos , Fezes/microbiologia , Feminino , Humanos , Lactente , Modelos Lineares , Masculino , RNA Ribossômico 16S/análise
15.
Neurosurg Focus ; 48(5): E4, 2020 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-32357322

RESUMO

OBJECTIVE: Subarachnoid hemorrhage (SAH) is a devastating cerebrovascular condition, not only due to the effect of initial hemorrhage, but also due to the complication of delayed cerebral ischemia (DCI). While hypertension facilitated by vasopressors is often initiated to prevent DCI, which vasopressor is most effective in improving outcomes is not known. The objective of this study was to determine associations between initial vasopressor choice and mortality in patients with nontraumatic SAH. METHODS: The authors conducted a retrospective cohort study using a large, national electronic medical record data set from 2000-2014 to identify patients with a new diagnosis of nontraumatic SAH (based on ICD-9 codes) who were treated with the vasopressors dopamine, phenylephrine, or norepinephrine. The relationship between the initial choice of vasopressor therapy and the primary outcome, which was defined as in-hospital death or discharge to hospice care, was examined. RESULTS: In total, 2634 patients were identified with nontraumatic SAH who were treated with a vasopressor. In this cohort, the average age was 56.5 years, 63.9% were female, and 36.5% of patients developed the primary outcome. The incidence of the primary outcome was higher in those initially treated with either norepinephrine (47.6%) or dopamine (50.6%) than with phenylephrine (24.5%). After adjusting for possible confounders using propensity score methods, the adjusted OR of the primary outcome was higher with dopamine (OR 2.19, 95% CI 1.70-2.81) and norepinephrine (OR 2.24, 95% CI 1.80-2.80) compared with phenylephrine. Sensitivity analyses using different variable selection procedures, causal inference models, and machine-learning methods confirmed the main findings. CONCLUSIONS: In patients with nontraumatic SAH, phenylephrine was significantly associated with reduced mortality in SAH patients compared to dopamine or norepinephrine. Prospective randomized clinical studies are warranted to confirm this finding.


Assuntos
Dopamina/uso terapêutico , Registros Eletrônicos de Saúde , Norepinefrina/uso terapêutico , Fenilefrina/uso terapêutico , Hemorragia Subaracnóidea/tratamento farmacológico , Vasoconstritores/uso terapêutico , Adulto , Idoso , Feminino , Escala de Coma de Glasgow , Mortalidade Hospitalar , Humanos , Modelos Logísticos , Masculino , Pessoa de Meia-Idade , Alta do Paciente/estatística & dados numéricos , Estudos Retrospectivos , Hemorragia Subaracnóidea/complicações , Hemorragia Subaracnóidea/mortalidade
16.
J Med Internet Res ; 22(7): e16981, 2020 07 31.
Artigo em Inglês | MEDLINE | ID: mdl-32735224

RESUMO

BACKGROUND: Asthma exacerbation is an acute or subacute episode of progressive worsening of asthma symptoms and can have a significant impact on patients' quality of life. However, efficient methods that can help identify personalized risk factors and make early predictions are lacking. OBJECTIVE: This study aims to use advanced deep learning models to better predict the risk of asthma exacerbations and to explore potential risk factors involved in progressive asthma. METHODS: We proposed a novel time-sensitive, attentive neural network to predict asthma exacerbation using clinical variables from large electronic health records. The clinical variables were collected from the Cerner Health Facts database between 1992 and 2015, including 31,433 adult patients with asthma. Interpretations on both patient and cohort levels were investigated based on the model parameters. RESULTS: The proposed model obtained an area under the curve value of 0.7003 through a five-fold cross-validation, which outperformed the baseline methods. The results also demonstrated that the addition of elapsed time embeddings considerably improved the prediction performance. Further analysis observed diverse distributions of contributing factors across patients as well as some possible cohort-level risk factors, which could be found supporting evidence from peer-reviewed literature such as respiratory diseases and esophageal reflux. CONCLUSIONS: The proposed neural network model performed better than previous methods for the prediction of asthma exacerbation. We believe that personalized risk scores and analyses of contributing factors can help clinicians better assess the individual's level of disease progression and afford the opportunity to adjust treatment, prevent exacerbation, and improve outcomes.


Assuntos
Asma/fisiopatologia , Aprendizado Profundo/normas , Redes Neurais de Computação , Qualidade de Vida/psicologia , Progressão da Doença , Feminino , Humanos , Masculino , Estudos Retrospectivos , Medição de Risco , Fatores de Risco
17.
BMC Bioinformatics ; 20(Suppl 11): 279, 2019 Jun 06.
Artigo em Inglês | MEDLINE | ID: mdl-31167638

RESUMO

BACKGROUND: Recent advances in whole-genome sequencing and SNP array technology have led to the generation of a large amount of genotype data. Large volumes of genotype data will require faster and more efficient methods for storing and searching the data. Positional Burrows-Wheeler Transform (PBWT) provides an appropriate data structure for bi-allelic data. With the increasing sample sizes, more multi-allelic sites are expected to be observed. Hence, there is a necessity to handle multi-allelic genotype data. RESULTS: In this paper, we introduce a multi-allelic version of the Positional Burrows-Wheeler Transform (mPBWT) based on the bi-allelic version for compression and searching. The time-complexity for constructing the data structure and searching within a panel containing t-allelic sites increases by a factor of t. CONCLUSION: Considering the small value for the possible alleles t, the time increase for the multi-allelic PBWT will be negligible and comparable to the bi-allelic version of PBWT.


Assuntos
Algoritmos , Alelos , Compressão de Dados , Genes , Haplótipos/genética , Humanos , Fatores de Tempo
18.
BMC Genomics ; 20(Suppl 1): 82, 2019 Feb 04.
Artigo em Inglês | MEDLINE | ID: mdl-30712510

RESUMO

BACKGROUND: Existing functional description of genes are categorical, discrete, and mostly through manual process. In this work, we explore the idea of gene embedding, distributed representation of genes, in the spirit of word embedding. RESULTS: From a pure data-driven fashion, we trained a 200-dimension vector representation of all human genes, using gene co-expression patterns in 984 data sets from the GEO databases. These vectors capture functional relatedness of genes in terms of recovering known pathways - the average inner product (similarity) of genes within a pathway is 1.52X greater than that of random genes. Using t-SNE, we produced a gene co-expression map that shows local concentrations of tissue specific genes. We also illustrated the usefulness of the embedded gene vectors, laden with rich information on gene co-expression patterns, in tasks such as gene-gene interaction prediction. CONCLUSIONS: We proposed a machine learning method that utilizes transcriptome-wide gene co-expression to generate a distributed representation of genes. We further demonstrated the utility of our distribution by predicting gene-gene interaction based solely on gene names. The distributed representation of genes could be useful for more bioinformatics applications.


Assuntos
Biologia Computacional/métodos , Software , Algoritmos , Biologia Computacional/normas , Epistasia Genética , Perfilação da Expressão Gênica/métodos , Regulação da Expressão Gênica , Humanos , Curva ROC , Transcriptoma , Interface Usuário-Computador
19.
BMC Genomics ; 20(Suppl 1): 80, 2019 Feb 04.
Artigo em Inglês | MEDLINE | ID: mdl-30712512

RESUMO

The sixth International Conference on Intelligent Biology and Medicine (ICIBM) took place in Los Angeles, California, USA on June 10-12, 2018. This conference featured eleven regular scientific sessions, four tutorials, one poster session, four keynote talks, and four eminent scholar talks. The scientific program covered a wide range of topics from bench to bedside, including 3D Genome Organization, reconstruction of large scale evolution of genomes and gene functions, artificial intelligence in biological and biomedical fields, and precision medicine. Both method development and application in genomic research continued to be a main component in the conference, including studies on genetic variants, regulation of transcription, genetic-epigenetic interaction at both single cell and tissue level and artificial intelligence. Here, we write a summary of the conference and also briefly introduce the four high quality papers selected to be published in BMC Genomics that cover novel methodology development or innovative data analysis.


Assuntos
Inteligência Artificial , Biologia , Medicina , Biologia/métodos , Humanos , Medicina/métodos
20.
Pharmacogenomics J ; 19(1): 97-108, 2019 02.
Artigo em Inglês | MEDLINE | ID: mdl-29855607

RESUMO

We evaluated interactions of SNP-by-ACE-I/ARB and SNP-by-TD on serum potassium (K+) among users of antihypertensive treatments (anti-HTN). Our study included seven European-ancestry (EA) (N = 4835) and four African-ancestry (AA) cohorts (N = 2016). We performed race-stratified, fixed-effect, inverse-variance-weighted meta-analyses of 2.5 million SNP-by-drug interaction estimates; race-combined meta-analysis; and trans-ethnic fine-mapping. Among EAs, we identified 11 significant SNPs (P < 5 × 10-8) for SNP-ACE-I/ARB interactions on serum K+ that were located between NR2F1-AS1 and ARRDC3-AS1 on chromosome 5 (top SNP rs6878413 P = 1.7 × 10-8; ratio of serum K+ in ACE-I/ARB exposed compared to unexposed is 1.0476, 1.0280, 1.0088 for the TT, AT, and AA genotypes, respectively). Trans-ethnic fine mapping identified the same group of SNPs on chromosome 5 as genome-wide significant for the ACE-I/ARB analysis. In conclusion, SNP-by-ACE-I /ARB interaction analyses uncovered loci that, if replicated, could have future implications for the prevention of arrhythmias due to anti-HTN treatment-related hyperkalemia. Before these loci can be identified as clinically relevant, future validation studies of equal or greater size in comparison to our discovery effort are needed.


Assuntos
Negro ou Afro-Americano/genética , Peptidil Dipeptidase A/genética , Polimorfismo de Nucleotídeo Único/genética , Potássio/sangue , Inibidores de Simportadores de Cloreto de Sódio/uso terapêutico , População Branca/genética , Idoso , Anti-Hipertensivos/uso terapêutico , Cromossomos Humanos Par 5/genética , Europa (Continente) , Feminino , Estudo de Associação Genômica Ampla/métodos , Genótipo , Humanos , Masculino , Pessoa de Meia-Idade
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA