Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 32
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
J Chem Inf Model ; 63(7): 1882-1893, 2023 04 10.
Artículo en Inglés | MEDLINE | ID: mdl-36971750

RESUMEN

Drug-induced gene expression profiling provides a lot of useful information covering various aspects of drug discovery and development. Most importantly, this knowledge can be used to discover drugs' mechanisms of action. Recently, deep learning-based drug design methods are in the spotlight due to their ability to explore huge chemical space and design property-optimized target-specific drug molecules. Recent advances in accessibility of open-source drug-induced transcriptomic data along with the ability of deep learning algorithms to understand hidden patterns have opened opportunities for designing drug molecules based on desired gene expression signatures. In this study, we propose a deep learning model, Gex2SGen (Gene Expression 2 SMILES Generation), to generate novel drug-like molecules based on desired gene expression profiles. The model accepts desired gene expression profiles in a cell-specific manner as input and designs drug-like molecules which can elicit the required transcriptomic profile. The model was first tested against individual gene-knocked-out transcriptomic profiles, where the newly designed molecules showed high similarity with known inhibitors of the knocked-out target genes. The model was next applied on a triple negative breast cancer signature profile, where it could generate novel molecules, highly similar to known anti-breast cancer drugs. Overall, this work provides a generalized method, where the method first learned the molecular signature of a given cell due to a specific condition, and designs new small molecules with drug-like properties.


Asunto(s)
Descubrimiento de Drogas , Transcriptoma , Perfilación de la Expresión Génica , Algoritmos
2.
J Chem Inf Model ; 63(16): 5066-5076, 2023 08 28.
Artículo en Inglés | MEDLINE | ID: mdl-37585609

RESUMEN

Generative artificial intelligence algorithms have shown to be successful in exploring large chemical spaces and designing novel and diverse molecules. There has been considerable interest in developing predictive models using artificial intelligence for drug-like properties, which can potentially reduce the late-stage attrition of drug candidates or predict the properties of novel AI-designed molecules. Concurrently, it is important to understand the contribution of functional groups toward these properties and modify them to obtain property-optimized lead compounds. As a result, there is an increasing interest in the development of explainable property prediction models. However, current explainable approaches are mostly atom-based, where, often, only a fraction of a fragment is shown to be significant. To address the above challenges, we have developed a novel domain-aware molecular fragmentation approach termed post-processing of BRICS (pBRICS), which can fragment small molecules into their functional groups. Multitask models were developed to predict various properties, including the absorption, distribution, metabolism, excretion, and toxicity (ADMET) properties. The fragment importance was explained using the gradient-weighted class activation mapping (Grad-CAM) approach. The method was validated on data sets of experimentally available matched molecular pairs (MMPs). The explanations from the model can be useful for medicinal chemists to identify the fragments responsible for poor drug-like properties and optimize the molecule. The explainability approach was also used to identify the reason behind false positive and false negative MMP predictions. Based on evidence from the existing literature and our analysis, some of these mispredictions were justified. We propose that the quantity, quality, and diversity of the training data will improve the accuracy of property prediction algorithms for novel molecules.


Asunto(s)
Algoritmos , Inteligencia Artificial
3.
J Chem Inf Model ; 62(21): 5100-5109, 2022 Nov 14.
Artículo en Inglés | MEDLINE | ID: mdl-34792338

RESUMEN

In recent years, deep learning-based methods have emerged as promising tools for de novo drug design. Most of these methods are ligand-based, where an initial target-specific ligand data set is necessary to design potent molecules with optimized properties. Although there have been attempts to develop alternative ways to design target-specific ligand data sets, availability of such data sets remains a challenge while designing molecules against novel target proteins. In this work, we propose a deep learning-based method, where the knowledge of the active site structure of the target protein is sufficient to design new molecules. First, a graph attention model was used to learn the structure and features of the amino acids in the active site of proteins that are experimentally known to form protein-ligand complexes. Next, the learned active site features were used along with a pretrained generative model for conditional generation of new molecules. A bioactivity prediction model was then used in a reinforcement learning framework to optimize the conditional generative model. We validated our method against two well-studied proteins, Janus kinase 2 (JAK2) and dopamine receptor D2 (DRD2), where we produce molecules similar to the known inhibitors. The graph attention model could identify the probable key active site residues, which influenced the conditional molecule generator to design new molecules with pharmacophoric features similar to the known inhibitors.


Asunto(s)
Aprendizaje Profundo , Ligandos , Modelos Moleculares , Diseño de Fármacos , Proteínas
4.
Hum Mutat ; 40(9): 1314-1320, 2019 09.
Artículo en Inglés | MEDLINE | ID: mdl-31140652

RESUMEN

Genetics play a key role in venous thromboembolism (VTE) risk, however established risk factors in European populations do not translate to individuals of African descent because of the differences in allele frequencies between populations. As part of the fifth iteration of the Critical Assessment of Genome Interpretation, participants were asked to predict VTE status in exome data from African American subjects. Participants were provided with 103 unlabeled exomes from patients treated with warfarin for non-VTE causes or VTE and asked to predict which disease each subject had been treated for. Given the lack of training data, many participants opted to use unsupervised machine learning methods, clustering the exomes by variation in genes known to be associated with VTE. The best performing method using only VTE related genes achieved an area under the ROC curve of 0.65. Here, we discuss the range of methods used in the prediction of VTE from sequence data and explore some of the difficulties of conducting a challenge with known confounders. In addition, we show that an existing genetic risk score for VTE that was developed in European subjects works well in African Americans.


Asunto(s)
Secuenciación del Exoma/métodos , Tromboembolia Venosa/genética , Warfarina/administración & dosificación , Análisis por Conglomerados , Biología Computacional/métodos , Congresos como Asunto , Femenino , Predisposición Genética a la Enfermedad , Humanos , Masculino , Curva ROC , Aprendizaje Automático no Supervisado , Tromboembolia Venosa/tratamiento farmacológico , Warfarina/uso terapéutico
5.
N Engl J Med ; 375(22): 2165-2176, 2016 12 01.
Artículo en Inglés | MEDLINE | ID: mdl-27959755

RESUMEN

BACKGROUND: Severe combined immunodeficiency (SCID) is characterized by arrested T-lymphocyte production and by B-lymphocyte dysfunction, which result in life-threatening infections. Early diagnosis of SCID through population-based screening of newborns can aid clinical management and help improve outcomes; it also permits the identification of previously unknown factors that are essential for lymphocyte development in humans. METHODS: SCID was detected in a newborn before the onset of infections by means of screening of T-cell-receptor excision circles, a biomarker for thymic output. On confirmation of the condition, the affected infant was treated with allogeneic hematopoietic stem-cell transplantation. Exome sequencing in the patient and parents was followed by functional analysis of a prioritized candidate gene with the use of human hematopoietic stem cells and zebrafish embryos. RESULTS: The infant had "leaky" SCID (i.e., a form of SCID in which a minimal degree of immune function is preserved), as well as craniofacial and dermal abnormalities and the absence of a corpus callosum; his immune deficit was fully corrected by hematopoietic stem-cell transplantation. Exome sequencing revealed a heterozygous de novo missense mutation, p.N441K, in BCL11B. The resulting BCL11B protein had dominant negative activity, which abrogated the ability of wild-type BCL11B to bind DNA, thereby arresting development of the T-cell lineage and disrupting hematopoietic stem-cell migration; this revealed a previously unknown function of BCL11B. The patient's abnormalities, when recapitulated in bcl11ba-deficient zebrafish, were reversed by ectopic expression of functionally intact human BCL11B but not mutant human BCL11B. CONCLUSIONS: Newborn screening facilitated the identification and treatment of a previously unknown cause of human SCID. Coupling exome sequencing with an evaluation of candidate genes in human hematopoietic stem cells and in zebrafish revealed that a constitutional BCL11B mutation caused human multisystem anomalies with SCID and also revealed a prethymic role for BCL11B in hematopoietic progenitors. (Funded by the National Institutes of Health and others.).


Asunto(s)
Anomalías Múltiples/genética , Células Madre Hematopoyéticas/fisiología , Mutación Missense , Proteínas Represoras/genética , Inmunodeficiencia Combinada Grave/genética , Proteínas Supresoras de Tumor/genética , Animales , Encéfalo/diagnóstico por imagen , Movimiento Celular , Modelos Animales de Enfermedad , Regulación de la Expresión Génica , Trasplante de Células Madre Hematopoyéticas , Células Madre Hematopoyéticas/metabolismo , Humanos , Técnicas In Vitro , Recién Nacido , Imagen por Resonancia Magnética , Masculino , Tamizaje Neonatal/métodos , Receptores de Antígenos de Linfocitos T , Proteínas Represoras/deficiencia , Proteínas Represoras/metabolismo , Proteínas Supresoras de Tumor/deficiencia , Proteínas Supresoras de Tumor/metabolismo , Pez Cebra/crecimiento & desarrollo
6.
J Clin Immunol ; 35(2): 227-33, 2015 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-25677497

RESUMEN

PURPOSE: Severe combined immunodeficiency (SCID) encompasses a group of disorders characterized by reduced or absent T-cell number and function and identified by newborn screening utilizing T-cell receptor excision circles (TRECs). This screening has also identified infants with T lymphopenia who lack mutations in typical SCID genes. We report an infant with low TRECs and non-SCID T lymphopenia, who proved upon whole exome sequencing to have Nijmegen breakage syndrome (NBS). METHODS: Exome sequencing of DNA from the infant and his parents was performed. Genomic analysis revealed deleterious variants in the NBN gene. Confirmatory testing included Sanger sequencing and immunoblotting and radiosensitivity testing of patient lymphocytes. RESULTS: Two novel nonsense mutations in NBN were identified in genomic DNA from the family. Immunoblotting showed absence of nibrin protein. A colony survival assay demonstrated radiosensitivity comparable to patients with ataxia telangiectasia. CONCLUSIONS: Although TREC screening was developed to identify newborns with SCID, it has also identified T lymphopenic disorders that may not otherwise be diagnosed until later in life. Timely identification of an infant with T lymphopenia allowed for prompt pursuit of underlying etiology, making possible a diagnosis of NBS, genetic counseling, and early intervention to minimize complications.


Asunto(s)
Tamizaje Neonatal , Síndrome de Nijmegen/diagnóstico , Síndrome de Nijmegen/genética , Receptores de Antígenos de Linfocitos T/genética , Linfocitos T/metabolismo , Proteínas de Ciclo Celular/genética , Proteínas de Ciclo Celular/metabolismo , ADN Circular , Exoma , Reordenamiento Génico de Linfocito T , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Lactante , Recién Nacido , Masculino , Síndrome de Nijmegen/inmunología , Proteínas Nucleares/genética , Proteínas Nucleares/metabolismo , Linfocitos T/inmunología
7.
J Clin Immunol ; 35(2): 135-46, 2015 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-25627829

RESUMEN

PURPOSE: A male infant developed generalized rash, intestinal inflammation and severe infections including persistent cytomegalovirus. Family history was negative, T cell receptor excision circles were normal, and engraftment of maternal cells was absent. No defects were found in multiple genes associated with severe combined immunodeficiency. A 9/10 HLA matched unrelated hematopoietic cell transplant (HCT) led to mixed chimerism with clinical resolution. We sought an underlying cause for this patient's immune deficiency and dysregulation. METHODS: Clinical and laboratory features were reviewed. Whole exome sequencing and analysis of genomic DNA from the patient, parents and 2 unaffected siblings was performed, revealing 2 MALT1 variants. With a host-specific HLA-C antibody, we assessed MALT1 expression and function in the patient's post-HCT autologous and donor lymphocytes. Wild type MALT1 cDNA was added to transformed autologous patient B cells to assess functional correction. RESULTS: The patient had compound heterozygous DNA variants affecting exon 10 of MALT1 (isoform a, NM_006785.3), a maternally inherited splice acceptor c.1019-2A > G, and a de novo deletion of c.1059C leading to a frameshift and premature termination. Autologous lymphocytes failed to express MALT1 and lacked NF-κB signaling dependent upon the CARMA1, BCL-10 and MALT1 signalosome. Transduction with wild type MALT1 cDNA corrected the observed defects. CONCLUSIONS: Our nonconsanguineous patient with early onset profound combined immunodeficiency and immune dysregulation due to compound heterozygous MALT1 mutations extends the clinical and immunologic phenotype reported in 2 prior families. Clinical cure was achieved with mixed chimerism after nonmyeloablative conditioning and HCT.


Asunto(s)
Caspasas/genética , Trasplante de Células Madre Hematopoyéticas , Mutación , Proteínas de Neoplasias/genética , Inmunodeficiencia Combinada Grave/genética , Inmunodeficiencia Combinada Grave/terapia , Adulto , Secuencia de Aminoácidos , Linfocitos B/metabolismo , Linfocitos B/virología , Secuencia de Bases , Caspasas/metabolismo , Línea Celular Transformada , Niño , Preescolar , Análisis Mutacional de ADN , Femenino , Expresión Génica , Humanos , Inmunofenotipificación , Lactante , Recién Nacido , Leucocitos Mononucleares/inmunología , Leucocitos Mononucleares/metabolismo , Masculino , Proteína 1 de la Translocación del Linfoma del Tejido Linfático Asociado a Mucosas , FN-kappa B/metabolismo , Proteínas de Neoplasias/metabolismo , ARN Mensajero/genética , ARN Mensajero/metabolismo , Inmunodeficiencia Combinada Grave/diagnóstico , Inmunodeficiencia Combinada Grave/metabolismo , Transducción de Señal , Piel/patología , Quimera por Trasplante , Trasplante Homólogo
8.
BMC Med Inform Decis Mak ; 14: 13, 2014 Feb 24.
Artículo en Inglés | MEDLINE | ID: mdl-24559132

RESUMEN

BACKGROUND: Pharmacovigilance aims to uncover and understand harmful side-effects of drugs, termed adverse events (AEs). Although the current process of pharmacovigilance is very systematic, the increasing amount of information available in specialized health-related websites as well as the exponential growth in medical literature presents a unique opportunity to supplement traditional adverse event gathering mechanisms with new-age ones. METHOD: We present a semi-automated pipeline to extract associations between drugs and side effects from traditional structured adverse event databases, enhanced by potential drug-adverse event pairs mined from user-comments from health-related websites and MEDLINE abstracts. The pipeline was tested using a set of 12 drugs representative of two previous studies of adverse event extraction from health-related websites and MEDLINE abstracts. RESULTS: Testing the pipeline shows that mining non-traditional sources helps substantiate the adverse event databases. The non-traditional sources not only contain the known AEs, but also suggest some unreported AEs for drugs which can then be analyzed further. CONCLUSION: A semi-automated pipeline to extract the AE pairs from adverse event databases as well as potential AE pairs from non-traditional sources such as text from MEDLINE abstracts and user-comments from health-related websites is presented.


Asunto(s)
Sistemas de Registro de Reacción Adversa a Medicamentos/normas , Algoritmos , Minería de Datos/métodos , Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos , Humanos , Procesamiento de Lenguaje Natural
9.
J Mol Graph Model ; 129: 108734, 2024 06.
Artículo en Inglés | MEDLINE | ID: mdl-38442440

RESUMEN

Application of Artificial intelligence (AI) in drug discovery has led to several success stories in recent times. While traditional methods mostly relied upon screening large chemical libraries for early-stage drug-design, de novo design can help identify novel target-specific molecules by sampling from a much larger chemical space. Although this has increased the possibility of finding diverse and novel molecules from previously unexplored chemical space, this has also posed a great challenge for medicinal chemists to synthesize at least some of the de novo designed novel molecules for experimental validation. To address this challenge, in this work, we propose a novel forward synthesis-based generative AI method, which is used to explore the synthesizable chemical space. The method uses a structure-based drug design framework, where the target protein structure and a target-specific seed fragment from co-crystal structures can be the initial inputs. A random fragment from a purchasable fragment library can also be the input if a target-specific fragment is unavailable. Then a template-based forward synthesis route prediction and molecule generation is performed in parallel using the Monte Carlo Tree Search (MCTS) method where, the subsequent fragments for molecule growth can again be obtained from a purchasable fragment library. The rewards for each iteration of MCTS are computed using a drug-target affinity (DTA) model based on the docking pose of the generated reaction intermediates at the binding site of the target protein of interest. With the help of the proposed method, it is now possible to overcome one of the major obstacles posed to the AI-based drug design approaches through the ability of the method to design novel target-specific synthesizable molecules.


Asunto(s)
Inteligencia Artificial , Descubrimiento de Drogas , Descubrimiento de Drogas/métodos , Diseño de Fármacos , Proteínas/química , Bibliotecas de Moléculas Pequeñas/química
10.
bioRxiv ; 2024 Jun 17.
Artículo en Inglés | MEDLINE | ID: mdl-38798479

RESUMEN

Continued advances in variant effect prediction are necessary to demonstrate the ability of machine learning methods to accurately determine the clinical impact of variants of unknown significance (VUS). Towards this goal, the ARSA Critical Assessment of Genome Interpretation (CAGI) challenge was designed to characterize progress by utilizing 219 experimentally assayed missense VUS in the Arylsulfatase A (ARSA) gene to assess the performance of community-submitted predictions of variant functional effects. The challenge involved 15 teams, and evaluated additional predictions from established and recently released models. Notably, a model developed by participants of a genetics and coding bootcamp, trained with standard machine-learning tools in Python, demonstrated superior performance among submissions. Furthermore, the study observed that state-of-the-art deep learning methods provided small but statistically significant improvement in predictive performance compared to less elaborate techniques. These findings underscore the utility of variant effect prediction, and the potential for models trained with modest resources to accurately classify VUS in genetic and clinical research.

11.
J Clin Immunol ; 33(3): 540-9, 2013 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-23264026

RESUMEN

PURPOSE: Severe combined immunodeficiency (SCID) is characterized by failure of T lymphocyte development and absent or very low T cell receptor excision circles (TRECs), DNA byproducts of T cell maturation. Newborn screening for TRECs to identify SCID is now performed in several states using PCR of DNA from universally collected dried blood spots (DBS). In addition to infants with typical SCID, TREC screening identifies infants with T lymphocytopenia who appear healthy and in whom a SCID diagnosis cannot be confirmed. Deep sequencing was employed to find causes of T lymphocytopenia in such infants. METHODS: Whole exome sequencing and analysis were performed in infants and their parents. Upon finding deleterious mutations in the ataxia telangiectasia mutated (ATM) gene, we confirmed the diagnosis of ataxia telangiectasia (AT) in two infants and then tested archival newborn DBS of additional AT patients for TREC copy number. RESULTS: Exome sequencing and analysis led to 2 unsuspected gene diagnoses of AT. Of 13 older AT patients for whom newborn DBS had been stored, 7 samples tested positive for SCID under the criteria of California's newborn screening program. AT children with low neonatal TRECs had low CD4 T cell counts subsequently detected (R = 0.64). CONCLUSIONS: T lymphocytopenia in newborns can be a feature of AT, as revealed by TREC screening and exome sequencing. Although there is no current cure for the progressive neurological impairment of AT, early detection permits avoidance of infectious complications, while providing information for families regarding reproductive recurrence risks and increased cancer risks in patients and carriers.


Asunto(s)
Ataxia Telangiectasia/diagnóstico , Tamizaje Neonatal , Inmunodeficiencia Combinada Grave/diagnóstico , Secuencia de Aminoácidos , Ataxia Telangiectasia/complicaciones , Ataxia Telangiectasia/genética , Proteínas de la Ataxia Telangiectasia Mutada , Secuencia de Bases , Proteínas de Ciclo Celular/genética , Niño , Preescolar , Proteínas de Unión al ADN/genética , Exoma , Femenino , Humanos , Lactante , Recién Nacido , Linfopenia/genética , Masculino , Mutación , Fenotipo , Polimorfismo de Nucleótido Simple , Proteínas Serina-Treonina Quinasas/genética , Inmunodeficiencia Combinada Grave/complicaciones , Inmunodeficiencia Combinada Grave/genética , Proteínas Supresoras de Tumor/genética
12.
J Mol Graph Model ; 118: 108361, 2023 01.
Artículo en Inglés | MEDLINE | ID: mdl-36257148

RESUMEN

Mycobacterium tuberculosis (Mtb) is a pathogen of major concern due to its ability to withstand both first- and second-line antibiotics, leading to drug resistance. Thus, there is a critical need for identification of novel anti-tuberculosis agents targeting Mtb-specific proteins. The ceaseless search for novel antimicrobial agents to combat drug-resistant bacteria can be accelerated by the development of advanced deep learning methods, to explore both existing and uncharted regions of the chemical space. The adaptation of deep learning methods to under-explored pathogens such as Mtb is a challenging aspect, as most of the existing methods rely on the availability of sufficient target-specific ligand data to design novel small molecules with optimized bioactivity. In this work, we report the design of novel anti-tuberculosis agents targeting the Mtb chorismate mutase protein using a structure-based drug design algorithm. The structure-based deep learning method relies on the knowledge of the target protein's binding site structure alone for conditional generation of novel small molecules. The method eliminates the need for curation of a high-quality target-specific small molecule dataset, which remains a challenge even for many druggable targets, including Mtb chorismate mutase. Novel molecules are proposed, that show high complementarity to the target binding site. The graph attention model could identify the probable key binding site residues, which influenced the conditional molecule generator to design new molecules with pharmacophoric features similar to the known inhibitors.


Asunto(s)
Aprendizaje Profundo , Mycobacterium tuberculosis , Antituberculosos/química , Mycobacterium tuberculosis/metabolismo , Corismato Mutasa/metabolismo , Diseño de Fármacos
13.
Res Sq ; 2023 Aug 02.
Artículo en Inglés | MEDLINE | ID: mdl-37577579

RESUMEN

In the context of the Critical Assessment of the Genome Interpretation, 6th edition (CAGI6), the Genetics of Neurodevelopmental Disorders Lab in Padua proposed a new ID-challenge to give the opportunity of developing computational methods for predicting patient's phenotype and the causal variants. Eight research teams and 30 models had access to the phenotype details and real genetic data, based on the sequences of 74 genes (VCF format) in 415 pediatric patients affected by Neurodevelopmental Disorders (NDDs). NDDs are clinically and genetically heterogeneous conditions, with onset in infant age. In this study we evaluate the ability and accuracy of computational methods to predict comorbid phenotypes based on clinical features described in each patient and causal variants. Finally, we asked to develop a method to find new possible genetic causes for patients without a genetic diagnosis. As already done for the CAGI5, seven clinical features (ID, ASD, ataxia, epilepsy, microcephaly, macrocephaly, hypotonia), and variants (causative, putative pathogenic and contributing factors) were provided. Considering the overall clinical manifestation of our cohort, we give out the variant data and phenotypic traits of the 150 patients from CAGI5 ID-Challenge as training and validation for the prediction methods development.

14.
Cell Syst ; 12(8): 810-826.e4, 2021 08 18.
Artículo en Inglés | MEDLINE | ID: mdl-34146472

RESUMEN

The recent advent of CRISPR and other molecular tools enabled the reconstruction of cell lineages based on induced DNA mutations and promises to solve the ones of more complex organisms. To date, no lineage reconstruction algorithms have been rigorously examined for their performance and robustness across dataset types and number of cells. To benchmark such methods, we decided to organize a DREAM challenge using in vitro experimental intMEMOIR recordings and in silico data for a C. elegans lineage tree of about 1,000 cells and a Mus musculus tree of 10,000 cells. Some of the 22 approaches submitted had excellent performance, but structural features of the trees prevented optimal reconstructions. Using smaller sub-trees as training sets proved to be a good approach for tuning algorithms to reconstruct larger trees. The simulation and reconstruction methods here generated delineate a potential way forward for solving larger cell lineage trees such as in mouse.


Asunto(s)
Benchmarking , Caenorhabditis elegans , Algoritmos , Animales , Caenorhabditis elegans/genética , Linaje de la Célula/genética , Simulación por Computador , Ratones
15.
PLoS One ; 15(4): e0231728, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-32315351

RESUMEN

INTRODUCTION: Phenotype-driven rare disease gene prioritization relies on high quality curated resources containing disease, gene and phenotype annotations. However, the effectiveness of gene prioritization tools is constrained by the incomplete coverage of rare disease, phenotype and gene annotations in such curated resources. METHODS: We extracted rare disease correlation pairs involving diseases, phenotypes and genes from MEDLINE abstracts and used the information propagation algorithm GCAS to build an association network. We built a tool called PRIORI-T for rare disease gene prioritization that uses this network for phenotype-driven rare disease gene prioritization. The quality of disease-gene associations in PRIORI-T was compared with resources such as DisGeNET and Open Targets in the context of rare diseases. The gene prioritization performance of PRIORI-T was evaluated using phenotype descriptions of 230 real-world rare disease clinical cases collated from recent publications, as well as compared to other gene prioritization tools such as HANRD and Orphamizer. RESULTS: PRIORI-T contains qualitatively better associations than DisGeNET and Open Targets. Furthermore, the causal genes were captured within Top-50 for more than 40% of the real-world clinical cases and within Top-300 for more than 72% of the cases when PRIORI-T was used for gene prioritization. It outperformed other gene prioritization tools such as HANRD and Orphamizer that primarily rely on curated resources. CONCLUSIONS: PRIORI-T exhibited improved gene prioritization performance without requiring high quality curated data. Thus, it holds great promise in phenotype-driven gene prioritization for rare disease studies.


Asunto(s)
Algoritmos , Biología Computacional/métodos , MEDLINE , Enfermedades Raras/genética , Humanos , Fenotipo
16.
Int J Neonatal Screen ; 6(2)2020 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-32802992

RESUMEN

Short-chain acyl-CoA dehydrogenase deficiency (SCADD) is a rare autosomal recessive disorder of ß-oxidation caused by pathogenic variants in the ACADS gene. Analyte testing for SCADD in blood and urine, including newborn screening (NBS) using tandem mass spectrometry (MS/MS) on dried blood spots (DBSs), is complicated by the presence of two relatively common ACADS variants (c.625G>A and c.511C>T). Individuals homozygous for these variants or compound heterozygous do not have clinical disease but do have reduced short-chain acyl-CoA dehydrogenase (SCAD) activity, resulting in elevated blood and urine metabolites. As part of a larger study of the potential role of exome sequencing in NBS in California, we reviewed ACADS sequence and MS/MS data from DBSs from a cohort of 74 patients identified to have SCADD. Of this cohort, approximately 60% had one or more of the common variants and did not have the two rare variants, and thus would need no further testing. Retrospective analysis of ethylmalonic acid, glutaric acid, 2-hydroxyglutaric acid, 3-hydroxyglutaric acid, and methylsuccinic acid demonstrated that second-tier testing applied before the release of the newborn screening result could reduce referrals by over 50% and improve the positive predictive value for SCADD to above 75%.

17.
Nat Med ; 26(9): 1392-1397, 2020 09.
Artículo en Inglés | MEDLINE | ID: mdl-32778825

RESUMEN

Public health newborn screening (NBS) programs provide population-scale ascertainment of rare, treatable conditions that require urgent intervention. Tandem mass spectrometry (MS/MS) is currently used to screen newborns for a panel of rare inborn errors of metabolism (IEMs)1-4. The NBSeq project evaluated whole-exome sequencing (WES) as an innovative methodology for NBS. We obtained archived residual dried blood spots and data for nearly all IEM cases from the 4.5 million infants born in California between mid-2005 and 2013 and from some infants who screened positive by MS/MS, but were unaffected upon follow-up testing. WES had an overall sensitivity of 88% and specificity of 98.4%, compared to 99.0% and 99.8%, respectively for MS/MS, although effectiveness varied among individual IEMs. Thus, WES alone was insufficiently sensitive or specific to be a primary screen for most NBS IEMs. However, as a secondary test for infants with abnormal MS/MS screens, WES could reduce false-positive results, facilitate timely case resolution and in some instances even suggest more appropriate or specific diagnosis than that initially obtained. This study represents the largest, to date, sequencing effort of an entire population of IEM-affected cases, allowing unbiased assessment of current capabilities of WES as a tool for population screening.


Asunto(s)
Secuenciación del Exoma/métodos , Exoma/genética , Errores Innatos del Metabolismo/diagnóstico , Errores Innatos del Metabolismo/genética , Tamizaje Neonatal/métodos , Pruebas Genéticas , Humanos , Recién Nacido , Errores Innatos del Metabolismo/epidemiología , Espectrometría de Masas en Tándem
18.
In Silico Biol ; 9(4): 195-202, 2009.
Artículo en Inglés | MEDLINE | ID: mdl-20109149

RESUMEN

Plasmodium falciparum is the parasite responsible for more than 90% of deaths that occur due to malaria. Organization and mining of 'omic' (genomic, proteomic, transcriptomic, interactomic) data can improve our understanding of P. falciparum biology and help in the fight against malaria. PlasmoID (Plasmodium Information Discovery) is a tool developed for the dynamic exploration of the parasite's 'omic' landscape. Diverse computational and curated P. falciparum protein-protein interaction datasets, as well as binary relationships involving protein-small molecule entities, manually curated protein-protein relationships derived from the published literature and protein-protein interactions based on metabolic pathways are included in the PlasmoID database. The graphical user interface is designed as a plug-in to Cytoscape, an open-source network visualization tool. Important features of this plug-in include a synchronized tabular representation of any network loaded on the canvas, ability to find the shortest path between a pair of nodes in the database, search and expansion of entities from the database, and the ability to add new entities to the database through the interface. Malaria researchers can now seamlessly interrogate heterogeneous 'omic' datasets as well as add proprietary data to generate and visualize P. falciparum pathway and cell process network models. PlasmoID can be downloaded from http://pfalciparum.atc.tcs.com/PlasmoID.


Asunto(s)
Bases de Datos Factuales , Plasmodium falciparum , Mapeo de Interacción de Proteínas/métodos , Proteínas Protozoarias/metabolismo , Programas Informáticos , Genoma , Humanos , Metaboloma , Plasmodium falciparum/genética , Plasmodium falciparum/metabolismo , Proteoma , Proteínas Protozoarias/genética , Interfaz Usuario-Computador
19.
J Comput Biol ; 26(1): 53-67, 2019 01.
Artículo en Inglés | MEDLINE | ID: mdl-30204489

RESUMEN

Genomic variations in a reference collection are naturally represented as genome variation graphs. Such graphs encode common subsequences as vertices and the variations are captured using additional vertices and directed edges. The resulting graphs are directed graphs possibly with cycles. Existing algorithms for aligning sequences on such graphs make use of partial order alignment (POA) techniques that work on directed acyclic graphs (DAGs). To achieve this, acyclic extensions of the input graphs are first constructed through expensive loop unrolling steps (DAGification). Furthermore, such graph extensions could have considerable blowup in their size and in the worst case the blow-up factor is proportional to the input sequence length. We provide a novel alignment algorithm V-ALIGN that aligns the input sequence directly on the input graph while avoiding such expensive DAGification steps. V-ALIGN is based on a novel dynamic programming (DP) formulation that allows gapped alignment directly on the input graph. It supports affine and linear gaps. We also propose refinements to V-ALIGN for better performance in practice. With the proposed refinements, the time to fill the DP table has linear dependence on the sizes of the sequence, the graph, and its feedback vertex set. We conducted experiments to compare the proposed algorithm against the existing POA-based techniques. We also performed alignment experiments on the genome variation graphs constructed from the 1000 Genomes data. For aligning short sequences, standard approaches restrict the expensive gapped alignment to small filtered subgraphs having high similarity to the input sequence. In such cases, the performance of V-ALIGN for gapped alignment on the filtered subgraph depends on the subgraph sizes.


Asunto(s)
Alineación de Secuencia/métodos , Algoritmos , Análisis de Secuencia de ADN
20.
Indian J Biochem Biophys ; 45(6): 365-73, 2008 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-19239121

RESUMEN

Protein trafficking in the malarial parasite Plasmodium falciparum is dictated by a complex life-cycle that involves a variety of intra-cellular and host cell destinations, such as the mitochondrion, apicoplast, rhoptries and micronemes. Of these, the apicoplast and mitochondrion are believed to account for more than 10% of this traffic. Studies have shown that mechanisms for mitochondrion and apicoplast targeting are distinct, despite their close physical proximity. The heme biosynthesis pathway spans both these organelles, making trafficking studies crucial for the spatial demarcation of the constituent interactions. This minireview highlights the challenges in identifying the possible sub-cellular destinations of the heme pathway enzymes using gleanings from literature survey as well as focussed bioinformatic analysis.


Asunto(s)
Hemo/biosíntesis , Mitocondrias/metabolismo , Plasmodium falciparum/enzimología , Proteínas Protozoarias/metabolismo , Secuencia de Aminoácidos , Animales , Hemo/metabolismo , Datos de Secuencia Molecular , Transporte de Proteínas
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA