Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 19 de 19
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Comput Biol Chem ; 112: 108118, 2024 Jun 10.
Artigo em Inglês | MEDLINE | ID: mdl-38878606

RESUMO

Mitochondrial disorders are a class of heterogeneous disorders caused by genetic variations in the mitochondrial genome (mtDNA) as well as the nuclear genome. The spectrum of mtDNA variants remains unexplored in the Indian population. In the present study, we have cataloged 2689 high confidence single nucleotide variants, small insertions and deletions in mtDNA in 1029 healthy Indian individuals. We found a major proportion (76.5 %) of the variants being rare (AF<=0.005) in the studied population. Intriguingly, we found two 'confirmed' pathogenic variants (m.1555 A>G and m.14484 T>C) with a frequency of ∼1 in 250 individuals in our dataset. The high carrier frequency underscores the need for screening of the mtDNA pathogenic mutations in newborns in India. Interestingly, our analysis also revealed 202 variants in our dataset which have been 'reported' in disease cases as per the MITOMAP database. Additionally, we found the frequency of haplogroup M (52.2 %) to be the highest among all the 18 top-level haplogroups found in our dataset. In comparison to the global population datasets, 20 unique mtDNA variants are found in the Indian population. We hope the whole genome sequencing based compendium of mtDNA variants along with their allele frequencies and heteroplasmy levels in the Indian population will drive additional genome scale studies for mtDNA. Furthermore, the identification of clinically relevant variants in our dataset will aid in better clinical interpretation of the variants in mitochondrial disorders.

2.
Microsc Microanal ; 2024 May 17.
Artigo em Inglês | MEDLINE | ID: mdl-38758983

RESUMO

Traditionally, materials discovery has been driven more by evidence and intuition than by systematic design. However, the advent of "big data" and an exponential increase in computational power have reshaped the landscape. Today, we use simulations, artificial intelligence (AI), and machine learning (ML) to predict materials characteristics, which dramatically accelerates the discovery of novel materials. For instance, combinatorial megalibraries, where millions of distinct nanoparticles are created on a single chip, have spurred the need for automated characterization tools. This paper presents an ML model specifically developed to perform real-time binary classification of grayscale high-angle annular dark-field images of nanoparticles sourced from these megalibraries. Given the high costs associated with downstream processing errors, a primary requirement for our model was to minimize false positives while maintaining efficacy on unseen images. We elaborate on the computational challenges and our solutions, including managing memory constraints, optimizing training time, and utilizing Neural Architecture Search tools. The final model outperformed our expectations, achieving over 95% precision and a weighted F-score of more than 90% on our test data set. This paper discusses the development, challenges, and successful outcomes of this significant advancement in the application of AI and ML to materials discovery.

3.
J Cheminform ; 16(1): 17, 2024 Feb 16.
Artigo em Inglês | MEDLINE | ID: mdl-38365691

RESUMO

Modern data mining techniques using machine learning (ML) and deep learning (DL) algorithms have been shown to excel in the regression-based task of materials property prediction using various materials representations. In an attempt to improve the predictive performance of the deep neural network model, researchers have tried to add more layers as well as develop new architectural components to create sophisticated and deep neural network models that can aid in the training process and improve the predictive ability of the final model. However, usually, these modifications require a lot of computational resources, thereby further increasing the already large model training time, which is often not feasible, thereby limiting usage for most researchers. In this paper, we study and propose a deep neural network framework for regression-based problems comprising of fully connected layers that can work with any numerical vector-based materials representations as model input. We present a novel deep regression neural network, iBRNet, with branched skip connections and multiple schedulers, which can reduce the number of parameters used to construct the model, improve the accuracy, and decrease the training time of the predictive model. We perform the model training using composition-based numerical vectors representing the elemental fractions of the respective materials and compare their performance against other traditional ML and several known DL architectures. Using multiple datasets with varying data sizes for training and testing, We show that the proposed iBRNet models outperform the state-of-the-art ML and DL models for all data sizes. We also show that the branched structure and usage of multiple schedulers lead to fewer parameters and faster model training time with better convergence than other neural networks. Scientific contribution: The combination of multiple callback functions in deep neural networks minimizes training time and maximizes accuracy in a controlled computational environment with parametric constraints for the task of materials property prediction.

4.
Mitochondrion ; 75: 101844, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38237647

RESUMO

Genomic investigations on an infant who presented with a putative mitochondrial disorder led to identification of compound heterozygous deletion with an overlapping region of ∼142 kb encompassing two nuclear encoded genes namely ERCC8 and NDUFAF2. Investigations on fetal-derived fibroblast culture demonstrated impaired bioenergetics and mitochondrial dysfunction, which explains the phenotype and observed infant mortality in the present study. The genetic findings from this study extended the utility of whole-genome sequencing as it led to development of a MLPA-based assay for carrier screening in the extended family and the prenatal testing aiding in the birth of two healthy children.


Assuntos
Mortalidade Infantil , Mitocôndrias , Lactente , Criança , Gravidez , Feminino , Humanos , Mitocôndrias/genética , Sequenciamento Completo do Genoma , Metabolismo Energético , Genômica , Fatores de Transcrição/genética , Enzimas Reparadoras do DNA/genética , Chaperonas Moleculares/genética , Proteínas Mitocondriais/genética
6.
Sci Rep ; 13(1): 9128, 2023 Jun 05.
Artigo em Inglês | MEDLINE | ID: mdl-37277456

RESUMO

Modern machine learning (ML) and deep learning (DL) techniques using high-dimensional data representations have helped accelerate the materials discovery process by efficiently detecting hidden patterns in existing datasets and linking input representations to output properties for a better understanding of the scientific phenomenon. While a deep neural network comprised of fully connected layers has been widely used for materials property prediction, simply creating a deeper model with a large number of layers often faces with vanishing gradient problem, causing a degradation in the performance, thereby limiting usage. In this paper, we study and propose architectural principles to address the question of improving the performance of model training and inference under fixed parametric constraints. Here, we present a general deep-learning framework based on branched residual learning (BRNet) with fully connected layers that can work with any numerical vector-based representation as input to build accurate models to predict materials properties. We perform model training for materials properties using numerical vectors representing different composition-based attributes of the respective materials and compare the performance of the proposed models against traditional ML and existing DL architectures. We find that the proposed models are significantly more accurate than the ML/DL models for all data sizes by using different composition-based attributes as input. Further, branched learning requires fewer parameters and results in faster model training due to better convergence during the training phase than existing neural networks, thereby efficiently building accurate models for predicting materials properties.

7.
J Chem Inf Model ; 63(7): 1865-1871, 2023 04 10.
Artigo em Inglês | MEDLINE | ID: mdl-36972592

RESUMO

The applications of artificial intelligence, machine learning, and deep learning techniques in the field of materials science are becoming increasingly common due to their promising abilities to extract and utilize data-driven information from available data and accelerate materials discovery and design for future applications. In an attempt to assist with this process, we deploy predictive models for multiple material properties, given the composition of the material. The deep learning models described here are built using a cross-property deep transfer learning technique, which leverages source models trained on large data sets to build target models on small data sets with different properties. We deploy these models in an online software tool that takes a number of material compositions as input, performs preprocessing to generate composition-based attributes for each material, and feeds them into the predictive models to obtain up to 41 different material property values. The material property predictor is available online at http://ai.eecs.northwestern.edu/MPpredictor.


Assuntos
Inteligência Artificial , Software , Aprendizado de Máquina
8.
J Hum Genet ; 68(6): 409-417, 2023 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-36813834

RESUMO

Structural variants contribute to genetic variability in human genomes and they can be presented in population-specific patterns. We aimed to understand the landscape of structural variants in the genomes of healthy Indian individuals and explore their potential implications in genetic disease conditions. For the identification of structural variants, a whole genome sequencing dataset of 1029 self-declared healthy Indian individuals from the IndiGen project was analysed. Further, these variants were evaluated for potential pathogenicity and their associations with genetic diseases. We also compared our identified variations with the existing global datasets. We generated a compendium of total 38,560 high-confident structural variants, comprising 28,393 deletions, 5030 duplications, 5038 insertions, and 99 inversions. Particularly, we identified around 55% of all these variants were found to be unique to the studied population. Further analysis revealed 134 deletions with predicted pathogenic/likely pathogenic effects and their affected genes were majorly enriched for neurological disease conditions, such as intellectual disability and neurodegenerative diseases. The IndiGenomes dataset helped us to understand the unique spectrum of structural variants in the Indian population. More than half of identified variants were not present in the publicly available global dataset on structural variants. Clinically important deletions identified in IndiGenomes might aid in improving the diagnosis of unsolved genetic diseases, particularly in neurological conditions. Along with basal allele frequency data and clinically important deletions, IndiGenomes data might serve as a baseline resource for future studies on genomic structural variant analysis in the Indian population.


Assuntos
Povo Asiático , Genoma Humano , Humanos , Frequência do Gene , Sequenciamento Completo do Genoma , Genoma Humano/genética
9.
Sci Rep ; 12(1): 11953, 2022 07 13.
Artigo em Inglês | MEDLINE | ID: mdl-35831344

RESUMO

While experiments and DFT-computations have been the primary means for understanding the chemical and physical properties of crystalline materials, experiments are expensive and DFT-computations are time-consuming and have significant discrepancies against experiments. Currently, predictive modeling based on DFT-computations have provided a rapid screening method for materials candidates for further DFT-computations and experiments; however, such models inherit the large discrepancies from the DFT-based training data. Here, we demonstrate how AI can be leveraged together with DFT to compute materials properties more accurately than DFT itself by focusing on the critical materials science task of predicting "formation energy of a material given its structure and composition". On an experimental hold-out test set containing 137 entries, AI can predict formation energy from materials structure and composition with a mean absolute error (MAE) of 0.064 eV/atom; comparing this against DFT-computations, we find that AI can significantly outperform DFT computations for the same task (discrepancies of [Formula: see text] eV/atom) for the first time.


Assuntos
Inteligência Artificial
10.
Hum Immunol ; 83(4): 335-345, 2022 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-35074268

RESUMO

X-linked agammaglobulinemia (XLA) is an X-linked recessive primary immunodeficiency disorder caused due to a pathogenic variant in the Bruton tyrosine (BTK) gene with an incidence of 1:379,000 live births and 1:190,000 male births. Patients affected with XLA present with recurrent infections of the gastrointestinal and respiratory tracts. Here we report the first case series of 17 XLA patients of 10 South Indian families with a wide spectrum of clinical and genetic features. In our cohort, patients presented mainly with recurrent pneumonia, gastrointestinal infection, otitis media, pyoderma, abscesses, empyema, arthritis, and osteomyelitis. Using next-generation and Sanger sequencing we have identified 10 unique pathogenic and likely pathogenic variants in 17 patients. This encompasses three nonsynonymous, two stop-gain, two frameshifts, two structural, and one splicing variant, out of which two of them are novel. Based on the type of variant, patients had variable clinical features and treatment responses. We have also evaluated Btk protein expression for six patients in comparison to the healthy individuals and determined mosaic Btk expression patterns in four mothers. We have also performed family screening in 6 families using Sanger sequencing and identified 19 carriers for the variant. The diagnosis for the patients led to the proper treatment i.e. 15 patients were on intravenous immunoglobulin (IVIG) and the other two had successful hematopoietic stem cell transplantation (HSCT). Unfortunately, two of our patients died due to sepsis, while on IVIG. We envision the present study could help in better understanding of patients with XLA and help in family screening and prenatal diagnosis. To the best of our knowledge, this is the largest case series of patients affected with XLA from South India.


Assuntos
Agamaglobulinemia , Doenças Genéticas Ligadas ao Cromossomo X , Tirosina Quinase da Agamaglobulinemia/genética , Agamaglobulinemia/diagnóstico , Agamaglobulinemia/genética , Criança , Doenças Genéticas Ligadas ao Cromossomo X/diagnóstico , Doenças Genéticas Ligadas ao Cromossomo X/genética , Doenças Genéticas Ligadas ao Cromossomo X/terapia , Humanos , Imunoglobulinas Intravenosas/uso terapêutico , Masculino , Mutação
12.
J Genet Eng Biotechnol ; 19(1): 183, 2021 Dec 14.
Artigo em Inglês | MEDLINE | ID: mdl-34905135

RESUMO

BACKGROUND: Autoinflammatory disorders are the group of inherited inflammatory disorders caused due to the genetic defect in the genes that regulates innate immune systems. These have been clinically characterized based on the duration and occurrence of unprovoked fever, skin rash, and patient's ancestry. There are several autoinflammatory disorders that are found to be prevalent in a specific population and whose disease genetic epidemiology within the population has been well understood. However, India has a limited number of genetic studies reported for autoinflammatory disorders till date. The whole genome sequencing and analysis of 1029 Indian individuals performed under the IndiGen project persuaded us to perform the genetic epidemiology of the autoinflammatory disorders in India. RESULTS: We have systematically annotated the genetic variants of 56 genes implicated in autoinflammatory disorder. These genetic variants were reclassified into five categories (i.e., pathogenic, likely pathogenic, benign, likely benign, and variant of uncertain significance (VUS)) according to the American College of Medical Genetics and Association of Molecular pathology (ACMG-AMP) guidelines. Our analysis revealed 20 pathogenic and likely pathogenic variants with significant differences in the allele frequency compared with the global population. We also found six causal founder variants in the IndiGen dataset belonging to different ancestry. We have performed haplotype prediction analysis for founder mutations haplotype that reveals the admixture of the South Asian population with other populations. The cumulative carrier frequency of the autoinflammatory disorder in India was found to be 3.5% which is much higher than reported. CONCLUSION: With such frequency in the Indian population, there is a great need for awareness among clinicians as well as the general public regarding the autoinflammatory disorder. To the best of our knowledge, this is the first and most comprehensive population scale genetic epidemiological study being reported from India.

13.
Nat Commun ; 12(1): 6595, 2021 11 15.
Artigo em Inglês | MEDLINE | ID: mdl-34782631

RESUMO

Artificial intelligence (AI) and machine learning (ML) have been increasingly used in materials science to build predictive models and accelerate discovery. For selected properties, availability of large databases has also facilitated application of deep learning (DL) and transfer learning (TL). However, unavailability of large datasets for a majority of properties prohibits widespread application of DL/TL. We present a cross-property deep-transfer-learning framework that leverages models trained on large datasets to build models on small datasets of different properties. We test the proposed framework on 39 computational and two experimental datasets and find that the TL models with only elemental fractions as input outperform ML/DL models trained from scratch even when they are allowed to use physical attributes as input, for 27/39 (≈ 69%) computational and both the experimental datasets. We believe that the proposed framework can be widely useful to tackle the small data challenge in applying AI/ML in materials science.

14.
PLoS One ; 16(7): e0254407, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34252140

RESUMO

X-linked agammaglobulinemia (XLA, OMIM #300755) is a primary immunodeficiency disorder caused by pathogenic variations in the BTK gene, characterized by failure of development and maturation of B lymphocytes. The estimated prevalence worldwide is 1 in 190,000 male births. Recently, genome sequencing has been widely used in difficult to diagnose and familial cases. We report a large Indian family suffering from XLA with five affected individuals. We performed complete blood count, immunoglobulin assay, and lymphocyte subset analysis for all patients and analyzed Btk expression for one patient and his mother. Whole exome sequencing (WES) for four patients, and whole genome sequencing (WGS) for two patients have been performed. Carrier screening was done for 17 family members using Multiplex Ligation-dependent Probe Amplification (MLPA) and haplotype ancestry mapping using fineSTRUCTURE was performed. All patients had hypogammaglobulinemia and low CD19+ B cells. One patient who underwent Btk estimation had low expression and his mother showed a mosaic pattern. We could not identify any single nucleotide variants or small insertion/ deletions from the WES dataset that correlates with the clinical feature of the patient. Structural variant analysis through WGS data identifies a novel large deletion of 5,296 bp at loci chrX:100,624,323-100,629,619 encompassing exons 3-5 of the BTK gene. Family screening revealed seven carriers for the deletion. Two patients had a successful HSCT. Haplotype mapping revealed a South Asian ancestry. WGS led to identification of the accurate genetic mutation which could help in early diagnosis leading to improved outcomes, prevention of permanent organ damage and improved quality of life, as well as enabling genetic counselling and prenatal diagnosis in the family.


Assuntos
Agamaglobulinemia/genética , Análise Mutacional de DNA/métodos , Sequenciamento do Exoma/métodos , Exoma/genética , Éxons/genética , Citometria de Fluxo , Haplótipos/genética , Transplante de Células-Tronco Hematopoéticas , Humanos , Masculino , Mutação/genética
15.
Pharmacogenomics ; 22(10): 603-618, 2021 07.
Artigo em Inglês | MEDLINE | ID: mdl-34142560

RESUMO

Aim: Numerous drugs are being widely prescribed for COVID-19 treatment without any direct evidence for the drug safety/efficacy in patients across diverse ethnic populations. Materials & methods: We analyzed whole genomes of 1029 Indian individuals (IndiGen) to understand the extent of drug-gene (pharmacogenetic), drug-drug and drug-drug-gene interactions associated with COVID-19 therapy in the Indian population. Results: We identified 30 clinically significant pharmacogenetic variants and 73 predicted deleterious pharmacogenetic variants. COVID-19-associated pharmacogenes were substantially overlapped with those of metabolic disorder therapeutics. CYP3A4, ABCB1 and ALB are the most shared pharmacogenes. Fifteen COVID-19 therapeutics were predicted as likely drug-drug interaction candidates when used with four CYP inhibitor drugs. Conclusion: Our findings provide actionable insights for future validation studies and improved clinical decisions for COVID-19 therapy in Indians.


Assuntos
Tratamento Farmacológico da COVID-19 , COVID-19/genética , Antivirais/uso terapêutico , Povo Asiático , Interações Medicamentosas/genética , Genoma/genética , Genótipo , Humanos , Índia , Farmacogenética/métodos , Testes Farmacogenômicos/métodos , Variantes Farmacogenômicos/genética , SARS-CoV-2/efeitos dos fármacos
17.
Adv Genet ; 107: 121-152, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33641745

RESUMO

Human migration and community specific cultural practices have contributed to founder events and enrichment of the variants associated with genetic diseases. While many founder events in isolated populations have remained uncharacterized, the application of genomics in clinical settings as well as for population scale studies in the recent years have provided an unprecedented push towards identification of founder variants associated with human health and disease. The discovery and characterization of founder variants could have far reaching implications not only in understanding the history or genealogy of the disease, but also in implementing evidence based policies and genetic testing frameworks. This further enables precise diagnosis and prevention in an attempt towards precision medicine. This review provides an overview of founder variants along with methods and resources cataloging them. We have also discussed the public health implications and examples of prevalent disease associated founder variants in specific populations.


Assuntos
Bases de Dados Genéticas , Efeito Fundador , Mutação , Finlândia , Doenças Genéticas Inatas/genética , Marcadores Genéticos , Genética Populacional , Genoma Humano , Humanos , Medicina de Precisão/métodos , Saúde Pública
18.
Sci Rep ; 11(1): 4244, 2021 02 19.
Artigo em Inglês | MEDLINE | ID: mdl-33608599

RESUMO

The application of machine learning (ML) techniques in materials science has attracted significant attention in recent years, due to their impressive ability to efficiently extract data-driven linkages from various input materials representations to their output properties. While the application of traditional ML techniques has become quite ubiquitous, there have been limited applications of more advanced deep learning (DL) techniques, primarily because big materials datasets are relatively rare. Given the demonstrated potential and advantages of DL and the increasing availability of big materials datasets, it is attractive to go for deeper neural networks in a bid to boost model performance, but in reality, it leads to performance degradation due to the vanishing gradient problem. In this paper, we address the question of how to enable deeper learning for cases where big materials data is available. Here, we present a general deep learning framework based on Individual Residual learning (IRNet) composed of very deep neural networks that can work with any vector-based materials representation as input to build accurate property prediction models. We find that the proposed IRNet models can not only successfully alleviate the vanishing gradient problem and enable deeper learning, but also lead to significantly (up to 47%) better model accuracy as compared to plain deep neural networks and traditional ML techniques for a given input materials representation in the presence of big data.

19.
Nucleic Acids Res ; 49(D1): D1225-D1232, 2021 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-33095885

RESUMO

With the advent of next-generation sequencing, large-scale initiatives for mining whole genomes and exomes have been employed to better understand global or population-level genetic architecture. India encompasses more than 17% of the world population with extensive genetic diversity, but is under-represented in the global sequencing datasets. This gave us the impetus to perform and analyze the whole genome sequencing of 1029 healthy Indian individuals under the pilot phase of the 'IndiGen' program. We generated a compendium of 55,898,122 single allelic genetic variants from geographically distinct Indian genomes and calculated the allele frequency, allele count, allele number, along with the number of heterozygous or homozygous individuals. In the present study, these variants were systematically annotated using publicly available population databases and can be accessed through a browsable online database named as 'IndiGenomes' http://clingen.igib.res.in/indigen/. The IndiGenomes database will help clinicians and researchers in exploring the genetic component underlying medical conditions. Till date, this is the most comprehensive genetic variant resource for the Indian population and is made freely available for academic utility. The resource has also been accessed extensively by the worldwide community since it's launch.


Assuntos
Bases de Dados Genéticas , Variação Genética , Genoma Humano , Projeto Genoma Humano , Software , Adulto , Exoma , Feminino , Genética Populacional/estatística & dados numéricos , Humanos , Índia , Internet , Masculino , Anotação de Sequência Molecular , Sequenciamento Completo do Genoma
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...