Your browser doesn't support javascript.
loading
: 20 | 50 | 100
1 - 9 de 9
1.
Complex Psychiatry ; 8(1-2): 35-46, 2022 Sep.
Article En | MEDLINE | ID: mdl-36407771

Introduction: Genome-wide association studies (GWAS) have played a critical role in identifying many thousands of loci associated with complex phenotypes and diseases. This has led to several translations of novel disease susceptibility genes into drug targets and care. This however has not been the case for analyses where sample sizes are small, which suffer from multiple comparisons testing. The present study examined the statistical impact of combining a burden test methodology, PrediXcan, with a multimodel meta-analysis, cross phenotype association (CPASSOC). Methods: The analysis was conducted on 5 addiction traits: family alcoholism, cannabis craving, alcohol, nicotine, and cannabis dependence and 10 brain tissues: anterior cingulate cortex BA24, cerebellar hemisphere, cortex, hippocampus, nucleus accumbens basal ganglia, caudate basal ganglia, cerebellum, frontal cortex BA9, hypothalamus, and putamen basal ganglia. Our sample consisted of 1,640 participants from the University of California, San Francisco (UCSF) Family Alcoholism Study. Genotypes were obtained through low pass whole genome sequencing and the use of Thunder, a linkage disequilibrium variant caller. Results: The post-PrediXcan, gene-phenotype association without aggregation resulted in 2 significant results, HCG27 and SPPL2B. Aggregating across phenotypes resulted no significant findings. Aggregating across tissues resulted in 15 significant and 5 suggestive associations: PPIE, RPL36AL, FOXN2, MTERF4, SEPTIN2, CIAO3, RPL36AL, ZNF304, CCDC66, SSPOP, SLC7A9, LY75, MTRF1L, COA5, and RRP7A; RPS23, GNMT, ERV3-1, APIP, and HLA-B, respectively. Discussion: Given the relatively small size of the cohort, this multimodel approach was able to find over a dozen significant associations between predicted gene expression and addiction traits. Of our findings, 8 had prior associations with similar phenotypes through investigation of the GWAS Atlas. With the onset of improved transcriptome data, this approach should increase in efficacy.

2.
J Phys Condens Matter ; 34(34)2022 Jun 24.
Article En | MEDLINE | ID: mdl-35705073

Relativistic calculations of the structural and spectral properties of the PbO molecule can provide fundamental information about the importance of a proper treatment of angular momentum coupling among electrons in order to achieve accurate computational results for spectral properties. Specifically, the nature of these couplings in PbO is expected to be intermediate between theLS- andjj-coupling limits because of its light/heavy element composition. This article reports potential energy curves, transition energies, electric dipole transition moments, permanent dipole moments and spectroscopic constants of PbO calculated using a multireference single plus double excitations spin-orbit configuration interaction approach in the context of relativistic effective core potentials and their concomitant spin-orbit coupling operators. The calculated results are in general agreement with both available experimental results as well as earlier calculations. New values for properties of excited states are also reported. It is noteworthy that certain properties show larger deviations from previous calculations. These deviations are attributed to direct and indirect relativistic effects resulting from diatomic electron-electron angular momentum coupling effects, which are included consistently in the calculations reported herein.

3.
Sci Rep ; 12(1): 5440, 2022 03 31.
Article En | MEDLINE | ID: mdl-35361850

Regularized regression analysis is a mature analytic approach to identify weighted sums of variables predicting outcomes. We present a novel Coarse Approximation Linear Function (CALF) to frugally select important predictors and build simple but powerful predictive models. CALF is a linear regression strategy applied to normalized data that uses nonzero weights + 1 or - 1. Qualitative (linearly invariant) metrics to be optimized can be (for binary response) Welch (Student) t-test p-value or area under curve (AUC) of receiver operating characteristic, or (for real response) Pearson correlation. Predictor weighting is critically important when developing risk prediction models. While counterintuitive, it is a fact that qualitative metrics can favor CALF with ± 1 weights over algorithms producing real number weights. Moreover, while regression methods may be expected to change most or all weight values upon even small changes in input data (e.g., discarding a single subject of hundreds) CALF weights generally do not so change. Similarly, some regression methods applied to collinear or nearly collinear variables yield unpredictable magnitude or the direction (in p-space) of the weights as a vector. In contrast, with CALF if some predictors are linearly dependent or nearly so, CALF simply chooses at most one (the most informative, if any) and ignores the others, thus avoiding the inclusion of two or more collinear variables in the model.


Algorithms , Area Under Curve , Humans , Linear Models , ROC Curve
4.
Bioinformatics ; 36(11): 3522-3527, 2020 06 01.
Article En | MEDLINE | ID: mdl-32176244

MOTIVATION: Low-dimensional representations of high-dimensional data are routinely employed in biomedical research to visualize, interpret and communicate results from different pipelines. In this article, we propose a novel procedure to directly estimate t-SNE embeddings that are not driven by batch effects. Without correction, interesting structure in the data can be obscured by batch effects. The proposed algorithm can therefore significantly aid visualization of high-dimensional data. RESULTS: The proposed methods are based on linear algebra and constrained optimization, leading to efficient algorithms and fast computation in many high-dimensional settings. Results on artificial single-cell transcription profiling data show that the proposed procedure successfully removes multiple batch effects from t-SNE embeddings, while retaining fundamental information on cell types. When applied to single-cell gene expression data to investigate mouse medulloblastoma, the proposed method successfully removes batches related with mice identifiers and the date of the experiment, while preserving clusters of oligodendrocytes, astrocytes, and endothelial cells and microglia, which are expected to lie in the stroma within or adjacent to the tumours. AVAILABILITY AND IMPLEMENTATION: Source code implementing the proposed approach is available as an R package at https://github.com/emanuelealiverti/BC_tSNE, including a tutorial to reproduce the simulation studies. CONTACT: aliverti@stat.unipd.it.


Endothelial Cells , Software , Algorithms , Animals , Gene Expression , Gene Expression Profiling , Mice
5.
PLoS One ; 10(12): e0142360, 2015.
Article En | MEDLINE | ID: mdl-26625115

Although 24 Alzheimer's disease (AD) risk loci have been reliably identified, a large portion of the predicted heritability for AD remains unexplained. It is expected that additional loci of small effect will be identified with an increased sample size. However, the cost of a significant increase in Case-Control sample size is prohibitive. The current study tests whether exploring the genetic basis of endophenotypes, in this case based on putative blood biomarkers for AD, can accelerate the identification of susceptibility loci using modest sample sizes. Each endophenotype was used as the outcome variable in an independent GWAS. Endophenotypes were based on circulating concentrations of proteins that contributed significantly to a published blood-based predictive algorithm for AD. Endophenotypes included Monocyte Chemoattractant Protein 1 (MCP1), Vascular Cell Adhesion Molecule 1 (VCAM1), Pancreatic Polypeptide (PP), Beta2 Microglobulin (B2M), Factor VII (F7), Adiponectin (ADN) and Tenascin C (TN-C). Across the seven endophenotypes, 47 SNPs were associated with outcome with a p-value ≤1x10(-7). Each signal was further characterized with respect to known genetic loci associated with AD. Signals for several endophenotypes were observed in the vicinity of CR1, MS4A6A/MS4A4E, PICALM, CLU, and PTK2B. The strongest signal was observed in association with Factor VII levels and was located within the F7 gene. Additional signals were observed in MAP3K13, ZNF320, ATP9B and TREM1. Conditional regression analyses suggested that the SNPs contributed to variation in protein concentration independent of AD status. The identification of two putatively novel AD loci (in the Factor VII and ATP9B genes), which have not been located in previous studies despite massive sample sizes, highlights the benefits of an endophenotypic approach for resolving the genetic basis for complex diseases. The coincidence of several of the endophenotypic signals with known AD loci may point to novel genetic interactions and should be further investigated.


Alzheimer Disease/blood , Alzheimer Disease/genetics , Genetic Loci/genetics , Genetic Predisposition to Disease/genetics , Aged , Alzheimer Disease/diagnosis , Biomarkers/blood , Endophenotypes , Female , Genomics , Humans , Male , Polymorphism, Single Nucleotide , Regression Analysis
6.
Channels (Austin) ; 5(4): 325-43, 2011.
Article En | MEDLINE | ID: mdl-21918370

Identification of bacterial and archaeal counterparts to eukaryotic ion channels has greatly facilitated studies of structural biophysics of the channels. Often, searches based only on sequence alignment tools are inadequate for discovering such distant bacterial and archaeal counterparts. We address the discovery of bacterial and archaeal members of the Pentameric Ligand-Gated Ion Channel (pLGIC) family by a combination of four computational methods. One domain-based method involves retrieval of proteins with pLGIC-relevant domains by matching those domains to previously established domain templates in the InterPro family of databases. The second domain-based method involves searches using ungapped de-novo motifs discovered by MEME which were trained with well characterized members of the pLGIC family. The third and fourth methods involve the use of two sequence alignment search algorithms BLASTp and psiBLAST respectively. The sequences returned from all methods were screened by having the correct topology for pLGIC's, and by returning an annotated member of this family as one of the first ten hits using BLASTp against a comprehensive database of eukaryotic proteins. We found the domain based searches to have high specificity but low sensitivity, while the sequence alignment methods have higher sensitivity but lower specificity. The four methods together discovered 69 putative bacterial and archaeal members of the pLGIC family. We ranked and divide the 69 proteins into groups according to the similarity of their domain compositions with known eukaryotic pLGIC's. One especially notable group is more closely related to eukaryotic pLGIC's than to any other known protein family, and has the overall topology of pLGIC's, but the functional domains they contain are sufficiently different from those found in known pLGIC's that they do not score very well against the pLGIC domain templates. We conclude that multiple methods used in a coordinated fashion outperform any single method for identifying likely distant bacterial and archaeal proteins that may provide useful models for important eukaryotic channel function. We note also that the methods used here are largely standard and readily accessible. The novelty is in the effectiveness of a strategy that combines these methods for identifying bacterial and archea relatives of this family. Therefore the paper may serve as a template for a broad group of workers to reliably identify bacterial and archaeal counterparts to eukaryotic proteins.


Archaea/genetics , Archaeal Proteins/genetics , Bacteria/genetics , Bacterial Proteins/genetics , Animals , Databases, Protein , Humans , Sequence Alignment , Sequence Analysis, Protein , Sequence Homology, Amino Acid
7.
PLoS One ; 5(10): e12827, 2010 Oct 06.
Article En | MEDLINE | ID: mdl-20949136

Voltage-gated and ligand-gated ion channels are used in eukaryotic organisms for the purpose of electrochemical signaling. There are prokaryotic homologues to major eukaryotic channels of these sorts, including voltage-gated sodium, potassium, and calcium channels, Ach-receptor and glutamate-receptor channels. The prokaryotic homologues have been less well characterized functionally than their eukaryotic counterparts. In this study we identify likely prokaryotic functional counterparts of eukaryotic glutamate receptor channels by comprehensive analysis of the prokaryotic sequences in the context of known functional domains present in the eukaryotic members of this family. In particular, we searched the nonredundant protein database for all proteins containing the following motif: the two sections of the extracellular glutamate binding domain flanking two transmembrane helices. We discovered 100 prokaryotic sequences containing this motif, with a wide variety of functional annotations. Two groups within this family have the same topology as eukaryotic glutamate receptor channels. Group 1 has a potassium-like selectivity filter. Group 2 is most closely related to eukaryotic glutamate receptor channels. We present analysis of the functional domain architecture for the group of 100, a putative phylogenetic tree, comparison of the protein phylogeny with the corresponding species phylogeny, consideration of the distribution of these proteins among classes of prokaryotes, and orthologous relationships between prokaryotic and human glutamate receptor channels. We introduce a construct called the Evolutionary Domain Network, which represents a putative pathway of domain rearrangements underlying the domain composition of present channels. We believe that scientists interested in ion channels in general, and ligand-gated ion channels in particular, will be interested in this work. The work should also be of interest to bioinformatics researchers who are interested in the use of functional domain-based analysis in evolutionary and functional discovery.


Ion Channels/metabolism , Receptors, Glutamate/metabolism , Amino Acid Sequence , Ion Channels/chemistry , Ion Channels/genetics , Molecular Sequence Data , Phylogeny , Prokaryotic Cells , Receptors, Glutamate/chemistry , Receptors, Glutamate/genetics
8.
J Phys Chem A ; 109(5): 807-15, 2005 Feb 10.
Article En | MEDLINE | ID: mdl-16838951

Large molecular clusters can be considered as intermediate states between gas and condensed phases, and information about them can help us understand condensed phases. In this paper, ab initio quantum mechanical methods have been used to examine clusters formed of methanol and water molecules. The main goal was to obtain information about the intermolecular interactions and the structure of methanol/water clusters at the molecular level. The large clusters (CH(4)O...(H(2)O)(12) and H(2)O...(CH(4)O)(10)) containing one molecule of one component (methanol or water) and many (12, 10) molecules of the other component were considered. Møller-Plesset perturbation theory (MP2) was used in the calculations. Several representative cluster geometries were optimized, and nearest-neighbor interaction energies were calculated for the geometries obtained in the first step. The results of the calculations were compared to the available experimental information regarding the liquid methanol/water mixtures and to the molecular dynamics and Monte Carlo simulations, and good agreement was found. For the CH(4)O...(H(2)O)(12) cluster, it was shown that the molecules of water can be subdivided into two classes: (i) H bonded to the central methanol molecule and (ii) not H bonded to the central methanol molecule. As expected, these two classes exhibited striking energy differences. Although they are located almost the same distance from the carbon atom of the central methanol molecule, they possess very different intermolecular interaction energies with the central molecule. The H bonding constitutes a dominant factor in the hydration of methanol in dilute aqueous solutions. For the H(2)O...(CH(4)O)(10) cluster, it was shown that the central molecule of water has almost three H bonds with the methanol molecules; this result differs from those in the literature that concluded that the average number of H bonds between a central water molecule and methanol molecules in dilute solutions of water in methanol is about two, with the water molecules being incorporated into the chains of methanol. In contrast, the present predictions revealed that the central water molecule is not incorporated into a chain of methanol molecules, but it can be the center of several (2-3) chains of methanol molecules. The molecules of methanol, which are not H bonded to the central water molecule, have characteristics similar to those of the methane molecules around a central water molecule in the H(2)O...(CH(4))(10) cluster. The ab initio quantum mechanical methods employed in this paper have provided detailed information about the H bonds in the clusters investigated. In particular, they provided full information about two types of H bonds between water and methanol molecules (in which the water or the methanol molecule is the proton donor), including information about their energies and lengths. The average numbers of the two types of H bonds in the CH(4)O...(H(2)O)(12) and H(2)O...(CH(4)O)(10) clusters have been calculated. Such information could hardly be obtained with the simulation methods.


Methanol/chemistry , Water/chemistry , Computer Simulation , Hydrogen Bonding , Indicator Dilution Techniques , Models, Molecular , Molecular Conformation
9.
J Chem Phys ; 121(12): 5661-75, 2004 Sep 22.
Article En | MEDLINE | ID: mdl-15366990

A valence full configuration interaction study with a polarized double-zeta quality basis set has been carried out for the lowest 49 electronic states of AmCl(+). The calculations use a pseudopotential treatment for the core electrons and incorporate a one-electron spin-orbit interaction operator. Electrons in the valence s, p, d, and f subshells were included in the active space. The resulting electronic potential energy curves are largely repulsive. The chemical bonding is ionic in character with negligible participation of 5f electrons. The molecular f-f spectroscopy of AmCl(+) arises essentially from an in situ Am(2+) core with states slightly redshifted by the presence of chloride ion. Am(+)+Cl asymptotes which give rise to the few attractive potential energy curves can be predicted by analysis of the f-f spectroscopy of isolated Am(+) and Am(2+). The attractive curves have substantial binding energies, on the order of 75-80 kcal/mol, and are noticeably lower than recent indirect measurements on the isovalent EuCl(+). An independent empirical correlation supports the predicted reduction in AmCl(+) binding energy. The energies of the repulsive curves are strongly dependent on the selection of the underlying atomic orbitals while the energies of the attractive curves do not display this sensitivity. The calculations were carried out using our recently developed parallel spin-orbit configuration interaction software.

...