Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 41
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Brief Bioinform ; 25(3)2024 Mar 27.
Artículo en Inglés | MEDLINE | ID: mdl-38557677

RESUMEN

Protein design is central to nearly all protein engineering problems, as it can enable the creation of proteins with new biological functions, such as improving the catalytic efficiency of enzymes. One key facet of protein design, fixed-backbone protein sequence design, seeks to design new sequences that will conform to a prescribed protein backbone structure. Nonetheless, existing sequence design methods present limitations, such as low sequence diversity and shortcomings in experimental validation of the designed functional proteins. These inadequacies obstruct the goal of functional protein design. To improve these limitations, we initially developed the Graphormer-based Protein Design (GPD) model. This model utilizes the Transformer on a graph-based representation of three-dimensional protein structures and incorporates Gaussian noise and a sequence random masks to node features, thereby enhancing sequence recovery and diversity. The performance of the GPD model was significantly better than that of the state-of-the-art ProteinMPNN model on multiple independent tests, especially for sequence diversity. We employed GPD to design CalB hydrolase and generated nine artificially designed CalB proteins. The results show a 1.7-fold increase in catalytic activity compared to that of the wild-type CalB and strong substrate selectivity on p-nitrophenyl acetate with different carbon chain lengths (C2-C16). Thus, the GPD method could be used for the de novo design of industrial enzymes and protein drugs. The code was released at https://github.com/decodermu/GPD.


Asunto(s)
Ingeniería de Proteínas , Proteínas , Proteínas/química , Secuencia de Aminoácidos , Ingeniería de Proteínas/métodos
2.
Brief Bioinform ; 25(1)2023 11 22.
Artículo en Inglés | MEDLINE | ID: mdl-38018910

RESUMEN

The biological function of proteins is determined not only by their static structures but also by the dynamic properties of their conformational ensembles. Numerous high-accuracy static structure prediction tools have been recently developed based on deep learning; however, there remains a lack of efficient and accurate methods for exploring protein dynamic conformations. Traditionally, studies concerning protein dynamics have relied on molecular dynamics (MD) simulations, which incur significant computational costs for all-atom precision and struggle to adequately sample conformational spaces with high energy barriers. To overcome these limitations, various enhanced sampling techniques have been developed to accelerate sampling in MD. Traditional enhanced sampling approaches like replica exchange molecular dynamics (REMD) and frontier expansion sampling (FEXS) often follow the MD simulation approach and still cost a lot of computational resources and time. Variational autoencoders (VAEs), as a classic deep generative model, are not restricted by potential energy landscapes and can explore conformational spaces more efficiently than traditional methods. However, VAEs often face challenges in generating reasonable conformations for complex proteins, especially intrinsically disordered proteins (IDPs), which limits their application as an enhanced sampling method. In this study, we presented a novel deep learning model (named Phanto-IDP) that utilizes a graph-based encoder to extract protein features and a transformer-based decoder combined with variational sampling to generate highly accurate protein backbones. Ten IDPs and four structured proteins were used to evaluate the sampling ability of Phanto-IDP. The results demonstrate that Phanto-IDP has high fidelity and diversity in the generated conformation ensembles, making it a suitable tool for enhancing the efficiency of MD simulation, generating broader protein conformational space and a continuous protein transition path.


Asunto(s)
Proteínas Intrínsecamente Desordenadas , Proteínas Intrínsecamente Desordenadas/química , Conformación Proteica , Simulación de Dinámica Molecular , Dominios Proteicos
3.
Biophys J ; 123(10): 1253-1263, 2024 May 21.
Artículo en Inglés | MEDLINE | ID: mdl-38615193

RESUMEN

Disordered proteins are conformationally flexible proteins that are biologically important and have been implicated in devastating diseases such as Alzheimer's disease and cancer. Unlike stably folded structured proteins, disordered proteins sample a range of different conformations that needs to be accounted for. Here, we treat disordered proteins as polymer chains, and compute a dimensionless quantity called instantaneous shape ratio (Rs), as Rs = Ree2/Rg2, where Ree is end-to-end distance and Rg is radius of gyration. Extended protein conformations tend to have high Ree compared with Rg, and thus have high Rs values, whereas compact conformations have smaller Rs values. We use a scatter plot of Rs (representing shape) against Rg (representing size) as a simple map of conformational landscapes. We first examine the conformational landscape of simple polymer models such as Random Walk, Self-Avoiding Walk, and Gaussian Walk (GW), and we notice that all protein/polymer maps lie within the boundaries of the GW map. We thus use the GW map as a reference and, to assess conformational diversity, we compute the fraction of the GW conformations (fC) covered by each protein/polymer. Disordered proteins all have high fC scores, consistent with their disordered nature. Each disordered protein accesses a different region of the reference map, revealing differences in their conformational ensembles. We additionally examine the conformational maps of the nonviral gene delivery vector polyethyleneimine at various protonation states, and find that they resemble disordered proteins, with coverage of the reference map decreasing with increasing protonation state, indicating decreasing conformational diversity. We propose that our method of combining Rs and Rg in a scatter plot generates a simple, meaningful map of the conformational landscape of a disordered protein, which in turn can be used to assess conformational diversity of disordered proteins.


Asunto(s)
Proteínas Intrínsecamente Desordenadas , Conformación Proteica , Proteínas Intrínsecamente Desordenadas/química , Modelos Moleculares , Polímeros/química
4.
Arterioscler Thromb Vasc Biol ; 43(10): 1867-1886, 2023 10.
Artículo en Inglés | MEDLINE | ID: mdl-37589134

RESUMEN

BACKGROUND: Tertiary lymphoid organs (TLOs) are ectopic lymphoid organs developed in nonlymphoid tissues with chronic inflammation, but little is known about their existence in different types of vascular diseases and the mechanism that mediated their development. METHODS: To take advantage of single-cell RNA sequencing techniques, we integrated 28 single-cell RNA sequencing data sets containing 5 vascular disease models (atherosclerosis, abdominal aortic aneurysm, intimal hyperplasia, isograft, and allograft) to explore TLOs existence and environment supporting its growth systematically. We also searched Medline, Embase, PubMed, and Web of Science from inception to January 2022 for published histological images of vascular remodeling for histological evidence to support TLO genesis. RESULTS: Accumulation and infiltration of innate and adaptive immune cells have been observed in various remodeling vessels. Interestingly, the proportion of such immune cells incrementally increases from atherosclerosis to intimal hyperplasia, abdominal aortic aneurysm, isograft, and allograft. Importantly, we uncovered that TLO structure cells, such as follicular helper T cells and germinal center B cells, present in all remodeled vessels. Among myeloid cells and lymphocytes, inflammatory macrophages, and T helper 17 cells are the major lymphoid tissue inducer cells which were found to be positively associated with the numbers of TLO structural cells in remodeled vessels. Vascular stromal cells also actively participate in vascular TLO genesis by communicating with myeloid cells and lymphocytes via CCLs (C-C motif chemokine ligands), CXCL (C-X-C motif ligand), lymphotoxin, BMP (bone morphogenetic protein) chemotactic, FGF-2 (fibroblast growth factor-2), and IGF (insulin growth factor) proliferation mechanisms, particularly for lymphoid tissue inducer cell aggregation. Additionally, the interaction between stromal cells and immune cells modulates extracellular matrix remodeling. Among TLO structure cells, follicular helper T, and germinal center B cells have strong interactions via TCR (T-cell receptor), CD40 (cluster of differentiation 40), and CXCL signaling, to promote the development and maturation of the germinal center in TLO. Consistently, by reviewing the histological images from the literature, TLO genesis was found in those vascular remodeling models. CONCLUSIONS: Our analysis showed the existence of TLOs across 5 models of vascular diseases. The mechanisms that support TLOs formation in different models are heterogeneous. This study could be a valuable resource for understanding and discovering new therapeutic targets for various forms of vascular disease.


Asunto(s)
Aterosclerosis , Remodelación Vascular , Humanos , Hiperplasia/patología , Análisis de Expresión Génica de una Sola Célula , Tejido Linfoide/metabolismo , Aterosclerosis/patología
5.
J Chem Inf Model ; 63(8): 2456-2468, 2023 04 24.
Artículo en Inglés | MEDLINE | ID: mdl-37057817

RESUMEN

Allosteric modulators are important regulation elements that bind the allosteric site beyond the active site, leading to the changes in dynamic and/or thermodynamic properties of the protein. Allosteric modulators have been a considerable interest as potential drugs with high selectivity and safety. However, current experimental methods have limitations to identify allosteric sites. Therefore, molecular dynamics simulation based on empirical force field becomes an important complement of experimental methods. Moreover, the precision and efficiency of current force fields need improvement. Deep learning and reweighting methods were used to train allosteric protein-specific precise force field (named APSF). Multiple allosteric proteins were used to evaluate the performance of APSF. The results indicate that APSF can capture different types of allosteric pockets and sample multiple energy-minimum reference conformations of allosteric proteins. At the same time, the efficiency of conformation sampling for APSF is higher than that for ff14SB. These findings confirm that the newly developed force field APSF can be effectively used to identify the allosteric pocket that can be further used to screen potential allosteric drugs based on these pockets.


Asunto(s)
Aprendizaje Profundo , Proteínas/química , Sitio Alostérico , Simulación de Dinámica Molecular , Dominio Catalítico , Regulación Alostérica
6.
Langmuir ; 38(21): 6638-6646, 2022 May 31.
Artículo en Inglés | MEDLINE | ID: mdl-35588476

RESUMEN

Chemical reactions in small droplets are extensively explored to accelerate the discovery of new materials and increase the efficiency and specificity in catalytic biphasic conversion and high-throughput analytics. In this work, we investigate the local rate of the gas-evolution reaction within femtoliter droplets immobilized on a solid surface. The growth rate of hydrogen microbubbles (≥500 nm in radius) produced from the reaction was measured online with high-resolution confocal microscopic images. The growth rate of bubbles was faster in smaller droplets and near the droplet rim in the same droplet. The results were consistent for both pure and binary reacting droplets and on substrates of different wettability. Our theoretical analysis based on diffusion, chemical reaction, and bubble growth predicted that the concentration of the reactant depended on the droplet size and the bubble location inside the droplet, in good agreement with experimental results. Our results reveal that the reaction rate may be spatially nonuniform in the reacting microdroplets. The findings may have implications for formulating the chemical properties and uses of these droplets.

7.
Langmuir ; 38(37): 11227-11235, 2022 09 20.
Artículo en Inglés | MEDLINE | ID: mdl-36067516

RESUMEN

Liquid-liquid extraction based on surface nanodroplets can be a green and sustainable technique to extract and concentrate analytes from a sample flow. However, because of the extremely small volume of each droplet (<10 fL, tens of micrometers in base radius and a few or less than 1 µm in height), only a few in situ analytical techniques, such as surface-enhanced Raman spectroscopy, were applicable for the online detection and analysis based on nanodroplet extraction. To demonstrate the versatility of surface nanodroplet-based extraction, in this work, the formation of octanol surface nanodroplets and extraction were performed inside a 3 m Teflon capillary tube. After extraction, surface nanodroplets were collected by injecting air into the tube, by which the contact line of surface droplets was collected by the capillary force. As the capillary allows for the formation of ∼1012 surface nanodroplets on the capillary wall, ≥2 mL of octanol can be collected after extraction. The volume of the collected octanol was enough for the analysis of offline analytical techniques such as UV-vis, GC-MS, and others. Coupled with UV-vis, reliable extraction and detection of two common water pollutants, triclosan and chlorpyrifos, was shown by a linear relationship between the analyte concentration in the sample solution and UV-vis absorbance. Moreover, the limit of detection (LOD) as low as 2 × 10-9 M for triclosan (∼0.58 µg/L) and 3 × 10-9 M for chlorpyrifos (∼1.05 µg/L) could be achieved. The collected surface droplets were also analyzed via gas chromatography (GC) and fluorescence microscopy. Our work shows that surface nanodroplet extraction may potentially streamline the process in sample pretreatment for sensitive chemical detection and quantification by using common analytic tools.


Asunto(s)
Cloropirifos , Triclosán , Contaminantes Químicos del Agua , Contaminantes del Agua , Octanoles , Politetrafluoroetileno , Contaminantes del Agua/análisis , Contaminantes Químicos del Agua/análisis
8.
J Chem Inf Model ; 62(2): 372-385, 2022 01 24.
Artículo en Inglés | MEDLINE | ID: mdl-35021622

RESUMEN

RNA plays a key role in a variety of cell activities. However, it is difficult to capture its structure dynamics by the traditional experimental methods because of the inherent limitations. Molecular dynamics simulation has become a valuable complement to the experimental methods. Previous studies have indicated that the current force fields cannot accurately reproduce the conformations and structural dynamics of RNA. Therefore, an RNA-specific force field was developed to improve the conformation sampling of RNA. The distribution of ζ/α dihedrals of tetranucleotides was optimized by a reweighting method, and the grid-based energy correction map (CMAP) term was first introduced into the Amber RNA force field of ff99bsc0χOL3, named ff99OL3_CMAP1. Extensive validations of tetranucleotides and tetraloops show that ff99OL3_CMAP1 can significantly decrease the population of an incorrect structure, increase the consistency between the simulation results and experimental values for tetranucleotides, and improve the stability of tetraloops. ff99OL3_CMAP1 can also precisely reproduce the conformation of a duplex and riboswitches. These findings confirm that the newly developed force field ff99OL3_CMAP1 can improve the conformer sampling of RNA.


Asunto(s)
Simulación de Dinámica Molecular , ARN , Conformación de Ácido Nucleico , ARN/química
9.
J Phys Chem A ; 124(23): 4583-4593, 2020 Jun 11.
Artículo en Inglés | MEDLINE | ID: mdl-32427477

RESUMEN

Defects naturally abound in semiconductor crystal structures and their presence either debilitates or improves device functionality. The increasing trend to strategically implant or remove specific defects to tailor the properties in materials via defect engineering has made it imperative to not only quantify these defects in nanostructures but to do so via efficient contactless techniques. Here we report the use of an ultrafast Kerr-gated microscope system to quantify the defect density at different locations on a single nanowire. By measuring the evolution of nonlinear luminescence dynamics from a nanowire, we are able to extract the individual nonradiative recombination constants and obtain the defect density at locations along the nanowire length. This new method promises fast, reliable, and contactless characterization of single nanoparticles.

10.
Sensors (Basel) ; 19(5)2019 Mar 11.
Artículo en Inglés | MEDLINE | ID: mdl-30862037

RESUMEN

In the field of array signal processing, distributed sources can be regarded as an assembly of point sources within a spatial distribution. In this study, a two-dimensional (2D) non-symmetric incoherently distributed (ID) source model is proposed; we explore the estimation of a 2D non-symmetric ID source using L-shape arrays. The 2D non-symmetric ID source is established by modeling the angular power density function (APDF) as a Gaussian mixture model. Estimation of the non-symmetric distributed source is proposed based on the expectation maximization (EM) framework. The proposed EM iterative framework contains three steps in the process of each circle. Firstly, the nominal azimuth and nominal elevation of each Gaussian component are obtained from the phase parts of elements in sample covariance matrices. Then the angular spreads can be solved through a one-dimensional (1D) search by the original generalized Capon estimator. Finally, weights of each Gaussian component are obtained by solving the least-squares estimator. Simulations are conducted to verify the effectiveness of the estimation technique.

11.
Langmuir ; 34(3): 961-969, 2018 01 23.
Artículo en Inglés | MEDLINE | ID: mdl-28968498

RESUMEN

Vertically aligned ZnO nanowire-based tree-like structures with CuO branches were synthesized on the basis of a multistep seed-mediated hydrothermal approach. The nanotrees form a p-n junction at the branch/stem interface that facilitates charge separation upon illumination. Photoelectrochemical measurements in different solvents show that ZnO/CuO hierarchical nanostructures have enhanced photocatalytic activity compared to that of the nonhierarchical structure of ZnO/CuO, pure ZnO, and pure CuO nanoparticles. The combination of ZnO and CuO in tree-like nanostructures provides opportunities for the design of photoelectrochemical sensors, photocatalytic synthesis, and solar energy conversion.

12.
J Vac Sci Technol A ; 36(4): 041404, 2018 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-29983480

RESUMEN

Recent advances in preservation of the morphology of ZnO nanostructures during dye sensitization required the use of a two-step preparation procedure. The first step was the key for preserving ZnO materials morphology. It required exposing clean ZnO nanostructures to a gas-phase prop-2-ynoic acid (propiolic acid) in vacuum. This step resulted in the formation of a robust and stable surface-bound carboxylate with ethynyl groups available for further modification, for example, with click chemistry. This paper utilizes spectroscopic and microscopic investigations to answer several questions about this modification and to determine if the process can be performed under medium vacuum conditions instead of high vacuum procedures reported earlier. Comparing the results of the preparation process at medium vacuum of 0.5 Torr base pressure with the previously reported investigations of the same process in high vacuum of 10-5 Torr suggests that both processes lead to the formation of the same surface species, confirming that the proposed modification scheme can be widely applicable for ZnO sensitization procedures and does not require the use of high vacuum. Additional analysis comparing the computationally predicted surface structures with the results of spectroscopic investigations yields the more complete description of the surface species resulting from this approach.

13.
Opt Lett ; 41(11): 2462-5, 2016 Jun 01.
Artículo en Inglés | MEDLINE | ID: mdl-27244389

RESUMEN

A Kerr-gated microscope capable of imaging ultraviolet luminescence with femtosecond time resolution has been developed. The system allows the spatial, spectral, and temporal measurement of UV-emitting samples. The instrumentation was optimized for emission collection in the UV, resulting in sub 90 fs time resolution of gated signals. ZnO nanowires were used to demonstrate the performance of the instrument. The evolution of the emission from a single nanowire was tracked via ultrafast transient spectroscopy and through sequential imaging. Transient dynamics were extracted from a region of intense emission on a single ZnO nanowire. This technique is a powerful tool capable of contactless ultrafast measurements of charge carrier dynamics in single nanoparticles.

14.
Nanotechnology ; 27(13): 135401, 2016 Apr 01.
Artículo en Inglés | MEDLINE | ID: mdl-26894995

RESUMEN

A new tree-like ZnO/CdSSe nanocomposite with CdSSe branches grown on ZnO nanowires prepared via a two-step chemical vapor deposition is presented. The nanotrees (NTs) are vertically aligned on a substrate. The CdSSe branches result in strong visible light absorption and form a type-II heterojunction with the ZnO stem that facilitates efficient electron transfer. A combination of photoluminescence spectroscopy and lifetime measurements indicates that the NTs are promising materials for applications that benefit from a Z-scheme charge transfer mechanism. Vertically aligned branched ZnO nanowires can provide direct electron transport pathways to substrates and allow for efficient charge separation. These advantages of nanoscale hierarchical heterostructures make ZnO/CdSSe NTs a promising semiconductor material for solar cells, and other opto-electronic devices.

15.
Artículo en Inglés | MEDLINE | ID: mdl-38381645

RESUMEN

Linear discriminant analysis (LDA) is a classic tool for supervised dimensionality reduction. Because the projected samples can be classified effectively, LDA has been successfully applied in many applications. Among the variants of LDA, trace ratio LDA (TR-LDA) is a classic form due to its explicit meaning. Unfortunately, when the sample size is much smaller than the data dimension, the algorithm for solving TR-LDA does not converge. The so-called small sample size (SSS) problem severely limits the application of TR-LDA. To solve this problem, we propose a revised formation of TR-LDA, which can be applied to datasets with different sizes in a unified form. Then, we present an optimization algorithm to solve the proposed method, explain why it can avoid the SSS problem, and analyze the convergence and computational complexity of the optimization algorithm. Next, based on the introduced theorems, we quantitatively elaborate on when the SSS problem will occur in TR-LDA. Finally, the experimental results on real-world datasets demonstrate the effectiveness of the proposed method.

16.
J Chem Theory Comput ; 20(6): 2676-2688, 2024 Mar 26.
Artículo en Inglés | MEDLINE | ID: mdl-38447040

RESUMEN

Molecular dynamics simulations play a pivotal role in elucidating the dynamic behaviors of RNA structures, offering a valuable complement to traditional methods such as nuclear magnetic resonance or X-ray. Despite this, the current precision of RNA force fields lags behind that of protein force fields. In this work, we systematically compared the performance of four RNA force fields (ff99bsc0χOL3, AMBERDES, ff99OL3_CMAP1, AMBERMaxEnt) across diverse RNA structures. Our findings highlight significant challenges in maintaining stability, particularly with regard to cross-strand and cross-loop hydrogen bonds. Furthermore, we observed the limitations in accurately describing the conformations of nonhelical structural motif, terminal nucleotides, and also base pairing and base stacking interactions by the tested RNA force fields. The identified deficiencies in existing RNA force fields provide valuable insights for subsequent force field development. Concurrently, these findings offer recommendations for selecting appropriate force fields in RNA simulations.


Asunto(s)
Simulación de Dinámica Molecular , ARN , Conformación de Ácido Nucleico , ARN/química , Emparejamiento Base , Espectroscopía de Resonancia Magnética
17.
IEEE Trans Cybern ; 54(4): 2420-2433, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-37126629

RESUMEN

Classification is a fundamental task in the field of data mining. Unfortunately, high-dimensional data often degrade the performance of classification. To solve this problem, dimensionality reduction is usually adopted as an essential preprocessing technique, which can be divided into feature extraction and feature selection. Due to the ability to obtain category discrimination, linear discriminant analysis (LDA) is recognized as a classic feature extraction method for classification. Compared with feature extraction, feature selection has plenty of advantages in many applications. If we can integrate the discrimination of LDA and the advantages of feature selection, it is bound to play an important role in the classification of high-dimensional data. Motivated by the idea, we propose a supervised feature selection method for classification. It combines trace ratio LDA with l2,p -norm regularization and imposes the orthogonal constraint on the projection matrix. The learned row-sparse projection matrix can be used to select discriminative features. Then, we present an optimization algorithm to solve the proposed method. Finally, the extensive experiments on both synthetic and real-world datasets indicate the effectiveness of the proposed method.

18.
Materials (Basel) ; 17(6)2024 Mar 12.
Artículo en Inglés | MEDLINE | ID: mdl-38541465

RESUMEN

Concurrently achieving high growth rate and high quality in single-crystal diamonds (SCDs) is significantly challenging. The growth rate of SCDs synthesized by microwave plasma chemical vapor deposition (MPCVD) was enhanced by introducing N2 into the typical CH4-H2 gas mixtures. The impact of nitrogen vacancy (NV) center concentration on growth rate, surface morphology, and lattice binding structure was investigated. The SCDs were characterized through Raman spectroscopy, photoluminescence (PL) spectroscopy, and X-ray photoelectron spectroscopy. It was found that the saturation growth rate was increased up to 45 µm/h by incorporating 0.8-1.2% N2 into the gas atmosphere, which is 4.5 times higher than the case without nitrogen addition. Nitrogen addition altered the growth mode from step-flow to bidimensional nucleation, leading to clustered steps and a rough surface morphology, followed by macroscopically pyramidal hillock formation. The elevation of nitrogen content results in a simultaneous escalation of internal stress and defects. XPS analysis confirmed chemical bonding between nitrogen and carbon, as well as non-diamond carbon phase formation at 0.8% of nitrogen doping. Furthermore, the emission intensity of NV-related defects from PL spectra changed synchronously with N2 concentrations (0-1.5%) during diamond growth, indicating that the formation of NV centers activated the diamond lattice and facilitated nitrogen incorporation into it, thereby accelerating chemical reaction rates for achieving high-growth-rate SCDs.

19.
IEEE Trans Pattern Anal Mach Intell ; 45(4): 5322-5328, 2023 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-34665722

RESUMEN

In the field of data mining, how to deal with high-dimensional data is an inevitable topic. Since it does not rely on labels, unsupervised feature selection has attracted a lot of attention. The performance of spectral-based unsupervised methods depends on the quality of the constructed similarity matrix, which is used to depict the intrinsic structure of data. However, real-world data often contain plenty of noise features, making the similarity matrix constructed by original data cannot be completely reliable. Worse still, the size of a similarity matrix expands rapidly as the number of samples rises, making the computational cost increase significantly. To solve this problem, a simple and efficient unsupervised model is proposed to perform feature selection. We formulate PCA as a reconstruction error minimization problem, and incorporate a l2,p-norm regularization term to make the projection matrix sparse. The learned row-sparse and orthogonal projection matrix is used to select discriminative features. Then, we present an efficient optimization algorithm to solve the proposed unsupervised model, and analyse the convergence and computational complexity of the algorithm theoretically. Finally, experiments on both synthetic and real-world data sets demonstrate the effectiveness of our proposed method.

20.
IEEE Trans Cybern ; 53(2): 1260-1271, 2023 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-34343100

RESUMEN

In the field of data mining, how to deal with high-dimensional data is a fundamental problem. If they are used directly, it is not only computationally expensive but also difficult to obtain satisfactory results. Unsupervised feature selection is designed to reduce the dimension of data by finding a subset of features in the absence of labels. Many unsupervised methods perform feature selection by exploring spectral analysis and manifold learning, such that the intrinsic structure of data can be preserved. However, most of these methods ignore a fact: due to the existence of noise features, the intrinsic structure directly built from original data may be unreliable. To solve this problem, a new unsupervised feature selection model is proposed. The graph structure, feature weights, and projection matrix are learned simultaneously, such that the intrinsic structure is constructed by the data that have been feature weighted and projected. For each data point, its nearest neighbors are acquired in the process of graph construction. Therefore, we call them adaptive neighbors. Besides, an additional constraint is added to the proposed model. It requires that a graph, corresponding to a similarity matrix, should contain exactly c connected components. Then, we present an optimization algorithm to solve the proposed model. Next, we discuss the method of determining the regularization parameter γ in our proposed method and analyze the computational complexity of the optimization algorithm. Finally, experiments are implemented on both synthetic and real-world datasets to demonstrate the effectiveness of the proposed method.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA