Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 108
Filtrar
1.
Brief Bioinform ; 22(3)2021 05 20.
Artigo em Inglês | MEDLINE | ID: mdl-32533167

RESUMO

The significance of pan-cancer categories has recently been recognized as widespread in cancer research. Pan-cancer categorizes a cancer based on its molecular pathology rather than an organ. The molecular similarities among multi-omics data found in different cancer types can play several roles in both biological processes and therapeutic developments. Therefore, an integrated analysis for various genomic data is frequently used to reveal novel genetic and molecular mechanisms. However, a variety of algorithms for multi-omics clustering have been proposed in different fields. The comparison of different computational clustering methods in pan-cancer analysis performance remains unclear. To increase the utilization of current integrative methods in pan-cancer analysis, we first provide an overview of five popular computational integrative tools: similarity network fusion, integrative clustering of multiple genomic data types (iCluster), cancer integration via multi-kernel learning (CIMLR), perturbation clustering for data integration and disease subtyping (PINS) and low-rank clustering (LRACluster). Then, a priori interactions in multi-omics data were incorporated to detect prominent molecular patterns in pan-cancer data sets. Finally, we present comparative assessments of these methods, with discussion over key issues in applying these algorithms. We found that all five methods can identify distinct tumor compositions. The pan-cancer samples can be reclassified into several groups by different proportions. Interestingly, each method can classify the tumors into categories that are different from original cancer types or subtypes, especially for ovarian serous cystadenocarcinoma (OV) and breast invasive carcinoma (BRCA) tumors. In addition, all clusters of the five computational methods show notable prognostic values. Furthermore, both the 9 recurrent differential genes and the 15 common pathway characteristics were identified across all the methods. The results and discussion can help the community select appropriate integrative tools according to different research tasks or aims in pan-cancer analysis.


Assuntos
Neoplasias da Mama , Cistadenocarcinoma Seroso , Bases de Dados Genéticas , Redes Reguladoras de Genes , Genômica , Aprendizado de Máquina , Neoplasias da Mama/genética , Neoplasias da Mama/metabolismo , Biologia Computacional , Cistadenocarcinoma Seroso/genética , Cistadenocarcinoma Seroso/metabolismo , Feminino , Humanos , Neoplasias , Neoplasias Ovarianas/genética , Neoplasias Ovarianas/metabolismo
2.
Brief Bioinform ; 22(3)2021 05 20.
Artigo em Inglês | MEDLINE | ID: mdl-32591780

RESUMO

Accurately identifying the interactions between genomic factors and the response of cancer drugs plays important roles in drug discovery, drug repositioning and cancer treatment. A number of studies revealed that interactions between genes and drugs were 'many-genes-to-many drugs' interactions, i.e. common modules, opposed to 'one-gene-to-one-drug' interactions. Such modules fully explain the interactions between complex biological regulatory mechanisms and cancer drugs. However, strategies for effectively and robustly identifying the underlying common modules among pharmacogenomics data remain to be improved. In this paper, we aim to provide a detailed evaluation of three categories of state-of-the-art common module identification techniques from a machine learning perspective, including non-negative matrix factorization (NMF), partial least squares (PLS) and network analyses. We first evaluate the performance of six methods, namely SNMNMF, NetNMF, SNPLS, O2PLS, NSBM and HOGMMNC, using two series of simulated data sets with different noise levels and outlier ratios. Then, we conduct experiments using a real world data set of 2091 genes and 101 drugs in 392 cancer cell lines and compare the real experimental results from the aspect of biological process term enrichment, gene-drug and drug-drug interactions. Finally, we present interesting findings from our evaluation study and discuss the advantages and drawbacks of each method. Supplementary information: Supplementary file is available at Briefings in Bioinformatics online.


Assuntos
Farmacogenética , Algoritmos , Antineoplásicos/farmacologia , Biologia Computacional/métodos , Descoberta de Drogas , Reposicionamento de Medicamentos , Redes Reguladoras de Genes , Humanos , Aprendizado de Máquina
3.
Brief Bioinform ; 22(6)2021 11 05.
Artigo em Inglês | MEDLINE | ID: mdl-34021302

RESUMO

Genomic data alignment, a fundamental operation in sequencing, can be utilized to map reads into a reference sequence, query on a genomic database and perform genetic tests. However, with the reduction of sequencing cost and the accumulation of genome data, privacy-preserving genomic sequencing data alignment is becoming unprecedentedly important. In this paper, we present a comprehensive review of secure genomic data comparison schemes. We discuss the privacy threats, including adversaries and privacy attacks. The attacks can be categorized into inference, membership, identity tracing and completion attacks and have been applied to obtaining the genomic privacy information. We classify the state-of-the-art genomic privacy-preserving alignment methods into three different scenarios: large-scale reads mapping, encrypted genomic datasets querying and genetic testing to ease privacy threats. A comprehensive analysis of these approaches has been carried out to evaluate the computation and communication complexity as well as the privacy requirements. The survey provides the researchers with the current trends and the insights on the significance and challenges of privacy issues in genomic data alignment.


Assuntos
Algoritmos , Genoma Humano , Genômica , Alinhamento de Sequência , Humanos
4.
PLoS Pathog ; 17(3): e1009328, 2021 03.
Artigo em Inglês | MEDLINE | ID: mdl-33657135

RESUMO

A key step to the SARS-CoV-2 infection is the attachment of its Spike receptor-binding domain (S RBD) to the host receptor ACE2. Considerable research has been devoted to the development of neutralizing antibodies, including llama-derived single-chain nanobodies, to target the receptor-binding motif (RBM) and to block ACE2-RBD binding. Simple and effective strategies to increase potency are desirable for such studies when antibodies are only modestly effective. Here, we identify and characterize a high-affinity synthetic nanobody (sybody, SR31) as a fusion partner to improve the potency of RBM-antibodies. Crystallographic studies reveal that SR31 binds to RBD at a conserved and 'greasy' site distal to RBM. Although SR31 distorts RBD at the interface, it does not perturb the RBM conformation, hence displaying no neutralizing activities itself. However, fusing SR31 to two modestly neutralizing sybodies dramatically increases their affinity for RBD and neutralization activity against SARS-CoV-2 pseudovirus. Our work presents a tool protein and an efficient strategy to improve nanobody potency.


Assuntos
Enzima de Conversão de Angiotensina 2/imunologia , Anticorpos Neutralizantes/imunologia , Anticorpos Antivirais/imunologia , SARS-CoV-2/imunologia , Anticorpos de Domínio Único/imunologia , Anticorpos Neutralizantes/química , Anticorpos Neutralizantes/genética , Anticorpos Antivirais/química , Anticorpos Antivirais/genética , Afinidade de Anticorpos , Sítios de Ligação , Cristalografia por Raios X , Células HEK293 , Humanos , Modelos Moleculares , Proteínas Recombinantes de Fusão/química , Proteínas Recombinantes de Fusão/genética , Proteínas Recombinantes de Fusão/imunologia , Anticorpos de Domínio Único/química , Anticorpos de Domínio Único/genética
5.
Hum Brain Mapp ; 43(13): 3970-3986, 2022 09.
Artigo em Inglês | MEDLINE | ID: mdl-35538672

RESUMO

Functional neural activities manifest geometric patterns, as evidenced by the evolving network topology of functional connectivities (FC) even in the resting state. In this work, we propose a novel manifold-based geometric neural network for functional brain networks (called "Geo-Net4Net" for short) to learn the intrinsic low-dimensional feature representations of resting-state brain networks on the Riemannian manifold. This tool allows us to answer the scientific question of how the spontaneous fluctuation of FC supports behavior and cognition. We deploy a set of positive maps and rectified linear unit (ReLU) layers to uncover the intrinsic low-dimensional feature representations of functional brain networks on the Riemannian manifold taking advantage of the symmetric positive-definite (SPD) form of the correlation matrices. Due to the lack of well-defined ground truth in the resting state, existing learning-based methods are limited to unsupervised methodologies. To go beyond this boundary, we propose to self-supervise the feature representation learning of resting-state functional networks by leveraging the task-based counterparts occurring before and after the underlying resting state. With this extra heuristic, our Geo-Net4Net allows us to establish a more reasonable understanding of resting-state FCs by capturing the geometric patterns (aka. spectral/shape signature) associated with resting states on the Riemannian manifold. We have conducted extensive experiments on both simulated data and task-based functional resonance magnetic imaging (fMRI) data from the Human Connectome Project (HCP) database, where our Geo-Net4Net not only achieves more accurate change detection results than other state-of-the-art counterpart methods but also yields ubiquitous geometric patterns that manifest putative insights into brain function.


Assuntos
Conectoma , Aprendizado Profundo , Encéfalo/diagnóstico por imagem , Cognição , Conectoma/métodos , Humanos , Imageamento por Ressonância Magnética/métodos
6.
Chembiochem ; 23(8): e202100534, 2022 04 20.
Artigo em Inglês | MEDLINE | ID: mdl-34862721

RESUMO

Small open reading frames (sORFs) are an important class of genes with less than 100 codons. They were historically annotated as noncoding or even junk sequences. In recent years, accumulating evidence suggests that sORFs could encode a considerable number of polypeptides, many of which play important roles in both physiology and disease pathology. However, it has been technically challenging to directly detect sORF-encoded peptides (SEPs). Here, we discuss the latest advances in methodologies for identifying SEPs with mass spectrometry, as well as the progress on functional studies of SEPs.


Assuntos
Peptídeos , Códon , Espectrometria de Massas , Fases de Leitura Aberta , Peptídeos/química
7.
BMC Med Inform Decis Mak ; 22(1): 190, 2022 07 23.
Artigo em Inglês | MEDLINE | ID: mdl-35870923

RESUMO

BACKGROUND: Patient subgroups are important for easily understanding a disease and for providing precise yet personalized treatment through multiple omics dataset integration. Multiomics datasets are produced daily. Thus, the fusion of heterogeneous big data into intrinsic structures is an urgent problem. Novel mathematical methods are needed to process these data in a straightforward way. RESULTS: We developed a novel method for subgrouping patients with distinct survival rates via the integration of multiple omics datasets and by using principal component analysis to reduce the high data dimensionality. Then, we constructed similarity graphs for patients, merged the graphs in a subspace, and analyzed them on a Grassmann manifold. The proposed method could identify patient subgroups that had not been reported previously by selecting the most critical information during the merging at each level of the omics dataset. Our method was tested on empirical multiomics datasets from The Cancer Genome Atlas. CONCLUSION: Through the integration of microRNA, gene expression, and DNA methylation data, our method accurately identified patient subgroups and achieved superior performance compared with popular methods.


Assuntos
MicroRNAs , Neoplasias , Metilação de DNA , Genoma , Humanos , Neoplasias/genética , Taxa de Sobrevida
8.
Sheng Wu Yi Xue Gong Cheng Xue Za Zhi ; 39(4): 672-678, 2022 Aug 25.
Artigo em Chinês | MEDLINE | ID: mdl-36008330

RESUMO

This study aims to analyze the biomechanical stability of Magic screw in the treatment of acetabular posterior column fractures by finite element analysis. A three-dimensional finite element model of the pelvis was established based on the computed tomography (CT) and magnetic resonance imaging (MRI) data of a volunteer and its effectiveness was verified. Then, the posterior column fracture model of the acetabulum was generated. The biomechanical stability of the four internal fixation models was compared. The 500 N force was applied to the upper surface of the sacrum to simulate human gravity. The maximum implant stresses of retrograde screw fixation, single-plate fixation, double-plate fixation and Magic screw fixation model in standing and sitting position were as follows: 114.10, 113.40 MPa; 58.93, 55.72 MPa; 58.76, 47.47 MPa; and 24.36, 27.50 MPa, respectively. The maximum stresses at the fracture end were as follows: 72.71, 70.51 MPa; 48.18, 22.80 MPa; 52.38, 27.14 MPa; and 34.05, 30.78 MPa, respectively. The fracture end displacement of the retrograde tension screw fixation model was the largest in both states, and the Magic screw had the smallest displacement variation in the standing state, but it was significantly higher than the two plate fixations in the sitting state. Magic screw can satisfy the biomechanical stability of posterior column fracture. Compared with traditional fixations, Magic screw has the advantages of more uniform stress distribution and less stress, and should be recommended.


Assuntos
Fraturas Ósseas , Fraturas da Coluna Vertebral , Fenômenos Biomecânicos , Placas Ósseas , Parafusos Ósseos , Análise de Elementos Finitos , Fixação Interna de Fraturas/métodos , Fraturas Ósseas/diagnóstico por imagem , Fraturas Ósseas/cirurgia , Humanos
9.
BMC Bioinformatics ; 22(1): 326, 2021 Jun 15.
Artigo em Inglês | MEDLINE | ID: mdl-34130622

RESUMO

BACKGROUND: With the development of high-throughput sequencing technology, a huge amount of multi-omics data has been accumulated. Although there are many software tools for statistical analysis and visual development of omics data, these tools are not suitable for private data and non-technical users. Besides, most of these tools have specialized in only one or perhaps a few data typesare, without combining clinical information. What's more, users could not choose data processing and model selection flexibly when using these tools. RESULTS: To help non-technical users to understand and analyze private multi-omics data and ensure data security, we developed an interactive desk tool for statistical analysis and visualization of omics and clinical data (shortly IOAT). Our mainly targets csv format data, and combines clinical data with high-dimensional multi-omics data. It also contains various operations, such as data preprocessing, feature selection, risk assessment, clustering, and survival analysis. By using this tool, users can safely and conveniently try a combination of various methods on their private multi-omics data to find a model suitable for their data, conduct risk assessment and determine their cancer subtypes. At the same time, the tool can also provide them with references to genes that are closely related to tumor staging, facilitating the development of precision oncology. We review IOAT's main features and demonstrate its analysis capabilities on a lung from TCGA. CONCLUSIONS: IOAT is a local desktop tool, which provides a set of multi-omics data integration solutions. It can quickly perform a complete analysis of cancer genome data for subtype discovery and biomarker identification without security issues and writing any code. Thus, our tool can enable cancer biologists and biomedicine researchers to analyze their data more easily and safely. IOAT can be downloaded for free from https://github.com/WlSunshine/IOAT-software .


Assuntos
Neoplasias , Análise por Conglomerados , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Neoplasias/genética , Medicina de Precisão , Software
10.
Retina ; 41(5): 1110-1117, 2021 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-33031250

RESUMO

PURPOSE: To develop a deep learning (DL) model to detect morphologic patterns of diabetic macular edema (DME) based on optical coherence tomography (OCT) images. METHODS: In the training set, 12,365 OCT images were extracted from a public data set and an ophthalmic center. A total of 656 OCT images were extracted from another ophthalmic center for external validation. The presence or absence of three OCT patterns of DME, including diffused retinal thickening, cystoid macular edema, and serous retinal detachment, was labeled with 1 or 0, respectively. A DL model was trained to detect three OCT patterns of DME. The occlusion test was applied for the visualization of the DL model. RESULTS: Applying 5-fold cross-validation method in internal validation, the area under the receiver operating characteristic curve for the detection of three OCT patterns (i.e., diffused retinal thickening, cystoid macular edema, and serous retinal detachment) was 0.971, 0.974, and 0.994, respectively, with an accuracy of 93.0%, 95.1%, and 98.8%, respectively, a sensitivity of 93.5%, 94.5%, and 96.7%, respectively, and a specificity of 92.3%, 95.6%, and 99.3%, respectively. In external validation, the area under the receiver operating characteristic curve was 0.970, 0.997, and 0.997, respectively, with an accuracy of 90.2%, 95.4%, and 95.9%, respectively, a sensitivity of 80.1%, 93.4%, and 94.9%, respectively, and a specificity of 97.6%, 97.2%, and 96.5%, respectively. The occlusion test showed that the DL model could successfully identify the pathologic regions most critical for detection. CONCLUSION: Our DL model demonstrated high accuracy and transparency in the detection of OCT patterns of DME. These results emphasized the potential of artificial intelligence in assisting clinical decision-making processes in patients with DME.


Assuntos
Inteligência Artificial , Aprendizado Profundo , Retinopatia Diabética/diagnóstico , Edema Macular/diagnóstico , Tomografia de Coerência Óptica/métodos , Acuidade Visual , Retinopatia Diabética/complicações , Retinopatia Diabética/fisiopatologia , Seguimentos , Humanos , Edema Macular/etiologia , Edema Macular/fisiopatologia , Curva ROC , Estudos Retrospectivos
11.
Bioinformatics ; 35(4): 602-610, 2019 02 15.
Artigo em Inglês | MEDLINE | ID: mdl-30052773

RESUMO

MOTIVATION: The emergence of large amounts of genomic, chemical, and pharmacological data provides new opportunities and challenges. Identifying gene-drug associations is not only crucial in providing a comprehensive understanding of the molecular mechanisms of drug action, but is also important in the development of effective treatments for patients. However, accurately determining the complex associations among pharmacogenomic data remains challenging. We propose a higher order graph matching with multiple network constraints (HOGMMNC) model to accurately identify gene-drug modules. The HOGMMNC model aims to capture the inherent structural relations within data drawn from multiple sources by hypergraph matching. The proposed technique seamlessly integrates prior constraints to enhance the accuracy and reliability of the identified relations. An effective numerical solution is combined with a novel sampling strategy to solve the problem efficiently. RESULTS: The superiority and effectiveness of our proposed method are demonstrated through a comparison with four state-of-the-art techniques using synthetic and empirical data. The experiments on synthetic data show that the proposed method clearly outperforms other methods, especially in the presence of noise and irrelevant samples. The HOGMMNC model identifies eighteen gene-drug modules in the empirical data. The modules are validated to have significant associations via pathway analysis. Significance: The modules identified by HOGMMNC provide new insights into the molecular mechanisms of drug action and provide patients with more effective treatments. Our proposed method can be applied to the study of other biological correlated module identification problems (e.g. miRNA-gene, gene-methylation, and gene-disease). AVAILABILITY AND IMPLEMENTATION: A matlab package of HOGMMNC is available at https://github.com/scutbioinformatics/HOGMMNC/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Interações Medicamentosas/genética , Redes Reguladoras de Genes , Genômica , Humanos , Reprodutibilidade dos Testes
12.
Protein Expr Purif ; 164: 105463, 2019 12.
Artigo em Inglês | MEDLINE | ID: mdl-31381990

RESUMO

Recombinant expression of human membrane proteins in large quantities remains a major challenge. Expression host is an important variable to screen for high-level production of membrane proteins. Using the green fluorescent protein (GFP) as a reporter, we screened the expression of a human multi-pass membrane protein called sterol Δ8-Δ7 isomerase in three different hosts: Escherichia coli, Saccharomyces cerevisiae, and Pichia pastoris. The expression of the His-tagged isomerase was exceptionally high in P. pastoris, reaching ~200 mg L-1 in standard flasks, and ~1,000 mg L-1 in condensed culture that mimics fermentation. The heterogeneously expressed isomerase could be extracted fully with dodecyl maltoside, and the solubilized protein in the form of GFP fusion showed a sharp and symmetric peak on fluorescence-detection size exclusion chromatography. Our work provides a useful source for the purification of the recombinant isomerase.


Assuntos
Pichia/genética , Esteroide Isomerases/química , Esteroide Isomerases/genética , Cromatografia em Gel , Expressão Gênica , Humanos , Proteínas Recombinantes/química , Proteínas Recombinantes/genética , Solubilidade
13.
Eur Radiol ; 29(10): 5590-5599, 2019 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-30874880

RESUMO

OBJECTIVES: To explore and evaluate the feasibility of radiomics in stratifying nasopharyngeal carcinoma (NPC) into distinct survival subgroups through multi-modalities MRI. METHODS: A total of 658 patients (training cohort: 424; validation cohort: 234) with non-metastatic NPC were enrolled in the retrospective analysis. Each slice was considered as a sample and 4863 radiomics features on the tumor region were extracted from T1-weighted, T2-weighted, and contrast-enhanced T1-weighted MRI. Consensus clustering and manual aggregation were performed on the training cohort to generate a baseline model and classification reference used to train a support vector machine classifier. The risk of each patient was defined as the maximum risk among the slices. Each patient in the validation cohort was assigned to the risk model using the trained classifier. Harrell's concordance index (C-index) was used to measure the prognosis performance, and differences between subgroups were compared using the log-rank test. RESULTS: The training cohort was clustered into four groups with distinct survival patterns. Each patient was assigned to one of the four groups according to the estimated risk. Our method gave a performance (C-index = 0.827, p < .004 and C-index = 0.814, p < .002) better than the T-stage (C-index = 0.815, p = .002 and C-index = 0.803, p = .024), competitive to and more stable than the TNM staging system (C-index = 0.842, p = .003 and C-index = 0.765, p = .050) in the training cohort and the validation cohort. CONCLUSIONS: Through investigating a large one-institutional cohort, the quantitative multi-modalities MRI image phenotypes reveal distinct survival subtypes. KEY POINTS: • Radiomics phenotype of MRI revealed the subtype of nasopharyngeal carcinoma (NPC) patients with distinct survival patterns. • The slice-wise analysis method on MRI helps to stratify patients and provides superior prognostic performance over the TNM staging method. • Risk estimation using the highest risk among slices performed better than using the majority risk in prognosis.


Assuntos
Carcinoma Nasofaríngeo/diagnóstico por imagem , Neoplasias Nasofaríngeas/diagnóstico por imagem , Adulto , Estudos de Coortes , Estudos de Viabilidade , Feminino , Humanos , Interpretação de Imagem Assistida por Computador/métodos , Estimativa de Kaplan-Meier , Imageamento por Ressonância Magnética/métodos , Masculino , Pessoa de Meia-Idade , Carcinoma Nasofaríngeo/patologia , Neoplasias Nasofaríngeas/patologia , Estadiamento de Neoplasias , Prognóstico , Estudos Retrospectivos , Máquina de Vetores de Suporte
15.
BMC Bioinformatics ; 17: 384, 2016 Sep 17.
Artigo em Inglês | MEDLINE | ID: mdl-27639558

RESUMO

BACKGROUND: Variations in DNA copy number have an important contribution to the development of several diseases, including autism, schizophrenia and cancer. Single-cell sequencing technology allows the dissection of genomic heterogeneity at the single-cell level, thereby providing important evolutionary information about cancer cells. In contrast to traditional bulk sequencing, single-cell sequencing requires the amplification of the whole genome of a single cell to accumulate enough samples for sequencing. However, the amplification process inevitably introduces amplification bias, resulting in an over-dispersing portion of the sequencing data. Recent study has manifested that the over-dispersed portion of the single-cell sequencing data could be well modelled by negative binomial distributions. RESULTS: We developed a read-depth based method, nbCNV to detect the copy number variants (CNVs). The nbCNV method uses two constraints-sparsity and smoothness to fit the CNV patterns under the assumption that the read signals are negatively binomially distributed. The problem of CNV detection was formulated as a quadratic optimization problem, and was solved by an efficient numerical solution based on the classical alternating direction minimization method. CONCLUSIONS: Extensive experiments to compare nbCNV with existing benchmark models were conducted on both simulated data and empirical single-cell sequencing data. The results of those experiments demonstrate that nbCNV achieves superior performance and high robustness for the detection of CNVs in single-cell sequencing data.


Assuntos
Variações do Número de Cópias de DNA/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Célula Única/métodos , Software , Distribuição Binomial , Análise por Conglomerados , Simulação por Computador , Humanos , Análise de Sequência de DNA
16.
BMC Bioinformatics ; 16: 219, 2015 Jul 10.
Artigo em Inglês | MEDLINE | ID: mdl-26159165

RESUMO

BACKGROUND: Classifying cancers by gene selection is among the most important and challenging procedures in biomedicine. A major challenge is to design an effective method that eliminates irrelevant, redundant, or noisy genes from the classification, while retaining all of the highly discriminative genes. RESULTS: We propose a gene selection method, called local hyperplane-based discriminant analysis (LHDA). LHDA adopts two central ideas. First, it uses a local approximation rather than global measurement; second, it embeds a recently reported classification model, K-Local Hyperplane Distance Nearest Neighbor(HKNN) classifier, into its discriminator. Through classification accuracy-based iterations, LHDA obtains the feature weight vector and finally extracts the optimal feature subset. The performance of the proposed method is evaluated in extensive experiments on synthetic and real microarray benchmark datasets. Eight classical feature selection methods, four classification models and two popular embedded learning schemes, including k-nearest neighbor (KNN), hyperplane k-nearest neighbor (HKNN), Support Vector Machine (SVM) and Random Forest are employed for comparisons. CONCLUSION: The proposed method yielded comparable to or superior performances to seven state-of-the-art models. The nice performance demonstrate the superiority of combining feature weighting with model learning into an unified framework to achieve the two tasks simultaneously.


Assuntos
Análise por Conglomerados , Análise Discriminante , Aprendizado de Máquina/normas , Neoplasias/classificação , Neoplasias/genética , Máquina de Vetores de Suporte , Perfilação da Expressão Gênica , Redes Reguladoras de Genes , Humanos
17.
BMC Bioinformatics ; 15: 70, 2014 Mar 14.
Artigo em Inglês | MEDLINE | ID: mdl-24625071

RESUMO

BACKGROUND: Modeling high-dimensional data involving thousands of variables is particularly important for gene expression profiling experiments, nevertheless,it remains a challenging task. One of the challenges is to implement an effective method for selecting a small set of relevant genes, buried in high-dimensional irrelevant noises. RELIEF is a popular and widely used approach for feature selection owing to its low computational cost and high accuracy. However, RELIEF based methods suffer from instability, especially in the presence of noisy and/or high-dimensional outliers. RESULTS: We propose an innovative feature weighting algorithm, called LHR, to select informative genes from highly noisy data. LHR is based on RELIEF for feature weighting using classical margin maximization. The key idea of LHR is to estimate the feature weights through local approximation rather than global measurement, which is typically used in existing methods. The weights obtained by our method are very robust in terms of degradation of noisy features, even those with vast dimensions. To demonstrate the performance of our method, extensive experiments involving classification tests have been carried out on both synthetic and real microarray benchmark datasets by combining the proposed technique with standard classifiers, including the support vector machine (SVM), k-nearest neighbor (KNN), hyperplane k-nearest neighbor (HKNN), linear discriminant analysis (LDA) and naive Bayes (NB). CONCLUSION: Experiments on both synthetic and real-world datasets demonstrate the superior performance of the proposed feature selection method combined with supervised learning in three aspects: 1) high classification accuracy, 2) excellent robustness to noise and 3) good stability using to various classification algorithms.


Assuntos
Biologia Computacional/métodos , Perfilação da Expressão Gênica/métodos , Máquina de Vetores de Suporte , Algoritmos , Teorema de Bayes , Análise por Conglomerados , Bases de Dados Genéticas , Análise Discriminante , Humanos , Neoplasias/genética , Neoplasias/metabolismo , Análise de Sequência com Séries de Oligonucleotídeos
18.
BMC Cancer ; 14: 366, 2014 May 24.
Artigo em Inglês | MEDLINE | ID: mdl-24885156

RESUMO

BACKGROUND: The apparent diffusion coefficient (ADC) is a highly diagnostic factor in discriminating malignant and benign breast masses in diffusion-weighted magnetic resonance imaging (DW-MRI). The combination of ADC and other pictorial characteristics has improved lesion type identification accuracy. The objective of this study was to reassess the findings on an independent patient group by changing the magnetic field from 1.5-Tesla to 3.0-Tesla. METHODS: This retrospective study consisted of a training group of 234 female patients, including 85 benign and 149 malignant lesions, imaged using 1.5-Tesla MRI, and a test group of 95 female patients, including 19 benign and 85 malignant lesions, imaged using 3.0-Tesla MRI. The lesion of interest was segmented from the raw image and four sets of measurements describing the morphology, kinetics, DW-MRI, and texture of the pictorial properties of each lesion were obtained. Each lesion was characterized by 28 features in total. Three classical machine-learning algorithms were used to build prediction models on the training group, which evaluated the prognostic performance of the multi-sided features in three scenarios. To reduce information redundancy, five highly diagnostic factors were selected to obtain a compact yet informative characterization of the lesion status. RESULTS: Three classification models were built on the training of 1.5-Tesla patients and were tested on the independent 3.0-Tesla test group. The following results were found. i) Characterization of breast masses in a multi-sided way dramatically increased prediction performance. The usage of all features gave a higher performance in both sensitivity and specificity than any individual feature groups or their combinations. ii) ADC was a highly effective factor in improving the sensitivity in discriminating malignant from benign masses. iii) Five features, namely ADC, Sum Average, Entropy, Elongation, and Sum Variance, were selected to achieve the highest performance in diagnosis of the 3.0-Tesla patient group. CONCLUSIONS: The combination of ADC and other multi-sided characteristics can increase the capability of discriminating malignant and benign breast lesions, even under different imaging protocols. The selected compact feature subsets achieved a high diagnostic performance and thus are promising in clinical applications for discriminating lesion type and for personalized treatment planning.


Assuntos
Neoplasias da Mama/patologia , Meios de Contraste , Imagem de Difusão por Ressonância Magnética , Gadolínio DTPA , Adolescente , Adulto , Idoso , Algoritmos , Inteligência Artificial , Diagnóstico Diferencial , Feminino , Humanos , Interpretação de Imagem Assistida por Computador , Pessoa de Meia-Idade , Valor Preditivo dos Testes , Prognóstico , Estudos Retrospectivos , Adulto Jovem
19.
IEEE J Biomed Health Inform ; 28(2): 1134-1143, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-37963003

RESUMO

Cancer is one of the most challenging health problems worldwide. Accurate cancer survival prediction is vital for clinical decision making. Many deep learning methods have been proposed to understand the association between patients' genomic features and survival time. In most cases, the gene expression matrix is fed directly to the deep learning model. However, this approach completely ignores the interactions between biomolecules, and the resulting models can only learn the expression levels of genes to predict patient survival. In essence, the interaction between biomolecules is the key to determining the direction and function of biological processes. Proteins are the building blocks and principal undertakings of life activities, and as such, their complex interaction network is potentially informative for deep learning methods. Therefore, a more reliable approach is to have the neural network learn both gene expression data and protein interaction networks. We propose a new computational approach, termed CRESCENT, which is a protein-protein interaction (PPI) prior knowledge graph-based convolutional neural network (GCN) to improve cancer survival prediction. CRESCENT relies on the gene expression networks rather than gene expression levels to predict patient survival. The performance of CRESCENT is evaluated on a large-scale pan-cancer dataset consisting of 5991 patients from 16 different types of cancers. Extensive benchmarking experiments demonstrate that our proposed method is competitive in terms of the evaluation metric of the time-dependent concordance index( Ctd) when compared with several existing state-of-the-art approaches. Experiments also show that incorporating the network structure between genomic features effectively improves cancer survival prediction.


Assuntos
Neoplasias , Mapas de Interação de Proteínas , Humanos , Mapas de Interação de Proteínas/genética , Algoritmos , Redes Neurais de Computação , Genômica , Neoplasias/genética
20.
J Bone Joint Surg Am ; 106(2): 129-137, 2024 Jan 17.
Artigo em Inglês | MEDLINE | ID: mdl-37992198

RESUMO

BACKGROUND: Sacral dysmorphism is not uncommon and complicates S1 iliosacral screw placement partially because of the difficulty of determining the starting point accurately on the sacral lateral view. We propose a method of specifying the starting point. METHODS: The starting point for the S1 iliosacral screw into the dysmorphic sacrum was specifically set at a point where the ossification of the S1/S2 intervertebral disc (OSID) intersected the posterior vertebral cortical line (PVCL) on the sacral lateral view, followed by guidewire manipulation and screw placement on the pelvic outlet and inlet views. Computer-simulated virtual surgical procedures based on pelvic computed tomography (CT) data on 95 dysmorphic sacra were performed to determine whether the starting point was below the iliac cortical density (ICD) and in the S1 oblique osseous corridor and to evaluate the accuracy of screw placement (with 1 screw being used, in the left hemipelvis). Surgical procedures on 17 patients were performed to verify the visibility of the OSID and PVCL, to check the location of the starting point relative to the ICD, and to validate the screw placement safety as demonstrated with postoperative CT scans. RESULTS: In the virtual surgical procedures, the starting point was consistently below the ICD and in the oblique osseous corridor in all patients and all screws were Grade 1. In the clinical surgical procedures, the OSID and PVCL were consistently visible and the starting point was always below the ICD in all patients; overall, 21 S1 iliosacral screws were placed in these 17 patients without malpositioning or iatrogenic injury. CONCLUSIONS: On the lateral view of the dysmorphic sacrum, the OSID and PVCL are visible and intersect at a point that is consistently below the ICD and in the oblique osseous corridor, and thus they can be used to identify the starting point. LEVEL OF EVIDENCE: Therapeutic Level III . See Instructions for Authors for a complete description of levels of evidence.


Assuntos
Fraturas Ósseas , Ossos Pélvicos , Humanos , Sacro/diagnóstico por imagem , Sacro/cirurgia , Ossos Pélvicos/cirurgia , Ílio/diagnóstico por imagem , Ílio/cirurgia , Fixação Interna de Fraturas/métodos , Parafusos Ósseos , Fraturas Ósseas/cirurgia
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa