Pesquisa | BVS Educação Profissional em Saúde

1.

BiPSTP: Sequence feature encoding method for identifying different RNA modifications with bidirectional position-specific trinucleotides propensities.

Wang, Mingzhao; Ali, Haider; Xu, Yandi; Xie, Juanying; Xu, Shengquan.

J Biol Chem ; 300(4): 107140, 2024 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-38447795

RESUMO

RNA modification, a posttranscriptional regulatory mechanism, significantly influences RNA biogenesis and function. The accurate identification of modification sites is paramount for investigating their biological implications. Methods for encoding RNA sequence into numerical data play a crucial role in developing robust models for predicting modification sites. However, existing techniques suffer from limitations, including inadequate information representation, challenges in effectively integrating positional and sequential information, and the generation of irrelevant or redundant features when combining multiple approaches. These deficiencies hinder the effectiveness of machine learning models in addressing the performance challenges associated with predicting RNA modification sites. Here, we introduce a novel RNA sequence feature representation method, named BiPSTP, which utilizes bidirectional trinucleotide position-specific propensities. We employ the parameter ξ to denote the interval between the current nucleotide and its adjacent forward or backward dinucleotide, enabling the extraction of positional and sequential information from RNA sequences. Leveraging the BiPSTP method, we have developed the prediction model mRNAPred using support vector machine classifier to identify multiple types of RNA modification sites. We evaluate the performance of our BiPSTP method and mRNAPred model across 12 distinct RNA modification types. Our experimental results demonstrate the superiority of the mRNAPred model compared to state-of-art models in the domain of RNA modification sites identification. Importantly, our BiPSTP method enhances the robustness and generalization performance of prediction models. Notably, it can be applied to feature extraction from DNA sequences to predict other biological modification sites.

Assuntos

Processamento Pós-Transcricional do RNA , RNA , Máquina de Vetores de Suporte , Biologia Computacional/métodos , RNA/química , RNA/genética , RNA/metabolismo , Análise de Sequência de RNA/métodos , Nucleotídeos/química , Nucleotídeos/metabolismo

2.

M6A-BiNP: predicting N⁶-methyladenosine sites based on bidirectional position-specific propensities of polynucleotides and pointwise joint mutual information.

Wang, Mingzhao; Xie, Juanying; Xu, Shengquan.

RNA Biol ; 18(12): 2498-2512, 2021 12.

Artigo em Inglês | MEDLINE | ID: mdl-34161188

RESUMO

N6-methyladenosine (m6A) plays an important role in various biological processes. Identifying m6A site is a key step in exploring its biological functions. One of the biggest challenges in identifying m6A sites is how to extract features comprising rich categorical information to distinguish m6A and non-m6A sites. To address this challenge, we propose bidirectional dinucleotide and trinucleotide position-specific propensities, respectively, in this paper. Based on this, we propose two feature-encoding algorithms: Position-Specific Propensities and Pointwise Mutual Information (PSP-PMI) and Position-Specific Propensities and Pointwise Joint Mutual Information (PSP-PJMI). PSP-PMI is based on the bidirectional dinucleotide propensity and the pointwise mutual information, while PSP-PJMI is based on the bidirectional trinucleotide position-specific propensity and the proposed pointwise joint mutual information in this paper. We introduce parameters α and ß in PSP-PMI and PSP-PJMI, respectively, to represent the distance from the nucleotide to its forward or backward adjacent nucleotide or dinucleotide, so as to extract features containing local and global classification information. Finally, we propose the M6A-BiNP predictor based on PSP-PMI or PSP-PJMI and SVM classifier. The 10-fold cross-validation experimental results on the benchmark datasets of non-single-base resolution and single-base resolution demonstrate that PSP-PMI and PSP-PJMI can extract features with strong capabilities to identify m6A and non-m6A sites. The M6A-BiNP predictor based on our proposed feature encoding algorithm PSP-PJMI is better than the state-of-the-art predictors, and it is so far the best model to identify m6A and non-m6A sites.

Assuntos

Adenosina/análogos & derivados , Algoritmos , Biologia Computacional/métodos , Polinucleotídeos/química , Processamento Pós-Transcricional do RNA , RNA/química , Adenosina/análise , Adenosina/química , Adenosina/metabolismo , Humanos , Polinucleotídeos/metabolismo , RNA/metabolismo , Análise de Sequência de RNA/métodos

3.

A novel method detecting the key clinic factors of portal vein system thrombosis of splenectomy & cardia devascularization patients for cirrhosis & portal hypertension.

Wang, Mingzhao; Ding, Linglong; Xu, Meng; Xie, Juanying; Wu, Shengli; Xu, Shengquan; Yao, Yingmin; Liu, Qingguang.

BMC Bioinformatics ; 20(Suppl 22): 720, 2019 Dec 30.

Artigo em Inglês | MEDLINE | ID: mdl-31888439

RESUMO

BACKGROUND: Portal vein system thrombosis (PVST) is potentially fatal for patients if the diagnosis is not timely or the treatment is not proper. There hasn't been any available technique to detect clinic risk factors to predict PVST after splenectomy in cirrhotic patients. The aim of this study is to detect the clinic risk factors of PVST for splenectomy and cardia devascularization patients for liver cirrhosis and portal hypertension, and build an efficient predictive model to PVST via the detected risk factors, by introducing the machine learning method. We collected 92 clinic indexes of splenectomy plus cardia devascularization patients for cirrhosis and portal hypertension, and proposed a novel algorithm named as RFA-PVST (Risk Factor Analysis for PVST) to detect clinic risk indexes of PVST, then built a SVM (support vector machine) predictive model via the detected risk factors. The accuracy, sensitivity, specificity, precision, F-measure, FPR (false positive rate), FNR (false negative rate), FDR (false discovery rate), AUC (area under ROC curve) and MCC (Matthews correlation coefficient) were adopted to value the predictive power of the detected risk factors. The proposed RFA-PVST algorithm was compared to mRMR, SVM-RFE, Relief, S-weight and LLEScore. The statistic test was done to verify the significance of our RFA-PVST. RESULTS: Anticoagulant therapy and antiplatelet aggregation therapy are the top-2 risk clinic factors to PVST, followed by D-D (D dimer), CHOL (Cholesterol) and Ca (calcium). The SVM (support vector machine) model built on the clinic indexes including anticoagulant therapy, antiplatelet aggregation therapy, RBC (Red blood cell), D-D, CHOL, Ca, TT (thrombin time) and Weight factors has got pretty good predictive capability to PVST. It has got the highest PVST predictive accuracy of 0.89, and the best sensitivity, specificity, precision, F-measure, FNR, FPR, FDR and MCC of 1, 0.75, 0.85, 0.92, 0, 0.25, 0.15 and 0.8 respectively, and the comparable good AUC value of 0.84. The statistic test results demonstrate that there is a strong significant difference between our RFA-PVST and the compared algorithms, including mRMR, SVM-RFE, Relief, S-weight and LLEScore, that is to say, the risk indicators detected by our RFA-PVST are statistically significant. CONCLUSIONS: The proposed novel RFA-PVST algorithm can detect the clinic risk factors of PVST effectively and easily. Its most contribution is that it can display all the clinic factors in a 2-dimensional space with independence and discernibility as y-axis and x-axis, respectively. Those clinic indexes in top-right corner of the 2-dimensional space are detected automatically as risk indicators. The predictive SVM model is powerful with the detected clinic risk factors of PVST. Our study can help medical doctors to make proper treatments or early diagnoses to PVST patients. This study brings the new idea to the study of clinic treatment for other diseases as well.

Assuntos

Cárdia/patologia , Hipertensão Portal/complicações , Cirrose Hepática/complicações , Veia Porta/patologia , Esplenectomia/efeitos adversos , Trombose Venosa/diagnóstico , Trombose Venosa/etiologia , Algoritmos , Área Sob a Curva , Humanos , Cirrose Hepática/patologia , Complicações Pós-Operatórias/diagnóstico , Complicações Pós-Operatórias/etiologia , Reprodutibilidade dos Testes , Fatores de Risco

4.

Analysis of synonymous codon usage pattern of genes in unique non-blood-sucking leech Whitmania pigra.

Khan, Muhammad Salabat; Guan, De-Long; Ma, Li-Bin; Xie, Juan-Ying; Xu, Sheng-Quan.

J Cell Biochem ; 120(6): 9850-9858, 2019 06.

Artigo em Inglês | MEDLINE | ID: mdl-30681200

RESUMO

Whitmania pigra is a unique, fluid-sucking ectoparasite and an anticoagulant medical leech. The codon usage bias (CUB) is the nonuniform usage of synonymous codons in which some codons are more preferred than others. Here, we performed a comprehensive analysis of CUB of genes in W. pigra, analyzing 140 780 transcripts, 59 553 unigenes, and 20 304 qualified coding sequences (CDSs) from the transcriptomic data of W. pigra. The effective number of codons values suggested that the CUB was low in these genes. We recognized profoundly favored codons in W. pigra that have a G/C-ending. Parity rule two-bias plots suggested that both mutation pressure and natural selection might have influenced the CUB. However, neutrality plots revealed that natural selection might have played a major role while mutation pressure might have played a minor role in shaping the CUB. We applied principal component analysis to relative synonymous codon usage values for divided CDSs based on GC content and codon-ending bases. Codon usage in W. pigra had a general inclination toward C-ending codons and natural selection rather than mutation pressure is the dominant force in the genetic evolution of W. pigra. To our knowledge, this is the first study to describe a complete codon usage analysis of W. pigra; this will increase the understanding of CUB and evolution in W. pigra. The analysis of codon usage patterns in W. pigra aids in understanding its evolution and genetic architecture.

Assuntos

Uso do Códon , Evolução Molecular , Sanguessugas/genética , Mutação , Seleção Genética , Animais

5.

Analysis of codon usage patterns in Hirudinaria manillensis reveals a preference for GC-ending codons caused by dominant selection constraints.

Guan, De-Long; Ma, Li-Bin; Khan, Muhammad Salabat; Zhang, Xiu-Xiu; Xu, Sheng-Quan; Xie, Juan-Ying.

BMC Genomics ; 19(1): 542, 2018 Jul 17.

Artigo em Inglês | MEDLINE | ID: mdl-30016953

RESUMO

BACKGROUND: Hirudinaria manillensis is an ephemeral, blood-sucking ectoparasite, possessing anticoagulant capacities with potential medical applications. Analysis of codon usage patterns would contribute to our understanding of the evolutionary mechanisms and genetic architecture of H. manillensis, which in turn would provide insight into the characteristics of other leeches. We analysed codon usage and related indices using 18,000 coding sequences (CDSs) retrieved from H. manillensis RNA-Seq data. RESULTS: We identified four highly preferred codons in H. manillensis that have G/C-endings. Points generated in an effective number of codons (ENC) plot distributed below the standard curve and the slope of a neutrality plot was less than 1. Highly expressed CDSs had lower ENC content and higher GC content than weakly expressed CDSs. Principal component analysis conducted on relative synonymous codon usage (RSCU) values divided CDSs according to GC content and divided codons according to ending bases. Moreover, by determining codon usage, we found that the majority of blood-diet related genes have undergone less adaptive evolution in H. manillensis, except for those with homologous sequences in the host species. CONCLUSIONS: Codon usage in H. manillensis had an overall preference toward C-endings and indicated that codon usage patterns are mediated by differential expression, GC content, and biological function. Although mutation pressure effects were also notable, the majority of genetic evolution in H. manillensis was driven by natural selection.

Assuntos

Códon , Evolução Molecular , Sanguessugas/genética , Seleção Genética , Animais , Composição de Bases , Expressão Gênica , Genoma , Sanguessugas/metabolismo , Nucleotídeos/análise , Análise de Componente Principal , Proteínas/química , Proteínas/genética

6.

Clustering analysis for the evolutionary relationships of SARS-CoV-2 strains.

Chen, Xiangzhong; Wang, Mingzhao; Liu, Xinglin; Zhang, Wenjie; Yan, Huan; Lan, Xiang; Xu, Yandi; Tang, Sanyi; Xie, Juanying.

Sci Rep ; 14(1): 6428, 2024 03 18.

Artigo em Inglês | MEDLINE | ID: mdl-38499639

RESUMO

To explore the differences and relationships between the available SARS-CoV-2 strains and predict the potential evolutionary direction of these strains, we employ the hierarchical clustering analysis to investigate the evolutionary relationships between the SARS-CoV-2 strains utilizing the genomic sequences collected in China till January 7, 2023. We encode the sequences of the existing SARS-CoV-2 strains into numerical data through k-mer algorithm, then propose four methods to select the representative sample from each type of strains to comprise the dataset for clustering analysis. Three hierarchical clustering algorithms named Ward-Euclidean, Ward-Jaccard, and Average-Euclidean are introduced through combing the Euclidean and Jaccard distance with the Ward and Average linkage clustering algorithms embedded in the OriginPro software. Experimental results reveal that BF.28, BE.1.1.1, BA.5.3, and BA.5.6.4 strains exhibit distinct characteristics which are not observed in other types of SARS-CoV-2 strains, suggesting their being the majority potential sources which the future SARS-CoV-2 strains' evolution from. Moreover, BA.2.75, CH.1.1, BA.2, BA.5.1.3, BF.7, and B.1.1.214 strains demonstrate enhanced abilities in terms of immune evasion, transmissibility, and pathogenicity. Hence, closely monitoring the evolutionary trends of these strains is crucial to mitigate their impact on public health and society as far as possible.

Assuntos

COVID-19 , Humanos , COVID-19/epidemiologia , SARS-CoV-2/genética , Análise por Conglomerados , Algoritmos , China/epidemiologia

7.

The nnU-Net based method for automatic segmenting fetal brain tissues.

Peng, Ying; Xu, Yandi; Wang, Mingzhao; Zhang, Huiquan; Xie, Juanying.

Health Inf Sci Syst ; 11(1): 17, 2023 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-36998806

RESUMO

The magnetic resonance (MR) images of fetuses make it possible for doctors to detect out pathological fetal brains in early stages. Brain tissue segmentation is prerequisite for making brain morphology and volume analyses. nnU-Net is an automatic segmentation method based on deep learning. It can adaptively configure itself, so as to adapt to a specific task via preprocessing, network architecture, training, and post-processing. Therefore, we adapt nnU-Net to segment seven types of fetal brain tissues, including external cerebrospinal fluid, gray matter, white matter, ventricle, cerebellum, deep gray matter, and brainstem. With regard to the characteristics of the FeTA 2021 data, some adjustments are made to the original nnU-Net, so that it can segment seven types of fetal brain tissues precisely as far as possible. The average segmentation results on FeTA 2021 training data demonstrate that our advanced nnU-Net is superior to the peers including SegNet, CoTr, AC U-Net and ResUnet. Its average segmentation results are 0.842, 11.759 and 0.957 in terms of Dice, HD95 and VS criteria. Moreover, the experimental results on FeTA 2021 test data further demonstrate that our advanced nnU-Net has obtained good segmentation performance of 0.774, 14.699 and 0.875 in terms of Dice, HD95 and VS, ranked the third in FeTA 2021 challenge. Our advanced nnU-Net realized the task for segmenting the fetal brain tissues using MR images of different gestational ages, which can help doctors to make correct and seasonable diagnoses.

8.

Fetal brain tissue annotation and segmentation challenge results.

Payette, Kelly; Li, Hongwei Bran; de Dumast, Priscille; Licandro, Roxane; Ji, Hui; Siddiquee, Md Mahfuzur Rahman; Xu, Daguang; Myronenko, Andriy; Liu, Hao; Pei, Yuchen; Wang, Lisheng; Peng, Ying; Xie, Juanying; Zhang, Huiquan; Dong, Guiming; Fu, Hao; Wang, Guotai; Rieu, ZunHyan; Kim, Donghyeon; Kim, Hyun Gi; Karimi, Davood; Gholipour, Ali; Torres, Helena R; Oliveira, Bruno; Vilaça, João L; Lin, Yang; Avisdris, Netanell; Ben-Zvi, Ori; Bashat, Dafna Ben; Fidon, Lucas; Aertsen, Michael; Vercauteren, Tom; Sobotka, Daniel; Langs, Georg; Alenyà, Mireia; Villanueva, Maria Inmaculada; Camara, Oscar; Fadida, Bella Specktor; Joskowicz, Leo; Weibin, Liao; Yi, Lv; Xuesong, Li; Mazher, Moona; Qayyum, Abdul; Puig, Domenec; Kebiri, Hamza; Zhang, Zelin; Xu, Xinyi; Wu, Dan; Liao, Kuanlun.

Med Image Anal ; 88: 102833, 2023 08.

Artigo em Inglês | MEDLINE | ID: mdl-37267773

RESUMO

In-utero fetal MRI is emerging as an important tool in the diagnosis and analysis of the developing human brain. Automatic segmentation of the developing fetal brain is a vital step in the quantitative analysis of prenatal neurodevelopment both in the research and clinical context. However, manual segmentation of cerebral structures is time-consuming and prone to error and inter-observer variability. Therefore, we organized the Fetal Tissue Annotation (FeTA) Challenge in 2021 in order to encourage the development of automatic segmentation algorithms on an international level. The challenge utilized FeTA Dataset, an open dataset of fetal brain MRI reconstructions segmented into seven different tissues (external cerebrospinal fluid, gray matter, white matter, ventricles, cerebellum, brainstem, deep gray matter). 20 international teams participated in this challenge, submitting a total of 21 algorithms for evaluation. In this paper, we provide a detailed analysis of the results from both a technical and clinical perspective. All participants relied on deep learning methods, mainly U-Nets, with some variability present in the network architecture, optimization, and image pre- and post-processing. The majority of teams used existing medical imaging deep learning frameworks. The main differences between the submissions were the fine tuning done during training, and the specific pre- and post-processing steps performed. The challenge results showed that almost all submissions performed similarly. Four of the top five teams used ensemble learning methods. However, one team's algorithm performed significantly superior to the other submissions, and consisted of an asymmetrical U-Net network architecture. This paper provides a first of its kind benchmark for future automatic multi-tissue segmentation algorithms for the developing human brain in utero.

Assuntos

Processamento de Imagem Assistida por Computador , Substância Branca , Gravidez , Feminino , Humanos , Processamento de Imagem Assistida por Computador/métodos , Encéfalo/diagnóstico por imagem , Cabeça , Feto/diagnóstico por imagem , Algoritmos , Imageamento por Ressonância Magnética/métodos

9.

Head and neck tumor segmentation in PET/CT: The HECKTOR challenge.

Oreiller, Valentin; Andrearczyk, Vincent; Jreige, Mario; Boughdad, Sarah; Elhalawani, Hesham; Castelli, Joel; Vallières, Martin; Zhu, Simeng; Xie, Juanying; Peng, Ying; Iantsen, Andrei; Hatt, Mathieu; Yuan, Yading; Ma, Jun; Yang, Xiaoping; Rao, Chinmay; Pai, Suraj; Ghimire, Kanchan; Feng, Xue; Naser, Mohamed A; Fuller, Clifton D; Yousefirizi, Fereshteh; Rahmim, Arman; Chen, Huai; Wang, Lisheng; Prior, John O; Depeursinge, Adrien.

Med Image Anal ; 77: 102336, 2022 04.

Artigo em Inglês | MEDLINE | ID: mdl-35016077

RESUMO

This paper relates the post-analysis of the first edition of the HEad and neCK TumOR (HECKTOR) challenge. This challenge was held as a satellite event of the 23rd International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI) 2020, and was the first of its kind focusing on lesion segmentation in combined FDG-PET and CT image modalities. The challenge's task is the automatic segmentation of the Gross Tumor Volume (GTV) of Head and Neck (H&N) oropharyngeal primary tumors in FDG-PET/CT images. To this end, the participants were given a training set of 201 cases from four different centers and their methods were tested on a held-out set of 53 cases from a fifth center. The methods were ranked according to the Dice Score Coefficient (DSC) averaged across all test cases. An additional inter-observer agreement study was organized to assess the difficulty of the task from a human perspective. 64 teams registered to the challenge, among which 10 provided a paper detailing their approach. The best method obtained an average DSC of 0.7591, showing a large improvement over our proposed baseline method and the inter-observer agreement, associated with DSCs of 0.6610 and 0.61, respectively. The automatic methods proved to successfully leverage the wealth of metabolic and structural properties of combined PET and CT modalities, significantly outperforming human inter-observer agreement level, semi-automatic thresholding based on PET images as well as other single modality-based methods. This promising performance is one step forward towards large-scale radiomics studies in H&N cancer, obviating the need for error-prone and time-consuming manual delineation of GTVs.

Assuntos

Neoplasias de Cabeça e Pescoço , Tomografia por Emissão de Pósitrons combinada à Tomografia Computadorizada , Fluordesoxiglucose F18 , Neoplasias de Cabeça e Pescoço/diagnóstico por imagem , Humanos , Tomografia por Emissão de Pósitrons combinada à Tomografia Computadorizada/métodos , Tomografia por Emissão de Pósitrons/métodos , Carga Tumoral

10.

The Unsupervised Feature Selection Algorithms Based on Standard Deviation and Cosine Similarity for Genomic Data Analysis.

Xie, Juanying; Wang, Mingzhao; Xu, Shengquan; Huang, Zhao; Grant, Philip W.

Front Genet ; 12: 684100, 2021.

Artigo em Inglês | MEDLINE | ID: mdl-34054930

RESUMO

To tackle the challenges in genomic data analysis caused by their tens of thousands of dimensions while having a small number of examples and unbalanced examples between classes, the technique of unsupervised feature selection based on standard deviation and cosine similarity is proposed in this paper. We refer to this idea as SCFS (Standard deviation and Cosine similarity based Feature Selection). It defines the discernibility and independence of a feature to value its distinguishable capability between classes and its redundancy to other features, respectively. A 2-dimensional space is constructed using discernibility as x-axis and independence as y-axis to represent all features where the upper right corner features have both comparatively high discernibility and independence. The importance of a feature is defined as the product of its discernibility and its independence (i.e., the area of the rectangular enclosed by the feature's coordinate lines and axes). The upper right corner features are by far the most important, comprising the optimal feature subset. Based on different definitions of independence using cosine similarity, there are three feature selection algorithms derived from SCFS. These are SCEFS (Standard deviation and Exponent Cosine similarity based Feature Selection), SCRFS (Standard deviation and Reciprocal Cosine similarity based Feature Selection) and SCAFS (Standard deviation and Anti-Cosine similarity based Feature Selection), respectively. The KNN and SVM classifiers are built based on the optimal feature subsets detected by these feature selection algorithms, respectively. The experimental results on 18 genomic datasets of cancers demonstrate that the proposed unsupervised feature selection algorithms SCEFS, SCRFS and SCAFS can detect the stable biomarkers with strong classification capability. This shows that the idea proposed in this paper is powerful. The functional analysis of these biomarkers show that the occurrence of the cancer is closely related to the biomarker gene regulation level. This fact will benefit cancer pathology research, drug development, early diagnosis, treatment and prevention.

11.

Colon cancer data analysis by chameleon algorithm.

Xie, Juanying; Wang, Yuchen; Wu, Zhaozhong.

Health Inf Sci Syst ; 7(1): 23, 2019 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-31656596

RESUMO

Detecting the key differential genes of colon cancers is very important to tell colon cancer patients from normal people. A gene selection algorithm for colon cancers is proposed by using the dynamic modeling properties of chameleon algorithm and its capability to discover any arbitrary shape clusters. This chameleon algorithm based gene selection algorithm comprises three steps. The first step is to select those genes with higher Fisher function values as candidate genes. The second step is to detect gene groups by using chameleon algorithm based on Euclidean distance. The third step is to select the most important gene from each gene cluster to comprise the gene subset by using the information index to classification of each gene. After that the chameleon algorithm is used to detect groups of colon cancer patients and normal people only with genes in gene subset. The final clustering accuracy of chameleon algorithm with the selected genes is up to 85.48%. The clustering analysis to colon cancer data and the comparisons to the other related studies demonstrate that the proposed algorithm is effective in detecting the differential genes of colon cancers.

12.

Deep Learning Based Analysis of Histopathological Images of Breast Cancer.

Xie, Juanying; Liu, Ran; Luttrell, Joseph; Zhang, Chaoyang.

Front Genet ; 10: 80, 2019.

Artigo em Inglês | MEDLINE | ID: mdl-30838023

RESUMO

Breast cancer is associated with the highest morbidity rates for cancer diagnoses in the world and has become a major public health issue. Early diagnosis can increase the chance of successful treatment and survival. However, it is a very challenging and time-consuming task that relies on the experience of pathologists. The automatic diagnosis of breast cancer by analyzing histopathological images plays a significant role for patients and their prognosis. However, traditional feature extraction methods can only extract some low-level features of images, and prior knowledge is necessary to select useful features, which can be greatly affected by humans. Deep learning techniques can extract high-level abstract features from images automatically. Therefore, we introduce it to analyze histopathological images of breast cancer via supervised and unsupervised deep convolutional neural networks. First, we adapted Inception_V3 and Inception_ResNet_V2 architectures to the binary and multi-class issues of breast cancer histopathological image classification by utilizing transfer learning techniques. Then, to overcome the influence from the imbalanced histopathological images in subclasses, we balanced the subclasses with Ductal Carcinoma as the baseline by turning images up and down, right and left, and rotating them counterclockwise by 90 and 180 degrees. Our experimental results of the supervised histopathological image classification of breast cancer and the comparison to the results from other studies demonstrate that Inception_V3 and Inception_ResNet_V2 based histopathological image classification of breast cancer is superior to the existing methods. Furthermore, these findings show that Inception_ResNet_V2 network is the best deep learning architecture so far for diagnosing breast cancers by analyzing histopathological images. Therefore, we used Inception_ResNet_V2 to extract features from breast cancer histopathological images to perform unsupervised analysis of the images. We also constructed a new autoencoder network to transform the features extracted by Inception_ResNet_V2 to a low dimensional space to do clustering analysis of the images. The experimental results demonstrate that using our proposed autoencoder network results in better clustering results than those based on features extracted only by Inception_ResNet_V2 network. All of our experimental results demonstrate that Inception_ResNet_V2 network based deep transfer learning provides a new means of performing analysis of histopathological images of breast cancer.

13.

Transcriptomics and differential gene expression in Whitmania pigra (Annelida: Clitellata: Hirudinida: Hirudinidae): Contrasting feeding and fasting modes.

Khan, Muhammad Salabat; Guan, De-Long; Kvist, Sebastian; Ma, Li-Bin; Xie, Juan-Ying; Xu, Sheng-Quan.

Ecol Evol ; 9(8): 4706-4719, 2019 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-31031937

RESUMO

The medicinal utility of leeches has been demonstrated through decades of use in modern hospital settings, mainly as relievers of venous congestion following flap or digit replantation surgery. In the present study, we sequence and annotate (through BLAST- and Gene Ontology-based approaches) the salivary transcriptome of the nonblood feeding hirudinid Whitmania pigra and assess the differential gene expression of anticoagulation factors (through both quantitative real-time PCR [qRT-PCR] and in silico-based methods) during feeding and fasting conditions. This was done in order to evince the diversity of putative anticoagulation factors, as well as estimate the levels of upregulation of genes immediately after feeding. In total, we found sequences with demonstrated orthology (via both phylogenetic analyses and BLAST-based approaches) to seven different proteins that have previously been linked to anticoagulatory capabilities-eglin C, bdellin, granulin, guamerin, hyaluronidase, destabilase I, and lipocalin. All of these were recovered from leeches both in the fasting and in the feeding conditions, but all show signs of upregulation in the feeding leeches. Interestingly, our RNA-seq effort, coupled with a hypergeometric test, indicated that the differentially expressed genes were disproportionately involved in three main immunological pathways (endocytosis, peroxisome regulation, and lysosome regulation). The results and implications of the finding of anticoagulants in this nonblood feeding leech and the putative upregulation of anticoagulation factors after feeding are briefly discussed in an evolutionary context.

14.

Draft Genome of a Blister Beetle Mylabris aulica.

Guan, De-Long; Hao, Xiao-Qian; Mi, Da; Peng, Jiong; Li, Yuan; Xie, Juan-Ying; Huang, Huateng; Xu, Sheng-Quan.

Front Genet ; 10: 1281, 2019.

Artigo em Inglês | MEDLINE | ID: mdl-32010178

RESUMO

Mylabris aulica is a widely distributed blister beetle of the Meloidae family. It has the ability to synthesize a potent defensive secretion that includes cantharidin, a toxic compound used to treat many major illnesses. However, owing to the lack of genetic studies on cantharidin biosynthesis in M. aulica, the commercial use of this species is less extensive than that of other blister beetle species in China. This study reports a draft assembly and possible genes and pathways related to cantharidin biosynthesis for the M. aulica blister beetle using nanopore sequencing data. The draft genome assembly size was 288.5 Mb with a 467.8 Kb N50, and a repeat content of 50.62%. An integrated gene finding pipeline performed for assembly obtained 16,500 protein coding genes. Benchmarking universal single-copy orthologs assessment showed that this gene set included 94.4% complete Insecta universal single-copy orthologs. Over 99% of these genes were assigned functional annotations in the gene ontology, Kyoto Encyclopedia of Genes and Genomes, or Genbank non-redundant databases. Comparative genomic analysis showed that the completeness and continuity of our assembly was better than those of Hycleus cichorii and Hycleus phaleratus blister beetle genomes. The analysis of homologous orthologous genes and inference from evolutionary history imply that the Mylabris and Hycleus genera are genetically close, have a similar genetic background, and have differentiated within one million years. This M. aulica genome assembly provides a valuable resource for future blister beetle studies and will contribute to cantharidin biosynthesis.

15.

Geographic variation in wing size and shape of the grasshopper Trilophidia annulata (Orthoptera: Oedipodidae): morphological trait variations follow an ecogeographical rule.

Bai, Yi; Dong, Jia-Jia; Guan, De-Long; Xie, Juan-Ying; Xu, Sheng-Quan.

Sci Rep ; 6: 32680, 2016 09 06.

Artigo em Inglês | MEDLINE | ID: mdl-27597437

RESUMO

A quantitative analysis of wing variation in grasshoppers can help us to understand how environmental heterogeneity affects the phenotypic patterns of insects. In this study, geometric morphometric methods were used to measure the differences in wing shape and size of Trilophidia annulata among 39 geographical populations in China, and a regression analysis was applied to identify the major environmental factors contributing to the observed morphological variations. The results showed that the size of the forewing and hindwing were significantly different among populations; the shape of the forewing among populations can be divided into geographical groups, however hindwing shape are geographical overlapped, and populations cannot be divided into geographical groups. Environmental PCA and thin-plate spline analysis suggested that smaller individuals with shorter and blunter-tip forewings were mainly distributed in the lower latitudes and mountainous areas, where they have higher temperatures and more precipitation. Correspondingly, the larger-bodied grasshoppers, those that have longer forewings with a longer radial sector, are distributed in contrary circumstances. We conclude that the size variations in body, forewing and hindwing of T. annulata apparently follow the Bergmann clines. The importance of climatic variables in influencing morphological variation among populations, forewing shape of T. annulata varies along an environmental gradient.

Assuntos

Ecologia , Geografia , Gafanhotos/anatomia & histologia , Asas de Animais/anatomia & histologia , Animais , Tamanho Corporal

16.

Two-stage hybrid feature selection algorithms for diagnosing erythemato-squamous diseases.

Xie, Juanying; Lei, Jinhu; Xie, Weixin; Shi, Yong; Liu, Xiaohui.

Health Inf Sci Syst ; 1: 10, 2013.

Artigo em Inglês | MEDLINE | ID: mdl-26042184

RESUMO

This paper proposes two-stage hybrid feature selection algorithms to build the stable and efficient diagnostic models where a new accuracy measure is introduced to assess the models. The two-stage hybrid algorithms adopt Support Vector Machines (SVM) as a classification tool, and the extended Sequential Forward Search (SFS), Sequential Forward Floating Search (SFFS), and Sequential Backward Floating Search (SBFS), respectively, as search strategies, and the generalized F-score (GF) to evaluate the importance of each feature. The new accuracy measure is used as the criterion to evaluated the performance of a temporary SVM to direct the feature selection algorithms. These hybrid methods combine the advantages of filters and wrappers to select the optimal feature subset from the original feature set to build the stable and efficient classifiers. To get the stable, statistical and optimal classifiers, we conduct 10-fold cross validation experiments in the first stage; then we merge the 10 selected feature subsets of the 10-cross validation experiments, respectively, as the new full feature set to do feature selection in the second stage for each algorithm. We repeat the each hybrid feature selection algorithm in the second stage on the one fold that has got the best result in the first stage. Experimental results show that our proposed two-stage hybrid feature selection algorithms can construct efficient diagnostic models which have got better accuracy than that built by the corresponding hybrid feature selection algorithms without the second stage feature selection procedures. Furthermore our methods have got better classification accuracy when compared with the available algorithms for diagnosing erythemato-squamous diseases.

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA