Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 372
Filtrar
1.
Genome Res ; 33(8): 1284-1298, 2023 08.
Artigo em Inglês | MEDLINE | ID: mdl-37714713

RESUMO

Chinese indicine cattle harbor a much higher genetic diversity compared with other domestic cattle, but their genome architecture remains uninvestigated. Using PacBio HiFi sequencing data from 10 Chinese indicine cattle across southern China, we assembled 20 high-quality partially phased genomes and integrated them into a multiassembly graph containing 148.5 Mb (5.6%) of novel sequence. We identified 156,009 high-confidence nonredundant structural variants (SVs) and 206 SV hotspots spanning ∼195 Mb of gene-rich sequence. We detected 34,249 archaic introgressed fragments in Chinese indicine cattle covering 1.93 Gb (73.3%) of the genome. We inferred an average of 3.8%, 3.2%, 1.4%, and 0.5% of introgressed sequence originating, respectively, from banteng-like, kouprey-like, gayal-like, and gaur-like Bos species, as well as 0.6% of unknown origin. Introgression from multiple donors might have contributed to the genetic diversity of Chinese indicine cattle. Altogether, this study highlights the contribution of interspecies introgression to the genomic architecture of an important livestock population and shows how exotic genomic elements can contribute to the genetic variation available for selection.


Assuntos
Bovinos , Ruminantes , Animais , Bovinos/genética , China , Genoma , Genômica , Ruminantes/genética
2.
PLoS Genet ; 19(2): e1010615, 2023 02.
Artigo em Inglês | MEDLINE | ID: mdl-36821549

RESUMO

The worldwide sheep population comprises more than 1000 breeds. Together, these exhibit a considerable morphological diversity, which has not been extensively investigated at the molecular level. Here, we analyze whole-genome sequencing individuals of 1,098 domestic sheep from 154 breeds, and 69 wild sheep from seven Ovis species. On average, we detected 6.8%, 1.0% and 0.2% introgressed sequence in domestic sheep originating from Iranian mouflon, urial and argali, respectively, with rare introgressions from other wild species. Interestingly, several introgressed haplotypes contributed to the morphological differentiations across sheep breeds, such as a RXFP2 haplotype from Iranian mouflon conferring the spiral horn trait, a MSRB3 haplotype from argali strongly associated with ear morphology, and a VPS13B haplotype probably originating from urial and mouflon possibly associated with facial traits. Our results reveal that introgression events from wild Ovis species contributed to the high rate of morphological differentiation in sheep breeds, but also to individual variation within breeds. We propose that long divergent haplotypes are a ubiquitous source of phenotypic variation that allows adaptation to a variable environment, and that these remain intact in the receiving population probably due to reduced recombination.


Assuntos
Aclimatação , Carneiro Doméstico , Ovinos/genética , Animais , Carneiro Doméstico/genética , Haplótipos/genética , Irã (Geográfico) , Fenótipo
3.
Proteomics ; 24(12-13): e2300371, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38643379

RESUMO

Forecasting alterations in protein stability caused by variations holds immense importance. Improving the thermal stability of proteins is important for biomedical and industrial applications. This review discusses the latest methods for predicting the effects of mutations on protein stability, databases containing protein mutations and thermodynamic parameters, and experimental techniques for efficiently assessing protein stability in high-throughput settings. Various publicly available databases for protein stability prediction are introduced. Furthermore, state-of-the-art computational approaches for anticipating protein stability changes due to variants are reviewed. Each method's types of features, base algorithm, and prediction results are also detailed. Additionally, some experimental approaches for verifying the prediction results of computational methods are introduced. Finally, the review summarizes the progress and challenges of protein stability prediction and discusses potential models for future research directions.


Assuntos
Estabilidade Proteica , Proteínas , Termodinâmica , Proteínas/química , Proteínas/metabolismo , Biologia Computacional/métodos , Bases de Dados de Proteínas , Algoritmos , Mutação , Humanos
4.
Proteomics ; : e2300302, 2024 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-38258387

RESUMO

Small proteins (SPs) are a unique group of proteins that play crucial roles in many important biological processes. Exploring the biological function of SPs is necessary. In this study, the InterPro tool and the maximum correlation method were utilized to analyze functional domains of SPs. The purpose was to identify important functional domains that can indicate the essential differences between small and large protein sequences. First, the small and large proteins were represented by their functional domains via a one-hot scheme. Then, the MaxRel method was adopted to evaluate the relationships between each domain and the target variable, indicating small or large protein. The top 36 domain features were selected for further investigation. Among them, 14 were deemed to be highly related to SPs because they were annotated to SPs more frequently than large proteins. We found the involvement of functional domains, such as ubiquitin-conjugating enzyme/RWD-like, nuclear transport factor 2 domain, and alpha subunit of guanine nucleotide-binding protein (G-protein) in regulating the biological function of SPs. The involvement of these domains has been confirmed by other recent studies. Our findings indicate that protein functional domains may regulate small protein-related functions and predict their biological activity.

5.
Biochem Genet ; 2024 Feb 21.
Artigo em Inglês | MEDLINE | ID: mdl-38383836

RESUMO

Breast cancer remains the most prevalent cancer in women. To date, its underlying molecular mechanisms have not been fully uncovered. The determination of gene factors is important to improve our understanding on breast cancer, which can correlate the specific gene expression and tumor staging. However, the knowledge in this regard is still far from complete. Thus, this study aimed to explore these knowledge gaps by analyzing existing gene expression profile data from 3149 breast cancer samples, where each sample was represented by the expression of 19,644 genes and classified into Nottingham histological grade (NHG) classes (Grade 1, 2, and 3). To this end, a machine learning-based framework was designed. First, the profile data were analyzed by using seven feature ranking algorithms to evaluate the importance of features (genes). Seven feature lists were generated, each of which sorted features in accordance with feature importance evaluated from a special aspect. Then, the incremental feature selection method was applied to each list to determine essential features for classification and building efficient classifiers. Consequently, overlapping genes, such as AURKA, CBX2, and MYBL2, were deemed as potentially related to breast cancer malignancy and prognosis, indicating that such genes were identified to be important by multiple feature ranking algorithms. In addition, the study formulated classification rules to reflect special gene expression patterns for three NHG classes. Some genes and rules were analyzed and supported by recent literature, providing new references for studying breast cancer.

6.
Mol Biol Evol ; 39(12)2022 12 05.
Artigo em Inglês | MEDLINE | ID: mdl-36382357

RESUMO

Understanding the genetic mechanism of how animals adapt to extreme conditions is fundamental to determine the relationship between molecular evolution and changing environments. Goat is one of the first domesticated species and has evolved rapidly to adapt to diverse environments, including harsh high-altitude conditions with low temperature and poor oxygen supply but strong ultraviolet radiation. Here, we analyzed 331 genomes of domestic goats and wild caprid species living at varying altitudes (high > 3000 m above sea level and low < 1200 m), along with a reference-guided chromosome-scale assembly (contig-N50: 90.4 Mb) of a female Tibetan goat genome based on PacBio HiFi long reads, to dissect the genetic determinants underlying their adaptation to harsh conditions on the Qinghai-Tibetan Plateau (QTP). Population genomic analyses combined with genome-wide association studies (GWAS) revealed a genomic region harboring the 3'-phosphoadenosine 5'-phosphosulfate synthase 2 (PAPSS2) gene showing strong association with high-altitude adaptability (PGWAS = 3.62 × 10-25) in Tibetan goats. Transcriptomic data from 13 tissues revealed that PAPSS2 was implicated in hypoxia-related pathways in Tibetan goats. We further verified potential functional role of PAPSS2 in response to hypoxia in PAPSS2-deficient cells. Introgression analyses suggested that the PAPSS2 haplotype conferring the high-altitude adaptability in Tibetan goats originated from a recent hybridization between goats and a wild caprid species, the markhor (Capra falconeri). In conclusion, our results uncover a hitherto unknown contribution of PAPSS2 to high-altitude adaptability in Tibetan goats on QTP, following interspecific introgression and natural selection.


Assuntos
Estudo de Associação Genômica Ampla , Cabras , Animais , Cabras/genética , Raios Ultravioleta , Genômica
7.
Cell Tissue Res ; 393(1): 149-161, 2023 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-37221302

RESUMO

The formation of skeletal muscle is a complex process that is coordinated by many regulatory factors, such as myogenic factors and noncoding RNAs. Numerous studies have proved that circRNA is an indispensable part of muscle development. However, little is known about circRNAs in bovine myogenesis. In this study, we discovered a novel circRNA, circ2388, formed by reverse splicing of the fourth and fifth exons of the MYL1 gene. The expression of circ2388 was different between fetal and adult cattle muscle. This circRNA is 99% homologous between cattle and buffalo and is localized in the cytoplasm. Thoroughly, we proved that circ2388 had no effect on cattle and buffalo myoblast proliferation but promotes myoblast differentiation and myotube fusion. Furthermore, circ2388 in vivo stimulated skeletal muscle regeneration in mouse muscle injury model. Taken together, our findings suggest that circ2388 promotes myoblast differentiation and promotes the recovery and regeneration of damaged muscles.


Assuntos
Mioblastos , RNA Circular , Camundongos , Animais , Bovinos , Mioblastos/metabolismo , RNA Circular/genética , RNA Circular/metabolismo , Búfalos , Proliferação de Células/genética , Fibras Musculares Esqueléticas/metabolismo , Músculo Esquelético/lesões , Desenvolvimento Muscular/genética , Diferenciação Celular
8.
Opt Express ; 31(26): 43891-43907, 2023 Dec 18.
Artigo em Inglês | MEDLINE | ID: mdl-38178474

RESUMO

Polarization 3D imaging has been a research hotspot in the field of 3D facial reconstruction because of its biosafety, high efficiency, and simplicity. However, the application of this technology is limited by the multi-valued problem of the azimuth angle of the normal vector. Currently, the most common method to overcome this limitation is to introduce additional depth techniques at the cost of reducing its applicability. This study presents a passive 3D polarization facial imaging method that does not require additional depth-capturing devices. It addresses the issue of azimuth ambiguity based on prior information about the target image's features. Specifically, by statistically analyzing the probability distribution of real azimuth angles, it is found that their quadrant distribution is closely related to the positions of facial feature points. Therefore, through facial feature detection, the polarized normal azimuth angle of each pixel can be accurately assigned to the corresponding quadrant, thus determining a precise unique normal vector and achieving accurate 3D facial reconstruction. Finally, our azimuth angle correction method was validated by simulated polarization imaging results, and it achieved accurate correction for over 75% of the global pixels without using additional depth techniques. Experimental results further indicate that this method can achieve polarization 3D facial imaging under natural conditions without extra depth devices, and the 3D results preserve edge details and texture information.

9.
Opt Lett ; 48(19): 5053-5056, 2023 Oct 01.
Artigo em Inglês | MEDLINE | ID: mdl-37773383

RESUMO

The shape from polarization is a noncontact 3D imaging method that shows great potential, but its application is limited by the monocular camera system and surface integration algorithm. This Letter proposes a novel, to the best of our knowledge, method that employs deep neural networks to enhance multi-target 3D reconstruction, making a significant advancement in the field. By constructing the relationship between targets' blur, distance, and clarity, the proposed method provides accurate spatial information while mitigating inaccuracies arising from the continuous model. Experiments show that the constructed neural network can help improve the multi-target 3D reconstruction quality compared with conventional methods.

10.
Appl Opt ; 62(21): 5627-5635, 2023 Jul 20.
Artigo em Inglês | MEDLINE | ID: mdl-37707178

RESUMO

The traditional polarization three-dimensional (3D) imaging technology has limited applications in the field of vision because it can only obtain the relative depth information of the target. Based on the principle of polarization stereo vision, this study combines camera calibration with a monocular ranging model to achieve high-precision recovery of the target's absolute depth information in multi-target scenes. Meanwhile, an adaptive camera intrinsic matrix prediction method is proposed to overcome changes in the camera intrinsic matrix caused by focusing on fuzzy targets outside the depth of field in multi-target scenes, thereby realizing monocular polarized 3D absolute depth reconstruction under dynamic focusing of targets at different depths. Experimental results indicate that the recovery error of monocular polarized 3D absolute depth information for the clear target is less than 10%, and the detail error is only 0.19 mm. Also, the precision of absolute depth reconstruction remains above 90% after dynamic focusing on the blurred target. The proposed monocular polarized 3D absolute depth reconstruction technology for multi-target scenes can broaden application scenarios of the polarization 3D imaging technology in the field of vision.

11.
Proteomics ; 22(15-16): e2100190, 2022 08.
Artigo em Inglês | MEDLINE | ID: mdl-35567424

RESUMO

Protein-protein interactions (PPIs) form the basis of a myriad of biological pathways and mechanism, such as the formation of protein complexes or the components of signaling cascades. Here, we reviewed experimental methods for identifying PPI pairs, including yeast two-hybrid (Y2H), mass spectrometry (MS), co-localization, and co-immunoprecipitation. Furthermore, a range of computational methods leveraging biochemical properties, evolution history, protein structures and more have enabled identification of additional PPIs. Given the wealth of known PPIs, we reviewed important network methods to construct and analyze networks of PPIs. These methods aid biological discovery through identifying hub genes and dynamic changes in the network, and have been thoroughly applied in various fields of biological research. Lastly, we discussed the challenges and future direction of research utilizing the power of PPI networks.


Assuntos
Mapeamento de Interação de Proteínas , Mapas de Interação de Proteínas , Mapeamento de Interação de Proteínas/métodos , Proteínas/metabolismo , Saccharomyces cerevisiae/metabolismo
12.
Mol Genet Genomics ; 297(5): 1301-1313, 2022 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-35780439

RESUMO

Lung is the most important organ in the human respiratory system, whose normal functions are quite essential for human beings. Under certain pathological conditions, the normal lung functions could no longer be maintained in patients, and lung transplantation is generally applied to ease patients' breathing and prolong their lives. However, several risk factors exist during and after lung transplantation, including bleeding, infection, and transplant rejections. In particular, transplant rejections are difficult to predict or prevent, leading to the most dangerous complications and severe status in patients undergoing lung transplantation. Given that most common monitoring and validation methods for lung transplantation rejections may take quite a long time and have low reproducibility, new technologies and methods are required to improve the efficacy and accuracy of rejection monitoring after lung transplantation. Recently, one previous study set up the gene expression profiles of patients who underwent lung transplantation. However, it did not provide a tool to predict lung transplantation responses. Here, a further deep investigation was conducted on such profiling data. A computational framework, incorporating several machine learning algorithms, such as feature selection methods and classification algorithms, was built to establish an effective prediction model distinguishing patient into different clinical subgroups, corresponding to different rejection responses after lung transplantation. Furthermore, the framework also screened essential genes with functional enrichments and create quantitative rules for the distinction of patients with different rejection responses to lung transplantation. The outcome of this contribution could provide guidelines for clinical treatment of each rejection subtype and contribute to the revealing of complicated rejection mechanisms of lung transplantation.


Assuntos
Transplante de Pulmão , Rejeição de Enxerto , Humanos , Pulmão , Reprodutibilidade dos Testes , Transcriptoma
13.
Mol Ecol ; 31(16): 4364-4380, 2022 08.
Artigo em Inglês | MEDLINE | ID: mdl-35751552

RESUMO

By their paternal transmission, Y-chromosomal haplotypes are sensitive markers of population history and male-mediated introgression. Previous studies identified biallelic single-nucleotide variants in the SRY, ZFY and DDX3Y genes, which in domestic goats identified four major Y-chromosomal haplotypes, Y1A, Y1B, Y2A and Y2B, with a marked geographical partitioning. Here, we extracted goat Y-chromosomal variants from whole-genome sequences of 386 domestic goats (75 breeds) and seven wild goat species, which were generated by the VarGoats goat genome project. Phylogenetic analyses indicated domestic haplogroups corresponding to Y1B, Y2A and Y2B, respectively, whereas Y1A is split into Y1AA and Y1AB. All five haplogroups were detected in 26 ancient DNA samples from southeast Europe or Asia. Haplotypes from present-day bezoars are not shared with domestic goats and are attached to deep nodes of the trees and networks. Haplogroup distributions for 186 domestic breeds indicate ancient paternal population bottlenecks and expansions during migrations into northern Europe, eastern and southern Asia, and Africa south of the Sahara. In addition, sharing of haplogroups indicates male-mediated introgressions, most notably an early gene flow from Asian goats into Madagascar and the crossbreeding that in the 19th century resulted in the popular Boer and Anglo-Nubian breeds. More recent introgressions are those from European goats into the native Korean goat population and from Boer goat into Uganda, Kenya, Tanzania, Malawi and Zimbabwe. This study illustrates the power of the Y-chromosomal variants for reconstructing the history of domestic species with a wide geographical range.


Assuntos
DNA Mitocondrial , Variação Genética , Animais , DNA Mitocondrial/genética , Cabras/genética , Haplótipos/genética , Filogenia , Cromossomo Y/genética
14.
Appl Opt ; 61(21): 6228-6233, 2022 Jul 20.
Artigo em Inglês | MEDLINE | ID: mdl-36256236

RESUMO

Diffuse polarization-based 3D imaging has flourished with the ability to obtain the 3D shapes of objects without multiple detectors, active mode lighting, or complex mechanical structures, which are major drawbacks of other methods for 3D imaging in natural scenes. However, traditional polarization-based 3D imaging technology introduces color distortion when reconstructing the surface of multi-colored targets. We propose a polarization-based 3D imaging model to recover the 3D geometry of multi-colored Lambertian objects. In particular, chromaticity-based color removal theory is used to restore the intrinsic intensity, which is modulated only by the target shape, and we apply the recovered intrinsic intensity to address the orientation uncertainty of target normals due to azimuth ambiguity. Finally, we integrate the corrected normals to reconstruct high-precision 3D shapes. Experimental results demonstrate that the proposed model has the ability to reconstruct multi-colored Lambertian objects exhibiting non-uniform reflectance from single views under natural light conditions.

15.
Mol Biol Evol ; 37(7): 2099-2109, 2020 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-32324877

RESUMO

Goats are one of the most widespread farmed animals across the world; however, their migration route to East Asia and local evolutionary history remain poorly understood. Here, we sequenced 27 ancient Chinese goat genomes dating from the Late Neolithic period to the Iron Age. We found close genetic affinities between ancient and modern Chinese goats, demonstrating their genetic continuity. We found that Chinese goats originated from the eastern regions around the Fertile Crescent, and we estimated that the ancestors of Chinese goats diverged from this population in the Chalcolithic period. Modern Chinese goats were divided into a northern and a southern group, coinciding with the most prominent climatic division in China, and two genes related to hair follicle development, FGF5 and EDA2R, were highly divergent between these populations. We identified a likely causal de novo deletion near FGF5 in northern Chinese goats that increased to high frequency over time, whereas EDA2R harbored standing variation dating to the Neolithic. Our findings add to our understanding of the genetic composition and local evolutionary process of Chinese goats.


Assuntos
Evolução Biológica , DNA Antigo/química , Genoma , Cabras/genética , Adaptação Biológica , Animais , China , Seleção Genética
16.
Mol Genet Genomics ; 296(4): 905-918, 2021 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-33914130

RESUMO

Phenotype is one of the most significant concepts in genetics, which is used to describe all the characteristics of a research object that can be observed. Considering that phenotype reflects the integrated features of genotype and environment factors, it is hard to define phenotype characteristics, even difficult to predict unknown phenotypes. Restricted by current biological techniques, it is still quite expensive and time-consuming to obtain sufficient structural information of large-scale phenotype-associated genes/proteins. Various bioinformatics methods have been presented to solve such problem, and researchers have confirmed the efficacy and prediction accuracy of functional network-based prediction. But general functional descriptions have highly complicated inner structures for phenotype prediction. To further address this issue and improve the efficacy of phenotype prediction on more than ten kinds of phenotypes, we first extract functional enrichment features from GO and KEGG, and then use node2vec to learn functional embedding features of genes from a gene-gene network. All these features are analyzed by some feature selection methods (Boruta, minimum redundancy maximum relevance) to generate a feature list. Such list is fed into the incremental feature selection, incorporating some multi-label classifiers built by RAkEL and some classic base classifiers, to build an optimum multi-label multi-class classification model for phenotype prediction. According to recent researches, our method has indeed identified many literature-supported genes/proteins and their associated phenotypes, and even some candidate genes with re-assigned new phenotypes, which provide a new computational tool for the accurate and effective phenotypic prediction.


Assuntos
Algoritmos , Biologia Computacional/métodos , Estudos de Associação Genética/métodos , Conjuntos de Dados como Assunto , Redes Reguladoras de Genes/fisiologia , Redes e Vias Metabólicas/genética , Fenótipo , Proteínas/química , Proteínas/genética , Proteínas/fisiologia , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/química , Proteínas de Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/fisiologia , Relação Estrutura-Atividade
17.
Genet Sel Evol ; 53(1): 74, 2021 Sep 10.
Artigo em Inglês | MEDLINE | ID: mdl-34507524

RESUMO

BACKGROUND: Goat, one of the first domesticated livestock, is a worldwide important species both culturally and economically. The current goat reference genome, known as ARS1, is reported as the first nonhuman genome assembly using 69× PacBio sequencing. However, ARS1 suffers from incomplete X chromosome and highly fragmented Y chromosome scaffolds. RESULTS: Here, we present a very high-quality de novo genome assembly, Saanen_v1, from a male Saanen dairy goat, with the first goat Y chromosome scaffold based on 117× PacBio long-read sequencing and 118× Hi-C data. Saanen_v1 displays a high level of completeness thanks to the presence of centromeric and telomeric repeats at the proximal and distal ends of two-thirds of the autosomes, and a much reduced number of gaps (169 vs. 773). The completeness and accuracy of the Saanen_v1 genome assembly are also evidenced by more assembled sequences on the chromosomes (2.63 Gb for Saanen_v1 vs. 2.58 Gb for ARS1), a slightly increased mapping ratio for transcriptomic data, and more genes anchored to chromosomes. The eight putative large assembly errors (1 to ~ 7 Mb each) found in ARS1 were amended, and for the first time, the substitution rate of this ruminant Y chromosome was estimated. Furthermore, sequence improvement in Saanen_v1, compared with ARS1, enables us to assign the likely correct positions for 4.4% of the single nucleotide polymorphism (SNP) probes in the widely used GoatSNP50 chip. CONCLUSIONS: The updated goat genome assembly including both sex chromosomes (X and Y) and the autosomes with high-resolution quality will serve as a valuable resource for goat genetic research and applications.


Assuntos
Genoma/genética , Genômica , Cabras/genética , Animais , Cromossomos de Mamíferos/genética , Indústria de Laticínios , Masculino , Polimorfismo de Nucleotídeo Único
18.
Genomics ; 112(6): 4945-4958, 2020 11.
Artigo em Inglês | MEDLINE | ID: mdl-32919019

RESUMO

Coronary artery disease (CAD) is the most common cardiovascular disease. CAD research has greatly progressed during the past decade. mRNA is a traditional and popular pipeline to investigate various disease, including CAD. Compared with mRNA, lncRNA has better stability and thus may serve as a better disease indicator in blood. Investigating potential CAD-related lncRNAs and mRNAs will greatly contribute to the diagnosis and treatment of CAD. In this study, a computational analysis was conducted on patients with CAD by using a comprehensive transcription dataset with combined mRNA and lncRNA expression data. Several machine learning algorithms, including feature selection methods and classification algorithms, were applied to screen for the most CAD-related RNA molecules. Decision rules were also reported to provide a quantitative description about the effect of these RNA molecules on CAD progression. These new findings (CAD-related RNA molecules and rules) can help understand mRNA and lncRNA expression levels in CAD.


Assuntos
Doença da Artéria Coronariana/genética , RNA Longo não Codificante/metabolismo , RNA Mensageiro/metabolismo , Doença da Artéria Coronariana/metabolismo , Perfilação da Expressão Gênica , Humanos , Aprendizado de Máquina
19.
Genomics ; 112(3): 2524-2534, 2020 05.
Artigo em Inglês | MEDLINE | ID: mdl-32045671

RESUMO

The development of embryonic cells involves several continuous stages, and some genes are related to embryogenesis. To date, few studies have systematically investigated changes in gene expression profiles during mammalian embryogenesis. In this study, a computational analysis using machine learning algorithms was performed on the gene expression profiles of mouse embryonic cells at seven stages. First, the profiles were analyzed through a powerful Monte Carlo feature selection method for the generation of a feature list. Second, increment feature selection was applied on the list by incorporating two classification algorithms: support vector machine (SVM) and repeated incremental pruning to produce error reduction (RIPPER). Through SVM, we extracted several latent gene biomarkers, indicating the stages of embryonic cells, and constructed an optimal SVM classifier that produced a nearly perfect classification of embryonic cells. Furthermore, some interesting rules were accessed by the RIPPER algorithm, suggesting different expression patterns for different stages.


Assuntos
Embrião de Mamíferos/metabolismo , Desenvolvimento Embrionário/genética , Aprendizado de Máquina , Transcriptoma , Animais , Perfilação da Expressão Gênica , Camundongos , Análise de Célula Única , Máquina de Vetores de Suporte
20.
Gene Ther ; 26(12): 465-478, 2019 12.
Artigo em Inglês | MEDLINE | ID: mdl-31455874

RESUMO

Oral cancer (OC) is one of the most common cancers threatening human lives. However, OC pathogenesis has yet to be fully uncovered, and thus designing effective treatments remains difficult. Identifying genes related to OC is an important way for achieving this purpose. In this study, we proposed three computational models for inferring novel OC-related genes. In contrast to previously proposed computational methods, which lacked the learning procedures, each proposed model adopted a one-class learning algorithm, which can provide a deep insight into features of validated OC-related genes. A network embedding algorithm (i.e., node2vec) was applied to the protein-protein interaction network to produce the representation of genes. The features of the OC-related genes were used in the training of the one-class algorithm, and the performance of the final inferring model was improved through a feature selection procedure. Then, candidate genes were produced by applying the trained inferring model to other genes. Three tests were performed to screen out the important candidate genes. Accordingly, we obtained three inferred gene sets, any two of which were different. The inferred genes were also different from previous reported genes and some of them have been included in the public Oral Cancer Gene Database. Finally, we analyzed several inferred genes to confirm whether they are novel OC-related genes.


Assuntos
Biologia Computacional/métodos , Redes Reguladoras de Genes , Neoplasias Bucais/genética , Bases de Dados Genéticas , Predisposição Genética para Doença , Humanos , Aprendizado de Máquina , Mapas de Interação de Proteínas
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA