Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 19 de 19
Filtrar
Mais filtros

Base de dados
Tipo de documento
País de afiliação
Intervalo de ano de publicação
1.
Brief Bioinform ; 23(1)2022 01 17.
Artigo em Inglês | MEDLINE | ID: mdl-34953464

RESUMO

Antibodies specifically bind to antigens and are an essential part of the immune system. Hence, antibodies are powerful tools in research and diagnostics. High-throughput sequencing technologies have promoted comprehensive profiling of the immune repertoire, which has resulted in large amounts of antibody sequences that remain to be further analyzed. In this study, antibodies were downloaded from IMGT/LIGM-DB and Sequence Read Archive databases. Contributing features from antibody heavy chains were formulated as numerical inputs and fed into an ensemble machine learning classifier to classify the antigen specificity of six classes of antibodies, namely anti-HIV-1, anti-influenza virus, anti-pneumococcal polysaccharide, anti-citrullinated protein, anti-tetanus toxoid and anti-hepatitis B virus. The classifier was validated using cross-validation and a testing dataset. The ensemble classifier achieved a macro-average area under the receiver operating characteristic curve (AUC) of 0.9246 from the 10-fold cross-validation, and 0.9264 for the testing dataset. Among the contributing features, the contribution of the complementarity-determining regions was 53.1% and that of framework regions was 46.9%, and the amino acid mutation rates occupied the first and second ranks among the top five contributing features. The classifier and insights provided in this study could promote the mechanistic study, isolation and utilization of potential therapeutic antibodies.


Assuntos
Sequência de Aminoácidos , Anticorpos/química , Aprendizado de Máquina , Especificidade de Anticorpos , Regiões Determinantes de Complementaridade , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Curva ROC
2.
BMC Public Health ; 24(1): 1454, 2024 May 30.
Artigo em Inglês | MEDLINE | ID: mdl-38816699

RESUMO

BACKGROUND: Various measures taken against the COVID-19 pandemic are not only effective in reducing the spread of the disease, but also lead to some unexpected results. This article regarded these measures as an intervention and explored their impact on the incidence of tuberculosis in Shantou, China. METHODS: The incidence rate and the surveillance data of tuberculosis from January 1st, 2018 to December 31st, 2021 were provided by the Shantou Tuberculosis Prevention and Control Institute. Data were divided into pre-pandemic period (January 1st, 2018 - December 31st, 2019) and pandemic periods (January 1st, 2020 - December 31st, 2021). The Interrupted Time Series (ITS) was used to analyze the trend of tuberculosis incidence prior to and during the COVID-19 epidemic. RESULTS: The results showed that the incidence of tuberculosis cases in Shantou decreased significantly (p < 0.05) during the pandemic as compared to that prior to the pandemic. Among them, the 45-64 age group and the 65 + age group have statistically significant declines. When patients were stratified by occupation, the unemployed and those working in agriculture reduced the most. CONCLUSIONS: In response to the pandemic, measures like lockdowns and quarantines seem to have reduced tuberculosis incidence. However, this does not imply a true decrease. Underlying causes for the reduced true incidence need further scrutiny. Findings offer a preliminary exploration of interventions designed for one disease but functioning as unexpected results for another.


Assuntos
COVID-19 , Tuberculose , Humanos , China/epidemiologia , COVID-19/epidemiologia , COVID-19/prevenção & controle , Incidência , Tuberculose/epidemiologia , Tuberculose/prevenção & controle , Adulto , Pessoa de Meia-Idade , Masculino , Feminino , Idoso , Adulto Jovem , Adolescente , Quarentena , Pandemias , Análise de Séries Temporais Interrompida , SARS-CoV-2 , Controle de Doenças Transmissíveis/métodos
3.
BMC Infect Dis ; 22(1): 331, 2022 Apr 04.
Artigo em Inglês | MEDLINE | ID: mdl-35379168

RESUMO

BACKGROUND: A range of strict nonpharmaceutical interventions (NPIs) were implemented in many countries to combat the coronavirus 2019 (COVID-19) pandemic. These NPIs may also be effective at controlling seasonal influenza virus infections, as influenza viruses have the same transmission path as severe acute respiratory syndrome coronavirus 2. The aim of this study was to evaluate the effects of different NPIs on the control of seasonal influenza. METHODS: Data for 14 NPIs implemented in 33 countries and the corresponding influenza virological surveillance data were collected. The influenza suppression index was calculated as the difference between the influenza positivity rate during its period of decline from 2019 to 2020 and during the influenza epidemic seasons in the previous 9 years. A machine learning model was developed using an extreme gradient boosting tree regressor to fit the NPI and influenza suppression index data. The SHapley Additive exPlanations tool was used to characterize the NPIs that suppressed the transmission of influenza. RESULTS: Of all NPIs tested, gathering limitations had the greatest contribution (37.60%) to suppressing influenza transmission during the 2019-2020 influenza season. The three most effective NPIs were gathering limitations, international travel restrictions, and school closures. For these three NPIs, their intensity threshold required to generate an effect were restrictions on the size of gatherings less than 1000 people, ban of travel to all regions or total border closures, and closing only some categories of schools, respectively. There was a strong positive interaction effect between mask-wearing requirements and gathering limitations, whereas merely implementing a mask-wearing requirement, and not other NPIs, diluted the effectiveness of mask-wearing requirements at suppressing influenza transmission. CONCLUSIONS: Gathering limitations, ban of travel to all regions or total border closures, and closing some levels of schools were found to be the most effective NPIs at suppressing influenza transmission. It is recommended that the mask-wearing requirement be combined with gathering limitations and other NPIs. Our findings could facilitate the precise control of future influenza epidemics and other potential pandemics.


Assuntos
COVID-19 , Vírus da Influenza A Subtipo H1N1 , Influenza Humana , COVID-19/epidemiologia , COVID-19/prevenção & controle , Humanos , Influenza Humana/epidemiologia , Influenza Humana/prevenção & controle , Pandemias/prevenção & controle , Estações do Ano
4.
BMC Bioinformatics ; 22(Suppl 3): 340, 2021 Jun 23.
Artigo em Inglês | MEDLINE | ID: mdl-34162327

RESUMO

BACKGROUND: Antifreeze proteins (AFPs) are a group of proteins that inhibit body fluids from growing to ice crystals and thus improve biological antifreeze ability. It is vital to the survival of living organisms in extremely cold environments. However, little research is performed on sequences feature extraction and selection for antifreeze proteins classification in the structure and function prediction, which is of great significance. RESULTS: In this paper, to predict the antifreeze proteins, a feature representation of weighted generalized dipeptide composition (W-GDipC) and an ensemble feature selection based on two-stage and multi-regression method (LRMR-Ri) are proposed. Specifically, four feature selection algorithms: Lasso regression, Ridge regression, Maximal information coefficient and Relief are used to select the feature sets, respectively, which is the first stage of LRMR-Ri method. If there exists a common feature subset among the above four sets, it is the optimal subset; otherwise we use Ridge regression to select the optimal subset from the public set pooled by the four sets, which is the second stage of LRMR-Ri. The LRMR-Ri method combined with W-GDipC was performed both on the antifreeze proteins dataset (binary classification), and on the membrane protein dataset (multiple classification). Experimental results show that this method has good performance in support vector machine (SVM), decision tree (DT) and stochastic gradient descent (SGD). The values of ACC, RE and MCC of LRMR-Ri and W-GDipC with antifreeze proteins dataset and SVM classifier have reached as high as 95.56%, 97.06% and 0.9105, respectively, much higher than those of each single method: Lasso, Ridge, Mic and Relief, nearly 13% higher than single Lasso for ACC. CONCLUSION: The experimental results show that the proposed LRMR-Ri and W-GDipC method can significantly improve the accuracy of antifreeze proteins prediction compared with other similar single feature methods. In addition, our method has also achieved good results in the classification and prediction of membrane proteins, which verifies its widely reliability to a certain extent.


Assuntos
Dipeptídeos , Máquina de Vetores de Suporte , Algoritmos , Proteínas Anticongelantes/genética , Reprodutibilidade dos Testes
5.
BMC Genomics ; 21(1): 597, 2020 Aug 28.
Artigo em Inglês | MEDLINE | ID: mdl-32859150

RESUMO

BACKGROUND: Antimicrobial resistance is one of our most serious health threats. Antimicrobial peptides (AMPs), effecter molecules of innate immune system, can defend host organisms against microbes and most have shown a lowered likelihood for bacteria to form resistance compared to many conventional drugs. Thus, AMPs are gaining popularity as better substitute to antibiotics. To aid researchers in novel AMPs discovery, we design computational approaches to screen promising candidates. RESULTS: In this work, we design a deep learning model that can learn amino acid embedding patterns, automatically extract sequence features, and fuse heterogeneous information. Results show that the proposed model outperforms state-of-the-art methods on recognition of AMPs. By visualizing data in some layers of the model, we overcome the black-box nature of deep learning, explain the working mechanism of the model, and find some import motifs in sequences. CONCLUSIONS: ACEP model can capture similarity between amino acids, calculate attention scores for different parts of a peptide sequence in order to spot important parts that significantly contribute to final predictions, and automatically fuse a variety of heterogeneous information or features. For high-throughput AMPs recognition, open source software and datasets are made freely available at https://github.com/Fuhaoyi/ACEP .


Assuntos
Aminoácidos , Peptídeos Catiônicos Antimicrobianos , Antibacterianos , Proteínas Citotóxicas Formadoras de Poros , Software
6.
J Med Internet Res ; 22(11): e23853, 2020 11 11.
Artigo em Inglês | MEDLINE | ID: mdl-33098287

RESUMO

BACKGROUND: The novel COVID-19 disease has spread worldwide, resulting in a new pandemic. The Chinese government implemented strong intervention measures in the early stage of the epidemic, including strict travel bans and social distancing policies. Prioritizing the analysis of different contributing factors to outbreak outcomes is important for the precise prevention and control of infectious diseases. We proposed a novel framework for resolving this issue and applied it to data from China. OBJECTIVE: This study aimed to systematically identify national-level and city-level contributing factors to the control of COVID-19 in China. METHODS: Daily COVID-19 case data and related multidimensional data, including travel-related, medical, socioeconomic, environmental, and influenza-like illness factors, from 343 cities in China were collected. A correlation analysis and interpretable machine learning algorithm were used to evaluate the quantitative contribution of factors to new cases and COVID-19 growth rates during the epidemic period (ie, January 17 to February 29, 2020). RESULTS: Many factors correlated with the spread of COVID-19 in China. Travel-related population movement was the main contributing factor for new cases and COVID-19 growth rates in China, and its contributions were as high as 77% and 41%, respectively. There was a clear lag effect for travel-related factors (previous vs current week: new cases, 45% vs 32%; COVID-19 growth rates, 21% vs 20%). Travel from non-Wuhan regions was the single factor with the most significant impact on COVID-19 growth rates (contribution: new cases, 12%; COVID-19 growth rate, 26%), and its contribution could not be ignored. City flow, a measure of outbreak control strength, contributed 16% and 7% to new cases and COVID-19 growth rates, respectively. Socioeconomic factors also played important roles in COVID-19 growth rates in China (contribution, 28%). Other factors, including medical, environmental, and influenza-like illness factors, also contributed to new cases and COVID-19 growth rates in China. Based on our analysis of individual cities, compared to Beijing, population flow from Wuhan and internal flow within Wenzhou were driving factors for increasing the number of new cases in Wenzhou. For Chongqing, the main contributing factor for new cases was population flow from Hubei, beyond Wuhan. The high COVID-19 growth rates in Wenzhou were driven by population-related factors. CONCLUSIONS: Many factors contributed to the COVID-19 outbreak outcomes in China. The differential effects of various factors, including specific city-level factors, emphasize the importance of precise, targeted strategies for controlling the COVID-19 outbreak and future infectious disease outbreaks.


Assuntos
COVID-19/epidemiologia , Surtos de Doenças/estatística & dados numéricos , China/epidemiologia , Análise Fatorial , Humanos
7.
BMC Bioinformatics ; 20(Suppl 25): 700, 2019 Dec 24.
Artigo em Inglês | MEDLINE | ID: mdl-31874615

RESUMO

BACKGROUND: Membrane proteins play an important role in the life activities of organisms. Knowing membrane protein types provides clues for understanding the structure and function of proteins. Though various computational methods for predicting membrane protein types have been developed, the results still do not meet the expectations of researchers. RESULTS: We propose two deep learning models to process sequence information and evolutionary information, respectively. Both models obtained better results than traditional machine learning models. Furthermore, to improve the performance of the sequence information model, we also provide a new vector representation method to replace the one-hot encoding, whose overall success rate improved by 3.81% and 6.55% on two datasets. Finally, a more effective model is obtained by fusing the above two models, whose overall success rate reached 95.68% and 92.98% on two datasets. CONCLUSION: The final experimental results show that our method is more effective than existing methods for predicting membrane protein types, which can help laboratory researchers to identify the type of novel membrane proteins.


Assuntos
Aprendizado Profundo , Proteínas de Membrana/química , Biologia Computacional , Análise de Sequência
8.
J Basic Microbiol ; 59(10): 1040-1048, 2019 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-31469176

RESUMO

Denitrification is a key nitrogen removal process that involves many denitrifying bacteria. In this study, the denitrification performance was estimated for soil samples from different land use types including farmland soil, restored wetland soil, and wetland soil. The quantitative real-time polymerase chain reaction results showed that the average abundance of nirS and nirK genes was notably affected by seasonal changes, increasing from 2.34 × 10 6 and 2.81 × 10 6 to 1.97 × 10 6 and 4.55 × 10 6 gene copies/g of dry soil, respectively, from autumn to spring. This suggests that the abundance of nirS and nirK denitrifiers in spring is higher than those in autumn. Furthermore, the abundance of nirS and nirK genes was higher in the farmland soil than in restored wetland soil and wetland soil in both seasons. According to the analyses of MiSeq sequencing of nirS and nirK genes, Halobacteriaceae could be used as a special strain to distinguish wetland soil from farmland soil and restored wetland soil. Furthermore, redundancy analysis indicated that the soil environmental variables of total carbon, total nitrogen, moisture content, and organic matter were the main factors affecting the community structures of nirS and nirK denitrifiers existing in wetland soil. These findings could contribute to understanding the differences in nirS and nirK denitrifiers between different land use types during seasonal changes.


Assuntos
Bactérias/classificação , Bactérias/metabolismo , Desnitrificação/genética , Microbiologia do Solo , Bactérias/genética , Bactérias/isolamento & purificação , Biodiversidade , China , Fazendas , Genes Bacterianos/genética , Nitrito Redutases/genética , Estações do Ano , Solo/química , Áreas Alagadas
9.
Child Adolesc Psychiatry Ment Health ; 18(1): 116, 2024 Sep 12.
Artigo em Inglês | MEDLINE | ID: mdl-39267097

RESUMO

BACKGROUND: Poly-victimization (PV) not only threatens physical and mental health but also causes a range of social problems. Left-behind children in rural areas are more likely to experience PV problems. However, there have been fewer studies on PV among rural children, and even fewer intervention studies. OBJECTIVE: The difference-in-differences method was employed to analyze the impact of intervention measures, based on the theory of planned behavior, on PV among left-behind children in rural areas. METHODS: The study subjects were left-behind children from six middle schools in two cities in southern China, who completed the baseline survey from 2020 to 2021. They were divided into a control group and an intervention group, each consisting of 228 cases, based on their schools. Before and after the intervention, the Self-made victimization-related knowledge, attitude, and practice questionnaire, Poly-victimization scale, and Middle school students' coping style scale were used to evaluate the victimization-related KAP(knowledge, attitude, and practice), victimization occurrence, and coping styles of left-behind children, respectively. Stata 15.0 was used to establish a difference-in-differences regression model to analyze the impact of the intervention measures on poly-victimization and coping styles. RESULTS: Mixed Anova revealed that after the intervention, the KAP scores of the intervention group were significantly higher than those of the control group (p < 0.05). After the intervention, the incidence of child victimization in the intervention group dropped to 9.60% (n = 22), lower than in the baseline survey, with a statistically significant difference (p < 0.01). The incidence of PV among children in the intervention group was lower than that in the control group, with the difference being statistically significant (p < 0.01). The net reduction in the incidence of PV among children was 21.20%. After the intervention, the protection rate for preventing PV among children was 73.33%, and the effect index was 3.75. The intervention improved children's coping styles, problem-solving, and help-seeking, while reducing negative coping styles such as avoidance and venting, with the differences being statistically significant (p < 0.05). CONCLUSION: Intervention measures based on the theory of planned behavior reduce the occurrence of PV among left-behind children, and the intervention effects on different types of victimization are also different.

10.
J Infect Public Health ; 17(6): 1086-1094, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38705061

RESUMO

BACKGROUND: The prevalence of different types/subtypes varies across seasons and countries for seasonal influenza viruses, indicating underlying interactions between types/subtypes. The global interaction patterns and determinants for seasonal influenza types/subtypes need to be explored. METHODS: Influenza epidemiological surveillance data, as well as multidimensional data that include population-related, environment-related, and virus-related factors from 55 countries worldwide were used to explore type/subtype interactions based on Spearman correlation coefficient. The machine learning method Extreme Gradient Boosting (XGBoost) and interpretable framework SHapley Additive exPlanation (SHAP) were utilized to quantify contributing factors and their effects on interactions among influenza types/subtypes. Additionally, causal relationships between types/subtypes were also explored based on Convergent Cross-mapping (CCM). RESULTS: A consistent globally negative correlation exists between influenza A/H3N2 and A/H1N1. Meanwhile, interactions between influenza A (A/H3N2, A/H1N1) and B show significant differences across countries, primarily influenced by population-related factors. Influenza A has a stronger driving force than influenza B, and A/H3N2 has a stronger driving force than A/H1N1. CONCLUSION: The research elucidated the globally complex and heterogeneous interaction patterns among influenza type/subtypes, identifying key factors shaping their interactions. This sheds light on better seasonal influenza prediction and model construction, informing targeted prevention strategies and ultimately reducing the global burden of seasonal influenza.


Assuntos
Saúde Global , Vírus da Influenza A Subtipo H1N1 , Vírus da Influenza A Subtipo H3N2 , Vírus da Influenza B , Influenza Humana , Estações do Ano , Humanos , Influenza Humana/epidemiologia , Influenza Humana/virologia , Aprendizado de Máquina , Monitoramento Epidemiológico , Prevalência
11.
Front Surg ; 10: 1047558, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36936651

RESUMO

Objective: Postoperative red blood cell (RBC) transfusion is widely used during the perioperative period but is often associated with a high risk of infection and complications. However, prediction models for RBC transfusion in patients with orthopedic surgery have not yet been developed. We aimed to identify predictors and constructed prediction models for RBC transfusion after orthopedic surgery using interpretable machine learning algorithms. Methods: This retrospective cohort study reviewed a total of 59,605 patients undergoing orthopedic surgery from June 2013 to January 2019 across 7 tertiary hospitals in China. Patients were randomly split into training (80%) and test subsets (20%). The feature selection method of recursive feature elimination (RFE) was used to identify an optimal feature subset from thirty preoperative variables, and six machine learning algorithms were applied to develop prediction models. The Shapley Additive exPlanations (SHAP) value was employed to evaluate the contribution of each predictor towards the prediction of postoperative RBC transfusion. For simplicity of the clinical utility, a risk score system was further established using the top risk factors identified by machine learning models. Results: Of the 59,605 patients with orthopedic surgery, 19,921 (33.40%) underwent postoperative RBC transfusion. The CatBoost model exhibited an AUC of 0.831 (95% CI: 0.824-0.836) on the test subset, which significantly outperformed five other prediction models. The risk of RBC transfusion was associated with old age (>60 years) and low RBC count (<4.0 × 1012/L) with clear threshold effects. Extremes of BMI, low albumin, prolonged activated partial thromboplastin time, repair and plastic operations on joint structures were additional top predictors for RBC transfusion. The risk score system derived from six risk factors performed well with an AUC of 0.801 (95% CI: 0.794-0.807) on the test subset. Conclusion: By applying an interpretable machine learning framework in a large-scale multicenter retrospective cohort, we identified novel modifiable risk factors and developed prediction models with good performance for postoperative RBC transfusion in patients undergoing orthopedic surgery. Our findings may allow more precise identification of high-risk patients for optimal control of risk factors and achieve personalized RBC transfusion for orthopedic patients.

12.
Front Immunol ; 13: 1048774, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36713410

RESUMO

Introduction: Influenza susceptibility difference is a widely existing trait that has great practical significance for the accurate prevention and control of influenza. Methods: Here, we focused on the human susceptibility to the seasonal influenza A/H3N2 of healthy adults at baseline level. Whole blood expression data for influenza A/H3N2 susceptibility from GEO were collected firstly (30 symptomatic and 19 asymptomatic). Then to explore the differences at baseline, a suite of systems biology approaches - the differential expression analysis, co-expression network analysis, and immune cell frequencies analysis were utilized. Results: We found the baseline condition, especially immune condition between symptomatic and asymptomatic, was different. Co-expression module that is positively related to asymptomatic is also related to immune cell type of naïve B cell. Function enrichment analysis showed significantly correlation with "B cell receptor signaling pathway", "immune response-activating cell surface receptor signaling pathway" and so on. Also, modules that are positively related to symptomatic are also correlated to immune cell type of neutrophils, with function enrichment analysis showing significantly correlations with "response to bacterium", "inflammatory response", "cAMP-dependent protein kinase complex" and so on. Responses of symptomatic and asymptomatic hosts after virus exposure show differences on resisting the virus, with more effective frontline defense for asymptomatic hosts. A prediction model was also built based on only baseline transcription information to differentiate symptomatic and asymptomatic population with accuracy of 0.79. Discussion: The results not only improve our understanding of the immune system and influenza susceptibility, but also provide a new direction for precise and targeted prevention and therapy of influenza.


Assuntos
Influenza Humana , Adulto , Humanos , Vírus da Influenza A Subtipo H3N2/genética , Transcriptoma , Estações do Ano
13.
Transbound Emerg Dis ; 69(5): e1584-e1594, 2022 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-35192224

RESUMO

Coronavirus disease 2019 (COVID-19) has become a global pandemic and continues to prevail with multiple rebound waves in many countries. The driving factors for the spread of COVID-19 and their quantitative contributions, especially to rebound waves, are not well studied. Multidimensional time-series data, including policy, travel, medical, socioeconomic, environmental, mutant and vaccine-related data, were collected from 39 countries up to 30 June 2021, and an interpretable machine learning framework (XGBoost model with Shapley Additive explanation interpretation) was used to systematically analyze the effect of multiple factors on the spread of COVID-19, using the daily effective reproduction number as an indicator. Based on a model of the pre-vaccine era, policy-related factors were shown to be the main drivers of the spread of COVID-19, with a contribution of 60.81%. In the post-vaccine era, the contribution of policy-related factors decreased to 28.34%, accompanied by an increase in the contribution of travel-related factors, such as domestic flights, and contributions emerged for mutant-related (16.49%) and vaccine-related (7.06%) factors. For single-peak countries, the dominant ones were policy-related factors during both the rising and fading stages, with overall contributions of 33.7% and 37.7%, respectively. For double-peak countries, factors from the rebound stage contributed 45.8% and policy-related factors showed the greatest contribution in both the rebound (32.6%) and fading (25.0%) stages. For multiple-peak countries, the Delta variant, domestic flights (current month) and the daily vaccination population are the three greatest contributors (8.12%, 7.59% and 7.26%, respectively). Forecasting models to predict the rebound risk were built based on these findings, with accuracies of 0.78 and 0.81 for the pre- and post-vaccine eras, respectively. These findings quantitatively demonstrate the systematic drivers of the spread of COVID-19, and the framework proposed in this study will facilitate the targeted prevention and control of the ongoing COVID-19 pandemic.


Assuntos
COVID-19 , Pandemias , Animais , COVID-19/epidemiologia , COVID-19/veterinária , Aprendizado de Máquina , Pandemias/prevenção & controle , SARS-CoV-2 , Viagem , Doença Relacionada a Viagens
14.
Artigo em Inglês | MEDLINE | ID: mdl-31352350

RESUMO

Cell-penetrating peptides (CPPs) are functional short peptides with high carrying capacity. CPP sequences with targeting functions for the highly efficient delivery of drugs to target cells. In this paper, which is focused on the prediction of the cargo category of CPPs, a biocomputational model is constructed to efficiently distinguish the category of cargo carried by CPPs as macromolecular carriers among the seven known deliverable cargo categories. Based on dipeptide composition (DipC), an improved feature representation method, general dipeptide composition (G-DipC) is proposed for short peptide sequences and can effectively increase the abundance of features represented. Then linear discriminant analysis (LDA) is applied to mine some important low-dimensional features of G-DipC and a predictive model is built with the XGBoost algorithm. Experimental results with five-fold cross validation show that G-DipC improves accuracy by 25 and 5 percent compared with amino acid composition (AAC) and DipC, respectively. G-DipC is even found to be better than tripeptide composition (TipC). Thus, the proposed model provides a novel resource for the study of cell-penetrating peptides, and the improved dipeptide composition G-DipC can be widely adapted to determine the feature representation of other biological sequences.


Assuntos
Peptídeos Penetradores de Células , Biologia Computacional/métodos , Aprendizado de Máquina , Algoritmos , Peptídeos Penetradores de Células/química , Peptídeos Penetradores de Células/metabolismo , Análise Discriminante , Modelos Estatísticos , Máquina de Vetores de Suporte
15.
Viruses ; 12(10)2020 10 03.
Artigo em Inglês | MEDLINE | ID: mdl-33022948

RESUMO

Characterizing the spatial transmission pattern is critical for better surveillance and control of human influenza. Here, we propose a mutation network framework that utilizes network theory to study the transmission of human influenza H3N2. On the basis of the mutation network, the transmission analysis captured the circulation pattern from a global simulation of human influenza H3N2. Furthermore, this method was applied to explore, in detail, the transmission patterns within Europe, the United States, and China, revealing the regional spread of human influenza H3N2. The mutation network framework proposed here could facilitate the understanding, surveillance, and control of other infectious diseases.


Assuntos
Vírus da Influenza A Subtipo H3N2/genética , Influenza Humana/transmissão , Influenza Humana/virologia , Mutação , China , Europa (Continente) , Humanos , Vírus da Influenza A Subtipo H3N2/classificação , Filogenia , Estados Unidos
16.
Emerg Microbes Infect ; 9(1): 988-990, 2020 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-32321369

RESUMO

Since Dec 2019, China has experienced an outbreak caused by a novel coronavirus, 2019-nCoV. A travel ban was implemented for Wuhan, Hubei on Jan 23 to slow down the outbreak. We found a significant positive correlation between population influx from Wuhan and confirmed cases in other cities across China (R2 = 0.85, P < 0.001), especially cities in Hubei (R2 = 0.88, P < 0.001). Removing the travel restriction would have increased 118% (91%-172%) of the overall cases for the coming week, and a travel ban taken three days or a week earlier would have reduced 47% (26%-58%) and 83% (78%-89%) of the early cases. We would expect a 61% (48%-92%) increase of overall cumulative cases without any restrictions on returning residents, and 11% (8%-16%) increase if the travel ban stays in place for Hubei. Cities from Yangtze River Delta, Pearl River Delta, and Capital Economic Circle regions are at higher risk.


Assuntos
Infecções por Coronavirus/epidemiologia , Pneumonia Viral/epidemiologia , Viagem/estatística & dados numéricos , Betacoronavirus/isolamento & purificação , COVID-19 , China/epidemiologia , Infecções por Coronavirus/transmissão , Humanos , Pandemias , Pneumonia Viral/transmissão , SARS-CoV-2 , Viagem/legislação & jurisprudência
17.
Med Biol Eng Comput ; 57(12): 2553-2565, 2019 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-31621050

RESUMO

Apoptosis proteins are related to many diseases. Obtaining the subcellular localization information of apoptosis proteins is helpful to understand the mechanism of diseases and to develop new drugs. At present, the researchers mainly focus on the primary protein sequences, so there is still room for improvement in the prediction accuracy of the subcellular localization of apoptosis proteins. In this paper, a new method named ERT-ECT-PSSM-IS is proposed to predict apoptosis proteins based on the position-specific scoring matrix (PSSM). First, the local and global features of different directions are extracted by evolutionary row transformation (ERT) and cross-covariance of evolutionary column transformation (ECT) based on PSSM (ERT-ECT-PSSM). Second, an improved isometric mapping algorithm (I-SMA) is used to eliminate redundant features. Finally, we adopt a support vector machine (SVM) to classify our results, and the prediction accuracy is evaluated by jackknife cross-validation tests. The experimental results show that the proposed method not only extracts more abundant feature expression but also has better predictive performance and robustness for the subcellular localization of apoptosis proteins in ZD98, ZW225, and CL317 databases. Graphical abstract Framework of the proposed prediction model.


Assuntos
Proteínas Reguladoras de Apoptose/metabolismo , Apoptose/fisiologia , Algoritmos , Biologia Computacional/métodos , Matrizes de Pontuação de Posição Específica , Máquina de Vetores de Suporte
18.
Comput Biol Chem ; 81: 9-15, 2019 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-31472418

RESUMO

Position-Specific Scoring Matrix (PSSM) is an excellent feature extraction method that was proposed early in protein classifying prediction, but within the restriction of feature shape in PSSM, researchers make a lot attempts to process it so that PSSM can be input to the traditional machine learning algorithms. These processes drop information provided by PSSM in a way thus the feature representation is limited. Moreover, the high-dimensional feature representation of PSSM makes it incompatible with other feature extraction methods. We use the PSSM as the input of Recurrent Neural Network without any post-processing, the amino acids in protein sequences are regarded as time step in RNN. This way takes full advantage of the information that PSSM provides. In this study, the PSSM is input to the model directly and the internal information of PSSM is fully utilized, we propose an end-to-end solution and achieve state-of-the-art performance. Ultimately, the exploration of how to combine PSSM with traditional feature extraction methods is carried out and achieve slightly improved performance. Our network architecture is implemented in Python and is available at https://github.com/YellowcardD/RNN-for-membrane-protein-types-prediction.


Assuntos
Proteínas de Membrana/classificação , Redes Neurais de Computação , Matrizes de Pontuação de Posição Específica , Biologia Computacional/métodos , Bases de Dados de Proteínas/estatística & dados numéricos , Proteínas de Membrana/química
19.
Sci Rep ; 7(1): 4503, 2017 07 03.
Artigo em Inglês | MEDLINE | ID: mdl-28674446

RESUMO

The in-situ stress state in the Tarim Basin, Northwest China, down to 7 km depth is constrained using the anelastic strain recovery (ASR) method and wellbore failure analysis. Results are consistent between the two methods, and indicate that the maximum principal stresses (σ1) are close to vertical and the intermediate and minimum principal stresses (σ2 and σ3) are approximately horizontal. The states of stress at the studied wellbore is in the normal faulting stress regime within the Tarim Basin rather than in the compressional tectonic stress regime as in the periphery of the Tarim Basin, which explains the presence of the normal faults interpreted in 3-D seismic profiles collected from adjacent areas. Our results demonstrate that the ASR method can be used for rocks recovered from depths as deep as 7 km to recover reliable stress state information. The in-situ stress measurement results revealed in this paper will help future development of the petroleum resources and kinematics study in the Tarim Basin.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA