Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 24
Filtrar
1.
Nucleic Acids Res ; 51(15): 8005-8019, 2023 08 25.
Artículo en Inglés | MEDLINE | ID: mdl-37283060

RESUMEN

Broad-host-range (BHR) plasmids in human gut bacteria are of considerable interest for their ability to mediate horizontal gene transfer (HGT) across large phylogenetic distance. However, the human gut plasmids, especially the BHR plasmids, remain largely unknown. Here, we identified the plasmids in the draft genomes of gut bacterial isolates from Chinese and American donors, resulting in 5372 plasmid-like clusters (PLCs), of which, 820 PLCs (comPLCs) were estimated with > 60% completeness genomes and only 155 (18.9%) were classified to known replicon types (n = 37). We observed that 175 comPLCs had a broad host range across distinct bacterial genera, of which, 71 were detected in at least two human populations of Chinese, American, Spanish, and Danish, and 13 were highly prevalent (>10%) in at least one human population. Haplotype analyses of two widespread PLCs demonstrated their spreading and evolutionary trajectory, suggesting frequent and recent exchanges of the BHR plasmids in environments. In conclusion, we obtained a large collection of plasmid sequences in human gut bacteria and demonstrated that a subset of the BHR plasmids can be transmitted globally, thus facilitating extensive HGT (e.g. antibiotic resistance genes) events. This study highlights the potential implications of the plasmids for global human health.


Asunto(s)
Microbioma Gastrointestinal , Humanos , Microbioma Gastrointestinal/genética , Filogenia , Especificidad del Huésped , Plásmidos/genética , Bacterias/genética , Transferencia de Gen Horizontal/genética
2.
Brief Bioinform ; 23(1)2022 01 17.
Artículo en Inglés | MEDLINE | ID: mdl-34953464

RESUMEN

Antibodies specifically bind to antigens and are an essential part of the immune system. Hence, antibodies are powerful tools in research and diagnostics. High-throughput sequencing technologies have promoted comprehensive profiling of the immune repertoire, which has resulted in large amounts of antibody sequences that remain to be further analyzed. In this study, antibodies were downloaded from IMGT/LIGM-DB and Sequence Read Archive databases. Contributing features from antibody heavy chains were formulated as numerical inputs and fed into an ensemble machine learning classifier to classify the antigen specificity of six classes of antibodies, namely anti-HIV-1, anti-influenza virus, anti-pneumococcal polysaccharide, anti-citrullinated protein, anti-tetanus toxoid and anti-hepatitis B virus. The classifier was validated using cross-validation and a testing dataset. The ensemble classifier achieved a macro-average area under the receiver operating characteristic curve (AUC) of 0.9246 from the 10-fold cross-validation, and 0.9264 for the testing dataset. Among the contributing features, the contribution of the complementarity-determining regions was 53.1% and that of framework regions was 46.9%, and the amino acid mutation rates occupied the first and second ranks among the top five contributing features. The classifier and insights provided in this study could promote the mechanistic study, isolation and utilization of potential therapeutic antibodies.


Asunto(s)
Secuencia de Aminoácidos , Anticuerpos/química , Aprendizaje Automático , Especificidad de Anticuerpos , Regiones Determinantes de Complementariedad , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Curva ROC
3.
BMC Bioinformatics ; 17: 142, 2016 Mar 23.
Artículo en Inglés | MEDLINE | ID: mdl-27006077

RESUMEN

BACKGROUND: High-throughput bio-OMIC technologies are producing high-dimension data from bio-samples at an ever increasing rate, whereas the training sample number in a traditional experiment remains small due to various difficulties. This "large p, small n" paradigm in the area of biomedical "big data" may be at least partly solved by feature selection algorithms, which select only features significantly associated with phenotypes. Feature selection is an NP-hard problem. Due to the exponentially increased time requirement for finding the globally optimal solution, all the existing feature selection algorithms employ heuristic rules to find locally optimal solutions, and their solutions achieve different performances on different datasets. RESULTS: This work describes a feature selection algorithm based on a recently published correlation measurement, Maximal Information Coefficient (MIC). The proposed algorithm, McTwo, aims to select features associated with phenotypes, independently of each other, and achieving high classification performance of the nearest neighbor algorithm. Based on the comparative study of 17 datasets, McTwo performs about as well as or better than existing algorithms, with significantly reduced numbers of selected features. The features selected by McTwo also appear to have particular biomedical relevance to the phenotypes from the literature. CONCLUSION: McTwo selects a feature subset with very good classification performance, as well as a small feature number. So McTwo may represent a complementary feature selection algorithm for the high-dimensional biomedical datasets.


Asunto(s)
Algoritmos , Bases de Datos Factuales , Humanos , Programas Informáticos
4.
Adv Exp Med Biol ; 827: 261-74, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-25387969

RESUMEN

All the cell types are under strict control of how their genes are transcribed into expressed transcripts by the temporally dynamic orchestration of the transcription factor binding activities. Given a set of known binding sites (BSs) of a given transcription factor (TF), computational TFBS screening technique represents a cost efficient and large scale strategy to complement the experimental ones. There are two major classes of computational TFBS prediction algorithms based on the tertiary and primary structures, respectively. A tertiary structure based algorithm tries to calculate the binding affinity between a query DNA fragment and the tertiary structure of the given TF. Due to the limited number of available TF tertiary structures, primary structure based TFBS prediction algorithm is a necessary complementary technique for large scale TFBS screening. This study proposes a novel evolutionary algorithm to randomly mutate the weights of different positions in the binding motif of a TF, so that the overall TFBS prediction accuracy is optimized. The comparison with the most widely used algorithm, Position Weight Matrix (PWM), suggests that our algorithm performs better or the same level in all the performance measurements, including sensitivity, specificity, accuracy and Matthews correlation coefficient. Our data also suggests that it is necessary to remove the widely used assumption of independence between motif positions. The supplementary material may be found at: http://www.healthinformaticslab.org/supp/ .


Asunto(s)
Evolución Biológica , Factores de Transcripción/metabolismo , Algoritmos , Sitios de Unión
5.
Genomics ; 103(1): 48-55, 2014 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-24239985

RESUMEN

Psoriasis is an autoimmune disease, which symptoms can significantly impair the patient's life quality. It is mainly diagnosed through the visual inspection of the lesion skin by experienced dermatologists. Currently no cure for psoriasis is available due to limited knowledge about its pathogenesis and development mechanisms. Previous studies have profiled hundreds of differentially expressed genes related to psoriasis, however with no robust psoriasis prediction model available. This study integrated the knowledge of three feature selection algorithms that revealed 21 features belonging to 18 genes as candidate markers. The final psoriasis classification model was established using the novel Incremental Feature Selection algorithm that utilizes only 3 features from 2 unique genes, IGFL1 and C10orf99. This model has demonstrated highly stable prediction accuracy (averaged at 99.81%) over three independent validation strategies. The two marker genes, IGFL1 and C10orf99, were revealed as the upstream components of growth signal transduction pathway of psoriatic pathogenesis.


Asunto(s)
Modelos Genéticos , Psoriasis/diagnóstico , Psoriasis/genética , Transcriptoma , Algoritmos , Inteligencia Artificial , Estudios de Casos y Controles , Proliferación Celular , Bases de Datos Factuales , Perfilación de la Expresión Génica , Marcadores Genéticos , Humanos , Análisis por Micromatrices , Psoriasis/clasificación , Curva ROC , Transducción de Señal/genética , Piel/citología , Piel/patología
6.
Mol Biol Rep ; 41(9): 5883-9, 2014 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-25108673

RESUMEN

Mycobacterium tuberculosis (M. tuberculosis) is one of the most widely spread human pathogenic bacteria, and it frequently exchanges pathogenesis genes among its strains or with other pathogenic microbes. The purpose of this study was to screen the pathogenicity islands (PAIs) in M. tuberculosis using the genomic barcode visualization technique and to characterize the functions of the detected PAIs. By visually screening the barcode image of the M. tuberculosis chromosomes, three candidate PAIs were detected as MPI-1, MPI-2 and MPI-3, among which MPI-2 and MPI-3 were known to harbor pathogenesis genes, and MPI-1 represents a novel candidate. Based on the functional annotations of Pfam domains and GO categories, both MPI-2 and MPI-3 carry genes encoding PE/PPE family proteins, MPI-2 encodes the type VII secretion system, and MPI-3 encodes genes for mycolic acid synthesis in the cell wall. Some of these genes were already widely used in early diagnosis or treatment of M. tuberculosis. The novel candidate PAI MPI-1 encodes CRISPR-C as family proteins, which are known to be associated with persistent infection of M. tuberculosis. Our data represents a molecular basis and protocol for comprehensive annotating the pathogenic systems of M. tuberculosis, and will also facilitate the development of diagnosis and vaccination techniques of M. tuberculosis.


Asunto(s)
Proteínas Bacterianas/genética , Genoma Bacteriano , Islas Genómicas , Mycobacterium tuberculosis/genética , Código de Barras del ADN Taxonómico , Bases de Datos Genéticas , Genómica , Mycobacterium tuberculosis/patogenicidad
7.
Front Immunol ; 14: 1195533, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37654488

RESUMEN

Background: Pre-existing cross-reactive immunity among different coronaviruses, also termed immune imprinting, may have a comprehensive impact on subsequent SARS-CoV-2 infection and COVID-19 vaccination effectiveness. Here, we aim to explore the interplay between pre-existing seasonal coronaviruses (sCoVs) antibodies and the humoral immunity induced by COVID-19 vaccination. Methods: We first collected serum samples from healthy donors prior to COVID-19 pandemic and individuals who had received COVID-19 vaccination post-pandemic in China, and the levels of IgG antibodies against sCoVs and SARS-CoV-2 were detected by ELISA. Wilcoxon rank sum test and chi-square test were used to compare the difference in magnitude and seropositivity rate between two groups. Then, we recruited a longitudinal cohort to collect serum samples before and after COVID-19 vaccination. The levels of IgG antibodies against SARS-CoV-2 S, S1, S2 and N antigen were monitored. Association between pre-existing sCoVs antibody and COVID-19 vaccination-induced antibodies were analyzed by Spearman rank correlation. Results: 96.0% samples (339/353) showed the presence of IgG antibodies against at least one subtype of sCoVs. 229E and OC43 exhibited the highest seroprevalence rates at 78.5% and 72.0%, respectively, followed by NL63 (60.9%) and HKU1 (52.4%). The levels of IgG antibodies against two ß coronaviruses (OC43 and HKU1) were significantly higher in these donors who had inoculated with COVID-19 vaccines compared to pre-pandemic healthy donors. However, we found that COVID-19 vaccine-induced antibody levels were not significant different between two groups with high levelor low level of pre-existing sCoVs antibody among the longitudinal cohort. Conclusion: We found a high prevalence of antibodies against sCoVs in Chinese population. The immune imprinting by sCoVs could be reactivated by COVID-19 vaccination, but it did not appear to be a major factor affecting the immunogenicity of COVID-19 vaccine. These findings will provide insights into understanding the impact of immune imprinting on subsequent multiple shots of COVID-19 vaccines.


Asunto(s)
Vacunas contra la COVID-19 , COVID-19 , Humanos , Pandemias , Estaciones del Año , Estudios Seroepidemiológicos , COVID-19/epidemiología , COVID-19/prevención & control , SARS-CoV-2 , Inmunoglobulina G
8.
Emerg Microbes Infect ; 12(2): 2245931, 2023 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-37542407

RESUMEN

Yearly epidemics of seasonal influenza cause an enormous disease burden around the globe. An understanding of the rules behind the immune response with repeated vaccination still presents a significant challenge, which would be helpful for optimizing the vaccination strategy. In this study, 34 healthy volunteers with 16 vaccinated were recruited, and the dynamics of the BCR repertoire for consecutive vaccinations in two seasons were tracked. In terms of diversity, length, network, V and J gene segments usage, somatic hypermutation (SHM) rate and isotype, it was found that the overall changes were stronger in the acute phase of the first vaccination than the second vaccination. However, the V gene segments of IGHV4-39, IGHV3-9, IGHV3-7 and IGHV1-69 were amplified in the acute phase of the first vaccination, with IGHV3-7 dominant. On the other hand, for the second vaccination, the changes were dominated by IGHV1-69, with potential for coding broad neutralizing antibody. Additional analysis indicates that the application of V gene segment for IGHV3-7 in the acute phase of the first vaccination was due to the elevated usage of isotypes IgM and IgG3. While for IGHV1-69 in the second vaccination, it was contributed by isotypes IgG1 and IgG2. Finally, 41 public BCR clusters were identified in the vaccine group, with both IGHV3-7 and IGHV1-69 were involved and representative complementarity determining region 3 (CDR3) motifs were characterized. This study provides insights into the immune response dynamics following repeated influenza vaccination in humans and can inform universal vaccine design and vaccine strategies in the future.


Asunto(s)
Cadenas Pesadas de Inmunoglobulina , Gripe Humana , Humanos , Cadenas Pesadas de Inmunoglobulina/genética , Gripe Humana/prevención & control , Gripe Humana/genética , Regiones Determinantes de Complementariedad/genética , Familia de Multigenes , Vacunación
9.
Emerg Microbes Infect ; 11(1): 2007-2020, 2022 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-35899581

RESUMEN

Dynamic changes of the paired heavy and light chain B cell receptor (BCR) repertoire provide an essential insight into understanding the humoral immune response post-SARS-CoV-2 infection and vaccination. However, differences between the endogenous paired BCR repertoire kinetics in SARS-CoV-2 infection and previously recovered/naïve subjects treated with the inactivated vaccine remain largely unknown. We performed single-cell V(D)J sequencing of B cells from six healthy donors with three shots of inactivated SARS-CoV-2 vaccine (BBIBP-CorV), five people who received the BBIBP-CorV vaccine after having recovered from COVID-19, five unvaccinated COVID-19 recovered patients and then integrated with public data of B cells from four SARS-CoV-2-infected subjects. We discovered that BCR variable (V) genes were more prominently used in the SARS-CoV-2 exposed groups (both in the group with active infection and in the group that had recovered) than in the vaccinated groups. The VH gene that expanded the most after SARS-CoV-2 infection was IGHV3-33, while IGHV3-23 in the vaccinated groups. SARS-CoV-2-infected group enhanced more BCR clonal expansion and somatic hypermutation than the vaccinated healthy group. A small proportion of public clonotypes were shared between the SARS-CoV-2 infected, vaccinated healthy, and recovered groups. Moreover, several public antibodies had been identified against SARS-CoV-2 spike protein. We comprehensively characterize the paired heavy and light chain BCR repertoire from SARS-CoV-2 infection to vaccination, providing further guidance for the development of the next-generation precision vaccine.


Asunto(s)
COVID-19 , Vacunas Virales , Anticuerpos Antivirales , COVID-19/prevención & control , Vacunas contra la COVID-19 , Humanos , Receptores de Antígenos de Linfocitos B/genética , SARS-CoV-2/genética , Glicoproteína de la Espiga del Coronavirus , Vacunación
10.
Biomed Res Int ; 2019: 4824909, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-31321235

RESUMEN

Recent studies have shown that microorganisms may be associated with the onset and development of bladder cancer. The purpose of this study is to identify the common core bacteria associated with bladder cancer. We characterized the urinary microbial profile of the individuals with bladder cancer by 16S rRNA gene sequencing, and the results of 24 bladder cancer samples collected in our laboratory reveal 31 common core bacteria at genera level. In addition, the abundance of four common core bacteria is significantly higher in bladder cancer samples than in samples from nondiseased people analyzed by LEfSe, based on two previous datasets. In particular, the abundance of Acinetobacter is much higher in bladder cancer samples. It has been reported that Acinetobacter is involved not only in biofilm formation but also in the adhesion and invasion of epithelial cells, the spread of bacteria caused by the degradation of phospholipids in the mucosal barrier, and the escape of the host immune response. Thus, Acinetobacter may be related to bladder cancer and is a potential microbial marker of bladder cancer. However, due to the limited number of participants, further studies are needed to better understand the role of microorganisms in bladder cancer to provide novel biomarkers for diagnosis, prognosis, and therapy.


Asunto(s)
Acinetobacter/aislamiento & purificación , Bacterias/genética , Biomarcadores de Tumor/orina , Neoplasias de la Vejiga Urinaria/orina , Acinetobacter/genética , Adulto , Anciano , Anciano de 80 o más Años , Bacterias/clasificación , Bacterias/aislamiento & purificación , Células Epiteliales/microbiología , Femenino , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Masculino , Microbiota/genética , Persona de Mediana Edad , Invasividad Neoplásica/genética , Invasividad Neoplásica/patología , Filogenia , ARN Ribosómico 16S/genética , Neoplasias de la Vejiga Urinaria/microbiología , Neoplasias de la Vejiga Urinaria/patología , Sistema Urinario/microbiología , Sistema Urinario/patología
11.
Front Microbiol ; 10: 618, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-30984144

RESUMEN

BACKGROUND: Cellulose is the most abundant organic polymer mainly produced by plants in nature. It is insoluble and highly resistant to enzymatic hydrolysis. Cellulolytic microorganisms that are capable of producing a battery of related enzymes play an important role in recycling cellulose-rich plant biomass. Effective cellulose degradation by multiple synergic microorganisms has been observed within a defined microbial consortium in the lab culture. Metagenomic analysis may enable us to understand how microbes cooperate in cellulose degradation in a more complex microbial free-living ecosystem in nature. RESULTS: Here we investigated a typical cellulose-rich and alkaline niche where constituent microbes survive through inter-genera cooperation in cellulose utilization. The niche has been generated in an ancient paper-making plant, which has served as an isolated habitat for over 7 centuries. Combined amplicon-based sequencing of 16S rRNA genes and metagenomic sequencing, our analyses showed a microbial composition with 6 dominant genera including Cloacibacterium, Paludibacter, Exiguobacterium, Acetivibrio, Tolumonas, and Clostridium in this cellulose-rich niche; the composition is distinct from other cellulose-rich niches including a modern paper mill, bamboo soil, wild giant panda guts, and termite hindguts. In total, 11,676 genes of 96 glucoside hydrolase (GH) families, as well as 1,744 genes of carbohydrate transporters were identified, and modeling analysis of two representative genes suggested that these glucoside hydrolases likely evolved to adapt to alkaline environments. Further reconstruction of the microbial draft genomes by binning the assembled contigs predicted a mutualistic interaction between the dominant microbes regarding the cellulolytic process in the niche, with Paludibacter and Clostridium acting as helpers that produce endoglucanases, and Cloacibacterium, Exiguobacterium, Acetivibrio, and Tolumonas being beneficiaries that cross-feed on the cellodextrins by oligosaccharide uptake. CONCLUSION: The analysis of the key genes involved in cellulose degradation and reconstruction of the microbial draft genomes by binning the assembled contigs predicted a mutualistic interaction based on public goods regarding the cellulolytic process in the niche, suggesting that in the studied microbial consortium, free-living bacteria likely survive on each other by acquisition and exchange of metabolites. Knowledge gained from this study will facilitate the design of complex microbial communities with a better performance in industrial bioprocesses.

12.
Sci Rep ; 9(1): 734, 2019 01 24.
Artículo en Inglés | MEDLINE | ID: mdl-30679786

RESUMEN

Increasing evidences have revealed a close interaction between the intestinal microbes and host growth performance. The shrimp (Litopenaeus vannamei) gut harbors a diverse microbial community, yet its associations with dietary, body weight and weaning age remain a matter of debate. In this study, we analyzed the effects of different dietary (fishmeal group (NC), krill meal group (KM)) and different growth stages (age from 42 day-old to 98 day-old) of the shrimp on the intestinal microbiota. High throughput sequencing of the 16S rRNA genes of shrimp intestinal microbes determined the novelty of bacteria in the shrimp gut microbiota and a core of 58 Operation Taxonomic Units (OTUs) was present among the shrimp gut samples. Analysis results indicated that the development of the shrimp gut microbiota is a dynamic process with three stages across the age according to the gut microbiota compositions. Furthermore, the dietary of KM group did not significantly change the intestinal microbiota of the shrimps compared with NC group. Intriguingly, compared to NC group, we observed in KM group that a fluctuation of the shrimp gut microbiota coincided with the shrimp body weight gain between weeks 6-7. Six OTUs associated with the microbiota change in KM group were identified. This finding strongly suggests that the shrimp gut microbiota may be correlated with the shrimp body weight likely by influencing nutrient uptake in the gut. The results obtained from this study potentially will be guidelines for manipulation to provide novel shrimp feed management approaches.


Asunto(s)
Bacterias/genética , Microbioma Gastrointestinal/genética , Penaeidae/microbiología , Alimentación Animal/microbiología , Animales , Acuicultura , Bacterias/clasificación , Peso Corporal , Humanos , Penaeidae/genética , ARN Ribosómico 16S/genética
13.
Microbiome ; 6(1): 24, 2018 02 01.
Artículo en Inglés | MEDLINE | ID: mdl-29391057

RESUMEN

BACKGROUND: Substantial efforts have been made to link the gut bacterial community to many complex human diseases. Nevertheless, the gut phages are often neglected. RESULTS: In this study, we used multiple bioinformatic methods to catalog gut phages from whole-community metagenomic sequencing data of fecal samples collected from both type II diabetes (T2D) patients (n = 71) and normal Chinese adults (n = 74). The definition of phage operational taxonomic units (pOTUs) and identification of large phage scaffolds (n = 2567, ≥ 10 k) revealed a comprehensive human gut phageome with a substantial number of novel sequences encoding genes that were unrelated to those in known phages. Interestingly, we observed a significant increase in the number of gut phages in the T2D group and, in particular, identified 7 pOTUs specific to T2D. This finding was further validated in an independent dataset of 116 T2D and 109 control samples. Co-occurrence/exclusion analysis of the bacterial genera and pOTUs identified a complex core interaction between bacteria and phages in the human gut ecosystem, suggesting that the significant alterations of the gut phageome cannot be explained simply by co-variation with the altered bacterial hosts. CONCLUSIONS: Alterations in the gut bacterial community have been linked to the chronic disease T2D, but the role of gut phages therein is not well understood. This is the first study to identify a T2D-specific gut phageome, indicating the existence of other mechanisms that might govern the gut phageome in T2D patients. These findings suggest the importance of the phageome in T2D risk, which warrants further investigation.


Asunto(s)
Bacterias/virología , Bacteriófagos/clasificación , Diabetes Mellitus Tipo 2/microbiología , Tracto Gastrointestinal/microbiología , Bacteriófagos/genética , Bacteriófagos/aislamiento & purificación , Estudios de Casos y Controles , China , Biología Computacional , Heces/microbiología , Humanos , Filogenia
14.
Front Microbiol ; 9: 1476, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-30034378

RESUMEN

As an alternative approach against multidrug-resistant bacterial infections, phages are now being increasingly investigated as effective therapeutic agents. Here, aiming to design an efficient phage cocktail against Aeromonas salmonicida infections, we isolated and characterized five lytic A. salmonicida phages, AS-szw, AS-yj, AS-zj, AS-sw, and AS-gz. The results of morphological and genomic analysis suggested that all these phages are affiliated to the T4virus genus of the Caudovirales order. Their heterogeneous lytic capacities against A. salmonicida strains were demonstrated by experiments. A series of phage cocktails were prepared and investigated in vitro. We observed that the cocktail combining AS-gz and AS-yj showed significantly higher antimicrobial activity than other cocktails and individual phages. Given the divergent genomes between the phages AS-yj and AS-gz, our results highlight that the heterogeneous mechanisms that phages use to infect their hosts likely lead to phage synergy in killing the host. Conclusively, our study described a strategy to develop an effective and promising phage cocktail as a therapeutic agent to combat A. salmonicida infections, and thereby to control the outbreak of relevant fish diseases. Our study suggests that in vitro investigations into phages are prerequisite to obtain satisfying phage cocktails prior to application in practice.

15.
J Integr Bioinform ; 14(3)2017 Aug 10.
Artículo en Inglés | MEDLINE | ID: mdl-28796642

RESUMEN

Background Miniature inverted repeat transposable element (MITE) is a short transposable element, carrying no protein-coding regions. However, its high proliferation rate and sequence-specific insertion preference renders it as a good genetic tool for both natural evolution and experimental insertion mutagenesis. Recently active MITE copies are those with clear signals of Terminal Inverted Repeats (TIRs) and Direct Repeats (DRs), and are recently translocated into their current sites. Their proliferation ability renders them good candidates for the investigation of genomic evolution. Results This study optimizes the C++ code and running pipeline of the MITE Uncovering SysTem (MUST) by assuming no prior knowledge of MITEs required from the users, and the current version, MUSTv2, shows significantly increased detection accuracy for recently active MITEs, compared with similar programs. The running speed is also significantly increased compared with MUSTv1. We prepared a benchmark dataset, the simulated genome with 150 MITE copies for researchers who may be of interest. Conclusions MUSTv2 represents an accurate detection program of recently active MITE copies, which is complementary to the existing template-based MITE mapping programs. We believe that the release of MUSTv2 will greatly facilitate the genome annotation and structural analysis of the bioOMIC big data researchers.


Asunto(s)
Elementos Transponibles de ADN/genética , Secuencias Invertidas Repetidas/genética , Programas Informáticos , Genómica/métodos , Anotación de Secuencia Molecular
16.
Gene ; 602: 1-7, 2017 Feb 20.
Artículo en Inglés | MEDLINE | ID: mdl-27845204

RESUMEN

BACKGROUND: Similar to the regular enzymatic glycosylation, glycation also attaches a sugar molecule to a peptide, but does not need the help of an enzyme. Glycation may occur both inside and outside the host body, and will compete with the glycosylation procedure for functional regulation of mature protein products. The glycated residues do not show significant patterns, which make both in silico sequence-level predictors and wet-lab validations a major challenge. This study hypothesizes that a better feature set formulated from the glycated flanking peptides may lead to a good glycation prediction program. RESULTS: We explored the application of sequence order information and position specific amino acid propensity (PSAAP) in the glycation residue prediction problem. The PSAAP demonstrated its ability to discriminate the glycated residues from the background control peptides. A Support Vector Machine (SVM) model was constructed from the training dataset and achieved 68.91% in the overall accuracy. The model also achieves 0.7258 and 0.3198 in the Area under the ROC and Matthew's Correlation Coefficient, respectively. The user-friendly online version of the proposed algorithm may be found on the web server Gly-PseAAC at http://app.aporc.org/Gly-PseAAC/. CONCLUSION: The feature set PSAAP was calculated and led to a useful classification of glycation residues.


Asunto(s)
Glicopéptidos/química , Glicopéptidos/metabolismo , Lisina/metabolismo , Algoritmos , Secuencia de Aminoácidos , Simulación por Computador , Bases de Datos de Proteínas , Glicosilación , Procesamiento Proteico-Postraduccional , Máquina de Vectores de Soporte
17.
Biomed Res Int ; 2016: 7237053, 2016.
Artículo en Inglés | MEDLINE | ID: mdl-27195295

RESUMEN

Motivation. Clustered regularly interspaced short palindromic repeat (CRISPR) is a genetic element with active regulation roles for foreign invasive genes in the prokaryotic genomes and has been engineered to work with the CRISPR-associated sequence (Cas) gene Cas9 as one of the modern genome editing technologies. Due to inconsistent definitions, the existing CRISPR detection programs seem to have missed some weak CRISPR signals. Results. This study manually curates all the currently annotated CRISPR elements in the prokaryotic genomes and proposes 95 updates to the annotations. A new definition is proposed to cover all the CRISPRs. The comprehensive comparison of CRISPR numbers on the taxonomic levels of both domains and genus shows high variations for closely related species even in the same genus. The detailed investigation of how CRISPRs are evolutionarily manipulated in the 8 completely sequenced species in the genus Thermoanaerobacter demonstrates that transposons act as a frequent tool for splitting long CRISPRs into shorter ones along a long evolutionary history.


Asunto(s)
Sistemas CRISPR-Cas/genética , Curaduría de Datos/métodos , Evolución Molecular , Células Procariotas/metabolismo , ADN Intergénico/genética , Bases de Datos de Ácidos Nucleicos , Genoma Bacteriano , Secuencias Repetitivas de Ácidos Nucleicos/genética
18.
Comput Biol Med ; 77: 16-22, 2016 10 01.
Artículo en Inglés | MEDLINE | ID: mdl-27494091

RESUMEN

Different therapeutic methods have been developed for the B-cell and T-cell subtypes of acute lymphoblastic leukemia (ALL). The identification of molecular biomarkers that can accurately discriminate between B-cell and T-cell ALLs will facilitate the quick determination of therapeutic plans, as well as reveal the intrinsic mechanisms underlining the two different ALL subtypes. This study computationally screened the high-throughput transcriptome dataset for multiple candidate biomarkers and verified their discrimination abilities in an independent sample set using quantitative real-time polymerase chain reaction (PCR) technology. Both technologies suggest that the two genes CD3D and PKRCQ together provided a good model for classification of B-cell and T-cell ALLs, whereas the individual genes did not show consistent discrimination between the two ALL subtypes. Supplementary material is available at http://healthinformaticslab.org/supp/.


Asunto(s)
Biomarcadores de Tumor/genética , Complejo CD3/genética , Leucemia-Linfoma Linfoblástico de Células Precursoras B , Leucemia-Linfoma Linfoblástico de Células T Precursoras , Proteína Quinasa C-theta/genética , Diagnóstico Diferencial , Perfilación de la Expresión Génica , Humanos , Leucemia-Linfoma Linfoblástico de Células Precursoras B/diagnóstico , Leucemia-Linfoma Linfoblástico de Células Precursoras B/genética , Leucemia-Linfoma Linfoblástico de Células T Precursoras/diagnóstico , Leucemia-Linfoma Linfoblástico de Células T Precursoras/genética , Reacción en Cadena en Tiempo Real de la Polimerasa , Transcriptoma/genética
19.
Sci Rep ; 6: 32942, 2016 09 06.
Artículo en Inglés | MEDLINE | ID: mdl-27596864

RESUMEN

Clustered regularly interspaced short palindromic repeats (CRISPRs) are important genetic elements in many bacterial and archaeal genomes, and play a key role in prokaryote immune systems' fight against invasive foreign elements. The CRISPR system has also been engineered to facilitate target gene editing in eukaryotic genomes. Using the common features of mis-annotated CRISPRs in prokaryotic genomes, this study proposed an accurate de novo CRISPR annotation program CRISPRdigger, which can take a partially assembled genome as its input. A comprehensive comparison with the three existing programs demonstrated that CRISPRdigger can recover more Direct Repeats (DRs) for CRISPRs and achieve a higher accuracy for a query genome. The program was implemented by Perl and all the parameters had default values, so that a user could annotate CRISPRs in a query genome by supplying only a genome sequence in the FASTA format. All the supplementary data are available at http://www.healthinformaticslab.org/supp/.


Asunto(s)
Sistemas CRISPR-Cas , Clostridium/genética , Repeticiones Palindrómicas Cortas Agrupadas y Regularmente Espaciadas , Methanocaldococcus/genética , Mapeo Cromosómico , Bases de Datos de Ácidos Nucleicos , Genoma Arqueal , Genoma Bacteriano , Anotación de Secuencia Molecular , Programas Informáticos
20.
Interdiscip Sci ; 7(2): 194-9, 2015 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-26245277

RESUMEN

Protein's posttranslational modification (PTM) represents a major dynamic regulation of protein functions after the translation of polypeptide chains from mRNA molecule. Compared with the costly and labor-intensive wet laboratory characterization of PTMs, the computer-based detection of PTM residues has been a major complementary technique in recent years. Previous studies demonstrated that the PTM-flanking positions convey different contributions to the computational detection of PTM residue, but did not directly translate this observation into the in silico PTM prediction. We propose a weight vector to represent the variant contributions of the PTM-flanking positions and use an evolutionary algorithm to optimize the vector. Even a simple nearest neighbor algorithm with the incorporated optimal weight vector outperforms the currently available algorithms. The algorithm is implemented as an easy-to-use computer program, jEcho version 1.0. The implementation language, Java, makes jEcho platform-independent and visually interactive. The predicted results may be directly exported as publication-quality images or text files. jEcho may be downloaded from http://www.healthinformaticslab.org/supp/ .


Asunto(s)
Secuencias de Aminoácidos , Minería de Datos/métodos , Procesamiento Proteico-Postraduccional , Máquina de Vectores de Soporte , Bases de Datos de Proteínas , Fosforilación , Diseño de Software
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA