Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
1.
BMC Bioinformatics ; 19(1): 475, 2018 Dec 12.
Artigo em Inglês | MEDLINE | ID: mdl-30541438

RESUMO

BACKGROUND: Sequence similarity networks are useful for classifying and characterizing biologically important proteins. Threshold-based approaches to similarity network construction using exact distance measures are prohibitively slow to compute and rely on the difficult task of selecting an appropriate threshold, while similarity networks based on approximate distance calculations compromise useful structural information. RESULTS: We present an alternative network representation for a set of sequence data that overcomes these drawbacks. In our model, called the Directed Weighted All Nearest Neighbors (DiWANN) network, each sequence is represented by a node and is connected via a directed edge to only the closest sequence, or sequences in the case of ties, in the dataset. Our contributions span several aspects. Specifically, we: (i) Apply an all nearest neighbors network model to protein sequence data from three different applications and examine the structural properties of the networks; (ii) Compare the model against threshold-based networks to validate their semantic equivalence, and demonstrate the relative advantages the model offers; (iii) Demonstrate the model's resilience to missing sequences; and (iv) Develop an efficient algorithm for constructing a DiWANN network from a set of sequences. We find that the DiWANN network representation attains similar semantic properties to threshold-based graphs, while avoiding weaknesses of both high and low threshold graphs. Additionally, we find that approximate distance networks, using BLAST bitscores in place of exact edit distances, can cause significant loss of structural information. We show that the proposed DiWANN network construction algorithm provides a fourfold speedup over a standard threshold based approach to network construction. We also identify a relationship between the centrality of a sequence in a similarity network of an Anaplasma marginale short sequence repeat dataset and how broadly that sequence is dispersed geographically. CONCLUSION: We demonstrate that using approximate distance measures to rapidly construct similarity networks may lead to significant deficiencies in the structure of that network in terms centrality and clustering analyses. We present a new network representation that maintains the structural semantics of threshold-based networks while increasing connectedness, and an algorithm for constructing the network using exact distance measures in a fraction of the time it would take to build a threshold-based equivalent.


Assuntos
Sequência de Aminoácidos/genética , Proteínas/química , Análise por Conglomerados , Genótipo , Metanálise em Rede
2.
BMC Genomics ; 17: 422, 2016 06 03.
Artigo em Inglês | MEDLINE | ID: mdl-27260942

RESUMO

BACKGROUND: Short-sequence repeats (SSRs) occur in both prokaryotic and eukaryotic DNA, inter- and intragenically, and may be exact or inexact copies. When heterogeneous SSRs are present in a given locus, we can take advantage of the pattern of different repeats to genotype strains based on the SSRs. Cataloguing and tracking these repeats can be difficult as diverse groups of researchers are involved in the identification of the repeats. Additionally, the task is error-prone when done manually. RESULTS: We developed RepeatAnalyzer, a new software tool capable of tracking, managing, analysing and cataloguing SSRs and genotypes using Anaplasma marginale as a model species. RepeatAnalyzer's analysis capability includes novel metrics for measuring regional genetic diversity (corresponding to variety and regularity of SSR occurrence). As a part of its visualization capabilities, RepeatAnalyzer produces high quality maps of the geographic distribution of genotypes or SSRs over a region of interest. RepeatAnalyzer's repeat identification functionality was validated for all SSRs and genotypes reported in 21 publications, using 380 A. marginale isolates gathered from the five publications within that list that provided access to their isolates. The tool produced accurate genotyping results in every case. In addition, it uncovered a number of errors in the published literature: 11 cases where SSRs were misreported, 5 cases where two different SSRs had been given the same name, and 16 cases where two or more names had been given to a single SSR. The analysis and visualization functionalities of the tool are demonstrated using several examples. CONCLUSIONS: RepeatAnalyzer is a robust software tool that can be used for storing, managing, and analysing short-sequence repeats for the purpose of strain identification. The tool can be used for any set of SSRs regardless of species. When applied to A. marginale, our test case, we show that genotype lengths for a given region follow a normal distribution, while SSR frequencies follow a power-law-like distribution. Further, we find that over 90 % of repeats are 28 to 29 amino acids long, which is in agreement with conventional wisdom. Lastly, our analysis reveals that the most common edit distance is five or six, which is counter-intuitive since we expected that result to be closer to one, resulting from the simplest change from one repeat to another.


Assuntos
Biologia Computacional/métodos , Genômica/métodos , Repetições de Microssatélites , Software , Anaplasma marginale/genética , Variação Genética , Genótipo , Reprodutibilidade dos Testes , Streptococcus pneumoniae/genética
3.
J Clin Microbiol ; 54(10): 2503-12, 2016 10.
Artigo em Inglês | MEDLINE | ID: mdl-27440819

RESUMO

Bovine anaplasmosis caused by the intraerythrocytic rickettsial pathogen Anaplasma marginale is endemic in South Africa. Anaplasma marginale subspecies centrale also infects cattle; however, it causes a milder form of anaplasmosis and is used as a live vaccine against A. marginale There has been less interest in the epidemiology of A. marginale subsp. centrale, and, as a result, there are few reports detecting natural infections of this organism. When detected in cattle, it is often assumed that it is due to vaccination, and in most cases, it is reported as coinfection with A. marginale without characterization of the strain. A total of 380 blood samples from wild ruminant species and cattle collected from biobanks, national parks, and other regions of South Africa were used in duplex real-time PCR assays to simultaneously detect A. marginale and A. marginale subsp. centrale. PCR results indicated high occurrence of A. marginale subsp. centrale infections, ranging from 25 to 100% in national parks. Samples positive for A. marginale subsp. centrale were further characterized using the msp1aS gene, a homolog of msp1α of A. marginale, which contains repeats at the 5' ends that are useful for genotyping strains. A total of 47 Msp1aS repeats were identified, which corresponded to 32 A. marginale subsp. centrale genotypes detected in cattle, buffalo, and wildebeest. RepeatAnalyzer was used to examine strain diversity. Our results demonstrate a diversity of A. marginale subsp. centrale strains from cattle and wildlife hosts from South Africa and indicate the utility of msp1aS as a genotypic marker for A. marginale subsp. centrale strain diversity.


Assuntos
Anaplasma marginale/classificação , Anaplasma marginale/isolamento & purificação , Anaplasmose/epidemiologia , Anaplasmose/microbiologia , Animais Selvagens , Variação Genética , Técnicas de Genotipagem/métodos , África , Anaplasma marginale/genética , Animais , Bovinos , Doenças dos Bovinos/epidemiologia , Doenças dos Bovinos/microbiologia , Genes Bacterianos , Reação em Cadeia da Polimerase Multiplex , Prevalência , Reação em Cadeia da Polimerase em Tempo Real , África do Sul/epidemiologia
4.
Viruses ; 14(8)2022 07 29.
Artigo em Inglês | MEDLINE | ID: mdl-36016294

RESUMO

Severe acute respiratory syndrome-related coronavirus (SARS-CoV-2), which still infects hundreds of thousands of people globally each day despite various countermeasures, has been mutating rapidly. Mutations in the spike (S) protein seem to play a vital role in viral stability, transmission, and adaptability. Therefore, to control the spread of the virus, it is important to gain insight into the evolution and transmission of the S protein. This study deals with the temporal and geographical distribution of mutant S proteins from sequences gathered across the US over a period of 19 months in 2020 and 2021. The S protein sequences are studied using two approaches: (i) multiple sequence alignment is used to identify prominent mutations and highly mutable regions and (ii) sequence similarity networks are subsequently employed to gain further insight and study mutation profiles of concerning variants across the defined time periods and states. Additionally, we tracked the variants using visualizations on geographical maps. The visualizations produced using the Directed Weighted All Nearest Neighbors (DiWANN) networks and maps provided insights into the transmission of the virus that reflect well the statistics reported for the time periods studied. We found that the networks created using DiWANN are superior to commonly used approximate distance networks created using BLAST bitscores. The study offers a richer computational approach to analyze the transmission profile of the prominent S protein mutations in SARS-CoV-2 and can be extended to other proteins and viruses.


Assuntos
COVID-19 , Glicoproteína da Espícula de Coronavírus , Humanos , Mutação , SARS-CoV-2/genética , Glicoproteína da Espícula de Coronavírus/genética , Glicoproteína da Espícula de Coronavírus/metabolismo
5.
Pathogens ; 9(8)2020 Jul 28.
Artigo em Inglês | MEDLINE | ID: mdl-32731487

RESUMO

Ticks and the vast array of pathogens they transmit, including bacteria, viruses, protozoa, and helminths, constitute a growing burden for human and animal health worldwide. In Cuba, the major tropical island in the Caribbean, ticks are an important cause of vector-borne diseases affecting livestock production, pet animal health and, to a lesser extent, human health. The higher number of tick species in the country belong to the Argasidae family and, probably less known, is the presence of an autochthonous tick species in the island, Ixodes capromydis. Herein, we provide a comprehensive review of the ticks and tick-borne pathogens (TBPs) affecting animal and human health in Cuba. The review covers research results including ecophysiology of ticks, the epidemiology of TBPs, and the diagnostic tools used currently in the country for the surveillance of TBPs. We also introduce the programs implemented in the country for tick control and the biotechnology research applied to the development of anti-tick vaccines.

6.
J Geriatr Oncol ; 10(1): 120-125, 2019 01.
Artigo em Inglês | MEDLINE | ID: mdl-30017733

RESUMO

PURPOSE: Gait speed in older patients with cancer is associated with mortality risk. One approach to assess gait speed is with the 'Timed Up and Go' (TUG) test. We utilized machine learning algorithms to automatically predict the results of the TUG tests and its association with survival, using patient-generated responses. METHODS: A decision tree classifier was trained based on functional status data, obtained from preoperative geriatric assessment, and TUG test performance of older patients with cancer. The functional status data were used as input features to the decision tree, and the actual TUG data was used as ground truth labels. The decision tree was constructed to assign each patient to one of three categories: "TUG < 10 s", "TUG ≥ 10 s", and "uncertain." RESULTS: In total, 1901 patients (49% women) with a mean age of 80 years were assessed. The most commonly performed operations were urologic, colorectal, and head and neck. The machine learning algorithm identified three features (cane/walker use, ability to walk outside, and ability to perform housework), in predicting TUG results with the decision tree classifier. The overall accuracy, specificity, and sensitivity of the prediction were 78%, 90%, and 66%, respectively. Furthermore, survival rates in each predicted TUG category differed by approximately 1% from the survival rates obtained by categorizing the patients based on their actual TUG results. CONCLUSIONS: Machine learning algorithms can accurately predict the gait speed of older patients with cancer, based on their response to questions addressing other aspects of functional status.


Assuntos
Sobreviventes de Câncer/estatística & dados numéricos , Marcha , Neoplasias/cirurgia , Idoso , Idoso de 80 Anos ou mais , Algoritmos , Árvores de Decisões , Feminino , Humanos , Aprendizado de Máquina , Masculino , Neoplasias/mortalidade , Valor Preditivo dos Testes , Recuperação de Função Fisiológica , Análise de Sobrevida
7.
Parasit Vectors ; 11(1): 5, 2018 01 03.
Artigo em Inglês | MEDLINE | ID: mdl-29298712

RESUMO

BACKGROUND: Only a few studies have examined the presence of Anaplasma marginale and Anaplasma centrale in South Africa, and no studies have comprehensively examined these species across the whole country. To undertake this country-wide study we adapted a duplex quantitative real-time PCR (qPCR) assay for use in South Africa but found that one of the genes on which the assay was based was variable. Therefore, we sequenced a variety of field samples and tested the assay on the variants detected. We used the assay to screen 517 cattle samples sourced from all nine provinces of South Africa, and subsequently examined A. marginale positive samples for msp1α genotype to gauge strain diversity. RESULTS: Although the A. marginale msp1ß gene is variable, the qPCR functions at an acceptable efficiency. The A. centrale groEL gene was not variable within the qPCR assay region. Of the cattle samples screened using the assay, 57% and 17% were found to be positive for A. marginale and A. centrale, respectively. Approximately 15% of the cattle were co-infected. Msp1α genotyping revealed 36 novel repeat sequences. Together with data from previous studies, we analysed the Msp1a repeats from South Africa where a total of 99 repeats have been described that can be attributed to 190 msp1α genotypes. While 22% of these repeats are also found in other countries, only two South African genotypes are also found in other countries; otherwise, the genotypes are unique to South Africa. CONCLUSIONS: Anaplasma marginale was prevalent in the Western Cape, KwaZulu-Natal and Mpumalanga and absent in the Northern Cape. Anaplasma centrale was prevalent in the Western Cape and KwaZulu-Natal and absent in the Northern Cape and Eastern Cape. None of the cattle in the study were known to be vaccinated with A. centrale, so finding positive cattle indicates that this organism appears to be naturally circulating in cattle. A diverse population of A. marginale strains are found in South Africa, with some msp1α genotypes widely distributed across the country, and others appearing only once in one province. This diversity should be taken into account in future vaccine development studies.


Assuntos
Anaplasma centrale/classificação , Anaplasma marginale/classificação , Anaplasmose/microbiologia , Doenças dos Bovinos/microbiologia , Coinfecção/veterinária , Variação Genética , Genótipo , Anaplasma centrale/genética , Anaplasma centrale/isolamento & purificação , Anaplasma marginale/genética , Anaplasma marginale/isolamento & purificação , Anaplasmose/epidemiologia , Animais , Proteínas da Membrana Bacteriana Externa/genética , Bovinos , Doenças dos Bovinos/epidemiologia , Chaperonina 60/genética , Coinfecção/epidemiologia , Coinfecção/microbiologia , Epidemiologia Molecular , Reação em Cadeia da Polimerase Multiplex , Prevalência , Reação em Cadeia da Polimerase em Tempo Real , África do Sul/epidemiologia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA