Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
J Neurodev Disord ; 11(1): 21, 2019 09 13.
Artigo em Inglês | MEDLINE | ID: mdl-31519145

RESUMO

BACKGROUND: Qualitatively atypical language development characterized by non-sequential skill acquisition within a developmental domain, which has been called developmental deviance or difference, is a common characteristic of autism spectrum disorder (ASD). We developed the Response Dispersion Index (RDI), a measure of this phenomenon based on intra-subtest scatter of item responses on standardized psychometric assessments, to assess the within-task variability among individuals with language impairment (LI) and/or ASD. METHODS: Standard clinical assessments of language were administered to 502 individuals from the New Jersey Language and Autism Genetics Study (NJLAGS) cohort. Participants were divided into four diagnostic groups: unaffected, ASD-only, LI-only, and ASD + LI. For each language measure, RDI was defined as the product of the total number of test items and the sum of the weight (based on item difficulty) of test items missed. Group differences in RDI were assessed, and the relationship between RDI and ASD diagnosis among individuals with LI was investigated for each language assessment. RESULTS: Although standard scores were unable to distinguish the LI-only and ASD/ASD + LI groups, the ASD/ASD + LI groups had higher RDI scores compared to LI-only group across all measures of expressive, pragmatic, and metalinguistic language. RDI was positively correlated with quantitative ASD traits across all subgroups and was an effective predictor of ASD diagnosis among individuals with LI. CONCLUSIONS: The RDI is an effective quantitative metric of developmental deviance/difference that correlates with ASD traits, supporting previous associations between ASD and non-sequential skill acquisition. The RDI can be adapted to other clinical measures to investigate the degree of difference that is not captured by standard performance summary scores.


Assuntos
Transtorno do Espectro Autista/diagnóstico , Desenvolvimento da Linguagem , Transtornos da Linguagem/diagnóstico , Testes de Linguagem , Psicometria , Análise e Desempenho de Tarefas , Adolescente , Adulto , Transtorno do Espectro Autista/complicações , Estudos de Coortes , Feminino , Humanos , Transtornos da Linguagem/etiologia , Masculino , Pessoa de Meia-Idade , Projetos Piloto , Estudos Retrospectivos , Adulto Jovem
2.
Artigo em Inglês | MEDLINE | ID: mdl-24991213

RESUMO

Digital signal processing (DSP) techniques for biological sequence analysis continue to grow in popularity due to the inherent digital nature of these sequences. DSP methods have demonstrated early success for detection of coding regions in a gene. Recently, these methods are being used to establish DNA gene similarity. We present the inter-coefficient difference (ICD) transformation, a novel extension of the discrete Fourier transformation, which can be applied to any DNA sequence. The ICD method is a mathematical, alignment-free DNA comparison method that generates a genetic signature for any DNA sequence that is used to generate relative measures of similarity among DNA sequences. We demonstrate our method on a set of insulin genes obtained from an evolutionarily wide range of species, and on a set of avian influenza viral sequences, which represents a set of highly similar sequences. We compare phylogenetic trees generated using our technique against trees generated using traditional alignment techniques for similarity and demonstrate that the ICD method produces a highly accurate tree without requiring an alignment prior to establishing sequence similarity.

3.
BMC Bioinformatics ; 14: 96, 2013 Mar 15.
Artigo em Inglês | MEDLINE | ID: mdl-23496846

RESUMO

BACKGROUND: In protein sequence classification, identification of the sequence motifs or n-grams that can precisely discriminate between classes is a more interesting scientific question than the classification itself. A number of classification methods aim at accurate classification but fail to explain which sequence features indeed contribute to the accuracy. We hypothesize that sequences in lower denominations (n-grams) can be used to explore the sequence landscape and to identify class-specific motifs that discriminate between classes during classification. Discriminative n-grams are short peptide sequences that are highly frequent in one class but are either minimally present or absent in other classes. In this study, we present a new substitution-based scoring function for identifying discriminative n-grams that are highly specific to a class. RESULTS: We present a scoring function based on discriminative n-grams that can effectively discriminate between classes. The scoring function, initially, harvests the entire set of 4- to 8-grams from the protein sequences of different classes in the dataset. Similar n-grams of the same size are combined to form new n-grams, where the similarity is defined by positive amino acid substitution scores in the BLOSUM62 matrix. Substitution has resulted in a large increase in the number of discriminatory n-grams harvested. Due to the unbalanced nature of the dataset, the frequencies of the n-grams are normalized using a dampening factor, which gives more weightage to the n-grams that appear in fewer classes and vice-versa. After the n-grams are normalized, the scoring function identifies discriminative 4- to 8-grams for each class that are frequent enough to be above a selection threshold. By mapping these discriminative n-grams back to the protein sequences, we obtained contiguous n-grams that represent short class-specific motifs in protein sequences. Our method fared well compared to an existing motif finding method known as Wordspy. We have validated our enriched set of class-specific motifs against the functionally important motifs obtained from the NLSdb, Prosite and ELM databases. We demonstrate that this method is very generic; thus can be widely applied to detect class-specific motifs in many protein sequence classification tasks. CONCLUSION: The proposed scoring function and methodology is able to identify class-specific motifs using discriminative n-grams derived from the protein sequences. The implementation of amino acid substitution scores for similarity detection, and the dampening factor to normalize the unbalanced datasets have significant effect on the performance of the scoring function. Our multipronged validation tests demonstrate that this method can detect class-specific motifs from a wide variety of protein sequence classes with a potential application to detecting proteome-specific motifs of different organisms.


Assuntos
Motivos de Aminoácidos , Proteínas/classificação , Análise de Sequência de Proteína/métodos , Substituição de Aminoácidos , Mineração de Dados , Enzimas/química , Enzimas/classificação
4.
BMC Res Notes ; 5: 351, 2012 Jul 10.
Artigo em Inglês | MEDLINE | ID: mdl-22780965

RESUMO

BACKGROUND: Understanding protein subcellular localization is a necessary component toward understanding the overall function of a protein. Numerous computational methods have been published over the past decade, with varying degrees of success. Despite the large number of published methods in this area, only a small fraction of them are available for researchers to use in their own studies. Of those that are available, many are limited by predicting only a small number of organelles in the cell. Additionally, the majority of methods predict only a single location for a sequence, even though it is known that a large fraction of the proteins in eukaryotic species shuttle between locations to carry out their function. FINDINGS: We present a software package and a web server for predicting the subcellular localization of protein sequences based on the ngLOC method. ngLOC is an n-gram-based Bayesian classifier that predicts subcellular localization of proteins both in prokaryotes and eukaryotes. The overall prediction accuracy varies from 89.8% to 91.4% across species. This program can predict 11 distinct locations each in plant and animal species. ngLOC also predicts 4 and 5 distinct locations on gram-positive and gram-negative bacterial datasets, respectively. CONCLUSIONS: ngLOC is a generic method that can be trained by data from a variety of species or classes for predicting protein subcellular localization. The standalone software is freely available for academic use under GNU GPL, and the ngLOC web server is also accessible at http://ngloc.unmc.edu.


Assuntos
Internet , Proteínas/metabolismo , Software , Frações Subcelulares/metabolismo , Teorema de Bayes , Células Eucarióticas/metabolismo , Células Procarióticas/metabolismo
5.
PLoS One ; 4(3): e5096, 2009.
Artigo em Inglês | MEDLINE | ID: mdl-19333396

RESUMO

Knowledge of specific domain-domain interactions (DDIs) is essential to understand the functional significance of protein interaction networks. Despite the availability of an enormous amount of data on protein-protein interactions (PPIs), very little is known about specific DDIs occurring in them. Here, we present a top-down approach to accurately infer functionally relevant DDIs from PPI data. We created a comprehensive, non-redundant dataset of 209,165 experimentally-derived PPIs by combining datasets from five major interaction databases. We introduced an integrated scoring system that uses a novel combination of a set of five orthogonal scoring features covering the probabilistic, evolutionary, evidence-based, spatial and functional properties of interacting domains, which can map the interacting propensity of two domains in many dimensions. This method outperforms similar existing methods both in the accuracy of prediction and in the coverage of domain interaction space. We predicted a set of 52,492 high-confidence DDIs to carry out cross-species comparison of DDI conservation in eight model species including human, mouse, Drosophila, C. elegans, yeast, Plasmodium, E. coli and Arabidopsis. Our results show that only 23% of these DDIs are conserved in at least two species and only 3.8% in at least 4 species, indicating a rather low conservation across species. Pair-wise analysis of DDI conservation revealed a 'sliding conservation' pattern between the evolutionarily neighboring species. Our methodology and the high-confidence DDI predictions generated in this study can help to better understand the functional significance of PPIs at the modular level, thus can significantly impact further experimental investigations in systems biology research.


Assuntos
Biologia Computacional/métodos , Mapeamento de Interação de Proteínas/métodos , Animais , Humanos , Proteômica/métodos
6.
Genome Biol ; 8(5): R68, 2007.
Artigo em Inglês | MEDLINE | ID: mdl-17472741

RESUMO

We present a method called ngLOC, an n-gram-based Bayesian classifier that predicts the localization of a protein sequence over ten distinct subcellular organelles. A tenfold cross-validation result shows an accuracy of 89% for sequences localized to a single organelle, and 82% for those localized to multiple organelles. An enhanced version of ngLOC was developed to estimate the subcellular proteomes of eight eukaryotic organisms: yeast, nematode, fruitfly, mosquito, zebrafish, chicken, mouse, and human.


Assuntos
Teorema de Bayes , Células Eucarióticas/ultraestrutura , Organelas/química , Proteoma/análise , Sequência de Aminoácidos , Animais , Classificação , Humanos , Proteínas/análise , Distribuição Tecidual , Leveduras
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...