RESUMO
BACKGROUND: Administrative claims data are a valuable source for clinical studies; however, the use of validated algorithms to identify patients is essential to minimize bias. We evaluated the validity of diagnostic coding algorithms for identifying patients with colorectal cancer from a hospital's administrative claims data. METHODS: This validation study used administrative claims data from a Japanese university hospital between April 2017 and March 2019. We developed diagnostic coding algorithms, basically based on the International Classification of Disease (ICD) 10th codes of C18-20 and Japanese disease codes, to identify patients with colorectal cancer. For random samples of patients identified using our algorithms, case ascertainment was performed using chart review as the gold standard. The positive predictive value (PPV) was calculated to evaluate the accuracy of the algorithms. RESULTS: Of 249 random samples of patients identified as having colorectal cancer by our coding algorithms, 215 were confirmed cases, yielding a PPV of 86.3% (95% confidence interval [CI], 81.5-90.1%). When the diagnostic codes were restricted to site-specific (right colon, left colon, transverse colon, or rectum) cancer codes, 94 of the 100 random samples were true cases of colorectal cancer. Consequently, the PPV increased to 94.0% (95% CI, 87.2-97.4%). CONCLUSION: Our diagnostic coding algorithms based on ICD-10 codes and Japanese disease codes were highly accurate in detecting patients with colorectal cancer from this hospital's claims data. The exclusive use of site-specific cancer codes further improved the PPV from 86.3 to 94.0%, suggesting their desirability in identifying these patients more precisely.
Assuntos
Neoplasias Colorretais , População do Leste Asiático , Humanos , Algoritmos , Neoplasias Colorretais/diagnóstico , Neoplasias Colorretais/epidemiologia , Bases de Dados Factuais , Hospitais Universitários , Classificação Internacional de Doenças , Valor Preditivo dos TestesRESUMO
G1 and G2 fluorene dendrimers with naphthalene termini were synthesized as a fluorescence turn-on type chemical sensor for vitamin K4. The fluorene dendrimers were prepared by Williamson ether reaction between the fluorene core with dihydroxy groups and dendritic naphthalene segments with methylene chloride by a convergent method. We investigated the relationship between the dendrimer generation and vitamin K4 recognition of fluorene dendrimer with naphthalene termini in CHCl3. Addition of vitamin K4 enhanced the fluorescence intensity of the fluorene dendrimer. Especially, the G2 fluorene dendrimer was found to be an effective chemical sensor for vitamin K4 and better than the G1 fluorene dendrimer.
Assuntos
Dendrímeros/síntese química , Fluorenos/síntese química , Sondas Moleculares/síntese química , Vitamina K/análise , Clorofórmio , Dendrímeros/química , Fluorenos/química , Cloreto de Metileno/química , Sondas Moleculares/química , Naftalenos/química , Soluções , Espectrometria de FluorescênciaRESUMO
The purpose of this study was to examine the use of non-contrast-enhanced MR angiography (MRA) for assessing recanalization of uterine arteries (UAs) after uterine artery embolization (UAE) for symptomatic fibroids. Pre-procedural and follow-up unenhanced MRA images of 30 patients were reviewed, and the extent to which the UAs could be visualized was classified on a 4-point scale. An increase in the score between consecutive time points indicates that a previously inconspicuous segment of the UA became visible on follow-up images. Patients were divided into two groups according to the presence (or absence) of recanalization. The median UA visualization score at each follow-up was significantly lower than that at baseline (p < 0.01), but there was no significant difference between the scores of the follow-up images. Recanalization was detected in 63% (19/30) of patients. In these patients, the mean decrease in uterine and largest fibroid volume at 12 months after UAE was inferior to the mean decrease in patients for whom recanalization was not detected. Based on MRA assessment, recanalization after UAE occurred in 63% of patients but did not compromise the reduction in uterine and dominant fibroid volumes within 12 months after UAE.
RESUMO
TMPDB is a database of experimentally-characterized transmembrane (TM) topologies. TMPDB release 6.2 contains a total of 302 TM protein sequences, in which 276 are alpha-helical sequences, 17 beta-stranded, and 9 alpha-helical sequences with short pore-forming helices buried in the membrane. The TM topologies in TMPDB were determined experimentally by means of X-ray crystallography, NMR, gene fusion technique, substituted cysteine accessibility method, N-linked glycosylation experiment and other biochemical methods. TMPDB would be useful as a test and/or training dataset in improving the proposed TM topology prediction methods or developing novel methods with higher performance, and as a guide for both the bioinformaticians and biologists to better understand TM proteins. TMPDB and its subsets are freely available at the following web site: http://bioinfo.si.hirosaki-u.ac.jp/~TMPDB/.
Assuntos
Bases de Dados de Proteínas , Proteínas de Membrana/química , Animais , Células Eucarióticas , Células Procarióticas , Estrutura Secundária de ProteínaRESUMO
ConPred II (http://bioinfo.si.hirosaki-u.ac.jp/~ConPred2/) is a server for the prediction of transmembrane (TM) topology [i.e. the number of TM segments (TMSs), TMS positions and N-tail location] based on a consensus approach by combining the results of several proposed methods. The ConPred II system is constructed from ConPred_elite and ConPred_all (previously named ConPred), proposed earlier by our group. The prediction accuracy of ConPred_elite is almost 100%, which is achieved by sacrificing the prediction coverage (20-30%). ConPred_all predicts TM topologies for all the input sequences with accuracies improved by up to 11% over individual proposed methods. In the ConPred II system, the TM topology prediction of input TM protein sequences is executed following a two-step process: (i) input sequences are first run through the ConPred_elite program; (ii) sequences for which ConPred_elite does not give the TM topology are delivered to the ConPred_all program for TM topology prediction. Users can get access to the ConPred II system automatically by submitting sequences to the server. The ConPred II server will return the predicted TM topology models and graphical representations of their contents (hydropathy plots, helical wheel diagrams of predicted TMSs and snake-like diagrams).
Assuntos
Proteínas de Membrana/química , Modelos Moleculares , Software , Algoritmos , Internet , Proteínas de Membrana/classificação , Proteínas de Membrana/fisiologia , Conformação Proteica , Reprodutibilidade dos Testes , Interface Usuário-ComputadorRESUMO
We investigated the evolution of transmembrane (TM) topology by detecting partial sequence repeats in TM protein sequences and analyzing them in detail. A total of 377 sequences that seem to have evolved by internal gene duplication events were found among 38,124 predicted TM protein sequences (except for single-spannings) from 87 prokaryotic genomes. Various types of internal duplication patterns were identified in these sequences. The majority of them are diploid-type (including quasi-diploid-type) duplication in which a primordial protein sequence was duplicated internally to become an extant TM protein with twice as many TM segments as the primordial one, and the remaining ones are partial duplications including triploid-type. The diploid-type repeats are recognized in many 8-tms, 10-tms and 12-tms TM protein sequences, suggesting the diploid-type duplication was a principle mechanism in the evolutionary development of these types of TM proteins. The "positive-inside" rule is satisfied in whole sequences of both 10-tms and 8-tms TM proteins and in both halves of 10-tms proteins while not necessarily in the second half of 8-tms proteins, providing fit examples of "internal divergent topology evolution" likely occurred after a diploid-type internal duplication event. From analyzing the partial duplication patterns, several evolutionary pathways were recognized for 6-tms TM proteins, i.e. from primordial 2-tms, 3-tms and 4-tms TM proteins to extant 6-tms proteins. Similarly, the duplication pattern analysis revealed plausible evolution scenarios that 7-tms TM proteins have arisen from 3-tms, 4-tms and 5-tms TM protein precursors via partial internal gene duplications.
Assuntos
Evolução Biológica , Duplicação Gênica , Genoma Arqueal , Genoma Bacteriano , Proteínas de Membrana/genética , Estrutura Secundária de Proteína , Proteínas/genética , Sequência de Aminoácidos , Archaea/genética , Archaea/metabolismo , Bactérias/genética , Bactérias/metabolismo , Bases de Dados de Proteínas , Proteínas de Membrana/isolamento & purificação , Proteínas de Membrana/metabolismo , Modelos Biológicos , Dados de Sequência Molecular , Sinais Direcionadores de Proteínas/genética , Proteoma/genética , Proteoma/isolamento & purificação , Proteoma/metabolismo , Análise de Sequência de Proteína , Homologia de Sequência de Aminoácidos , SoftwareRESUMO
We performed a proteome-wide survey of the domain architectures in single-spanning transmembrane (TM) proteins (single-spannings) from 87 sequenced prokaryotic (Bacterial and Archaean) genomes by assigning Pfam domains to their N-tail and C-tail loops. Out of 14,625 single-spannings, 3,516 sequences have at least one domain assigned, and no domains were assigned to 7,850, with the remaining 3,259 with less reliable assignment. In the domain-assigned sequences, 3116 sequences are with at most two domains, and the other 400 sequences with more than two. The assigned domains distribute over 651 Pfam families, which account for 11.4% of the total Pfam-A families. Among the 651 families are mostly soluble-protein-originated ones, but only 21 families are unique to TM proteins. The occurrence frequency of the individual domain families follows a power-law, that is, 264 families occur only once, 106 just twice, and the families appeared more than 30 times are counted by only 39. It is found that the great majority of the sequences having one or two domains are of the type II topology with the C-tail loop containing domains on it. On the contrary, the N-tail loop of the same type topology seldom carries domains. Importantly, the assigned domains are always found on the tail loops longer than 60 residues, even for the small domains with less than 30 residues. There are still as many as 5,800 sequences without assigned domains in spite of having at least one long tail, on which no less than 1,000 novel domain families are expected most likely to lie concealed unknown yet. We also investigated the domain arrangement preference and the domain family combination patterns in 'singlets' (single-spannings with one assigned domain) and 'doublets' (with two domains).
Assuntos
Proteínas de Membrana/química , Estrutura Terciária de Proteína , Proteoma/química , Biologia Computacional , Bases de Dados Genéticas , Bases de Dados de Proteínas , Genoma Arqueal , Genoma Bacteriano , Células Procarióticas/química , Proteoma/análiseRESUMO
We report the spontaneous generation of a cell-like morphology in an environment crowded with the polymers dextran and polyethylene glycol (PEG) in the presence of DNA. DNA molecules were selectively located in the interior of dextran-rich micro-droplets, when the composition of an aqueous two-phase system (ATPS) was near the critical condition of phase-segregation. The resulting micro-droplets could be controlled by the use of optical tweezers. As an example of laser manipulation, the dynamic fusion of two droplets is reported, which resembles the process of cell division in time-reverse. A hypothetical scenario for the emergence of a primitive cell with DNA is briefly discussed.
RESUMO
We propose a new method for classifying and identifying transmembrane (TM) protein functions in proteome-scale by applying a single-linkage clustering method based on TM topology similarity, which is calculated simply from comparing the lengths of loop regions. In this study, we focused on 87 prokaryotic TM proteomes consisting of 31 proteobacteria, 22 gram-positive bacteria, 19 other bacteria, and 15 archaea. Prior to performing the clustering, we first categorized individual TM protein sequences as "known," "putative" (similar to "known" sequences), or "unknown" by using the homology search and the sequence similarity comparison against SWISS-PROT to assess the current status of the functional annotation of the TM proteomes based on sequence similarity only. More than three-quarters, that is, 75.7% of the TM protein sequences are functionally "unknown," with only 3.8% and 20.5% of them being classified as "known" and "putative," respectively. Using our clustering approach based on TM topology similarity, we succeeded in increasing the rate of TM protein sequences functionally classified and identified from 24.3% to 60.9%. Obtained clusters correspond well to functional superfamilies or families, and the functional classification and identification are successfully achieved by this approach. For example, in an obtained cluster of TM proteins with six TM segments, 109 sequences out of 119 sequences annotated as "ATP-binding cassette transporter" are properly included and 122 "unknown" sequences are also contained.
Assuntos
Algoritmos , Proteínas de Membrana/química , Proteínas de Membrana/classificação , Células Procarióticas/química , Proteoma/química , Homologia de Sequência de Aminoácidos , Biologia Computacional , Bases de Dados Genéticas , Estrutura Terciária de ProteínaRESUMO
We analysed comprehensively transmembrane (TM) topologies of TM proteins of 50 selected prokaryotic genomes, by discriminating between TM and soluble proteins by using SOSUI, then detecting and removing signal peptides by applying 'DetecSig', and finally predicting TM topologies by employing 'ConPred'. Estimated fraction of TM proteins in proteome averaged over the 50 genomes is approximately 22%. About 13% of TM proteins were predicted to have a signal peptide, and the fraction of soluble proteins with signal peptide (secretory proteins) ranges from 8 to 18% for most majority of the genomes. The N(in)-type TM proteins with 2-, 4-, 6- and 12-tms (number of transmembrane segments) are predominant among multi-spanning TM proteins, and correspondingly, significantly higher fractions of N(out)-type TM proteins with 1-, 3-, 5- and 11-tms have a signal peptide. It is also found that the TM proteins with signal peptide tend to have a long N-tail loop. The averaged sequence length of TM proteins increases linearly with the increase of the number of TM segments, with the increasing rate of about 35 residues, suggesting a possibility that TM topologies might have been evolved by the 'internal gene duplication' mechanism. Datasets of TM topologies predicted in this study are available at http://bioinfo.si.hirosaki-u.ac.jp/ approximately TMPinGS/.
Assuntos
Genoma Arqueal , Genoma Bacteriano , Proteínas de Membrana/genética , Archaea/genética , Archaea/metabolismo , Bactérias/genética , Bactérias/metabolismo , Proteínas de Membrana/isolamento & purificação , Proteínas de Membrana/metabolismo , Sinais Direcionadores de Proteínas/genética , Proteoma/genética , Proteoma/isolamento & purificação , Proteoma/metabolismo , Frações SubcelularesRESUMO
Structure prediction of membrane proteins could be constrained and thereby improved by introducing data of the observed molecular shape. We studied a coarse-grained molecular model that relied on residue-based dummy atoms to fold the transmembrane helices of a protein in the observed molecular shape. Based on the inter-residue potential, the α-helices were folded to contact each other in a simulated annealing protocol to search optimized conformation. Fitting the model into a three-dimensional volume was tested for proteins with known structures and resulted in a fairly reasonable arrangement of helices. In addition, the constraint to the packing transmembrane helix with the two-dimensional region was tested and found to work as a very similar folding guide. The obtained models nicely represented α-helices with the desired slight bend. Our structure prediction method for membrane proteins well demonstrated reasonable folding results using a low-resolution structural constraint introduced from recent cell-surface imaging techniques.
Assuntos
Proteínas de Membrana/química , Dobramento de Proteína , Proteínas de Membrana/metabolismo , Proteínas de Membrana/ultraestrutura , Microscopia Eletrônica , Modelos Moleculares , Simulação de Dinâmica Molecular , Estrutura Secundária de ProteínaRESUMO
We selected 10 transmembrane (TM) prediction methods (KKD, TMpred, TopPred II, DAS, TMAP, MEMSAT 2, SOSUI, PRED-TMR2, TMHMM 2.0 and HMMTOP 2.0) and re-assessed its prediction performance using a reliable dataset with 122 entries of experimentally-characterized TM topologies. Then, we improved prediction performance by a consensus prediction method. Prediction performance during re-assessment and consensus prediction were based on four attributes: (i) the number of transmembrane segments (TMSs), (ii) the number of TMSs plus TMS-position, (iii) N-tail location and (iv) TM topology. We noted that hidden Markov model-based methods dominate over other methods by individual prediction performance for all four attributes. In addition, all top-performing methods generally were model-based. Among prokaryotic sequences, HMMTOP 2.0 solely topped among other methods with prediction accuracies ranging from 64% to 86% across all attributes. However, among eukaryotic sequences, prediction performance for all the attributes was relatively poor compared with prokaryotic ones. On the other hand, our results showed that our proposed consensus prediction method significantly improved prediction performance by, at least, an additional nine percentage points particularly among prokaryotic sequences for the number of TMS (84%), number of TMS and position (80%), and TM topology attributes (74%). Although our consensus prediction method improved also the prediction performance among eukaryotic sequences, the obtained accuracies for all attributes were relatively lower than that obtained by prokaryotic counterparts particularly for TM topology.
Assuntos
Membrana Celular/metabolismo , Modelos Moleculares , Proteínas/química , Proteínas/metabolismo , Software , Membrana Celular/química , Bases de Dados de ProteínasRESUMO
Genome-wide sequence analysis in the invertebrate chordate, Ciona intestinalis, has provided a comprehensive picture of immune-related genes in an organism that occupies a key phylogenetic position in vertebrate evolution. The pivotal genes for adaptive immunity, such as the major histocompatibility complex (MHC) class I and II genes, T-cell receptors, or dimeric immunoglobulin molecules, have not been identified in the Ciona genome. Many genes involved in innate immunity have been identified, including complement components, Toll-like receptors, and the genes involved in intracellular signal transduction of immune responses, and show both expansion and unexpected diversity in comparison with the vertebrates. In addition, a number of genes were identified which predicted integral membrane proteins with extracellular C-type lectin or immunoglobulin domains and intracellular immunoreceptor tyrosine-based inhibitory motifs (ITIMs) and immunoreceptor tyrosine-based activation motifs (ITAMs) (plus their associated signal transduction molecules), suggesting that activating and inhibitory receptors have an MHC-independent function and an early evolutionary origin. A crucial component of vertebrate adaptive immunity is somatic diversification, and the recombination activating genes (RAG) and activation-induced cytidine deaminase (AID) genes responsible for the Generation of diversity are not present in Ciona. However, there are key V regions, the essential feature of an immunoglobulin superfamily VC1-like core, and possible proto-MHC regions scattered throughout the genome waiting for Godot.