Your browser doesn't support javascript.
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 5.767
Filtrar
1.
BMC Bioinformatics ; 20(1): 456, 2019 Sep 06.
Artigo em Inglês | MEDLINE | ID: mdl-31492094

RESUMO

*: Background In the search for therapeutic peptides for disease treatments, many efforts have been made to identify various functional peptides from large numbers of peptide sequence databases. In this paper, we propose an effective computational model that uses deep learning and word2vec to predict therapeutic peptides (PTPD). *: Results Representation vectors of all k-mers were obtained through word2vec based on k-mer co-existence information. The original peptide sequences were then divided into k-mers using the windowing method. The peptide sequences were mapped to the input layer by the embedding vector obtained by word2vec. Three types of filters in the convolutional layers, as well as dropout and max-pooling operations, were applied to construct feature maps. These feature maps were concatenated into a fully connected dense layer, and rectified linear units (ReLU) and dropout operations were included to avoid over-fitting of PTPD. The classification probabilities were generated by a sigmoid function. PTPD was then validated using two datasets: an independent anticancer peptide dataset and a virulent protein dataset, on which it achieved accuracies of 96% and 94%, respectively. *: Conclusions PTPD identified novel therapeutic peptides efficiently, and it is suitable for application as a useful tool in therapeutic peptide design.


Assuntos
Biologia Computacional/métodos , Aprendizado Profundo , Peptídeos/uso terapêutico , Bases de Dados de Ácidos Nucleicos , Descoberta de Drogas
2.
BMC Bioinformatics ; 20(1): 424, 2019 Aug 15.
Artigo em Inglês | MEDLINE | ID: mdl-31416440

RESUMO

BACKGROUND: High throughput DNA/RNA sequencing has revolutionized biological and clinical research. Sequencing is widely used, and generates very large amounts of data, mainly due to reduced cost and advanced technologies. Quickly assessing the quality of giga-to-tera base levels of sequencing data has become a routine but important task. Identification and elimination of low-quality sequence data is crucial for reliability of downstream analysis results. There is a need for a high-speed tool that uses optimized parallel programming for batch processing and simply gauges the quality of sequencing data from multiple datasets independent of any other processing steps. RESULTS: FQStat is a stand-alone, platform-independent software tool that assesses the quality of FASTQ files using parallel programming. Based on the machine architecture and input data, FQStat automatically determines the number of cores and the amount of memory to be allocated per file for optimum performance. Our results indicate that in a core-limited case, core assignment overhead exceeds the benefit of additional cores. In a core-unlimited case, there is a saturation point reached in performance by increasingly assigning additional cores per file. We also show that memory allocation per file has a lower priority in performance when compared to the allocation of cores. FQStat's output is summarized in HTML web page, tab-delimited text file, and high-resolution image formats. FQStat calculates and plots read count, read length, quality score, and high-quality base statistics. FQStat identifies and marks low-quality sequencing data to suggest removal from downstream analysis. We applied FQStat on real sequencing data to optimize performance and to demonstrate its capabilities. We also compared FQStat's performance to similar quality control (QC) tools that utilize parallel programming and attained improvements in run time. CONCLUSIONS: FQStat is a user-friendly tool with a graphical interface that employs a parallel programming architecture and automatically optimizes its performance to generate quality control statistics for sequencing data. Unlike existing tools, these statistics are calculated for multiple datasets and separately at the "lane," "sample," and "experiment" level to identify subsets of the samples with low quality, thereby preventing the loss of complete samples when reliable data can still be obtained.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/métodos , Sequenciamento de Nucleotídeos em Larga Escala/normas , Software , Bases de Dados de Ácidos Nucleicos , Humanos , Controle de Qualidade , Análise de Sequência de DNA , Análise de Sequência de RNA , Fatores de Tempo
3.
BMC Bioinformatics ; 20(1): 405, 2019 Jul 25.
Artigo em Inglês | MEDLINE | ID: mdl-31345161

RESUMO

BACKGROUND: Next-generation sequencing technologies can produce tens of millions of reads, often paired-end, from transcripts or genomes. But few programs can align RNA on the genome and accurately discover introns, especially with long reads. We introduce Magic-BLAST, a new aligner based on ideas from the Magic pipeline. RESULTS: Magic-BLAST uses innovative techniques that include the optimization of a spliced alignment score and selective masking during seed selection. We evaluate the performance of Magic-BLAST to accurately map short or long sequences and its ability to discover introns on real RNA-seq data sets from PacBio, Roche and Illumina runs, and on six benchmarks, and compare it to other popular aligners. Additionally, we look at alignments of human idealized RefSeq mRNA sequences perfectly matching the genome. CONCLUSIONS: We show that Magic-BLAST is the best at intron discovery over a wide range of conditions and the best at mapping reads longer than 250 bases, from any platform. It is versatile and robust to high levels of mismatches or extreme base composition, and reasonably fast. It can align reads to a BLAST database or a FASTA file. It can accept a FASTQ file as input or automatically retrieve an accession from the SRA repository at the NCBI.


Assuntos
RNA/genética , Alinhamento de Sequência , Análise de Sequência de RNA/métodos , Software , Algoritmos , Sequência de Bases , Bases de Dados de Ácidos Nucleicos , Humanos , Íntrons/genética , Curva ROC , Fatores de Tempo
4.
J S Afr Vet Assoc ; 90(0): e1-e6, 2019 Jun 20.
Artigo em Inglês | MEDLINE | ID: mdl-31291728

RESUMO

Genetic diversity within partial 18S rRNA sequences from Hepatozoon protozoan parasites from domestic cats in South Africa was assessed and compared against published data to assess global biogeographic patterns. Multiple distinct haplotypes of Hepatozoon felis were identified, as well as an unrelated Hepatozoon lineage. Hepatozoon felis genetic diversity globally is very high, indicating a likely complex of species. The recently described Hepatozoon apri from wild boars is closely related to some lineages of H. felis. Sarcocystis and Babesia parasites were also detected. Since Hepatozoon felis is apparently a species complex, potential differences between genetically distinct forms need to be assessed. The finding of an unrelated Hepatozoon indicates that felids can be infected by more species of Hepatozoonthan currently known, and that trophic interactions may increase the number of Hepatozoon species found in carnivores. Genetic screening again is demonstrated to identify previously unrecognised parasites from vertebrate hosts.


Assuntos
Apicomplexa/genética , Gatos/parasitologia , Animais , Animais Domésticos/parasitologia , Teorema de Bayes , Bases de Dados de Ácidos Nucleicos , Variação Genética , Haplótipos , RNA Ribossômico 18S/genética , Análise de Sequência de DNA/veterinária , África do Sul
5.
J Forensic Leg Med ; 66: 155-161, 2019 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-31306915

RESUMO

The simultaneous localisation and globalisation of 'terrorist threats' and cross-border criminality have led to increased expansion of surveillance activities and greater cross-border police and judicial cooperation, placing a greater priority on these activities within the political agenda of the EU. In this scenario, the expansion of technological systems for surveillance and monitoring, and the large-scale exchange of citizens' personal data play a pivotal role in the "fight against crime". This paper explores the multiplicity of data protection regimes in different EU Member States within the framework of the Prüm system. While EU regulations establish minimum standards for personal data flows at the transnational level, local and domestic practices are extremely heterogeneous. Based on analysis of 37 interviews conducted with professionals involved in the automated exchange of forensic genetic profiles, this paper provides empirical data that highlights the tensions between the local and the global within DNA data exchanges across the EU. These tensions relate to differentiated sociotechnical imaginaries regarding the protection of personal data flowing between Member-States. In sum, this paper analyses the potential threats to human rights created by the exchange of personal data with regards to issues of privacy and data protection.


Assuntos
Segurança Computacional/legislação & jurisprudência , Bases de Dados de Ácidos Nucleicos , Disseminação de Informação/legislação & jurisprudência , Cooperação Internacional , Privacidade/legislação & jurisprudência , Crime/prevenção & controle , Impressões Digitais de DNA/legislação & jurisprudência , Dermatoglifia , União Europeia , Humanos , Terrorismo/prevenção & controle
6.
BMC Bioinformatics ; 20(1): 298, 2019 Jun 03.
Artigo em Inglês | MEDLINE | ID: mdl-31159722

RESUMO

BACKGROUND: Several standalone error correction tools have been proposed to correct sequencing errors in Illumina data in order to facilitate de novo genome assembly. However, in a recent survey, we showed that state-of-the-art assemblers often did not benefit from this pre-correction step. We found that many error correction tools introduce new errors in reads that overlap highly repetitive DNA regions such as low-complexity patterns or short homopolymers, ultimately leading to a more fragmented assembly. RESULTS: We propose BrownieCorrector, an error correction tool for Illumina sequencing data that focuses on the correction of only those reads that overlap short DNA patterns that are highly repetitive in the genome. BrownieCorrector extracts all reads that contain such a pattern and clusters them into different groups using a community detection algorithm that takes into account both the sequence similarity between overlapping reads and their respective paired-end reads. Each cluster holds reads that originate from the same genomic region and hence each cluster can be corrected individually, thus providing a consistent correction for all reads within that cluster. CONCLUSIONS: BrownieCorrector is benchmarked using six real Illumina datasets for different eukaryotic genomes. The prior use of BrownieCorrector improves assembly results over the use of uncorrected reads in all cases. In comparison with other error correction tools, BrownieCorrector leads to the best assembly results in most cases even though less than 2% of the reads within a dataset are corrected. Additionally, we investigate the impact of error correction on hybrid assembly where the corrected Illumina reads are supplemented with PacBio data. Our results confirm that BrownieCorrector improves the quality of hybrid genome assembly as well. BrownieCorrector is written in standard C++11 and released under GPL license. BrownieCorrector relies on multithreading to take advantage of multi-core/multi-CPU systems. The source code is available at https://github.com/biointec/browniecorrector .


Assuntos
Algoritmos , DNA/genética , Genoma , Sequências Repetitivas de Ácido Nucleico/genética , Análise de Sequência de DNA/métodos , Animais , Bases de Dados de Ácidos Nucleicos , Humanos , Alinhamento de Sequência , Fatores de Tempo
7.
Nat Commun ; 10(1): 2837, 2019 06 28.
Artigo em Inglês | MEDLINE | ID: mdl-31253775

RESUMO

The diagnostic yield of exome and genome sequencing remains low (8-70%), due to incomplete knowledge on the genes that cause disease. To improve this, we use RNA-seq data from 31,499 samples to predict which genes cause specific disease phenotypes, and develop GeneNetwork Assisted Diagnostic Optimization (GADO). We show that this unbiased method, which does not rely upon specific knowledge on individual genes, is effective in both identifying previously unknown disease gene associations, and flagging genes that have previously been incorrectly implicated in disease. GADO can be run on www.genenetwork.nl by supplying HPO-terms and a list of genes that contain candidate variants. Finally, applying GADO to a cohort of 61 patients for whom exome-sequencing analysis had not resulted in a genetic diagnosis, yields likely causative genes for ten cases.


Assuntos
Regulação da Expressão Gênica/fisiologia , Predisposição Genética para Doença , Análise de Sequência de RNA/métodos , Transcriptoma , Bases de Dados de Ácidos Nucleicos , Humanos , Modelos Genéticos , Análise de Componente Principal , Software , Interface Usuário-Computador
8.
Forensic Sci Int ; 301: 371-381, 2019 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-31212144

RESUMO

Different stakeholders use forensic DNA databases for different purposes; for example, law enforcement agencies use them as an investigative tool to identify suspects, and criminologists use them to study the offending patterns of unidentified suspects. A number of researchers have already studied their effectiveness, but none has performed an overview of the relevant literature. Such an overview could help future researchers and policymakers by evaluating their creation, use and expansion. Using a systematic review, this article synthesizes the most relevant research into the effectiveness of forensic DNA databases published between January 1985 and March 2018. We report the results of the selected studies and look deeper into the evidence by evaluating the relationship between the purpose, content, and effectiveness of DNA databases, three inseparable elements in this type of research. We classify the studies by purposes: (i) detection and clearance; (ii) deterrence; and (iii) criminological scientific knowledge. Each category uses different measurements to evaluate effectiveness. The majority of these studies report positive results, supporting the assumption that DNA databases are an effective tool for the police, society, and criminologists.


Assuntos
Direito Penal , Bases de Dados de Ácidos Nucleicos , Impressões Digitais de DNA , Humanos
10.
Nat Commun ; 10(1): 2449, 2019 06 04.
Artigo em Inglês | MEDLINE | ID: mdl-31164644

RESUMO

DNA base modifications, such as C5-methylcytosine (5mC) and N6-methyldeoxyadenosine (6mA), are important types of epigenetic regulations. Short-read bisulfite sequencing and long-read PacBio sequencing have inherent limitations to detect DNA modifications. Here, using raw electric signals of Oxford Nanopore long-read sequencing data, we design DeepMod, a bidirectional recurrent neural network (RNN) with long short-term memory (LSTM) to detect DNA modifications. We sequence a human genome HX1 and a Chlamydomonas reinhardtii genome using Nanopore sequencing, and then evaluate DeepMod on three types of genomes (Escherichia coli, Chlamydomonas reinhardtii and human genomes). For 5mC detection, DeepMod achieves average precision up to 0.99 for both synthetically introduced and naturally occurring modifications. For 6mA detection, DeepMod achieves ~0.9 average precision on Escherichia coli data, and have improved performance than existing methods on Chlamydomonas reinhardtii data. In conclusion, DeepMod performs well for genome-scale detection of DNA modifications and will facilitate epigenetic analysis on diverse species.


Assuntos
Chlamydomonas reinhardtii/genética , Metilação de DNA , Escherichia coli/genética , Genoma Bacteriano/genética , Genoma Humano/genética , Genoma de Planta/genética , Redes Neurais (Computação) , Bases de Dados de Ácidos Nucleicos , Epigênese Genética , Humanos , Nanoporos
11.
BMC Genomics ; 20(1): 459, 2019 Jun 06.
Artigo em Inglês | MEDLINE | ID: mdl-31170919

RESUMO

BACKGROUND: The most widely used human genome reference assembly hg19 harbors minor alleles at 2.18 million positions as revealed by 1000 Genome Phase 3 dataset. Although this is less than 2% of the 89 million variants reported, it has been shown that the minor alleles can result in 30% false positives in individual genomes, thus misleading and burdening downstream interpretation. More alarming is the fact that, significant percentage of variants that are homozygous recessive for these minor alleles, with potential disease implications, are masked from reporting. RESULTS: We have demonstrated that the false positives (FP) and false negatives (FN) can be corrected for by simply replacing nucleotides at the minor allele positions in hg19 with corresponding major allele. Here, we have effectively replaced 2.18 million minor alleles Single Nucleotide Polymorphism (SNPs), Insertion and Deletions (INDELs), Multiple Nucleotide Polymorphism (MNPs) in hg19 with the corresponding major alleles to create an ethnically normalized reference genome called hg19KIndel. In doing so, hg19KIndel has both corrected for sequencing errors acknowledged to be present in hg19 and has improved read alignment near the minor alleles in hg19. CONCLUSION: We have created and made available a new version human reference genome called hg19KIndel. It has been shown that variant calling using hg19KIndel, significantly reduces false positives calls, which in-turn reduces the burden from downstream analysis and validation. It also improved false negative variants call, which means that the variants which were getting missed due to the presence of minor alleles in hg19, will now be called using hg19KIndel. Using hg19KIndel, one even gets a better mapping percentage when compared to currently available human reference genome. hg19KIndel reference genome and its auxiliary datasets are available at https://doi.org/10.5281/zenodo.2638113.


Assuntos
Grupos Étnicos/genética , Variação Genética , Genoma Humano , Alelos , Bases de Dados de Ácidos Nucleicos , Humanos , Mutação INDEL , Polimorfismo de Nucleotídeo Único , Padrões de Referência , Análise de Sequência de DNA
12.
Nat Commun ; 10(1): 2557, 2019 06 11.
Artigo em Inglês | MEDLINE | ID: mdl-31186421

RESUMO

Facial recognition from DNA refers to the identification or verification of unidentified biological material against facial images with known identity. One approach to establish the identity of unidentified biological material is to predict the face from DNA, and subsequently to match against facial images. However, DNA phenotyping of the human face remains challenging. Here, another proof of concept to biometric authentication is established by using multiple face-to-DNA classifiers, each classifying given faces by a DNA-encoded aspect (sex, genomic background, individual genetic loci), or by a DNA-inferred aspect (BMI, age). Face-to-DNA classifiers on distinct DNA aspects are fused into one matching score for any given face against DNA. In a globally diverse, and subsequently in a homogeneous cohort, we demonstrate preliminary, but substantial true (83%, 80%) over false (17%, 20%) matching in verification mode. Consequences of future efforts include forensic applications, necessitating careful consideration of ethical and legal implications for privacy in genomic databases.


Assuntos
Identificação Biométrica , Face/anatomia & histologia , Reconhecimento Facial , Genótipo , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Estatura , Peso Corporal , Estudos de Coortes , Bases de Dados de Ácidos Nucleicos , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Polimorfismo de Nucleotídeo Único
13.
Nat Commun ; 10(1): 2569, 2019 06 12.
Artigo em Inglês | MEDLINE | ID: mdl-31189880

RESUMO

Synonymous mutations have been viewed as silent mutations, since they only affect the DNA and mRNA, but not the amino acid sequence of the resulting protein. Nonetheless, recent studies suggest their significant impact on splicing, RNA stability, RNA folding, translation or co-translational protein folding. Hence, we compile 659194 synonymous mutations found in human cancer and characterize their properties. We provide the user-friendly, comprehensive resource for synonymous mutations in cancer, SynMICdb ( http://SynMICdb.dkfz.de ), which also contains orthogonal information about gene annotation, recurrence, mutation loads, cancer association, conservation, alternative events, impact on mRNA structure and a SynMICdb score. Notably, synonymous and missense mutations are depleted at the 5'-end of the coding sequence as well as at the ends of internal exons independent of mutational signatures. For patient-derived synonymous mutations in the oncogene KRAS, we indicate that single point mutations can have a relevant impact on expression as well as on mRNA secondary structure.


Assuntos
Bases de Dados de Ácidos Nucleicos , Regulação Neoplásica da Expressão Gênica/genética , Neoplasias/genética , Mutação Silenciosa/genética , Conjuntos de Dados como Assunto , Humanos , Mutação de Sentido Incorreto/genética , Mutação Puntual/genética , Proteínas Proto-Oncogênicas p21(ras)/genética , Dobramento de RNA/genética , Processamento de RNA/genética , RNA Mensageiro/química , RNA Mensageiro/genética
14.
Forensic Sci Int ; 301: 107-117, 2019 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-31153988

RESUMO

In the last year direct-to-consumer (DTC) genetic genealogy databases have been used to identify suspects and missing persons in over fifty cold cases, many of which have been unsolved for decades. Genealogists worked on these cases in collaboration with law enforcement agencies. Raw DNA data files were uploaded to the genealogy websites GEDmatch and FamilyTreeDNA, and identification was made by tracing the family trees of relatives who were predicted to be close genetic matches in the database. Such searches have far-reaching consequences because they affect not just those who have consented to upload their DNA results to these databases but also all of their relatives, regardless of whether or not they have taken a DNA test. This article provides an overview of the methods used, the potential privacy and security issues, and the wider implications for society. There is an urgent need for forensic scientists, bioethicists, law enforcement agencies, genetic genealogists and other interested parties to work together to produce international guidelines and policies to ensure that the techniques are used responsibly and effectively.


Assuntos
Crime , Impressões Digitais de DNA , Bases de Dados de Ácidos Nucleicos , Aplicação da Lei , Linhagem , Cromossomos Humanos Y , Privacidade Genética , Humanos , Consentimento Livre e Esclarecido , Repetições de Microssatélites , Sequenciamento Completo do Genoma
15.
Forensic Sci Int ; 300: e13-e19, 2019 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-31056342

RESUMO

In the present case, due to the lack of database matches and available relatives, the use of single-source DNA profiles from the unidentified deceased and the suspect was not yet able to determine the identity of them, resulting in the inability to continue the investigation. However, by interpreting a simple mixture on the penile swab of the deceased and a complex mixture on the prayer flag wrapped around the ankles of the deceased, the information embedded in these two mixtures provided us with a breakthrough point for addressing the identity determination. Preliminary analysis using the separating method or based on imbalanced peaks at the Amelogenin locus revealed that each of the two DNA mixtures should have a female minor contributor, who were likely to be sex workers according to the investigation results. Consequently, blood samples from fifty-two women were collected for STR genotyping. The analysis results of the two mixtures using LRmix Studio showed that the probability of the simple mixture that if it came from the deceased and the female numbered P0053 is 4.7078 × 1012 times more likely than if it came from the deceased and an unknown female, while the probability of the complex mixture that if it came from P0062, the deceased and the male suspect is 8.1777 × 107 times more likely than if it came from the deceased and two unknowns. Subsequently, based on the clues provided by P0053 and P0062, the identity of the deceased and the suspect was successfully determined and the case was finally resolved. These results suggest the valuable evidence that can be obtained from mixtures and the high priority that should be placed on the analysis of mixtures, especially those that may be considered unlikely to derive complete single-source profiles by interpretation. In addition, the occurrence of a secondary DNA transfer was confirmed.


Assuntos
Impressões Digitais de DNA , DNA/análise , Genética Forense/métodos , Repetições de Microssatélites , Bases de Dados de Ácidos Nucleicos , Feminino , Genótipo , Humanos , Masculino , Reação em Cadeia da Polimerase em Tempo Real , Tibet
16.
BMC Bioinformatics ; 20(1): 216, 2019 Apr 29.
Artigo em Inglês | MEDLINE | ID: mdl-31035936

RESUMO

BACKGROUND: The large biological databases such as GenBank contain vast numbers of records, the content of which is substantively based on external resources, including published literature. Manual curation is used to establish whether the literature and the records are indeed consistent. We explore in this paper an automated method for assessing the consistency of biological assertions, to assist biocurators, which we call BARC, Biocuration tool for Assessment of Relation Consistency. In this method a biological assertion is represented as a relation between two objects (for example, a gene and a disease); we then use our novel set-based relevance algorithm SaBRA to retrieve pertinent literature, and apply a classifier to estimate the likelihood that this relation (assertion) is correct. RESULTS: Our experiments on assessing gene-disease relations and protein-protein interactions using the PubMed Central collection show that BARC can be effective at assisting curators to perform data cleansing. Specifically, the results obtained showed that BARC substantially outperforms the best baselines, with an improvement of F-measure of 3.5% and 13%, respectively, on gene-disease relations and protein-protein interactions. We have additionally carried out a feature analysis that showed that all feature types are informative, as are all fields of the documents. CONCLUSIONS: BARC provides a clear benefit for the biocuration community, as there are no prior automated tools for identifying inconsistent assertions in large-scale biological databases.


Assuntos
Algoritmos , Mineração de Dados/métodos , Bases de Dados Factuais , Bases de Dados de Ácidos Nucleicos , Humanos , Mapas de Interação de Proteínas , Editoração
17.
Forensic Sci Int Genet ; 41: 83-92, 2019 07.
Artigo em Inglês | MEDLINE | ID: mdl-31031230

RESUMO

For very serious crimes, reporting scientists often have to contend with complex cases where literally hundreds of items are submitted by investigators for analysis. In order to efficiently expedite the challenge of comparing reference profiles to evidence profiles, many of which are mixtures, we have developed an investigative open source expert system CaseSolver. We have analysed a real case based on GlobalFiler involving 119 evidence profiles and 3 reference profiles. To provide a demonstration of the power of the system we also added the three references to a fictive large database of 1 million individuals in order to test subsequent recovery of the presumed true contributors. CaseSolver was used on a Fusion 6C validation study involving 25 two- to four-person mixture profiles based on 14 reference profiles. The sequential use of simple allele comparison, the qualitative model (forensim) and the quantitative model (EuroForMix) makes the analysis very fast and accurate - and finally, the software generates a list of potential match candidates which can be exported as a report. From these two studies we found that the resolution of match candidates from CaseSolver was the same as that reported by a scientist who worked manually through the samples, except that CaseSolver highlighted two manual errors. For the validation study we found low template DNA samples giving negative results, which demonstrate the limitations of the tool; but overall our assessment shows that CaseSolver will benefit all analyses involving mixture interpretation and screening. Importantly, CaseSolver removes the very time-consuming aspect of manual comparison and gives improved quality by preventing manual errors.


Assuntos
Impressões Digitais de DNA , Sistemas Especialistas , Software , Bases de Dados de Ácidos Nucleicos , Genética Forense , Frequência do Gene , Humanos , Funções Verossimilhança , Repetições de Microssatélites
19.
Methods Mol Biol ; 1962: 215-226, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31020563

RESUMO

DDBJ Fast Annotation and Submission Tool (DFAST) is a genome annotation pipeline for prokaryotes, which also assists data submission to the public sequence database. It is available both as a web service and as a stand-alone tool that runs on local machines. DFAST can annotate a typical-sized bacterial genome within 5 min. The default annotation workflow contains a gene prediction phase for protein coding sequence, rRNA, tRNA, and CRISPR, and a functional annotation phase to infer protein functions. DFAST generates result files in standard annotation formats and data files for submission to DNA Data Bank of Japan (DDBJ). In this chapter, the annotation workflow and applications of DFAST are introduced.


Assuntos
Bases de Dados de Ácidos Nucleicos , Anotação de Sequência Molecular/métodos , Células Procarióticas , Software , Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas , Apresentação de Dados , Genoma Bacteriano , Internet , Proteínas/genética , Pseudogenes , Editoração , RNA Ribossômico , RNA de Transferência , Fluxo de Trabalho
20.
Methods Mol Biol ; 1970: 15-30, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-30963485

RESUMO

MicroRNA (miRNA) studies deliver numerous types of information including miRNA identification, sequence of miRNAs, target prediction, roles in diseases, and interactions in signaling pathways. Considering the different types of miRNA data, the number of miRNA databases has been increasing quickly. While resources have been planned to simplify miRNA analysis, scientists are facing the challenging task of choosing the most proper tool to retrieve related information. In this chapter, we introduce the use of miRandb, a resource that we have established to present an outline of different types of miRNA online resources and to simplify finding the right miRNA information that scientists need for their research. miRandb offers a user-friendly platform to find related information about any miRNA data among more than 188 present miRNA databases. miRandb has an easy procedure, and information can be retrieved by miRNA category resources. Each database comprises numerous kinds of information including database activity, description, main and unique features, organism, URL, publication, category, published year, citations per year, last update, and relative popularity. miRandb provides several opportunities and facilitates access to diverse classes of microRNA resources. miRandb is available at http://miRandb.ir .


Assuntos
Biologia Computacional/métodos , Bases de Dados de Ácidos Nucleicos , Internet , MicroRNAs/genética , RNA Mensageiro/genética , Software , Sítios de Ligação , Regulação da Expressão Gênica , Humanos , MicroRNAs/metabolismo , RNA Mensageiro/metabolismo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA