RESUMO
MOTIVATION: Simple tandem repeats, microsatellites in particular, have regulatory functions, links to several diseases and applications in biotechnology. There is an immediate need for an accurate tool for detecting microsatellites in newly sequenced genomes. The current available tools are either sensitive or specific but not both; some tools require adjusting parameters manually. RESULTS: We propose Look4TRs, the first application of self-supervised hidden Markov models to discovering microsatellites. Look4TRs adapts itself to the input genomes, balancing high sensitivity and low false positive rate. It auto-calibrates itself. We evaluated Look4TRs on 26 eukaryotic genomes. Based on F measure, which combines sensitivity and false positive rate, Look4TRs outperformed TRF and MISA-the most widely used tools-by 78 and 84%. Look4TRs outperformed the second and the third best tools, MsDetector and Tantan, by 17 and 34%. On eight bacterial genomes, Look4TRs outperformed the second and the third best tools by 27 and 137%. AVAILABILITY AND IMPLEMENTATION: https://github.com/TulsaBioinformaticsToolsmith/Look4TRs. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Genômica , Eucariotos , Genoma Bacteriano , Repetições de Microssatélites , SoftwareRESUMO
Gene expression measurement techniques such as quantitative reverse transcriptase (qRT)-PCR require a normalization strategy to allow meaningful comparisons across biological samples. Typically, this is accomplished through the use of an endogenous housekeeping gene that is presumed to show stable expression levels in the samples under study. There is concern regarding how precisely specific genes can be measured in limited amounts of mRNA such as those from microdissected (MD) tissues. To address this issue, we evaluated three different approaches for qRT-PCR normalization of dissected samples; cell count during microdissection, total RNA measurement, and endogenous control genes. The data indicate that both cell count and total RNA are useful in calibrating input amounts at the outset of a study, but do not provide enough precision to serve as normalization standards. However, endogenous control genes can accurately determine the relative abundance of a target gene relative to the entire cellular transcriptome. Taken together, these results suggest that precise gene expression measurements can be made from MD samples if the appropriate normalization strategy is employed.