RESUMO
Experimental data from single-molecule DNA-protein experiments, such as experiments using optical traps or magnetic tweezers, typically contain steps, plateaus, or dwell regions that are obscured by thermal and other noise sources. We present a nonparametric method for detecting step-like features in noisy biological data sets. Our algorithm does not assume that the steps can be modeled as Heaviside functions or any particular parametric form. No assumptions about the noise source, such as whether the noise is Gaussian or colored, are made either. Instead, for detection of plateaus, the algorithm uses the novel method of analyzing a probability distribution function of the data values. The vast majority of previously published methods for step detection rely on statistical fitting of step functions with the flat segments linked by vertical segments. Our approach is intended for use on data which cannot be modelled as a series of step functions but applies to step functions as a special case. These type of data traces have, so far, been difficult to characterize effectively. We examine the performance of the algorithm through systematic simulation studies and illustrate the use of our algorithm to analyze single molecule DNA-protein micromanipulation experiments carried out by our laboratory. The simulation results and experimental validation suggest that our method is very robust, avoids overfitting, and functions effectively in the presence of noise sources characteristic of single molecule experiments.