RESUMO
The degree to which unimodal circular data are concentrated around the mean direction can be quantified using the mean resultant length, a measure known under many alternative names, such as the phase locking value or the Kuramoto order parameter. For maximal concentration, achieved when all of the data take the same value, the mean resultant length attains its upper bound of one. However, for a random sample drawn from the circular uniform distribution, the expected value of the mean resultant length achieves its lower bound of zero only as the sample size tends to infinity. Moreover, as the expected value of the mean resultant length depends on the sample size, bias is induced when comparing the mean resultant lengths of samples of different sizes. In order to ameliorate this problem, here, we introduce a re-normalized version of the mean resultant length. Regardless of the sample size, the re-normalized measure has an expected value that is essentially zero for a random sample from the circular uniform distribution, takes intermediate values for partially concentrated unimodal data, and attains its upper bound of one for maximal concentration. The re-normalized measure retains the simplicity of the original mean resultant length and is, therefore, easy to implement and compute. We illustrate the relevance and effectiveness of the proposed re-normalized measure for mathematical models and electroencephalographic recordings of an epileptic seizure.
RESUMO
Recently described stochastic models of protein evolution have demonstrated that the inclusion of structural information in addition to amino acid sequences leads to a more reliable estimation of evolutionary parameters. We present a generative, evolutionary model of protein structure and sequence that is valid on a local length scale. The model concerns the local dependencies between sequence and structure evolution in a pair of homologous proteins. The evolutionary trajectory between the two structures in the protein pair is treated as a random walk in dihedral angle space, which is modeled using a novel angular diffusion process on the two-dimensional torus. Coupling sequence and structure evolution in our model allows for modeling both "smooth" conformational changes and "catastrophic" conformational jumps, conditioned on the amino acid changes. The model has interpretable parameters and is comparatively more realistic than previous stochastic models, providing new insights into the relationship between sequence and structure evolution. For example, using the trained model we were able to identify an apparent sequence-structure evolutionary motif present in a large number of homologous protein pairs. The generative nature of our model enables us to evaluate its validity and its ability to simulate aspects of protein evolution conditioned on an amino acid sequence, a related amino acid sequence, a related structure or any combination thereof.