RESUMO
MOTIVATION: Substrings of length k, commonly referred to as k-mers, play a vital role in sequence analysis. However, k-mers are limited to exact matches between sequences leading to alternative constructs. We recently introduced a class of new constructs, strobemers, that can match across substitutions and smaller insertions and deletions. Randstrobes, the most sensitive strobemer proposed in Sahlin (Effective sequence similarity detection with strobemers. Genome Res 2021a;31:2080-94. https://doi.org/10.1101/gr.275648.121), has been used in several bioinformatics applications such as read classification, short-read mapping, and read overlap detection. Recently, we showed that the more pseudo-random the behavior of the construction (measured in entropy), the more efficient the seeds for sequence similarity analysis. The level of pseudo-randomness depends on the construction operators, but no study has investigated the efficacy. RESULTS: In this study, we introduce novel construction methods, including a Binary Search Tree-based approach that improves time complexity over previous methods. To our knowledge, we are also the first to address biases in construction and design three metrics for measuring bias. Our evaluation shows that our methods have favorable speed and sampling uniformity compared to existing approaches. Lastly, guided by our results, we change the seed construction in strobealign, a short-read mapper, and find that the results change substantially. We suggest combining the two results to improve strobealign's accuracy for the shortest reads in our evaluated datasets. Our evaluation highlights sampling biases that can occur and provides guidance on which operators to use when implementing randstrobes. AVAILABILITY AND IMPLEMENTATION: All methods and evaluation benchmarks are available in a public Github repository at https://github.com/Moein-Karami/RandStrobes. The scripts for running the strobealign analysis are found at https://github.com/NBISweden/strobealign-evaluation.
RESUMO
Fraud in the reward-based crowdfunding market has been of concern to regulators, but it is arguably of greater importance to the nascent industry itself. Despite its significance for entrepreneurial finance, our knowledge of the occurrence, determinants, and consequences of fraud in this market, as well as the implications for the business ethics literature, remain limited. In this study, we conduct an exhaustive search of all media reports on Kickstarter campaign fraud allegations from 2010 through 2015. We then follow up until 2018 to assess the ultimate outcome of each allegedly fraudulent campaign. First, we construct a sample of 193 fraud cases, and categorize them into detected vs. suspected fraud, based on a set of well-defined criteria. Next, using multiple matched samples of non-fraudulent campaigns, we determine which features are associated with a higher probability of fraudulent behavior. Second, we document the short-term negative consequences of possible breaches of trust in the market, using a sample of more than 270,000 crowdfunding campaigns from 2010 through 2018 on Kickstarter. Our results show that crowdfunding projects launched around the public announcement of a late and significant misconduct detection (resulting in suspension) tend to have a lower probability of success, raise less funds, and attract fewer backers. Supplementary Information: The online version contains supplementary material available at 10.1007/s10551-021-04942-w.