Using Nonprobability Survey Samples Can Be a Dangerous Gamble
February 2026
Survey data is only as strong as the foundation on which it rests.
Nonprobability surveys have long been attractive for their low cost, speed, and reach. For a while, they seemed “good enough.” But today’s digital environment, shaped by shifting recruitment models and AI-driven fraud, has created new and significant uncertainties.
Leaders rely on survey data to guide policy, allocate resources, and understand the communities they serve. The question isn’t how fast you can gather responses. It’s whether you can trust the story those responses tell.
That’s where the divide between probability and nonprobability methods becomes critical. Nonprobability samples no longer offer a clear, measurable connection to the populations they aim to represent. Probability samples do. The difference affects accuracy, credibility, and every decision built on the data.
Two Approaches, Very Different Odds
Probability sampling begins with a verified sampling frame, such as a national address file, that gives every person in the target population a known and quantifiable chance of selection. Because recruitment is random, controlled, and transparent, researchers can measure sampling error, evaluate bias, and generalize results to the broader population with accuracy.
Nonprobability sampling works fundamentally differently. Participation depends on who happens to see an ad, access a rewards program, or click a link online. There’s no way to know who had the chance to participate, or who never saw the invitation at all. People who volunteer tend to share similar motivations or behaviors, making it impossible to verify whether they reflect the full population.
While one can find isolated cases where nonprobability results resemble those from probability surveys, the overwhelming evidence shows that probability methods are consistently more accurate (Callegaro et al., 2014; Cornesse et al., 2020; Elliott & Valliant, 2017; Malhotra & Krosnick, 2007; Yeager et al., 2011). More concerning, when nonprobability surveys are wrong, they can be dramatically wrong. Using them is a blind gamble: Are your results valid, or fatally flawed? Before the data arrive, there’s no way to know.
The Players Matter
Who participates in a survey matters just as much as how the survey is conducted. Nonprobability participants often represent a narrow slice of the public. Because participation isn’t random, the sample typically tilts toward more engaged, more opinionated, and more digitally connected individuals.
Opt-in vendors recruit through whichever partner networks they maintain, so the sample inherits the biases of those networks. If a panel is tied to a travel rewards program, frequent flyers are overrepresented. If a study relies on online channels, underrepresented groups or those with low digital literacy can be missed by up to 40 percentage points.
But the gamble goes deeper than demographic tilt. Without a master list of the target population, there is no way to draw a true random sample. Every entry flows through uncontrolled pathways, making it impossible to verify who, or what, is providing the responses. At best, you hope the data “look right.” At worst, insights, policies, and forecasts rest on systematically distorted information.
Fraud: When the Game Is Rigged
We are now in a moment where the very viability of nonprobability surveys is in question. Evidence shows that many nonprobability surveys receive more responses from AI agents and bots than from real people.
Today’s low-barrier opt-in environments are easily exploited. Bots and bot-human hybrids use automation, VPN/proxy rotation, and sophisticated large language models (LLMs) to blend in. Fraud rates of 15 to 30 percent are common, with many platforms seeing 45 percent or more (Imperva, 2024; McCarthy, 2022; Maimarides, 2025). In social media-recruited surveys, fraud can reach 90 percent (Pinzón et al., 2024).
Meanwhile, incentives magnify the problem. Monetary rewards reliably attract fraudsters (Teitcher et al., 2015; Bowen et al., 2008). Buchanan and Scofield (2018) found that some actors cheat not for money but simply to cause disruption. This fraud undermines data quality at its core.
LLM-driven agents now read survey questions, maintain coherent personas, generate polished open-ended responses, and pass attention checks at near-perfect rates, 99.8 percent in one proof of concept study (Westwood, 2025). Residential proxies and clean IP addresses further obscure detection. The resulting corruption can swing estimates by double digits. Some recent cases show probability surveys estimating a five percent prevalence rate on a metric while nonprobability samples reported 45 to 55 percent.
This is not simple bias. It is a failure. Researchers increasingly discard entire data collections and re-field surveys to get reliable results.
Recent findings underscore the danger:
- Valid responses may have plummeted from 75 percent to just 10 percent in recent years (Pinzón et al., 2024).
- Multiple studies document identity falsification rates above 80 percent (Bell & Gift, 2023) and bot infiltration rates above 90 percent, including a case where only three of 981 responses were real (Krawczyk & Siek, 2024).
- In sector-specific work, the pattern holds: only 4 percent of 2,622 responses in a beekeeping survey were legitimate (Goodrich et al., 2023).
- Social media recruitment is especially vulnerable, with studies reporting fraud rates of 94 to 95 percent (Pozzar et al., 2020; Imes et al., 2024).
Bots and bot-human hybrids routinely bypass honeypots, CAPTCHAs, and attention checks, while overtight screens risk excluding real respondents and introducing new biases (Pozzar et al., 2020; Teitcher et al., 2015; Carballo-Diéguez and Strecher, 2012). The Insights Association considers survey fraud an existential threat. While nonprobability vendors claim to have solutions, there is no independent evidence that their proprietary approaches consistently reduce fraud or improve accuracy.
The most credible mitigation is the one nonprobability methods fundamentally lack: using verified frames, controlled recruitment, identity validation, throttling, and transparent response tracking, all inherent features of probability‑based approaches (Lawlor et al., 2021; Wardrop, 2025).
How AmeriSpeak Avoids These Pitfalls
In a landscape full of uncertainty and AI‑driven distortion, maintaining data quality requires structural safeguards. Two design features distinguish NORC’s AmeriSpeak® panel from nonprobability sources.
First, AmeriSpeak uses an area‑probability design drawn from the NORC National Sample Frame, a comprehensive list of verified U.S. households. Respondents do not self‑select; they are invited through scientific random sampling.
Second, recruitment involves direct, person‑to‑person contact. Through door‑to‑door outreach and telephone engagement, AmeriSpeak has met half of its panelists in person and spoken with another quarter by phone. Panel members are real, verified individuals recruited through controlled, high‑integrity processes.
What This Means for Your Research
Nonprobability sources invite fraud, amplify bias, and erode trust in surveys.
Vendors operate in a black box: recruitment channels are vague, fraud detection methods are proprietary, and evidence of effectiveness is lacking. While they highlight isolated “success stories,” independent reviews from Pew Research Center, American Association for Public Opinion Research (AAPOR), and leading scientific assessments consistently show larger bias than probability samples—and most of that research was conducted before today’s explosion of AI-driven fraud.
Probability samples aren’t perfect, but their errors are measurable, bounded, and understood. Nonprobability surveys are a gamble with unknown odds and catastrophic consequences.
The stakes are too high. If you care about credibility, confidence, and truth, there’s only one choice: probability sampling.
References
Bell, A. M. & Gift, T. (2023). Fraud in online surveys: Evidence from a nonprobability, subpopulation sample. Journal of Experimental Political Science, 10(1), 148–153. https://doi.org/10.1017/XPS.2022.8
Bowen, A. M., Daniel, C. M., Williams, M. L. & Baird, G. L. (2008). Identifying multiple submissions in Internet research: Preserving data integrity. AIDS and Behavior, 12(6), 964–973. https://doi.org/10.1007/s10461-007-9352-2
Buchanan, E. M. & Scofield, J. E. (2018). Methods to detect low quality data and its implication for psychological research. Behavior Research Methods, 50, 2586–2596. https://doi.org/10.3758/s13428-018-1035-6
Callegaro, M., Villar, A., Yeager, D. & Krosnick, J. A. (2014). A critical review of studies investigating the quality of data obtained with online panels based on probability and nonprobability samples. In M. Callegaro et al. (Eds.), Online panel research: A data quality perspective (pp. 23–53). Wiley.
CarballoDiéguez, A. & Strecher, V. (2012). Data quality in web-based HIV/AIDS research: Handling invalid and suspicious data. Field Methods, 24(3), 272–291. https://doi.org/10.1177/1525822X12443097
Cornesse, C., Blom, A. G., Dutwin, D., Krosnick, J. A., De Leeuw, E. D., Legleye, S., Wenz, A. & others. (2020). A review of conceptual approaches and empirical evidence on probability and nonprobability sample survey research. Journal of Survey Statistics and Methodology, 8(1), 4–36. https://doi.org/10.1093/jssam/smz041
Elliott, M. N. & Valliant, R. (2017). Inference for nonprobability samples. Statistical Science, 32(2), 249–264. https://doi.org/10.1214/16-STS598
Goodrich, B., Fenton, M., Penn, J., Bovay, J. & Mountain, T. (2023). Battling bots: Experiences and strategies to mitigate fraudulent responses in online surveys. Applied Economic Perspectives and Policy, 45(2), 762–784. https://doi.org/10.1002/aepp.13353
Imes, C. C., Baniak, L. M., Luyster, F. S., Morris, J. L. & Orbell, S. L. (2024). Attack of the (survey) bots: How we determined that our anonymous survey was full of false data. Circulation, 149(Suppl_1), AP460. https://doi.org/10.1161/circ.149.suppl_1.P460
Imperva. (2024). 2024 Bad Bot Report. https://www.imperva.com/resources/resource-library/reports/2024-bad-bot-report/
Krawczyk, M. & Siek, K. A. (2024). When research becomes all about the bots: A case study on fraud prevention and participant validation. In Extended Abstracts of the CHI Conference on Human Factors in Computing Systems. https://doi.org/10.1145/3613905.3637109
Lawlor, J., Thomas, C., Guhin, A. T., Kenyon, K. & Lerner, M. D. (2021). Suspicious and fraudulent online survey participation: Introducing the REAL framework. Methodological Innovations, 14(3). https://doi.org/10.1177/20597991211050467
Malhotra, N. & Krosnick, J. A. (2007). The effect of survey mode and sampling on inferences about political attitudes and behavior. Political Analysis, 15, 286–323. https://www.jstor.org/stable/25791896
Maimarides, A. (2025, August 5). When bots take surveys: AI’s impact on the integrity of market research. Lab42. https://www.lab42.com/blog/when-bots-take-surveys-ais-impact-on-the-integrity-of-market-research
McCarthy, T. (2022, May 31). How to stop professional survey cheaters. Quirk’s Media. https://www.quirks.com/articles/how-to-stop-professional-survey-cheaters
Pinzón, N., Koundinya, V., Galt, R. E., Dowling, W. O. R., Baukloh, M., TakuForchu, N. C. & Pathak, T. B. (2024). AI-powered fraud and the erosion of online survey integrity. Frontiers in Research Metrics and Analytics, 9. https://doi.org/10.3389/frma.2024.1432774
Pozzar, R., Hammer, M. J., UnderhillBlazey, M., Wright, A. A., Tulsky, J. A. & Berry, D. L. (2020). Threats of bots and other bad actors to data quality. Journal of Medical Internet Research, 22(10), e23021. https://doi.org/10.2196/23021
Teitcher, J. E., Bockting, W. O., Bauermeister, J. A., Hoefer, C. J., Miner, M. H. & Klitzman, R. L. (2015). Detecting, preventing, and responding to fraudsters in internet research. Journal of Law, Medicine & Ethics, 43(1), 116–133. https://doi.org/10.1111/jlme.12200
Wardrop, B. (2025, March 17). Tackling survey fraud: Insights from CloudResearch & Toluna. CloudResearch. https://www.cloudresearch.com/resources/blog/survey-fraud-toluna-cloudresearch-quirks-la/
Westwood, S. J. (2025). The potential existential threat of large language models to online survey research. Proceedings of the National Academy of Sciences of the United States of America, 122, e2518075122. https://doi.org/10.1073/pnas.2518075122
Yeager, D. S., Krosnick, J. A., Chang, L., Javitz, H. S., Levendusky, M. S., Simpser, A. & Wang, R. (2011). Comparing the accuracy of RDD telephone surveys and internet surveys. Public Opinion Quarterly, 75(4), 709–747. https://doi.org/10.1093/poq/nfr020
Suggested Citation
Bilgen, I. & Dutwin, D. (2026, February 6). Using Nonprobability Survey Samples Can Be a Dangerous Gamble. NORC at the University of Chicago. Retrieved from www.norc.org.