Finding the Right Words: Improving How We Reach Spanish-Speaking Households
Author
Senior Research Methodologist
May 2025
By comparing the effectiveness of methods for identifying these households, we found “Bayesian Improved Surname Geocoding” was the best single approach.
As a survey methodologist, I’ve always been fascinated by the intersection of mathematics and human behavior. In my recent study published in Survey Practice, I evaluated three different techniques for identifying Spanish-speaking households before sending out survey materials. This might seem like a small detail in the survey process, but it has significant implications for data quality, cost-effectiveness, and ensuring everyone has their voice heard.
Why We Need to Get Translation Right
When designing surveys, one of the biggest challenges is ensuring a high response rate from all participants—regardless of what language they speak at home. It’s crucial to accurately identify Spanish-speaking households so that we can effectively reach them and ensure that survey results reflect the full experience of the U.S. population.
There are practical reasons to identify Spanish-speaking households in advance, too. Printing and mailing bilingual materials to every household would be prohibitively expensive, especially for surveys that include lengthy paper questionnaires. By sending Spanish-language survey materials only to the households that need them, we can improve response rates while keeping costs manageable.
We Tested Three Approaches
My colleagues and I evaluated three different methods for identifying likely Spanish-speaking households:
- Vendor data: We used commercial data that indicate whether a household has previously received mail in Spanish or has been identified as Spanish-speaking. These data were available in all but three states. Those three states had laws restricting this type of data collection.
- Census data: We used American Community Survey information to identify neighborhoods with high concentrations of Spanish speakers, allowing us to target entire areas.
- Bayesian Improved Surname Geocoding (BISG): This statistical technique combines geographic location and surnames to predict the likelihood that household members are Hispanic. We used this as a proxy for likely Spanish-speaking in states where vendor data weren’t available.
What We Discovered
Our analysis revealed that no single technique was clearly superior, but each had its strengths. The BISG method correctly identified 82.0 percent of households where respondents completed the survey in Spanish. Vendor data identified 73.6 percent of Spanish-speaking households. Census data had the lowest identification rate at 33.9 percent but the highest rate of correctly identifying English-speaking households.
While the BISG method was the single best approach, we found that combining all three techniques further improved the accuracy of identifying Spanish-speaking households, achieving an identification rate of 82.9 percent. By using multiple approaches, we reached Spanish speakers who would have been missed by any single method.
Looking Beyond Spanish
While our study focused on Spanish speakers participating in a religious survey, these techniques have broader applications. Similar approaches can be applied to various demographic groups, enhancing the overall quality and reliability of survey research across populations.
At NORC, we're already applying similar thinking to our AmeriSpeak® panel to help better reach Korean-, Japanese-, and Mandarin-speaking households.
Methodological approaches like these will be essential to ensure that public opinion research accurately reflects the views of all communities—allowing us to hear every voice in whatever language they speak.
Suggested Citation
McRoy, M. (2025, April 28). Finding the Right Words: Improving How We Reach Spanish-Speaking Households. [Web blog post]. NORC at the University of Chicago. Retrieved from www.norc.org.