Record linkage is the process of identifying and matching records that refer to the same person, organization, or entity across two or more data sources, then combining them into a single, integrated data set. Also called data matching, entity resolution, or probabilistic and deterministic linkage, it is essential whenever files share no reliable common identifier.
NORC specializes in linkages that demand accuracy, transparency, and strict protection of confidential data. Our statisticians and data scientists combine entity resolution, deterministic and probabilistic methods, using machine learning with purpose-built software—including NORCLink, our software-agnostic linkage tool. When data custodians cannot share complete files, our privacy-preserving record linkage (PPRL) methods let projects move forward without exposing sensitive identifiers.
Federal statistical and health agencies, including the National Center for Health Statistics and the National Center for Science and Engineering Statistics, rely on NORC to link survey, administrative, and claims data under demanding governance and confidentiality requirements. We assess linkage error and bias at every step, so the data we deliver support valid statistical inference and decisions clients can defend.
Our Capabilities
We link records using proven statistical techniques, purpose-built software, and rigorous quality control.
Deterministic, Probabilistic & Machine Learning Methods
We match records using exact-match (deterministic) rules, probabilistic models that weigh partial agreement, and machine learning trained on your data. We select and combine approaches based on data quality, file size, and the cost of a wrong match.
Privacy-Preserving Record Linkage (PPRL)
Using both commercial and open-source tools, we link records through encrypted or hashed identifiers, so no party has to expose sensitive data. PPRL lets linkages proceed even when legal restrictions or custodial limits prevent sharing complete files, and we evaluate how it affects match quality and downstream analysis.
NORCLink & Custom Software
NORCLink, our software-agnostic tool for advanced record linkage, supports large and complex matching jobs. When a project’s data or requirements call for it, we build custom solutions tailored to the work.
Beyond Individual-Level Matching
Our work extends past person-level matching to organization-level and concept-based linkage, connecting entities such as firms, providers, and programs across sources.
Data Standardization & Processing
Before matching, we standardize and clean data, resolving the inconsistencies in formatting, naming conventions, and quality that commonly undermine accurate linkages.
Linkage Quality, Error & Bias Assessment
We quantify match quality and analyze linkage error and bias, then tune matching parameters to balance false positives and false negatives. The result is linked data that support valid statistical inference and stand up to review.
Our Work
Our linkage projects span health, government, and research, connecting data that inform decisions at every level.
Our Experts
NORC’s record linkage work is led by statisticians and data scientists with deep experience linking federal survey, administrative, and claims data.
-
Brenda Betancourt
Senior Statistician -
Scott Campbell
Senior Statistician -
Núria Adell Raventós
Statistician II -
Dean Resnick
Principal Data Scientist -
Emily R. Wiegand
Senior Research Methodologist