Technical Support to Conduct Privacy Preserving Record Linkage and to Link the National Hospital Care Survey Data to Centers for Medicare & Medicaid Services Data

The data linkage work was performed at NCHS under contract by NORC at the University of Chicago with funding from the Department of Health and Human Services’ Office of the Secretary Patient-Centered Outcomes Research Trust Fund (OS-PCORTF).

There are two essential goals of the contract;

  • Evaluate privacy-preserving record linkage techniques by re-performing the linkage conducted between the National Hospital Care Survey (NHCS, 2016) and the National Death Index (NDI) using encrypted identifiers with Datavant software.
  • Conduct linkage between NHCS (2014 and 2016) and Centers for Medicare & Medicaid Services (CMS) Transformed Medicaid Statistical Information System (T-MSIS) data containing enrollment and claims data from a range of years.

The first part of this project is to perform a methodological assessment of the hashing algorithms used by groups like Datavant and Childhood Obesity Data Initiative (CODI). The results of the linkages based on the hashing algorithms will then be compared to the linkage results obtained using the probabilistic algorithms developed at NCHS for the linkage of NHCS data with data from the NDI. The project will assess how the use of hashing algorithms may affect the quality of the linked data and further affect the inference in secondary analysis of those data. By assessing the hashing algorithms with a “truth” source (the linked NHCS-NDI data), this uncertainty will be quantified and strategies for mitigation can potentially be incorporated into analyses using linked data derived from PPRL techniques.

This second part of the project will integrate NHCS patient records with T-MSIS data thus creating a database of detailed health insurance claims data for all NHCS patients receiving health insurance coverage from the two largest US public health insurance programs (Medicare and Medicaid). In previous work, detailed information from claims and electronic health records on care received in sampled hospitals were linked to the NDI and CMS Medicare data. This project expands data capacity for studies of HHS priority issues, particularly among the Medicaid covered population, such as opioids, obesity, and infectious diseases in a way that no single source alone provides. The proposed linkage of the 2014 and 2016 NHCS to T-MSIS data will build on the previously funded linkages of the 2014 and 2016 NHCS to Medicare fee-for-service health care claims, Part C Medicare Advantage health care encounter, and housing assistance data from HUD linkages. Together, these data resources will greatly increase the number of patients for whom supplemental contextual health outcomes information is available and support a wide array of health outcomes research studies such as examining differences in the efficiency and effectiveness of treatment protocols or post-acute care utilization among patients covered by Medicare fee-for-service, Medicare Advantage, and Medicaid programs. The linked NHCS-T-MSIS data will also provide rich data source for researchers examining the association between health and housing. A successful project will result in two new linked data sets that will be available via the NCHS and Federal Statistical RDCs

This new data resource will broaden the robust data infrastructure to support patients, caregivers, and providers as they strive to improve health, prevent chronic disease, and improve the efficacy and quality of health care services.

For the linkage of NHCS to T-MSIS we will use an enhanced version of Fellegi-Sunter methodology to replace existing methods used by the NCHS Special Projects Branch (SPB).

A white paper has been developed describing the PPRL linkage analysis, comparing the linkage results to previous linkage using standard (non-PPRL) methods. It is the intention of the project team to have this white paper published in a peer-reviewed journal.

The results of the NHCS-T-MSIS data linkage will be made available to outside researchers to use in the analysis of social determinants of health. Publication of documentation describing methods and providing guidance for data users is forthcoming.

Infographics and Visualizations



View All