AI Framework Helps Federal Agencies Improve Data Quality & Integration
This article is from our NORC Now newsletter. Subscribe today.
January 2026
NORC researchers created a framework for developing AI tools that could allow federal agencies to prepare complex data more efficiently.
NORC researchers developed a framework to guide the development of artificial intelligence (AI) tools that can help improve data quality, standardization, and integration for federal statistical agencies like the Bureau of Labor Statistics and the Census Bureau.
The AI-Data Quality, Standardization, and Integration project, funded by the National Center for Science and Engineering Statistics as part of the National Secure Data Service-Demonstration, addresses persistent challenges that data managers face daily. Many federal agencies work with “organic” data sources not originally designed for statistical use, such as administrative records and other nontraditional formats, which often arrive with minimal documentation and require extensive processing.
NORC’s research team conducted expert interviews and reviewed current practices to understand how AI could enhance federal data management while maintaining the privacy and accuracy standards required for government statistics. The evaluation revealed opportunities to improve metadata documentation, streamline data harmonization, and support quality assessment workflows.
“AI can help explore patterns and reveal data quality issues that would be time-consuming to find otherwise,” said Zachary Seeskin, senior statistician and project team member. “With human-in-the-loop processes, AI can speed up the work for data analysts to identify and review data anomalies.”
NORC researchers identified promising uses of AI for federal statistics through a literature scan and expert interviews.
The most exciting use cases included supporting more rapid data cleaning, identifying potential data quality issues, and enhancing data documentation.
“Documentation and metadata are critical to data quality and usability of data,” said Seeskin. “AI has particular promise for enhancing data documentation for a range of data types.”
Experts interviewed for the project emphasized the potential for AI to assist with metadata extraction and standardization. These capabilities help analysts better understand variable definitions, identify missing documentation, and improve dataset usability.
“For input data sources that are updated over time, AI could help quickly identify new trends and recognize changes in variables or records needed to process the data,” Seeskin said.
The framework emphasizes transparency and human oversight to ensure accuracy.
In the next phase of the project, NORC is developing a suite of AI tools for the National Secure Data Service to improve how federal agencies handle administrative records and other complex data sources. The goal is to support more efficient, reliable data preparation processes that help federal agencies generate timely insights for evidence-based policy decisions.
“A framework to guide the application of AI to process federal data used for policy decisions must address the importance of accuracy, the need for comparability in making comparisons among subgroups, requirements for transparency with statistical processes, and the risk for algorithmic biases from AI,” Seeskin said.
This article is from our flagship newsletter, NORC Now. NORC Now keeps you informed of the full breadth of NORC’s work, the questions we help our clients answer, and the issues we help them address.