Skip to main content

Brandon Sepulvado

Pronouns: He/Him

Senior Research Methodologist
Brandon specializes in natural language processing, data infrastructure, and research on research.

Brandon is a senior research methodologist in the Methodology & Quantitative Social Sciences department at NORC at the University of Chicago. He leads work in natural language processing (NLP), research on research and science evaluation, and research data infrastructure. Brandon’s NLP work leverages advanced methods, such as deep learning and named-entity recognition, to generate meaningful insights from a wide range of text data, such as social media, administrative records, and surveys, and Brandon’s considerable expertise in data infrastructure allows him to help clients make their data assets secure, accessible, and interoperable. As a sociologist, he uses a deep knowledge of research on research to help clients across sectors monitor and evaluate investments in science, technology, and innovation.

Brandon has delivered research solutions to help a wide range of government and private clients, helping them gain insights into R&D investments. As Co-Principal Investigator on a project with America’s DataHub Consortium, Brandon is uncovering new insights about technology transfer among foreign-born scientists and engineers in the U.S. Brandon has worked with the ASPE Office of Science and Data Policy to understand federal investments, regulatory decisions, and research outcomes surrounding pharmaceutical products that are subject to extended intellectual property claims. Brandon has helped the National Science Foundation assess the quality of data from automated data collection processes, in order to reduce time and effort involved in funding program evaluations, and he works on behalf of the National Institutes of Health to understand collaboration patterns the stem from its Justice Community Opioid Innovation Network.

Brandon’s data science work focuses on NLP and data infrastructure. NLP examples include using named-entity recognition to identify new e-cigarette brands and flavors on social media and developing text classifiers to monitor commercial smokeless tobacco campaigns across social media platforms. His work has entailed the use of deep learning to build a recommender system connecting synthetic biologists with information concerning the ethical and societal implications of their research as well as the development of tools—using algorithms such as topic models and stochastic blockmodels—to process open-ended responses to large-scale surveys. Brandon is helping the NCSES to establish a research data infrastructure integrating survey, administrative, and bibliometric data about the scientific workforce. He was Principal Investigator on a team of experts across the U.S. to develop the Synthetic Biology Knowledge System, which aims to provide a single interface that transforms the way researchers access diverse types of synthetic biology data.

Brandon’s work has been published in peer-reviewed journals and conference proceedings across disciplines and has been supported by prestigious awards from the National Science Foundation, a Fulbright fellowship, the Government of France, and the Countway Library of Medicine (Harvard University/Boston Medical Library). He currently serves as Secretary of the Washington Statistical Society and as Program Chair for the American Statistical Association’s Section on Text Analysis. In the past, he has been elected to multiple section councils of the American Sociological Association, has served as assistant editor for the American Sociological Review, and has been a member of the South Big Data Hub’s Data Science Education & Workforce working group. He regularly serves on the program committee of many conferences, including the Conference on Empirical Methods in Natural Language Processing (EMNLP) and Widening NLP (WiNLP). Brandon frequently speaks around the globe on research on research, NLP, and data infrastructure.

Project Contributions

America’s DataHub Consortium

Demonstrating replicable processes for acquiring and providing secure access to linked data sources

Client:

National Center for Science and Engineering Statistics

Early Childhood Training and Technical Assistance Cross-System Evaluation

A first-of-its-kind evaluation to maximize the effectiveness of TTA provided to early childhood grantees

Client:

Office of Head Start and Office of Child Care in the Administration for Children and Families, U.S. Department of Health and Human Services

Graduate Research Fellowship Program Pilot Project

Innovating data collection methods to track National Science Foundation research fellowship outcomes

Client:

National Science Foundation

Curriculum & Learning Improvement Project Consortium

A data ecosystem supporting middle school math instruction for historically marginalized students

Client:

Gates Foundation

Publications