PROJECTS
Project Contacts

Kirk Wolter

wolter-kirk@norc.org

 

Project Affiliations
Related News
    Florida Ballots, Raw Data File Structure

    The raw data file contains one record for each ballot examined in the Florida Ballot Project. There are two types of ballots: undervotes and overvotes. An undervote is a ballot for which no presidential vote was recorded. An overvote is a ballot for which more than one presidential candidate was selected.


    For each undervote, the ballot was examined by three independent coders. Each chad on the ballot was coded and the coders' evaluations appear on one ballot-level record in the raw data file. Variables with the suffix "C1" refer to the evaluations made by the first of the three coders. Data for coders 2 and 3 are recorded in the variables with suffixes "C2" and "C3," respectively. In addition to the coder evaluations of the ballot, each record in the raw data file contains the county name and FIPS code, precinct number, ballot system (Votomatic, Datavote, or Optical Scan), and other identifying information pertaining to that ballot. The unique identifier for each record is recorded in the variable BALNUM, which is a sequential integer ranging from 1 to 175,037.


    For three Florida counties (Nassau, Pasco, and Polk), overvotes were also examined by three coders. For the remaining counties, overvotes were examined by only one coder. For these remaining overvotes, the data relating to the single coder's evaluation are contained in the variables with the suffix "C1." The data in variables with the suffixes "C2" and "C3" were assigned reserve code values of -8 to indicate that the ballot was examined by only one coder. (For more information concerning the meaning of individual codes in the raw data file, please refer to the raw data layout contained in NORCLAY.XLS).


    The raw data file contains 175,010 records. Of these, 61,190 are undervotes and 113,820 are overvotes. In total, 138,037 ballots were from counties using Votomatic technology, 5,198 from counties using Datavote, and 31,775 from counties using Optical Scan technology. The complete breakdown of ballots by ballot type (undervotes/overvotes, Votomatic/Datavote/Optical Scan) follows:


    Total Records (Ballots): 175,010

    Total Undervotes: 61,190 Total Overvotes: 113,820

    Total Votomatic: 138,037 Total Datavote: 5,198 Total Optical Scan: 31,775

    Votomatic Undervotes: 53,215 Votomatic Overvotes: 84,822

    Datavote Undervotes: 771 Datavote Overvotes: 4,427

    Optical Scan Undervotes: 7,204 Optical Scan Overvotes: 24,571


    Note #1: There are 30 undervote ballots that did not have 3 codings and 11 overvote ballots that should have had 3 codings (Nassau, Pasco, or Polk counties) but did not. For the 30 undervotes and 6 of the 11 undervotes, the first two sets of codings are in the data, and the third set of codings has been assigned the reserve code -8 (to indicate no data for that coder). For the remaining 5 of the 11 overvote cases, the first set of codings is in the data, and the second and third are assigned reserve values of -8 (to indicate no data for that coder).


    Note #2: Analysts should be aware of the presence of an unusual coder (Coder ID 75683). In Baker county (FIPS = 3), one coder's work on 79 undervote ballots indicates misunderstanding of instructions, bias, or other problems. For documentary purposes and because of the relatively small number of ballots examined by this coder, the data was left in the database. Analysts are advised to conduct analyses without this coder's data.


    Ballots from absentee precincts are indicated with a 1 in the variable ABSENTEE (0 otherwise).


    Undervote and overvote ballots that have been contested are identified with a C (contested) in the variable PRECVERS (BLANK otherwise).


    Downloading formats:
    ASCII
    SAS Program to read in ASCII data
    SAS (WinZip file)
    SPSS (WinZip file)
    FREQUENCIES