Title: Election Data Standards Requirements: Getting on with what we
1Election Data Standards RequirementsGetting on
with what weve got
- John L. McCarthy, volunteer
- Verified Voting Foundation
Common Elections Data Format Workshop National
Institute for Standards and Technology Gaithersbur
g, Maryland 29-30 October, 2009
2Overview Review Background for election data
standards
- Who needs uses election data? (clients)
- What kinds of election data are required?
- When are election data needed for what purposes?
- What objectives would data standards help meet?
- How are these needs currently being met?
- in the United States?
- In other countries?
- What characterizes good data format standards?
- Why cant we simply use EML ( extend as
necessary)? - OASIS Election Markup Language (dialect of W3C
XML)
3Who needs uses election data ( how)?Potential
clients for election data standards
- Voting systems vendors and system developers
- component communications, system integration,
testing, reporting - Election officials local, state, and national
(EAC, ) - ballot definition, testing, reporting,
aggregation, auditing - Election management consultants contractors
- systems integration, contract work for election
officials - News media (TV, radio, print, web)
- reporting results, predicting outcomes
analyzing trends - Candidates, political parties organizations
- deciding whether to concede, claim victory or
dispute results - Citizens, citizen organizations academic
researchers - pre and post election auditing, analyzing
detailed results
4What kinds of election data are required?
- Election districts district boundaries
- Voter registration information eligible voter
lists - Candidate nominations approved candidate lists
- Referendum options and approved options lists
- Ballot definition information (for each
jurisdiction) - Election vote records, counts, results, and
statistics - Cast Vote Records (CVR) for each individual
ballot - including outcomes for each voting opportunity
(choice) - e.g., vote recorded, blank, too many choices,
unrecognized - Logs from each individual piece of voting
equipment - Audit information pertinent to all the above
categories
5What detailed components are needed for vote
tabulation audits?
- GEOGRAPHIC IDENTIFIERS
- State, County
- Sub-county jurisdiction(s), if any (e.g., city,
township) - Precinct
- Other Aggregation Unit Identifiers (e.g., state
assembly district, water district) - Voting Method (early, absentee, in-precinct,
provisional, - Ballot Type and/or party (for primary elections)
- FOR EACH CONTEST
- Contest (e.g., Governor, State Assembly, City
Council, Water Board) - Choice (candidate or position Y/N)
- Summary records typically contain counts for each
choice and some systems cast vote records for
individual ballots may show how each choice was
counted -- vote, blank, too many choices
(overvote), or unrecognized mark. - SoSs others also need standards for various
types of election audit reports
6When are election data needed?
- Preceding an election
- system development testing
- logic and accuracy testing test results
- jurisdiction boundaries, ballot types, voting
places - ballot design and contents (candidates, ballot
measures, etc.) - registered eligible voters
- During an election
- problem reports
- individuals who have voted
- Election night
- detailed vote counts by polling place, type
(in-person, absentee), candidate, ballot measure
choices, overvotes, undervotes - Individual Cast Vote Records (CVR) for each
ballot - Before certification of final results
- audit results, including resolution of any
discrepancies found
7Objectives that election data standards can help
us achieve
- Timely Transparent Reporting
- aggregation within local jurisdictions from
local to state - to media, interested organizations the general
public - to help support pre and post-election auditing
- Lower costs improved Accuracy
- Improve transparency testing of ballot
definition - connect registration, pollbooks, and reporting
- facilitate transition to electronic
record-keeping - Interoperability
- between components from a single vendor
- among different components from different vendors
- Auditability
- detailed data available immediately following
each election - machine-readable reports broken down in arbitrary
ways
8How are these needs currently being met?
- In the United States
- very little standardization
- data exchange via poorly documented proprietary
formats - election management systems produce human
readable reports - some exceptions
- CA SoS media feed 2008, 2009
- IL translation programs for EAC data collection
grant program - In other countries
- Council of Europe recommends EML for
interoperability (2004) - Australian Electoral Commission EML Media Feed
(since 2007) - UK e-voting pilots and CORE registration project
use EML - Belgium uses EML for local elections in Flanders
(2006-7) - Others?
9What kinds of data and metadata do current
commercial vote tabulation systems provide?
Human Readable Reports
e.g., Hart-Intercivic (Crystal Reports)
10What would characterize good election data
format standards?
- Machine-readable, structured components
- separate elements for each distinct type of
information - (e.g., state, county, precinct, type, contest,
candidate, undervotes) - easy to render into different formats
- modular structures/schemas for different kinds of
data - (e.g., ballot definition, geography, tabulation
results, ) - Well-defined and documented data elements
structures - preferably defined by data verifiable via
formal schema - Quasi-human-readable
- data volume does not require serious compression
(e.g., ASN.1) - easy to render into different human-readable
machine formats - Compatible with tools for translation, rendering
storage - e.g., XML style sheets, schema, databases web
services XSLT - Developed through standards consensus process
- input and discussion from all stake-holders,
trial use, etc.
11Doesnt EML (Election Markup Language) meet most
if not all of these requirements?
- Dialect of XML (current lingua franca for data
exchange) - Developed by OASIS Technical Committee (since
2001) - participation by vendors and election experts
- currently completing work on version 6.0 (still
time for feedback!) - OASIS will propose EML 6.0 as ISO standard early
in 2010 - Flexible, extensible, modular framework
- version 6.0 includes new elements features to
support US voting - V 6.0 meets most known election requirements
- Already used by a number of organizations
jurisdictions - California Australia media feeds, etc.
- ESS, Hart-Intercivic (EDX XML variant), EDS,
IBM, more in Europe - For more info, see http//www.oasis-open.org/commi
ttees/ tc_home.php?wg_abbrevelectionexpository
12What are primary objections, barriers, and
counter-arguments to use of EML?
- Too new ?
- development of multiple versions since 2001
- used successfully in growing number of
jurisdictions - Competing approaches standards ?
- IEEE Voting Systems Electronic Data Interchange
Project 1622 - temporarily deactivated because TC "failed to
achieve balance" - Comma-delimited spreadsheet format
- No schema to enforce data input requirements
- Require multiple tables to supported nested
repeating groups - Would have to develop table and column
definitions, etc. - Too complex and/or missing features ?
- can ignore modules that are not applicable
- Easy to extend and add new features using XML
(e.g. audit reports) - Implementation costs ?
- 3 major vendors already use EML or XML in
significant ways - Lots of tools to support XML development and use
13The need is urgentNow is the time to act
- Election auditing requires a single standard set
of formats - statement from last weeks meeting on election
auditing at ASA - States are beginning to implement electronic
reporting - California 6 county experiment plans to expand
to statewide - Illinois plans statewide integrated voting
elections system - Need for national archive of election data
- for policy makers, legislators, academic
researchers - current election day survey data is inadequate
- not timely, detailed data not easily available in
standard formats - EAC data collection grant project results can
provide insights - If EML is deficient, we can propose revisions for
v6 - but should do so in the next couple of months
14Opportunities for participation
- Election Data Standards Email list ( google
sites wiki) - electiondatastandards_at_verifiedvoting.org
- Try new election data software help improve it
- Auditing software from CO (McBurnett), UC
Berkeley (Stark), - VTS translation software from IL?
- EML enhancements for version 6
- OASIS Elections Voter Services Technical
Committee - Joe Hall, David Webber, others
- www.oasis-open.org/committees/election/
- NIST, TGDC, VVSG
- Urge EAC and/or NIST to become active members of
OASIS TC - create documentation guidelines to facilitate
adoption
15Thanks to .
- Verified Voting Foundation President Pam Smith
- Election Data Standards and Auditing Lists
- American Statistical Association Steve Pierson
- David Webber, OVS/OASIS
- John Sebes, Open Source Digital Voting Foundation
- Neal McBurnett, Boulder, Colorado
- Scott Hilkert Catalyst Consulting associates,
Chicago - participants in last weeks election auditing
meeting
16Example XML data fragment
- lt?xml version"1.0" encoding"utf-8" ?gt
- - ltelection type"GE" name"General Election"
date"11/4/2008"gt - - ltstate id"IL" name"Illinois"gt
- - ltjurisdiction id"2402" name"Alexander County"
federalId"1700300000"gt - - ltcontest id"4" name"12TH CONGRESS"
polling"3167" absentee"0" early"487" grace"0"
provisional"0" total"3654"gt - Â ltspecialCount type"blankVotes" polling"0"
absentee"0" early"0" grace"0" provisional"0"
total"0" /gt - - ltspecialCount type"underVotes" polling"283"
absentee"0" early"88" grace"0" provisional"0"
total"371"gt - Â ltprecinct name"CAIRO 1" polling"57"
absentee"0" early"14" grace"0" provisional"0"
total"71" /gt - Â ltprecinct name"CAIRO 2" polling"37"
absentee"0" early"21" grace"0" provisional"0"
total"58" /gt - Â ltprecinct name"CAIRO 3" polling"16"
absentee"0" early"14" grace"0" provisional"0"
total"30" /gt - Â ltprecinct name"CAIRO 4" polling"22"
absentee"0" early"14" grace"0" provisional"0"
total"36" /gt - Â ltprecinct name"CAIRO 5" polling"19"
absentee"0" early"9" grace"0" provisional"0"
total"28" /gt - Â ltprecinct name"CACHE" polling"9" absentee"0"
early"5" grace"0" provisional"0" total"14" /gt
- Â ltprecinct name"SANDUSKY" polling"7"
absentee"0" early"2" grace"0" provisional"0"
total"9" /gt - Â ltprecinct name"TAMMS" polling"39"
absentee"0" early"1" grace"0" provisional"0"
total"40" /gt - Â ltprecinct name"MCCLURE" polling"29"
absentee"0" early"2" grace"0" provisional"0"
total"31" /gt - Â ltprecinct name"THEBES" polling"19"
absentee"0" early"1" grace"0" provisional"0"
total"20" /gt - Â ltprecinct name"OLIVE BRANCH" polling"29"
absentee"0" early"5" grace"0" provisional"0"
total"34" /gt - Â lt/specialCountgt