We are often asked to explain the impact of compromised voter data and I realized today that given some very relevant integrity engineering going on in the TrustTheVote Project its worth revisiting this topic for new citizens following our work, especially the growing #unhackthevote community (welcome and thank you for following us, taking interest in our work and several of you for supporting us this week! We're humbled by your donations.) For elections professionals, this topic is understood, so this may not be the most informative thing you read from me this week. However, this topic also came up this week at a San Francisco Elections Commission meeting during a discussion about technology and system integrity, and so in part I want to relay some comments here to help.
There are three types of data in election administration:
- Voter Data -- information about a registered voter and their history of participation
- Ballot Data -- information about the composition of unmarked and cast ballots
- Election Data -- information comprising both voter data and ballot data and other administrative data that goes to managing an election.
Today, we're focusing on the first type: Voter Data. At first, you might not think that voter data could be a target of an attack or effort to compromise an election. After all, the media focus on election hacking is tampering with voting machines and related nefarious activities. Actually, the most vulnerable aspect of the election system is the back office components, but that is for another day. It turns out that an election can just as easily be derailed by tampering with voter data. Below with help of the TrustTheVote CoreTeam and using remarks from and discussion related to this week's S.F. election commission meeting, I try to explain why, and then offer a teaser to what we're working on to address this equally vulnerable type of data susceptible to hacking (or other mischief).
NOTE: A special acknowledgment to David Jefferson (Lawrence Livermore Labs) and Dr. Douglas Jones (University of Iowa) for their comments in other venues that I've borrowed from and relied upon to help me put this together for our readers.
Why Compromised Voter Data is as Bad as Compromised Ballot Data
Set aside miscreant voter suppression activities on the voter roll. We know that is currently the hottest subject matter (so many links to articles, we can't even choose one, but you could Google it.) Let's just train on the miscreants (or foreign state actors if that perks the ears and straightens the tail ...remember those 24 databases Russian-sanctioned actors probed last year?)
If an attacker tampers or hacks with the registration data of a large number of voters in a Voter Registration Database (VRDB) before an election, the consequence is likely to be that many voters have to submit provisional ballots instead of real ones.
That's an inconvenience and will irritate the voters who will realize that something is very wrong. It will also cause undue burdens on the election officials who have to investigate and make decisions on counting the provisional ballots after the fact. Plus, requiring large numbers of voters to cast provisional ballots is a disruption to the flow of voters through the polling place. It can quickly double the amount of time poll workers spend with each voter whose registration record has been tampered, and as a result this will clog the flow of voters through the polling place.
Line grow to very large lengths when the service rate equals the arrival rate (and as Dr. Jones points out elsewhere, this is one of the more depressing theorems in queuing theory). It turns out that most polling places have a service rate that is only a little faster than the arrival rate. This means the line length and wait time experience at polling places is typically variable and unpredictable. For instance a small increases in the average per-voter service time at a check-in table can push the expected line to great lengths in a short time. The resulting experience is in essence a disenfranchisement that can result in abandonment (as we saw for instance in the Primary in Arizona last year).
So, if the goal of an attack is to disrupt an election and possibly derail outcomes, compromising the VRDB (Voter Registration Database) just prior to the election will accomplish the mission. However, there's another problem someone raised at the SF Election Commission meeting earlier this week.
A More Disturbing Observation
If the attack on the VRDB were to take place right after the election, it might go undetected for a while. This would unknowingly affect decisions on actual counting of absentee and provisional ballots, and worst case, it might not be noticed until the next election when large numbers of voters suddenly would have to use provisional ballots. And at this point, the use of old backups (assuming they exist) of the VRDB to restore it to a last known-good state would be a far more complicated task.
Unhacking the Voter Data
Well, no system is hack-proof. And the header above is really a nod to the growing social media community #unhackthevote (you can follow them on Twitter.) But the thing we want to share today is that the TrustTheVote Project CoreTeam may have come up with a solution in the course of other security engineering work this week.
So, ElectOS has in its service layer the equivalent of a flight data recorder ("FDR") You probably have heard of such a device as a "black box" (its actually orange) in airplanes or trains used to capture all of the operating data of the vehicle in the unfortunate event of an accident. Its the same thing as an FDR and given that we're working on glass box technology (i.e., transparent true open source) we're eschewing the phrase "black box" for some obvious reasons. In any event, the ElectOS service layer has an FDR to track every single operation of any ElectOS component. Now, the FDR is a longer term, lower priority component, but it is an important piece of security-centric engineering underway. However, the rising conversation about voter data integrity, fueled by the voter fraud and voter suppression issues got our CoreTeam thinking.
We're working with a variety of new innovations including something called "BlockChain" and we're engaging with some open source projects to explore how this technology can be used in aspects of election administration (and no NOT for online voting; at least not until some real important research is finished). But BlockChain for the purposes of creating strong, secure, irrefutable ledgers of transaction information? That dog can hunt.
The idea is to create a high integrity transaction logging service (the flight data recorder idea) using a type of BlockChain technology to serve as an intrusion-detection/tripwire and an irrefutable transaction log that can verify any and all reads and writes to VRDBs. (Warning, that link above about BlockChain technology leads to some intensely geeky stuff involving math way beyond Ms. Voting Matters!) The CoreTeam will have more to share about this as this comes together over on its own blog in the not too distant future
What's so interesting to me in trying to focus on what matters to we, the people, in protecting and preserving our right to vote, is that something like this transaction logging service could not only detect attacks, but seems to me it could also be a useful tool in determining if any transactions on the voter rolls occurred that maybe shouldn't have. Just thinking in print here. :-)
Ms. Voting Matters