We are often asked to explain the impact of compromised voter data and we realized today that given some very relevant integrity engineering going on in the TrustTheVote Project its worth revisiting that for new citizens following our work, especially the growing #unhackthevote community.....
Viewing entries tagged
open source election technology
The TrustTheVote Project is proud to announce the launch of the Pennsylvania Voter Registration App developed in partnership with Rock the Vote, Pennsylvania Voice, and the office of the Pennsylvania Secretary of State. This first in the nation mobile App is the culmination of over a year of work and marks a significant improvement in the voting experience for the citizens of Pennsylvania....
Last month San Francisco issued a fast-tracked Request For Information ("RFI") to obtain insight, knowledge, and a reality check on the potential for adopting, adapting, and deploying a next generation voting system that is based on open source software technology. We responded to the RFI. However, in the process, we unintentionally misrepresented the status of OSI review of our OSS license, which we've now corrected. Read on about our licensing to ensure adoption of OSS election technology, and some comments about San Francisco's thought leadership in researching open source opportunities for electoral technology innovation.
Today we provide another follow-up to our continuing report on our Repositories and source code development efforts. As others of the Core team have mentioned when contributing posts to the OSET blog (verses the TrustTheVote Project Blog), we appreciate the audience is diversifying over here, and want to forewarn you that parts of what follow get kinda geeky but we try to provide links for those curious to learn more. (Also Note: The TrustTheVote site is about to be re-launched within the next month, so we're trying to limit blog posts over there.) Anyway, we suspect what makes it geekish more than anything are code-names and acronyms. We’ll try to minimize the alphabet soup. OK, here we go…
Even philanthropic efforts to produce public benefits in the form of civic technology have real costs associated with software development. The open source model, however, means the costs are significantly less than current proprietary commercial alternatives, while the innovative benefits, unconstrained by commercial mandates, can be significantly greater. More importantly, there is some reality distortion over the real costs to building civic engagement IT, such as election administration and voting systems. They are markedly different than many other civic engagement tools that require only APIs and interactive web services leveraging government data stores to better engage and serve citizens. Tuesday's post by Ms. Voting Matters on our Voter Services Portal ignited comments and questions about the real cost to build the Voter Services Portal. The VSP is not "yet another simple web site," but a collection of software to provide services to voters that integrate with back-end legacy systems, and set the foundation to drive a series of voter service innovations as well as other election management tools in the near future. We breakdown the cost model and actual costs here...
The Voter Services Portal component of the Open Source Election Technology Framework is a freely available highly extensible online voter registration platform that can cut the cost of States' and jurisdictions' custom development by as much as 75% and reduce the time to develop and deploy from months or more to merely a few weeks. Why wouldn't any jurisdiction moving to online voter services strongly consider this freely available source code, open for innovation? That's the whole point of our non-profit technology R&D effort: increase confidence in elections and their outcomes by offering technology innovations that can be easily adopted, adapted, and deployed. Sure, there are costs associated with adaptation and deployment; after all, open source does not necessarily mean free source. But the time and taxpayer dollars savings should make this an easy decision...
Today, members of the Core Team are in Vail, Colorado at the IACREOT Conference to unveil the next phase of VoteStream, the elections results and reporting subsystem of our Open Source Election Technology Framework. This is an awesome day, and we owe a great deal of thanks to the Knight Foundation for continuing to support this important part of the Framework.
If you've read some of the ongoing thread about our VoteStream effort, it's been a lot about data and standards. Today is more of the same, but first with a nod that the software development is going fine, as well. We've come up with a preliminary data model, gotten real results data from Ramsey County, Minnesota, and developed most of the key features in the VoteStream prototype, using the TrustTheVote Project's Election Results Reporting Platform. I'll have plenty to say about the data-wrangling as we move through several different counties' data. But today I want to focus on a key structuring principle that works both for data and for the work that real local election officials (LEOS) do, before an election, during election night, and thereafter.
Put simply, the basic structuring principle is that the election definition comes first, and the election results come later and refer to the election definition. This principle matches the work that LEOs do, using their election management system to define each contest in an upcoming election, define each candidate, and do on. The result of that work is a data set that both serves as an election definition, and also provides the context for the election by defining the jurisdiction in which the election will be held. The jurisdiction is typically a set of electoral districts (e.g. a congressional district, or a city council seat), and a county divided into precincts, each of which votes on a specific set of contests in the election.
Our shorthand term for this dataset is JEDI (jurisdiction election data interchange), which is all the data about an election that an independent system would need to know. Most current voting system products have an Election Management System (EMS) product that can produce a JEDI in a proprietary format, for use in reporting, or ballot counting devices. Several states and localities have already adopted the VIP standard for publishing a similar set of information.
We've adopted the VIP format as the standard that that we'll be using on the TrustTheVote Project. And we're developing a few modest extensions to it, that are needed to represent a full JEDI that meets the needs of VoteStream, or really any system that consumes and displays election results. All extensions are optional and backwards compatible, and we'll be submitting them as suggestions, when we think we got a full set. So far, it's pretty basic: the inclusion of geographic data that describes a precinct's boundaries; a use of existing meta-data to note whether a district is a federal, state, or local district.
So far, this is working well, and we expect to be able to construct a VIP-standard JEDI for each county in our VoteStream project, based on the extant source data that we have. The next step, which may be a bit more hairy, is a similar standard for election results with the detailed information that we want to present via VoteStream.
PS: If you want to look at a small artificial JEDI, it's right here: Arden County, a fictional county that has just 3 precincts, about a dozen districts, and Nov/2012 election. It's short enough that you can page through it and get a feel for what kinds of data are required.
Last time, I explained how our VoteStream work depends on the 3rd of 3 assumptions: loosely, that there might be a good way to get election results data (and other related data) out of their current hiding places, and into some useful software, connected by an election data standard that encompasses results data. But what are we actually doing about it? Answer: we are building prototypes of that connection, and the lynchpin is an election data standard that can express everything about the information that VoteStream needs. We've found that the VIP format is an existing, widely adopted standard that provides a good starting point. More details on that later, but for now the key words are "converters" and "connectors". We're developing technology that proves the concept that anyone with basic data modeling and software development skills can create a connector, or data converter, that transforms election data (including but most certainly not limited to vote counts) from one of a variety of existing formats, to the format of the election data standard.
And this is the central concept to prove -- because as we've been saying in various ways for some time, the data exists but is locked up in a variety of legacy and/or proprietary formats. These existing formats differ from one another quite a bit, and contain varying amounts of information beyond basic vote counts. There is good reason to be skeptical, to suppose that is a hard problem to take these different shapes and sizes of square data pegs (and pentagonal, octahedral, and many other shaped pegs!) and put them in a single round hole.
But what we're learning -- and the jury is still out, promising as our experience is so far -- that all these existing data sets have basically similar elements, that correspond to a single standard, and that it's not hard to develop prototype software that uses those correspondence to convert to a single format. We'll get a better understanding of the tricky bits, as we go along making 3 or 4 prototype converters.
Much of this feasibility rests on a structuring principle that we've adopted, which runs parallel to the existing data standard that we've adopted. Much more on that principle, the standard, its evolution, and so on … yet to come. As we get more experience with data-wrangling and converter-creation, there will certainly be a lot more to say.
It's time to finish -- in two parts -- the long-ish explanation of the assumptions behind our current "VoteStream" prototype stage of the TrustTheVote Project's Election Result Reporting Platform (ENRS) project. As I said before, it is an exercise in validating some key assumptions, and discovering their limits. Previously, I've described our assumptions about election results data, and the software that can present it. Today, I'll explain the 3rd of three basic assumptions, which in a nutshell is this:
- If the data has the characteristics that we assumed, and
- if the software (to present that data) is as feasible and useful as we assumed;
- then there is a method for getting the data from its source to the reporting software, and
- that method is practical for real-world elections organization, scalable, and feasible to be adopted widely.
So, where are we today? Well, as previous postings have described, we made a good start on validating the first 2 assumptions during the previous design phase. And since starting this prototype phase, we've improved the designs and put them into action. So far so good: the data is richer than we assumed; the software is actually significantly more flexible than before, and effectively presents the data. We're pretty confident that our assumptions were valid on those two points.
But where did the 2012 election results data come from, and how did it get into the ENRS prototype? Invented elections, or small transcribed subsets of real results, were fine for design; but in this phase it needs to be real data, complete data, from real election officials, used in a regular and repeated way. That's the kind of connection between data source and ENRS software that we've been assuming.
Having stated this third of three assumptions, the next point is about what we're doing to prove that assumption, and assess it limits. That will be part two of two, of this last segment of my account of our assumptions and progress to date.
As I often do, I had a thoughtful Martin Luther King Day -- as you can see from my still pondering a couple days later. But I think I now have something to share. Last time I wrote on MLK, I likened two unlikely things:
- King's demand for social justice and peace, using Isaiah's prophetic words that "Justice shall roll down like water, and righteousness like a mighty stream."
- My vision of really meaningful election transparency, stemming from a mighty torrent of data that details everything that happened in a county's conduct of an election, published in a form anyone can see, and can use to check whether the election outcomes are actually supported by the data.
Still a bit of a stretch, no doubt, because since my little moment by the waterfalls of the MLK memorial in San Francisco, I've had rather mixed success in explaining why this kind of transparency is so difficult. Among the reasons are the complexity of the data, and the very inconvenient way it is locked up inside voting system products and proprietary data formats.
But perhaps more important, it is just a vexingly detailed and complicated process to administer elections and conduct voting and counting -- paradoxically made even more complex with the addition of new technology. (Just ask a New York state election admin person about 2010.) In some cases, I am sure that local election officials would not take umbrage at the phrase "Rube Goldberg Machine" to describe the whole passle of people, process, and tools.
So, among my new year's resolutions, I am going to try to communicate, by example, a large part of the scope of data and transparency that is needed in U.S. elections. It will take some time to do in small digestible blogs, but I hope the example will serve to illustrate several things:
- What election administration is really like;
- What kinds of information and operations are used;
- How a regular process of capturing and exposing the information can prevent some of the mishaps, doubts, and litigation you've often read about here.
- Last but not least, how the resulting transparency connects directly to the nuts-and-bolts election technology work that we are doing on vote tabulation and on digital pollbooks.
One challenge will be keeping the example at an artificially small scale, for comprehensibility, while still providing meaningful examples of the data and the election officials' work to use it. On that point especially, feedback will be particularly welcome!