Representation of overvoted / invalid ballots #52
Replies: 9 comments 10 replies
-
Various jurisdictions may have laws which influence implementation here. The "Error" option would account for the widest implementation. With the "Error" scenario, adjudication boards could create a replacement ballot — with the overvotes omitted or corrected — to submit for scanning. |
Beta Was this translation helpful? Give feedback.
-
Right now ElectionGuard errors by design; we no longer have a well-formed ballot so the proofs are undermined. We rely on the implementing system to handle overvotes. The "default" setting for ElectionGuard is the use case of end-to-end verifiability, which In this use case would assume precinct scan, not central tabulation. In that context, the ballot is either held or rejected and the voter is presented with a choice to correct the ballot. If we were to assume a central tabulation use case, I'm of the interpretation route as well; by default, though, I would not want those votes to be included in the initial tabulation, since there is likely the need for adjudication or other separate processes |
Beta Was this translation helpful? Give feedback.
-
We've had the idea floating around for a while now that we need some auxiliary "conventionally encrypted" arbitrary text as a way to handle write-in votes. My proposal is that we generalize this a bit and make it a JSON structure of some sort that we can haggle around. For now, I'm assuming one text blob per contest, which means on a contest with three normal selections and one write-in selection, where the IDs are
Or, it could be something fancier, like:
The latter case supposes a voter who filled in bubbles for two normal candidates as well as the write-in candidate, and then put some text in the write-in field which the scanner somehow magically recognized. Of course, we could make this even fancier in several dimensions. Similarly, we might decide that we want to have one auxiliary text blob for the entire ballot, or for every contest, or even for every selection. And we could fancy it up even more, by representing the full generality of the election markup. I'm not particular sure I'd recommend that, initially, but going with a JSON dictionary as the base datatype here allows for future extensibility. Anyway, for the arlo-e2e use case, we're dealing with central tabulators which just directly output these overvotes, so this isn't just a fun hypothetical. It's a concrete problem we need to solve. Similarly, write-in votes on touch-screen voting machines are a very real problem. A general-purpose JSON structure, "conventionally encrypted", would seem to address all these needs. But how do you do the conventional encryption so it doesn't leak anything by virtue of its length? We can certainly create a "session key", encrypted with the public key of the election, and then use that session key to run a conventional AES-GCM (or equivalent) machine. We can mitigate variable-length JSON strings by just making a decision, up front, that we'll require a specific number of characters for the write-in fields. That may mean that we output |
Beta Was this translation helpful? Give feedback.
-
Yuck. My first thought was to mark the ballot as spoiled. We currently say that all ballots -- including spoiled ballots -- should pass validation. But it would be OK to have invalid spoiled ballots in the system. However, it seems that this is not sufficient since I presume that a cast ballot with an overvote in one contest should still be counted normally in other contests. My next thought is that we might need to add a one-bit flag to each contest on each ballot which indicates whether or not there is an overvote in the contest. Perhaps a more elegant way to do this would be to add an integer value to each contest on each ballot which indicates the total number of selections made. In either case, since this flag/integer would not be encrypted, the tallying logic could simply say that encrypted votes are not tallied when the number of votes in a contest exceeds the limit. This allows an overvote ballot to pass validation and still have the overvote not be tallied. The unfortunate thing is that this does require some code changes. |
Beta Was this translation helpful? Give feedback.
-
Also, one more thing: Consider the case where there are enough write-in candidates that we have at least the possibility that a write-in candidate is among the winners. (Murkowski famously won this way: https://www.reuters.com/article/us-usa-elections-murkowski/senator-lisa-murkowski-wins-alaska-write-in-campaign-idUSTRE6AG51C20101118) A fancy thing we could do is a reencryption mixnet of some sort, allowing us to shuffle up all the JSON strings while provably not damaging any of them. Then decrypt conventionally. Then tally conventionally. @benaloh should chime in here, but I think this would mean that we can't use conventional encryption any more, but rather that we need to shoehorn the write-in string into So, it's entirely possible that we should consider a per write-in |
Beta Was this translation helpful? Give feedback.
-
We've thus far chosen to punt on write-ins beyond just counting the number of write-ins. Doing this right involves a lot of new code to implement a MixNet -- as well as new verification steps to verify correct mixing. We would need to do entirely different encryption for write-ins. We couldn't even shoehorn things into the 31-32 bytes available mod q because we're using exponential ElGamal and need to compute discrete logs to decrypt. It's a bit strange, but even though we have a (nearly) 32-byte space to work with, we can't reasonably use more than about 4 bytes. This is ample for holding an integer tally, but not much more. At one point, I started including details on how to use ordinary (non-exponential) ElGamal to convey a 32-byte key share in the key generation phase. It became increasingly painful and after several pages of ugly documentation, we decided to pull it all out and just use RSA. While exponential ElGamal works really nicely, ordinary integer ElGamal is so painful that I couldn't find any instances of its use. Ordinary elliptic-curve ElGamal is used widely, but that also requires lots more code and even more pain for verifiers. |
Beta Was this translation helpful? Give feedback.
-
Yes. There is no harm in having a conventional encryption that is opened only if necessary. For that use E2E-verification just covers the number of write-ins but not their contents. For the Texas case, we could create an entry for each registered write-in. This doesn't imply that write-in candidates need to be included in the UI, but it gives us a place to mark a named write-in candidate in our structure whenever it is recognized as a voter selection without our having to do anything special with the EG code. |
Beta Was this translation helpful? Give feedback.
-
currently we have the following on the
and the following in
perhaps one solution here would be to generate 2 sets of proofs, one for the in any case, we'll need more thought around this if we are going to support other types of voting that support values different from 0 or 1 for any individual contest. food for thought |
Beta Was this translation helpful? Give feedback.
-
So there may be a pretty easy solution if we take a step up. If a voter overvotes a contest, we can encode that as though the voter left that contest blank. Everything else will go just fine as long as it's clear to everyone that this is the expected behavior and not ElectionGuard somehow deleting votes. In the alternative, it is certainly possible to encode the ballot as it is presented and count only votes in contests where the selection limit is not exceeded. But doing this without publicly revealing which ballots were excluded requires substantially more complex zero-knowledge proofs and computations. The problem is that now everything is linear and we're just adding (encrypted) ballots. If we introduce what amounts to an encrypted flag or integer that is used to determine whether or not a vote is to be counted, we are introducing multiplication. (Instead of computing V1+V2+V3+..., we'd be computing C1V1+C2V2+C3V3+... where each C is either a zero or one to indicate whether or not the corresponding V should be counted.) This would be A LOT of work -- perhaps even requiring an entirely different method of encryption. |
Beta Was this translation helpful? Give feedback.
-
Consider the following: If a voter using an optical scan ballot enters a ballot with an overvote, what's supposed to happen next?
In at least some cases, including central tabulation scanners, the answer is apparently that the CVR for that ballot includes the overvote. That means that the proper way to tabulate those CVRs isn't simply to add up each column! This leads to some really unpleasant questions of how ElectionGuard should represent those ballots.
Options?
I'm kinda leaning toward one of the "interpret" solutions, possibly with auxiliary metadata that would only show up if you did an individual decryption.
Beta Was this translation helpful? Give feedback.
All reactions