Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove auto-generated schemata #69

Open
wants to merge 35 commits into
base: master
Choose a base branch
from

Conversation

dingxiangfei2009
Copy link

@dingxiangfei2009 dingxiangfei2009 commented May 17, 2018

  • Build schemata with protoc_rust in build.rs

Related to #67.

protoc_rust is added as a build dependency.
It is used to trans-compile .proto schemata into Rust source.
build.rs will put the trans-compiled schemata in place.

I am not fully certain on the decision between leaving out Cargo.lock and checking it in, but I slightly prefer leaving it out as per the suggestion from rust-lang.

Sorry. I am new to Travis. I will push a new change to make this build on Travis.

psivesely and others added 30 commits March 19, 2018 21:22
Implements barycentric Lagrange interpolation. Uses algorithm (3.1) from the
paper "Polynomial Interpolation: Langrange vs Newton" by Wilhelm Werner to find
the barycentric weights, and then evaluates at `Gf256::zero()` using the second
or "true" form of the barycentric interpolation formula.

I also earlier implemented a variant of this algorithm, Algorithm 2, from "A new
efficient algorithm for polynomial interpolation," which uses less total
operations than Werner's version, however, because it uses a lot more
multiplications or divisions (depending on how you choose to write it), it runs
slower given the running time of subtraction/ addition (equal) vs
multiplication, and especially division in the Gf256 module.

The new algorithm takes n^2 / 2 divisions and n^2 subtractions to calculate the
barycentric weights, and another n divisions, n multiplications, and 2n
additions to evaluate the polynomial*. The old algorithm runs in n^2 - n
divisions, n^2 multiplications, and n^2 subtractions. Without knowing the exact
running time of each of these operations, we can't say for sure, but I think a
good guess would be the new algorithm trends toward about 1/3 running time as n
-> infinity. It's also easy to see theoretically that for small n the original
lagrange algorithm is faster. This is backed up by benchmarks, which showed for
n >= 5, the new algorithm is faster. We can see that this is more or less what
we should expect given the running times in n of these algorithms.

To ensure we always run the faster algorithm, I've kept both versions and only
use the new one when 5 or more points are given.

Previously the tests in the lagrange module were allowed to pass nodes to the
interpolation algorithms with x = 0. Genuine shares will not be evaluated at x =
0, since then they would just be the secret, so:

1. Now nodes in tests start at x = 1 like `scheme::secret_share` deals them out.
2. I have added assert statements to reinforce this fact and guard against
   division by 0 panics.

This meant getting rid of the `evaluate_at_works` test, but
`interpolate_evaluate_at_0_eq_evaluate_at` provides a similar test.

Further work will include the use of barycentric weights in the `interpolate`
function.

A couple more interesting things to note about barycentric weights:

* Barycentric weights can be partially computed if less than threshold
  shares are present. When additional shares come in, computation can resume
  with no penalty to the total runtime.
* They can be determined totally independently from the y values of our points,
  and the x value we want to evaluate for. We only need to know the x values of
  our interpolation points.
While this is a slight regression in performance in the case
where k < 5, in absolute terms it is small enough to be neglible.
Horner's method is an algorithm for calculating polynomials, which consists of
transforming the monomial form into a computationally efficient form. It is
pretty easy to understand:
https://en.wikipedia.org/wiki/Horner%27s_method#Description_of_the_algorithm

This implementation has resulted in a noticeable secret share generation speedup
as the RustySecrets benchmarks show, especially when calculating larger
polynomials:

Before:
test sss::generate_1kb_10_25 ... bench: 3,104,391 ns/iter (+/- 113,824)
test sss::generate_1kb_3_5 ... bench: 951,807 ns/iter (+/- 41,067)

After:
test sss::generate_1kb_10_25        ... bench:   2,071,655 ns/iter (+/- 46,445)
test sss::generate_1kb_3_5          ... bench:     869,875 ns/iter (+/- 40,246)
RustySecrets makes minimal use of the rand library. It only initializes
the `ChaChaRng` with a seed, and `OsRng` in the standard way, and then calls
their `fill_bytes` methods, provided by the same Trait, and whose function
signature has not changed.  I have confirmed by looking at the code changes,
that there have been no changes to the relevant interfaces this library uses.
Since id is a `u8` it will never be greater than 255.
It's possible that two different points have the same data.

To give a concrete example consider the secret polynomial `x^2 + x + s`, where
`s` is the secret byte. Plugging in 214 and 215 (both elements of the cyclic
subgroup of order 2) for `x` will give the same result, `1 + s`.

More broadly, for any polynomial `b*x^t + b*x^(t-1) + ... + x + s`, where `t` is
the order of at least one subgroup of GF(256), for all subgroups of order `t`,
all elements of that subgroup, when chosen for `x`, will produce the same
result.

There are certainly other types of polynomials that have "share collisions."
This type was just easy to find because it exploits the nature of finite fields.
Ensures that threshold > 2 during the parsing process, since we ensure the same
during the splitting process.
Since the validation already confirms `shares` is not empty, `k_sets` will never
match 0.
The arguments were provided in the wrong order.
* Pass a ref to `Vec<Shares>` instead of recreating and moving the object
  through several functions.
* Return `slen`/ `data_len`, since we'll be using it anyway in `recover_secrets`
I think that using hashmaps and hash sets was overkill and made the code much
longer and complicated than it needed to be.

The new code also produces more useful error messages that will hopefully help
users identify which share(s) are causing the inconsistency.
The best place to catch share problems is immediately during parsing from
`&str`, however, because `validate_shares` takes any type that implements the
`IsShare` trait, and there's nothing about that trait that guarantees that the
share id, threshold, and secret length will be valid, I thought it best to leave
those three tests in `validate_shares` as a defensive coding practice.
This should be useful when validating very large sets of shares. Wouldn't want
to print out up to 254 shares.
* Update rustfmt compliance

Looks like rustfmt has made some improvements recently, so wanted to bring the
code up to date.

* Add rustfmt to nightly item in Travis matrix

* Use Travis Cargo cache

* Allow fast_finish in Travis

Items that match the `allow_failures` predicate (right now, just Rust nightly),
will still finish, but Travis won't wait for them to report a result if the
other builds have already finished.

* Run kcov in a separate matrix build in Travis

* Rework allowed_failures logic

We don't want rustfmt to match `allow_failures` just because it needs to use
nightly, while we do want nightly to match `allow_failures`. Env vars provide a
solution.

* Add --all switch to rustfmt Travis

* Test building docs in Travis

* Use exact Ubuntu dependencies listed for kcov

Some of the dependencies we were installing were not listed on
https://github.com/SimonKagstrom/kcov/blob/master/INSTALL.md, and we were
missing one dependency that was listed there. When `sudo: true` Travis uses
Ubuntu Trusty.

* No need to build before running kcov

kcov builds its own test executables.

* Generate `Cargo.lock` w/ `cargo update` before running kcov

As noted in aeb3906 it is not necessary to
build the project before running kcov, but kcov does require a `Cargo.lock`
file, which can be generated with `cargo update`.
Although RustySecrets is a library, it is important that all
contributors to the library are using the very same version
of every package, as we cannot always trust downstream deps
to follow SemVer to the letter.
ebkalderon and others added 3 commits May 4, 2018 14:23
Fortunately as both MIN_SHARES and MIN_THRESHOLD are both set to 2 in errors.rs,
the typo had no impact on validation correctness.
* Build schemata with `protoc_rust` in `build.rs`

`protoc_rust` is added as a build dependency.
It is used to trans-compile `.proto` schemata into Rust source.
`build.rs` will put the trans-compiled schemata in place.
@psivesely
Copy link
Contributor

Does this require that protoc is installed on the build machine and available in $PATH or does this package bring protoc (or a Rust rewrite) in? If the former, do we want to add this build dependency, and how do we ensure that the whole crate can still be built reproducibly (i.e., maybe we should lock in the protoc version)?

@dingxiangfei2009
Copy link
Author

Yes, does. I have taken the idea from here to build the protoc on Travis-CI. In this pull request, I cache the protoc result to speed up future builds. The protoc used is locked to 3.5.1 for now.

@psivesely
Copy link
Contributor

The protoc used is locked to 3.5.1 for now.

This is referring to in Travis builds and a PR in the merkle.rs repo SpinResearch/merkle.rs#38.

Would it be possible to ensure in build.rs that if the protoc binary found by protoc_rust is other than 3.5.1 the build fails with an informative error?

@romac romac changed the title * Remove auto-generated schemata Remove auto-generated schemata May 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants