Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement EllipticCurve_with_prime_order() constructor #38341

Merged
merged 30 commits into from
Sep 15, 2024

Conversation

JosePisco
Copy link
Contributor

@JosePisco JosePisco commented Jul 9, 2024

Me and @grhkm21 suggest this diff against develop that implements the EllipticCurve_with_prime_order(N) constructor. Using the prime order N in input, this method finds another prime p and constructs an elliptic curve E/Fp with #E(Fp) = N.

It follows Algorithm 2.2 of the paper Constructing Elliptic Curves of prime order by Bröker and Stevenhagen. The running time is quite random depending on the input parameter but can turn out to be fast for some larger values (≃ 256 bits primes). It's also worth noticing that some values will make this function run for a very long time.

There had been a PR by @grhkm21 and @GiacomoPope that implements the EllipticCurve_with_order() method. This PR would intuitively fit nice into their work but I felt uncomfortable with it returning an iterator. I felt like returning a single curve was more handy so I implemented this method in a separate function that does so but I'm open to suggestions if this is of any interest to the community.

Fixes #38342

📝 Checklist

  • The title is concise and informative.
  • The description explains in detail what this PR is about.
  • I have linked a relevant issue or discussion.
  • I have created tests covering the changes.
  • I have updated the documentation and checked the documentation preview.

⌛ Dependencies

Copy link

github-actions bot commented Jul 9, 2024

Documentation preview for this PR (built with commit b60e81f; changes) is ready! 🎉
This preview will update shortly after each push to this PR.

@GiacomoPope
Copy link
Contributor

From my perspective the work was done during a Sage days and most probably can be improved but I would be interested in tracing that over flow error... I thought we tested it for larger values and it was fine and fast enough. Don't know...

@JosePisco
Copy link
Contributor Author

From my perspective the work was done during a Sage days and most probably can be improved but I would be interested in tracing that over flow error... I thought we tested it for larger values and it was fine and fast enough. Don't know...

Now that I read my comment again, I might have wrongfully expressed myself. I wanted to argue why this code that would intuitively fit inside your work is written separately.

Copy link
Contributor

@vincentmacri vincentmacri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I'm familiar enough with Sage or what you're doing to approve this, but hopefully this feedback helps make the review easier for someone who is more familiar.

src/sage/schemes/elliptic_curves/ell_finite_field.py Outdated Show resolved Hide resolved
# of elements of S than using a powerset. Here many multiplications are
# done multiple times.
for e in powerset(S):
D = product(e)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the paper, the algorithm takes a product of the p^* terms, not the p terms. Should there be a computation of the p^* terms somewhere? I also don't see where you do the square root mod N part of the algorithm.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The square roots are used to solve the diophantine equation for 4 * N and BinaryQF does it for us.

The p* are defined as p* = (-1)^(p-1 / 2) * p. Basically, it checks the parity of p-1 / 2 to negate or not p. This is a shady part for me because this wouldn't work and it kept yielding non-discriminants for the Hilbert polynomial. I found a hack around that seems to work but maybe I'm wrong here.

@grhkm21
Copy link
Contributor

grhkm21 commented Jul 10, 2024

This PR would intuitively fit nice into their work but I felt uncomfortable with it returning an iterator. I felt like returning a single curve was more handy so I implemented this method in a separate function that does so but I'm open to suggestions if this is of any interest to the community.

I disagree, I mean CM method inherently generates multiple curves, so it makes sense to return an iterator. If you want a single curve, just use the (builtin) next function.

@grhkm21
Copy link
Contributor

grhkm21 commented Jul 10, 2024

Also I will have to look closer at the paper and the implementation, but why don't use you just use EllipticCurve_with_order? It seems to implement the same CM stuff based on hilbert_polynomial.

Copy link
Contributor

@vincentmacri vincentmacri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, but someone who better understands the algorithm and paper you're working off of should take a look too.

src/sage/schemes/elliptic_curves/ell_finite_field.py Outdated Show resolved Hide resolved
src/sage/schemes/elliptic_curves/ell_finite_field.py Outdated Show resolved Hide resolved
@JosePisco
Copy link
Contributor Author

Looks good to me, but someone who better understands the algorithm and paper you're working off of should take a look too.

Agreed, I believe this work is fine but the ways around for the discriminant computation would require second, maybe deeper review.

Thanks a lot for the good review so far!

@JosePisco
Copy link
Contributor Author

I have tried to comply 100% with the algorithm from the paper Constructing Elliptic Curves of prime order by Bröker and Stevenhagen i.e.

  • Taking $p$ or $-p$ depending on the parity of $(p-1) / 2$
  • Checking if $D \equiv 5 \mod 8$ and not $-D$
  • Solving the diophantine equation $x^2 - Dy^2 = 4N$ and not $x^2 + Dy^2 = 4N$

I can't explain why the actual implementation works while the attempts at strictly following the algorithm don't.
I provide a patch that applies the changes described above if anybody is curious (but it fails the tests).
changes.diff

@grhkm21
Copy link
Contributor

grhkm21 commented Sep 3, 2024

Hi, I'm ready to review this. As @vincentmacri mentioned, you implemented the algorithm incorrectly. I am able to get an algorithm working using $p^*$ instead of $p$ terms. I also added a few performance tricks here and there, mostly the curve_has_order auxiliary function (it's a bit slow because of the try-except but maybe copy the code from .set_order that I modified a few months back), and also you can specify to use algorithm="cornacchia".

And also as I said above, please return an iterator instead of a single curve, just like many other algorithms do.

sage: def curve_has_order(E, N):
....:     if any(E.random_point() * N != 0 for _ in range(20)):
....:         return False
....:     try:
....:         E.set_order(N)
....:         return True
....:     except ValueError:
....:         return False
....: 
....: def f(N):
....:     assert is_prime(N)
....: 
....:     r = 0
....:     while r < 1:
....:         primes = []
....:         for p in prime_range(max(3, floor(r * log(N))), floor((r + 1) * log(N))):
....:             if legendre_symbol(N, p) == 1:
....:                 p_ = p if p % 4 == 1 else -p
....:                 primes.append((p_, GF(N)(p_).sqrt()))
....:         for subset in powerset(primes):
....:             D, sqrtD = prod(u[0] for u in subset), prod(u[1] for u in subset)
....:             if D % 8 != 5 or D >= 0:
....:                 continue
....:             Q = BinaryQF([1, 0, -D])
....:             sol = Q.solve_integer(4 * N, algorithm="cornacchia")
....:             if sol is None:
....:                 continue
....:             x, y = sol
....:             assert x**2 - D * y**2 == 4 * N
....:             for p in [N + 1 + x, N + 1 - x]:
....:                 if not is_prime(p):
....:                     continue
....:                 h = hilbert_class_polynomial(D)
....:                 K = GF(p)
....:                 for j in h.roots(ring=K, multiplicities=False):
....:                     E = EllipticCurve(K, j=j)
....:                     for E_ in E.twists():
....:                         if curve_has_order(E_, N):
....:                             yield E_
....:         r += 1
....: 
....: N = 123456789012345678901234567890123456789012345678901234568197
....: it = f(N)
....: [next(it) for _ in range(5)]
[Elliptic Curve defined by y^2 = x^3 + 70969083882016979692051040054897244304487459580456198765042*x + 23963839073844928337866754804687093774080822288916599802630 over Finite Field of size 123456789012345678901234567890654833374525085966737125236501,
 Elliptic Curve defined by y^2 = x^3 + 32301896591105181383197983392011966935739308240089212446888*x + 20440933360506659987728474271841544365948756313127780399770 over Finite Field of size 123456789012345678901234567890654833374525085966737125236501,
 Elliptic Curve defined by y^2 = x^3 + 49640879391385567455601737809134376710338058924528599296877*x + 57903173466042191227456380331494397476264372226969014648600 over Finite Field of size 123456789012345678901234567890654833374525085966737125236501,
 Elliptic Curve defined by y^2 = x^3 + 113468659391060172590889752354736732425162987733160523980142*x + 72243181080242569121408088391312469724962221744363796902853 over Finite Field of size 123456789012345678901234567890654833374525085966737125236501,
 Elliptic Curve defined by y^2 = x^3 + 75763109110881858499138647331042804275382035844950124504205*x + 24123673357232490706020024337399069487889990527522975347683 over Finite Field of size 123456789012345678901234567890654833374525085966737125236501]```

@JosePisco
Copy link
Contributor Author

Thanks for the review!

From your comment and code and my notes I was able to propose commit db67898 complying entirely with Algorithm 2.2 from the paper.

Few things I'm not convinced about:

  • Your curve_has_order() function is probabilistic? I get the method and believe it's a smart way to do it but is sampling 20 points enough? And do we not fall even with .order() while trying with E.set_order(N) after?
  • I'm against returning an iterator; simply from a pragmatic point of view we want a constructor for elliptic curves to return one instance of a curve, not a range of them.

@vincentmacri
Copy link
Contributor

vincentmacri commented Sep 4, 2024

* I'm against returning an iterator; simply from a pragmatic point of view we want a constructor for elliptic curves to return one instance of a curve, not a range of them.

I don't have a strong opinion either way, but unless there's some kind of canonical curve with the given prime order, what's wrong with an iterator? If a user is fine with an arbitrary one, they could use next(elliptic_curves_with_prime_order(N)) to easily get the first one. And if the "canonical" one is the first one returned, then I still think and iterator is fine and you could just add a note to the docstring saying that the first curve returned is the canonical/best/smallest or whatever, and that the iterator returns curves of increasing complexity as it goes on.

@JosePisco
Copy link
Contributor Author

JosePisco commented Sep 5, 2024

unless there's some kind of canonical curve with the given prime order

There isn't. As @grhkm21 mentionned: "CM method inherently generates multiple curves" and that makes it more logical to return an iterator on a theoretical point of view.

My point is on a programming and software point of view: we are defining a special constructor of EllipticCurve and that would make sense to align with other constructors that just return an EllipticCurve and not an iterator.

Maybe we could satisfy everybody by introducing a parameter iterator in EllipticCurve_with_prime_order(N, iterator=False). So that the user can choose if he wants an iterator or not. I get that people can just use next(EllipticCurve_with_prime_order(N)) at this point but it feels like we stray further from other elliptic curves constructors in sage.

Open debate, please take part :)

@JosePisco
Copy link
Contributor Author

We should also discuss how to efficiently check the order of the curve candidates.
@grhkm21 proposed to probabilistically check curves' orders instead of computing them and comparing naively. This method is smarter and faster only if set_order() is deterministic.

def curve_has_order(E, N):
    if any(E.random_point() * N != 0 for _ in range(20)):
        return False
    try:
        E.set_order(N)
        return True
    except ValueError:
        return False

However, grhkm found out while discussing this with me that it was deterministically incorrect in some rare cases (see #38617).

So I think it would be nice to discuss which method to adopt for this case (and given how the issue is addressed, maybe implement E.has_order(N) for elliptic curves).

@vincentmacri
Copy link
Contributor

My point is on a programming and software point of view: we are defining a special constructor of EllipticCurve and that would make sense to align with other constructors that just return an EllipticCurve and not an iterator.

I agree a constructor shouldn't return an iterator. But why does this need to be a constructor? We already have elliptic_curves.rank which returns a list of curves with the specified rank and torsion order. Instead of adding the constructor EllipticCurve_with_prime_order, why not add the function elliptic_curves.prime_order which returns an iterator?

@vincentmacri
Copy link
Contributor

why not add the function elliptic_curves.prime_order which returns an iterator?

Maybe not that specific name since it seems elliptic_curves is for stuff involving an elliptic curves database, but my point is why not view this as a utility function rather than a constructor.

@grhkm21
Copy link
Contributor

grhkm21 commented Sep 5, 2024

This should definitely be viewed as an utility function rather than a constructor. I still don't see any valid argument for returning a curve rather than an iterator.

@grhkm21
Copy link
Contributor

grhkm21 commented Sep 11, 2024

Hey @vincentmacri , I added a few changes to EllipticCurve_with_order (not with_prime_order) as well, mainly refactoring + reversing the order of D to start searching from smaller absolute value D. I am a coauthor of that method so I hope you can review that too ^_^ (it's very similar in nature)

@grhkm21
Copy link
Contributor

grhkm21 commented Sep 11, 2024

This PR fixes #38342 as well after reversing the order of searching $D$.

@vincentmacri
Copy link
Contributor

This PR fixes #38342 as well after reversing the order of searching D .

Add "Fixes #38342" to the PR description to make sure the issue gets closed after this is merged.

Copy link
Contributor

@vincentmacri vincentmacri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it looks fine otherwise, but I'll take another look tomorrow or Friday before approving.

to be fast for many purposes, and for most `N` we tested we are able to
find a suitable small `D` without increasing the size of `S`.

<<<<<<< HEAD
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<<<<<<< HEAD

Guessing this made it in when doing some git operations?

(-D)y^2 = 4N` with `D < 0` and `N` prime, we actually need `|D| \leq 4N`,
so we terminate the algorithm when the primes in the table are larger than
that bound. This makes the iterator return all curves it can find in finite
time :)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
time :)
time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As much as I wish software documentation had more emojis, I don't think that's how Sage docs are written.

for Et in E.twists():
# `num_checks=1` is sufficient for prime order.
if Et.has_order(N, num_checks=1):
Et.set_order(N, check=False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it's worth calling set_order in has_order, at least when you're in the situation where you know has_order will return the correct result?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we can do that, I was worried because there's a bug report open right now, but I think it'll be fine...

@grhkm21
Copy link
Contributor

grhkm21 commented Sep 11, 2024

This PR fixes #38342 as well after reversing the order of searching D .

Add "Fixes #38342" to the PR description to make sure the issue gets closed after this is merged.

This PR fixes #38342 as well after reversing the order of searching D .

Add "Fixes #38342" to the PR description to make sure the issue gets closed after this is merged.

@JosePisco


- Gareth Ma (2024-01-21): Fix bug for small curves
"""
# This method does *not* use the cached value of `_order` even when
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's the point of the method, to check the order "independently" in case the order is wrong (?) I don't know if it makes sense. If you think it makes more sense to use the cached order (when available) I can change it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding of the point of this method is it's a faster version of E.order() == N meant to be used when E.order() would take too long to compute. So if we know the order then we should use it. The value of _order shouldn't be allowed to be wrong (unless the user has used proof=False or something in which case they should know to be aware of wrong answers).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you change it to use _order you might need to tweak the test showcasing the bug.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -1436,6 +1436,7 @@ def has_order(self, value, num_checks=8):
if not (value * G).is_zero():
return False

self._order = value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to make sure, there is no path to this line that is impacted by the false positive bug? I.e. if this line is reached, the order is 100% equal to value?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's the other way around I think. It's possible for the code to reach this line with a wrong value.

Copy link
Contributor

@vincentmacri vincentmacri Sep 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be reached in the bugged case. I think you were right before, let's not set it here for now until the issues are sorted out (sorry for changing my mind).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's the other way around I think. It's possible for the code to reach this line with a wrong value.

Let's not set order here then, but maybe add a TODO comment to do so in the future when/if the bugs are sorted out.

@vincentmacri
Copy link
Contributor

You can put the set_order calls back in the two generator functions as well (and mention to remove those in the comment in has_order about after #38617 is fixed).

@JosePisco
Copy link
Contributor Author

Not a fan at all to address multiple problems (#38342) in a single PR.
It's starting to contain lots of changes and I think we should resolve the problems atomically.

@grhkm21
Copy link
Contributor

grhkm21 commented Sep 12, 2024

I don't see problems if they fit in the PR naturally, as it was required to create the example of failing has_order. If you want to, feel free to isolate it into a PR.

@vincentmacri
Copy link
Contributor

Not a fan at all to address multiple problems (#38342) in a single PR. It's starting to contain lots of changes and I think we should resolve the problems atomically.

It's an incredibly minor change and I've already reviewed it. Might not be the best example of best practices but splitting it up at this point seems like it would be a waste of time.

Comment on lines 1412 to 1413
# This method does *not* use the cached value of `_order` even when
# available.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# This method does *not* use the cached value of `_order` even when
# available.

It does use it now.


- ``num_checks``-- integer (default: `8`); the number of times to check
whether ``value`` times a random point on this curve equals the
identity
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
identity
identity. If the order of the base field is prime, it is sufficient
to pass in ``num_checks = 1`` because of the Hass bound.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct me if I'm wrong about why num_checks = 1 is sufficient here. Feel free to reword.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, not quite, it's if the order of the curve is prime. I will edit it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, the commented out code I asked to remove was checking if value is prime.

Comment on lines +1425 to +1428

# This might be slow
# if value.is_prime():
# num_checks = 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# This might be slow
# if value.is_prime():
# num_checks = 1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I always keep these comments because they're valuable for contributors later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's redundant with the added explanation but if you still think it's valuable that's fine.

Copy link
Contributor

@vincentmacri vincentmacri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I don't have permission to add the positive review label so feel free to do that yourself.

You don't happen to know what the process is to get that permission, do you?

@grhkm21
Copy link
Contributor

grhkm21 commented Sep 12, 2024

I think I can get someone to add you to the group. Thanks for the review.

@vincentmacri
Copy link
Contributor

I think I can get someone to add you to the group. Thanks for the review.

That would be great. Thanks for the contribution!

@vbraun vbraun merged commit 76fbb6c into sagemath:develop Sep 15, 2024
17 of 19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

EllipticCurve_with_order() crashes with overflow error on values of 64 bits or more
6 participants