Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor arithmetic power #886

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from
Draft

refactor arithmetic power #886

wants to merge 1 commit into from

Conversation

mmatera
Copy link
Contributor

@mmatera mmatera commented Jul 21, 2023

This is another chuck of #766. This part is related to the way in which powers are evaluated.

Part of this code is dealing with the fact that the output from Sympy is not compatible with the way in which WL handles expressions involving powers. Tests were fixed here according to the behavior in WMA.

@mmatera mmatera marked this pull request as ready for review July 21, 2023 02:25
@mmatera mmatera mentioned this pull request Jul 21, 2023
@mmatera
Copy link
Contributor Author

mmatera commented Jul 21, 2023

@rocky, notice that this part was one of the most polemical in the discussion of #766. The reasons I have for insisting in this are:

  • Provides a closer behavior to the WMA, so I can build tests that are compatible with its outputs.
  • Despite the implementation of eval_Power_can be improved, I think that splitting it from the symbol definition would help to improve it in the future.
  • In particular, this could help to think about how to improve the sympy back and forward conversions, if we still want to use it for these arithmetic evaluations.

@rocky
Copy link
Member

rocky commented Jul 21, 2023

@rocky, notice that this part was one of the most polemical in the discussion of #766. The reasons I have for insisting in this are:

  • Provides a closer behavior to the WMA, so I can build tests that are compatible with its outputs.
  • Despite the implementation of eval_Power_can be improved, I think that splitting it from the symbol definition would help to improve it in the future.
  • In particular, this could help to think about how to improve the sympy back and forward conversions, if we still want to use it for these arithmetic evaluations.

Ok - noted. I note this and will try to take a look at over the weekend.

@@ -520,6 +528,8 @@ class Power(BinaryOperator, _MPMathFunction):
rules = {
"Power[]": "1",
"Power[x_]": "x",
"Power[I,-1]": "-I",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In master, this rule is not needed.: Power[I, -1] returns -I.

In this branch though I am seeing (-I) ^ (-1 / 2). Which one matches WMA. And why the discrepancy?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point is that certain rules implemented implicitly by Sympy are not available now because I wrote a different implementation that does not rely too much on Sympy. At this moment I do not remember exactly which test fails if we remove the new rule, but it is easy to check.

Copy link
Member

@rocky rocky Jul 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not consider your different implementation as a good thing but a bad thing. Given the choice of relying on Sympy code or mmatera code we should go with Sympy. It is more used, more maintained ,and generally more well thought out and more complete.

Our job should be to figure out how to get WMA things done in Sympy when it can.

@@ -319,48 +321,27 @@ def eval_complex_sign(n: BaseElement) -> Optional[BaseElement]:
sign = eval_RealSign(expr)
Copy link
Member

@rocky rocky Jul 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I now see that SymPy has a "Sign" function that works over reals and complexes.

I missed this fact before. We should be using that.

Copy link
Contributor Author

@mmatera mmatera Jul 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sympy.Sign works like WL Sign and, even if it is true to force sympy.Sign as WL RealSign, it would mean doing an extra work in conversion, by checking that the expression is a Real expression. This can be done by hacking to_sympy to handle RealSign in a special way, but that work is more or less equivalent to performing the actual computation in pure Python, with the overhead of the back-and-forth conversion.

For now, please you should not worry about the overhead of back-and-forth conversion this way.

I think we can reduce conversions without custom code and actually down the line we will get a better result if we can just have sympy work on larger chunks of sympy code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, Sign[3 - 4 I] can not be evaluated by using only sympy. The problems of using sympy.sign are shown in the tests by replacing the implementation of mathics.eval.arithmetic.eval_Sign by

def eval_Sign(expr: BaseElement) -> Optional[BaseElement]:
    """
    if expr is a number, return its sign.
    """
    sympy_expr = expr.to_sympy()
    if sympy_expr is not None:
        print("   sympy_expr", sympy_expr)
        result = from_sympy(sympy.sign(sympy_expr))
        if result is not None:
            return result
    return None

In a similar way, if we have expressions with products of negative rational powers, the work of convert back the product to Mathics is more or less the same than evaluating the product without sympy.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is what I see when I evaluate the Python/Sympy equivalent of Sign[3 - 4I]:

>>> from sympy import *
>>> sign(3 - 4j)
0.6 - 0.8*I
>>> 

It looks like Complex.to_sympy() is not doing the right thing.

@@ -153,8 +153,8 @@ def test_multiply(str_expr, str_expected, msg):
("a b DirectedInfinity[q]", "a b (q Infinity)", ""),
# Failing tests
# Problem with formatting. Parenthezise are missing...
# ("a b DirectedInfinity[-I]", "a b (-I Infinity)", ""),
# ("a b DirectedInfinity[-3]", "a b (-Infinity)", ""),
("a b DirectedInfinity[-I]", "a b (-I Infinity)", ""),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this branch I am not seeing parenthesis around the result. I assume WMA adds parenthesis?

I wonder if this is a formatting problem or a precedence problem

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In WMA

In[1]:= a b DirectedInfinity[-I]                                                

Out[1]= a b (-I Infinity)

In[2]:= %//FullForm                                                             

Out[2]//FullForm= Times[a, b, DirectedInfinity[Complex[0, -1]]]

On the other hand, in master,

In[1]:= a b DirectedInfinity[-I]
Out[1]= a b -I Infinity

In[2]:= %//FullForm
Out[2]//FullForm= Times[a, b, DirectedInfinity[Complex[0, -1]]]

Notice that the first output is wrong, and indeed, is a formatting problem.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the information. It looks like this is an operator precedence problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I see the problem now: in this branch, the test is marked as "skip". I solved this precedence issue in another branch, and for some reason, the change was lost. I will try to put this fix and tests in another PR.

# ("a b DirectedInfinity[-I]", "a b (-I Infinity)", ""),
# ("a b DirectedInfinity[-3]", "a b (-Infinity)", ""),
("a b DirectedInfinity[-I]", "a b (-I Infinity)", ""),
("a b DirectedInfinity[-3]", "a b (-Infinity)", ""),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same thing here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In any case, the test seems to be right. The problem is that if we activate it, it does not pass it. I can revert the change, but it will not have any effect until the underlying problem gets fixed, and the tests get activated.

@@ -197,7 +197,7 @@ def test_directed_infinity_precedence(str_expr, str_expected, msg):
("I^(2/3)", "(-1) ^ (1 / 3)", None),
# In WMA, the next test would return ``-(-I)^(2/3)``
# which is less compact and elegant...
# ("(-I)^(2/3)", "(-1) ^ (-1 / 3)", None),
("(-I)^(2/3)", "(-1) ^ (-1 / 3)", None),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. master returns -1 right now, so what we have is wrong.

Do you know what part of the code fixes this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is related to the new rules you mention before...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nop. The rule at the end seems to be useless. The computation is done by mathics.eval.arithmetic.eval_Power_number, which now is called from Power.eval_check ( mathics.builtin.arithfns.basic, line 591).

"System`TraditionalForm": "Sqrt[1 / (1+1 / (1+1 / a))]",
"System`InputForm": "Sqrt[1 / (1 + 1 / (1 + 1 / a))]",
"System`OutputForm": "Sqrt[1 / (1 + 1 / (1 + 1 / a))]",
"System`StandardForm": "1 / Sqrt[1+1 / (1+1 / a)]",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the change in test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the effect of implementing the associativity of powers. In master,

Sqrt[1 / (1 + 1 / (1 + 1 / a))]//FullForm

produces

Power[Power[Plus[1, Power[Plus[1, Power[a, -1]], -1]], -1], Rational[1, 2]]

while here,

Power[Plus[1, Power[Plus[1, Power[a, -1]], -1]], Rational[-1, 2]]

On the other hand, looking at WMA, it seems that it does not implement the associativity in this case either.

# if the result can not be evaluated, or is trivial. For example, if we call eval_Power_number(Integer2, RationalOneHalf)
# it returns ``None`` instead of ``Expression(SymbolPower, Integer2, RationalOneHalf)``.
# The reason is that these functions are written to be part of replacement rules, to be applied during the evaluation process.
# In that process, a rule is considered applied if produces an expression that is different from the original one, or
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something is weird here in terminology.

In normal terminology, a rule is applied if it matches. period. It has nothing to do with whether things change. The WMA evaluation process has decided to look for fixed points for its own reasons.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, but this is not how Mathics builtin rules work. I guess that the reason to do this is to allow to use simpler patterns in BuiltinRules. For example, suppose you want to define a rule of the form

F[x___/;cond]:=(evaluation...)

Sometimes, to evaluate cond is expensive, and can be done as a part of the evaluation process. Then, to avoid duplicating an expensive computation, in a BuiltinRule we can define
F[x___]:=(evaluation...)
and eventually, if in the evaluation we determine that x1,x2,... does not satisfy the condition, we can return None, so the evaluator understands that the rule didn't match.

A different situation would be to have a rule like F[x___]->(eval) that for a certain input F[x1,x2,...] must return the same function (i.e. is invariant). The evaluator indeed can rebuild the expression, but as it is a new expression, it is very expensive to determine (from outside) that both expressions are the same, and we reach a fixed point. Returning None avoids the need of this comparison.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To simplify the discussion of this PR, I removed the comment, so we can go back to this in another round.

@@ -688,8 +694,88 @@ def eval_Times(*items: BaseElement) -> BaseElement:
)


# Here I used the convention of calling eval_* to functions that can produce a new expression, or None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this comment. Based on past experience, I have a sinking feeling this is going off into some misguided direction. (I think of the days when text substitutions on MathML boxes was proposed as a way of addressing formatting?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eval_ methods are used as the function attribute of a BuiltinRule. When the rule is applied, the eval_ method is filled with the parameters from the pattern, and produce an output. If the output is None, the rule is assumed to "fail" and the original expression is kept. Otherwise, if the eval_ method returns an Element, then this element is used as the result of the replacement, and the rule is considered "applied".

Here we have certain eval_ functions that are not eval_ methods but can be used as a part of them. If you are up, please help me to rephrase the comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative way to solve the same problem would be to add a parameter expression into the eval_ methods. Then, we can just check if the result of the replacement is the same as the original expression without any extra cost. However, it will take to change the code in many many places...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To simplify this PR, I am going to remove these comments, that still are available in #766.

#


def associate_powers(expr: BaseElement, power: BaseElement = Integer1) -> BaseElement:
Copy link
Member

@rocky rocky Jul 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure this kind of thing can't be done in Sympy. Have we tried this?

For another thing brought up and in a previous PR, with Sign, I see in that a problem there mentioned in StackOverflow was just about not tagging a variable as being in the right type (complex vs real).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point is not if we can find a way to use Sympy, but if we can do it "natively", without the cost of back-and-forth conversions. Even if we can make these conversions more efficient, avoiding them at all would always be a win, mostly if the native implementation is not too complicated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment above.

@rocky
Copy link
Member

rocky commented Jul 24, 2023

It feels to me there is more here than just

@rocky, notice that this part was one of the most polemical in the discussion of #766. The reasons I have for insisting in this are:

  • Provides a closer behavior to the WMA, so I can build tests that are compatible with its outputs.

I would like more detail here. Be specific about what behavior is not close. Elsewhere I see specific differences in output even with this PR. I think we need to understand in better detail what it is that is making the behavior different.

  • Despite the implementation of eval_Power_can be improved, I think that splitting it from the symbol definition would help to improve it in the future.

I am not opposed in theory to having something that can let us control how power works. But again, I'd appreciate understanding exactly what. Here I see the associativity of turning repeated powers into multiplications. But I am not convinced Sympy can't do this as well. Or that we couldn't do this via a rule.

  • In particular, this could help to think about how to improve the sympy back and forward conversions, if we still want to use it for these arithmetic evaluations.

Until a good cases are made for not using sympy for arithmetic evaluation let's use sympy for arithmetic evaluation. In this PR I see that for I^(2/3) where we get the wrong result, but it is not clear to me that this is the fault of Sympy versus how we feed stuff to Sympy.

So let's take some specific examples like this see what's up here for specific examples.

@rocky
Copy link
Member

rocky commented Jul 24, 2023

This code still seems to have bits of code that aren't associated with arithmetic power. Please let's split off anything that is not related to the association stuff and put that in a different PR so we can disentangle whatever issues those have from the association part.

@rocky rocky marked this pull request as draft July 30, 2023 02:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants