Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First jets implementation #242

Merged
merged 154 commits into from
May 19, 2023
Merged

First jets implementation #242

merged 154 commits into from
May 19, 2023

Conversation

alecandido
Copy link
Member

@alecandido alecandido commented Apr 4, 2023

Supersedes #217

  • complete OperatorGrid migration
  • add inventories initialization
  • restore old runner
    • it will be dropped in the next PR, but let's use one last time to benchmark the new one
  • fix tests in the workflow
  • fix isolated benchmarks
  • fix benchmarks
  • check managed runner equivalence with legacy
  • revert default solve
    # TODO: drop this before merging #242
    solve = solve_jets
  • check and fix tutorials
  • src/eko still contains some q2 in names or docstrings, you might want to update
  • decide about normalize

@alecandido alecandido mentioned this pull request Apr 4, 2023
2 tasks
@felixhekhorn felixhekhorn added refactor Refactor code output Output format and management labels Apr 4, 2023
@alecandido
Copy link
Member Author

alecandido commented Apr 17, 2023

@felixhekhorn while propagating the new mu2grid I noticed a couple of things:

  1. we have a mostly unused parameter of some operator operations of EKO, i.e. compress - should we assume that we always compress our operators and simplify the implementation?
  2. since now we are specifying what I called "evolution points" (i.e. $(\mu^2, n_f)$, alternative proposals are welcome), maybe it would be more consistent to collapse the nf0 parameter with mu0, to have a single evolution point

compress is an optional argument of some internal functions, and I do not expect it to be used anywhere, so it should be non-breaking. nf0 is EKO specific (in the sense that is not included in the NNPDF theory), but it would break both theory card and operator card.

@felixhekhorn
Copy link
Contributor

  1. I agree - we can assume it to be always compressed (which in practice is always the case)
  2. indeed this a bit more tricky since it is another (more or less) big break ... I've no clear opinion ... maybe better to postpone this to after this PR?

@alecandido
Copy link
Member Author

Ok, let's keep 2. on hold, I will open an issue for that.

Since most of the support for "numerical FONLL" essentially comes from ThresholdAtlas[*], I was wondering whether we could improve/simplify it (we're not touching since long, but we have a lot more insight now than when we wrote it):

  1. masses and ratios are passed separately, but I'm pretty sure it only depends on the product, so I'd propose to just pass the matching_scales (if possible naming it scales, since matching is clear from the context)
    thresholds = self.build_area_walls(sorted_masses, thresholds_ratios, max_nf)

    eko/src/eko/thresholds.py

    Lines 164 to 169 in d0f6e7a

    thresholds = []
    for m, k in zip(masses, thresholds_ratios):
    thresholds.append(m * k)
    # cut array = simply reduce some thresholds
    thresholds = thresholds[: max_nf - 3]
    return thresholds
  2. I would drop the check on sorted masses: it is not incredibly useful (we should be able to solve anyhow), and we can assume this to be up to the user (at most we could lift to the runcard, but I would avoid even that)
    sorted_masses = sorted(masses)
    if not np.allclose(masses, sorted_masses):
    raise ValueError("masses need to be sorted")
  3. I would consider dropping the max_nf argument, since it is redundant: we can lift some matching scales to infinity, and in practice it's what is done (the only reason to keep it there was to propagate the NNPDF theory setting, but we already diverged, so this can be done in the converter)
    max_nf: Optional[int] = None,

    thresholds = thresholds[: max_nf - 3]
  4. Btw, if not 3., I would at least pad the thresholds, in order to have a fixed amount of them
  5. I would make also the reference point a single EPoint, and maybe even consider making it non-optional. At the moment we have two strategies: either specify an Atlas with a reference point, or pass it whenever you call the methods. This violates an important principle, i.e. There should be one-- and preferably only one --obvious way to do it., with the consequence of making the usage more involved (I was wondering myself why .path() had _from parameters a while ago).
    On the one hand, I'd rather keep the from/to in .path() and just store the scales in the Atlas, it seems more flexible and elegant. But on the other, I see how the reference point is mostly fixed, so it is much more suitable to carry it with the Atlas as well.
  6. Following 5., I'd propose to drop the related checks in the Atlas initialization:

    eko/src/eko/thresholds.py

    Lines 83 to 101 in d0f6e7a

    if nf_ref is not None:
    if q2_ref is None:
    raise ValueError(
    "Without a reference Q2 value a reference number of flavors "
    "does not make sense!"
    )
    # else self.q2_ref is not None
    nf_init = 2 + len(list(filter(lambda x: np.isclose(0, x), self.area_walls)))
    if nf_ref < nf_init:
    raise ValueError(
    f"The reference number of flavors is set to {nf_ref}, "
    f"but the atlas starts at {nf_init}"
    )
    nf_final = 2 + len(list(filter(lambda x: x < np.inf, self.area_walls)))
    if nf_ref > nf_final:
    raise ValueError(
    f"The reference number of flavors is set to {nf_ref}, "
    f"but the atlas stops at {nf_final}"
    )
    • if we specify an EPoint, we'll be giving both nf and the scale
    • I would simplify anyhow the initialization, dropping consistency checks: the Atlas would be initialized even with an inconsistent reference point (e.g. in an unreachable nf), but if you don't use it, nothing would happen; the error will be raised as late as possible, whenever we'll encounter some impossible computation (so make it overall lazier, mostly with the goal of simplifying the code)

[*]: at some point I'd like to rename thresholds.py -> matching.py/landscape.py, to avoid the misleading word, and ThresholdAtlas to simply Atlas, to avoid redundancy (Atlas is clear enough, we just need good documentation, and it is already good enough)

@felixhekhorn
Copy link
Contributor

Well, now you're suggesting a much more drastic change than before 🙃 (e.g. because I know this would also have a highly non-trivial impact on yadism) - while I agree we may want to update a bit, are you really sure you want to do this here? this PR will already be big and I think we could decouple this change ... let me comment on your points nevertheless:

at some point I'd like to rename

even fine (as long as we keep the name "Atlas" - which I like 😇 )

  1. Mmm ... in practice when you're crossing a threshold you're always interested in the ratio as this will enter your matching conditions as an explicit parameter, so not sure about dropping it here ...
  2. I guess we can (as long as tests keep passing) ...
  3. either is or
  4. this is fine, I guess ...
  5. then we should follow what we do in practice: have a fixed point at construction time and no "second" argument in path
  6. well, we should keep the possibility to specify 0 or -1 as nf_ref - when it will be auto determined

@alecandido
Copy link
Member Author

Mmm ... in practice when you're crossing a threshold you're always interested in the ratio as this will enter your matching conditions as an explicit parameter, so not sure about dropping it here ...

It is not used at all by the atlas. Let's decouple tasks: the Atlas is drawing the path, if we need the ratios information for something else, we will propagate on its own.

@alecandido
Copy link
Member Author

alecandido commented Apr 19, 2023

  • either is or

  • this is fine, I guess ...

Among the two, I prefer 3. (or both).

  • well, we should keep the possibility to specify 0 or -1 as nf_ref - when it will be auto determined

I already wrote a function for the automated determination:

eko/src/eko/io/runcards.py

Lines 388 to 406 in d0f6e7a

def flavored_mugrid(mugrid: list, masses: list, matching_ratios: list):
r"""Upgrade :math:`\mu^2` grid to contain also target number flavors.
It determines the number of flavors for the PDF set at the target scale,
inferring it according to the specified scales.
This method should not be used to write new runcards, but rather to have a
consistent default for comparison with other softwares and existing PDF
sets.
There is no one-to-one relation between number of running flavors and final
scales, unless matchings are all applied. But this is a custom choice,
since it is possible to have PDFs in different |FNS| at the same scales.
"""
tc = ThresholdsAtlas(
masses=(np.array(masses) ** 2).tolist(),
thresholds_ratios=(np.array(matching_ratios) ** 2).tolist(),
)
return [(mu, int(tc.nf(mu**2))) for mu in mugrid]

So, the method you're asking for is Atlas.nf().

If ever we will do something like this, we should use None, that is much more explicit, but I would avoid usage in the current runcards, and keep it:

  • for legacy upgrade (thus where it is)
  • or as a tool for writing runcards in ekobox

Following the spirit of the default-less strategy, we do not want to recommend a specific number of flavors at a given scale, since they are all available.
(remark: I'd keep following just the spirit of the default-less strategy, and instead being much milder on the strategy application itself, since some sensible defaults simplify a lot the usage, marking the options as "advanced" - but always preferring to keep the code simple, without complex parameter determinations happening behind the scenes)

@alecandido
Copy link
Member Author

I updated thresholds.py, and it is now much cleaner!

The amount of code left is minimal, and everything should be well documented. However, it broke EKO completely.

Since I already broke it, I decided to go all the way in and change even the name (still very open to alternatives, please propose), it will make it even simpler to detect all the places in which it is used.

src/eko/matchings.py Outdated Show resolved Hide resolved
src/eko/matchings.py Outdated Show resolved Hide resolved
@felixhekhorn
Copy link
Contributor

I wonder whether we should make a release before merging this one ... this is due to the current (=master) version being already propagated NNPDF/pineappl#227 and due to the bad experiences with #172

@alecandido
Copy link
Member Author

I wonder whether we should make a release before merging this one ... this is due to the current (=master) version being already propagated NNPDF/pineappl#227 and due to the bad experiences with #172

The actual merge is not going to happen immediately, so I would reconsider the situation in a while.
In case, if we are already breaking with something else, we could make a pre-release of the next minor, make also this one into a pre-release, and eventually stabilize with an actual minor when it will be proved to work.

@alecandido
Copy link
Member Author

@felixhekhorn in 7f8a628 I tried to simplify the xgrid reshape function, since that one and the corresponding flavor one are quite complex and very redundant.

Of course, I started by fixing the corresponding test, that was failing for a different reason...

However, there is still a lot to be improved: the similarity with the flavor one is now hidden and the new is still too complex, in order to account for the factorization of the two (xgrid and flavor) into some shared pieces.

  1. this function is almost trivial, but if we use over and over is better factored out:
    def rotation(new: Optional[Basis], old: Basis, check: Callable, compute: Callable):
    """Define grid rotation.
    This function returns the new grid to be assigned and the rotation computed,
    if the checks for a non-trivial new grid are passed.
    However, the check and the computation are delegated respectively to the
    callables `check` and `compute`.
    """
    if new is None:
    return old, None
    if check(new, old):
    warnings.warn("The new grid is close to the current one")
    return old, None
    return new, compute(new, old)
  2. I wonder if swap here is really needed: if what I wrote in the docstring is correct, I'd rather transpose the result, rather than adding a swap argument
    def xgrid_compute_rotation(new: XGrid, old: XGrid, interpdeg: int, swap: bool = False):
    """Compute rotation from old to new xgrid.
    By default, the roation is computed for a target xgrid. Whether the function
    should be used for an input xgrid, the `swap` argument should be set to
    `True`, in order to compute it in the other direction (i.e. the transposed).
    """
    if swap:
    new, old = old, new
    b = interpolation.InterpolatorDispatcher(old, interpdeg, False)
    return b.get_interpolation(new.raw)
  3. here I would have liked to immediately assign eko.rotations.xxxgrid on the same line, but I postponed in order to avoid setting trivial ones when no rotations will be performed
    # construct matrices
    newtarget, targetrot = rotation(
    targetgrid,
    eko.rotations.targetgrid,
    check,
    lambda new, old: crot(new, old, interpdeg),
    )
    newinput, inputrot = rotation(
    inputgrid,
    eko.rotations.inputgrid,
    check,
    lambda new, old: crot(new, old, interpdeg, swap=True),
    )
    # after the checks: if there is still nothing to do, skip
    if targetrot is None and inputrot is None:
    logger.debug("Nothing done.")
    return
    # if no rotation is done, the grids are not modified
    if targetrot is not None:
    eko.rotations.targetgrid = newtarget
    if targetrot is not None:
    eko.rotations.targetgrid = newinput

    I wonder if there is a way to further simplify it...
  4. instead about this part I'm pretty happy, and I believe that we could already factorize and apply also in flavor_reshape
    # build new grid
    for ep, elem in eko.items():
    assert elem is not None
    operands = [elem.operator]
    operands_errs = [elem.error]
    if targetrot is not None and inputrot is None:
    contraction = TARGETGRID_ROTATION
    elif inputrot is not None and targetrot is None:
    contraction = INPUTGRID_ROTATION
    else:
    contraction = SIMGRID_ROTATION
    if targetrot is not None:
    operands.insert(0, targetrot)
    operands_errs.insert(0, targetrot)
    if inputrot is not None:
    operands.append(inputrot)
    operands_errs.append(inputrot)
    elem.operator = np.einsum(contraction, *operands, optimize="optimal")
    if elem.error is not None:
    elem.error = np.einsum(contraction, *operands_errs, optimize="optimal")
    eko[ep] = elem
    eko.update()

Can you confirm I didn't break anything, and try to see if there are ways to further improve, especially the first part (about checks and rotations computation).

@felixhekhorn
Copy link
Contributor

All fine, more or less - but you're adding one giant break on an other ...

@alecandido
Copy link
Member Author

All fine, more or less - but you're adding one giant break on an other ...

Nope, this should not break anything :)

src/eko/io/struct.py Outdated Show resolved Hide resolved
@alecandido
Copy link
Member Author

I finally split struct.py in several modules, I started hating scrolling up and down so much...

There will be a few tests to be fixed :)

@felixhekhorn felixhekhorn linked an issue Apr 28, 2023 that may be closed by this pull request
@alecandido
Copy link
Member Author

@felixhekhorn I will try to maintain a TODO list in the OP, feel free to add some items you believe to be relevant.

If you wish, restoring the old runner and fixing some tests is open for help. At the moment, I will focus on completing the managed runner.

eko.evolution_operator can be greatly improved, including some functions of PhysicalOperator and much involved code. However, I decided to focus on OperatorGrid alone in this PR, to have the alternative runner and nothing else.
The improvement of the individual pieces of the computation will come in a further PR (and this gives a sense to the raw in raw-jets).

@alecandido
Copy link
Member Author

alecandido commented May 11, 2023

I rebased on master after the merge of #258, i.e. py3.11, and it seems that also the tests in this PR are passing even on that Python version.

I believe that we should be good to go. We're not in a special hurry to merge this PR, but we want it a bit sooner, rather than later, because we'd like to use to compute jets, and to finally close a big PR, reducing the level of branching in EKO (this is the last big PR left in EKO).

@felixhekhorn @giacomomagni @niclaurenti

@felixhekhorn
Copy link
Contributor

this is the last big PR left in EKO

I agree on everything, but on that one 😇

@alecandido
Copy link
Member Author

alecandido commented May 12, 2023

Atlas.normalize is only affecting the computation of couplings and MSbar masses.

The main issue is simply that we didn't move consistently to EvolutionPoints in the full eko package, so there are still some functions that make use of bare target scales (like in couplings), and because of this we're allowing nf=None, and this propagates down to the Atlas.

The part involved is not going to be immediately part of the public API, and the only places where this is directly used are ekobox (which we can consistently update in a single PR, living in this repo) and yadism, that is already lacking behind (and using an old version of EKO).

I will open a dedicated issue, and the upgrade will be done in future PRs, while refactoring EKO.

EDIT: issue opened #265

@alecandido alecandido merged commit 67fa988 into master May 19, 2023
@alecandido alecandido deleted the raw-jets branch May 19, 2023 08:42
This was referenced May 23, 2023
@felixhekhorn felixhekhorn mentioned this pull request Aug 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
output Output format and management refactor Refactor code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow user to choose also final scale FNS
6 participants