Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kfactor maintenance and documentation #161

Merged
merged 39 commits into from
Aug 21, 2024
Merged

kfactor maintenance and documentation #161

merged 39 commits into from
Aug 21, 2024

Conversation

giacomomagni
Copy link
Contributor

@giacomomagni giacomomagni commented Mar 15, 2024

Address #160.

I'd like to:

  • document the code
  • allow for nnpdf settings, ie drop the explicit yaml_db.
  • simplify the code, also making use of the new pineappl functions

@giacomomagni giacomomagni marked this pull request as draft March 15, 2024 13:26
@giacomomagni giacomomagni linked an issue Mar 15, 2024 that may be closed by this pull request
@felixhekhorn felixhekhorn added the refactor Refactor code label Mar 15, 2024
@andreab1997
Copy link
Contributor

Should I review this immediately or wait until at least the tests pass?

@giacomomagni
Copy link
Contributor Author

giacomomagni commented Mar 15, 2024

Should I review this immediately or wait until at least the tests pass?

Feel free to comment, I'll try to fix tests in the meantime.
And it's not yet working correctly.

src/pineko/kfactor.py Outdated Show resolved Hide resolved
@giacomomagni
Copy link
Contributor Author

giacomomagni commented Mar 15, 2024

@andreab1997 the benchmark is not passing because (3,0,0,0) is already in the grid ATLAS_TTB_FAKE,
is this correct?
How come this was not there before ?? Or maybe I'm messing up with the syntax and I should ask for (4,0,0,0) ?

@andreab1997
Copy link
Contributor

andreab1997 commented Mar 15, 2024

@andreab1997 the benchmark is not passing because (3,0,0,0) is already in the grid ATLAS_TTB_FAKE, is this correct? How come this was not there before ?? Or maybe I'm messing up with the syntax and I should ask for (4,0,0,0) ?

I do not remember by hearth. Did you change the grids being tested?

EDIT: Which are exactly the orders in the grid?

@giacomomagni
Copy link
Contributor Author

I do not remember by hearth. Did you change the grids being tested?

No I did not...
So the grids contains:

[array([2, 0, 0, 0]), array([3, 0, 0, 0]), array([3, 0, 1, 0]), array([3, 0, 0, 1])]

@andreab1997
Copy link
Contributor

I do not remember by hearth. Did you change the grids being tested?

No I did not... So the grids contains:

[array([2, 0, 0, 0]), array([3, 0, 0, 0]), array([3, 0, 1, 0]), array([3, 0, 0, 1])]

Okay then yes, of course in this case the benchmark was meant to compute an additional order, the (4,000). Instead you are using it to update (3,000) which however is already there, so it fails. So yes, you should update the (4,0,0,0). I think now that the name order_to_update is not that good, because in this case you are actually adding an order, not updating it. The update part is only when you use order_exist, but in general you want to create new orders, not updating them. I guess this was the reason of the confusion

@giacomomagni
Copy link
Contributor Author

I left 2 todos which I don't know if or how we want to solve:

# TODO: can we make this function simpler ??

# TODO: generalize for other type of kfactors ?
cfac_path = kfactor_folder / f"CF_QCD_{grid}.dat"
if "ATLASDY2D8TEV" in grid:
cfac_path = kfactor_folder / f"CF_QCDEWK_{grid}.dat"

But apart from that the rest should be a bit nicer now, if you want please review.

@giacomomagni giacomomagni marked this pull request as ready for review March 16, 2024 09:27
@andreab1997
Copy link
Contributor

I left 2 todos which I don't know if or how we want to solve:

# TODO: can we make this function simpler ??

# TODO: generalize for other type of kfactors ?
cfac_path = kfactor_folder / f"CF_QCD_{grid}.dat"
if "ATLASDY2D8TEV" in grid:
cfac_path = kfactor_folder / f"CF_QCDEWK_{grid}.dat"

But apart from that the rest should be a bit nicer now, if you want please review.

Ok thanks for this, I will review this ASAP.

Regarding the two TODOs:

# TODO: can we make this function simpler ??

This has already been simplified a lot in the past (it was even worst). Personally I would not know how to further simplify it, other than dividing it in unit functions (which is still not trivial).

# TODO: generalize for other type of kfactors ?
cfac_path = kfactor_folder / f"CF_QCD_{grid}.dat"
if "ATLASDY2D8TEV" in grid:
cfac_path = kfactor_folder / f"CF_QCDEWK_{grid}.dat"

This might be a good idea but the problem is that, while for QCD kfactor it is rather trivial to know what they do, other kfactors can be defined in an arbitrary way which makes particulary difficult to generalize the equations that we implement here. We can try to assume a general behaviour and put some options for the rest but I believe that it easily become ugly

@alecandido
Copy link
Member

This part is just creating an empty grid like the other, replacing orders:

bin_limits = [float(bin) for bin in range(ori_grid.bins() + 1)]
lumi_grid = [pineappl.lumi.LumiEntry(mylum) for mylum in ori_grid.lumi()]
subgrid_params = pineappl.subgrid.SubgridParams()
ori_grid_orders = orders_as_tuple(ori_grid)
new_orders = [
pineappl.grid.Order(*ord)
for ord in ori_grid_orders
if ord != to_construct_order
]
new_grid = pineappl.grid.Grid.create(
lumi_grid, new_orders, bin_limits, subgrid_params
)

and here you're iterating to filter the subgrids to export:

orders_indeces = [ori_grid_orders.index(order.as_tuple()) for order in new_orders]
for order_index in orders_indeces:
for lumi_index in range(len(lumi_grid)):
for bin_index in range(ori_grid.bins()):
extr_subgrid = ori_grid.subgrid(order_index, bin_index, lumi_index)
new_grid.set_subgrid(
orders_indeces.index(order_index),
bin_index,
lumi_index,
extr_subgrid,
)

This could be possible with a single iteration, by making a clever usage of the IndexedIter.

Moreover, constructing grids from filtering existing ones could be a feature for PineAPPL itself (if you filter with indices, instead of callables, there is no problem of exposing in the non-Rust APIs).

This is just a function of ori_grid and nothing else:

bin_dimension = ori_grid.bin_dimensions()
limits = []
for num_bin in range(ori_grid.bins()):
for dim in range(bin_dimension):
limits.append(
(
ori_grid.bin_left(dim)[num_bin],
ori_grid.bin_right(dim)[num_bin],
)
)
norma = ori_grid.bin_normalizations()
remap_obj = pineappl.bin.BinRemapper(norma, limits)

This is also a separate piece:

# propagate metadata
for k, v in ori_grid.key_values().items():
new_grid.set_key_value(k, v)

In any case, this function altogether is just the filtering one that I mentioned, and then taking care of propagating additional pieces of information. If you could delete subgrids, just a full copy and then drop would be simpler.
The nicest way I see to achieve the goal would be to have a GridView object, such that you could obtain a consistent filtered view without copying subgrids, and then copy-on-write (or just provide a way to dump the view).

However, most of these improvements would require to do something inside PineAPPL itself.

@felixhekhorn
Copy link
Contributor

I still have to have a look at the two most important files, but I'll wait until @felixhekhorn is happy with them

please go ahead @andreab1997 and start looking and also committing if you see things that you can be improved

src/pineko/kfactor.py Outdated Show resolved Hide resolved
@giacomomagni
Copy link
Contributor Author

giacomomagni commented Mar 26, 2024

Given #159 I'd remove the yamldb also from here and just pass the single grids and kfactor files one by one with the CLI. Is that okay @andreab1997 @felixhekhorn ?

@giacomomagni
Copy link
Contributor Author

giacomomagni commented Jun 10, 2024

Since we need to burn kfactors, and we need to do it for the STAR W asymmetry,
st this point also this PR is waiting for #181 to be closed and pineappl dependency to be updated.

@giacomomagni
Copy link
Contributor Author

Can we close this, before we forget what we have done ?

@felixhekhorn
Copy link
Contributor

felixhekhorn commented Aug 21, 2024

I opened #191 and let's not close the discussions here.

Then let's merge (now) - this is getting really old

@giacomomagni
Copy link
Contributor Author

@andreab1997 are u okay?

@andreab1997
Copy link
Contributor

@andreab1997 are u okay?

yes, please go on merging this

@giacomomagni giacomomagni merged commit 4452b48 into main Aug 21, 2024
6 checks passed
@giacomomagni giacomomagni deleted the issue_160 branch August 21, 2024 15:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
refactor Refactor code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

kfactor module maintenance
4 participants