Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature, Performance] Optimize HamEvo for commuting Hamiltonians #346

Closed
jpmoutinho opened this issue Feb 16, 2024 · 8 comments
Closed
Assignees
Labels
digital-analog performance to_check To check in the development of the new expression system.

Comments

@jpmoutinho
Copy link
Collaborator

jpmoutinho commented Feb 16, 2024

Currently HamEvo of some generator is always exponentiated fully, without checking for commutation relations. However, generators composed of several commuting parts can be exponentiated separately. We can make some of those checks automatic in the instantiation of HamEvo and then optimize the calculation in the backend. Below is an example script doing it manually to showcase the potential:

from qadence import X, Y, Z, N, kron, add, CNOT
from qadence import HamEvo, run, chain, random_state
from qadence import equivalent_state, hamiltonian_factory

import time

n_qubits = 10

# Consider a generator composed of commuting terms, e.g. a neutral atom interaction on a line register
op = N
gen_full = add(N(i) @ N(i+1) for i in range(n_qubits - 1))

# Let's split it in two parts, such that gen_ful = gen_part_1 + gen_part_2
gen_part_1 = add(op(i) @ op(i+1) for i in range(n_qubits // 2))
gen_part_2 = add(op(i) @ op(i+1) for i in range(n_qubits // 2, n_qubits-1))

# Because the two generators commute, they can be exponentiated separately as smaller generators
op_split = chain(HamEvo(gen_part_1, 1.0), HamEvo(gen_part_2, 1.0))

# Here we exponentiate the two together
op_full = HamEvo(gen_full, 1.0)

# Some random initial state
state_init = random_state(n_qubits)

start = time.time()
state_split = run(n_qubits, op_split, state = state_init)
mid = time.time()
state_full = run(n_qubits, op_full, state = state_init)
end = time.time()

# Check that indeed we get the same result
assert equivalent_state(state_split, state_full, atol = 1e-12)

# Check the time taken for both calculations
print("Split exponentiation: ", mid - start)
print("Full exponentiation: ", end - mid)
Split exponentiation:  0.010470867156982422
Full exponentiation:  0.6789381504058838

To implement this, there is already a block_is_commuting_hamiltonian function in qadence.blocks.utils that would be useful. It seem efficient, but should be reviewed. This function just returns True or False, so a first implementation could be based on that: essentially, if True, every term in the AddBlock is separated into its own matrix to be exponentiated.

A next level would be to look for optimizations even if the function returns False, where each non-commuting term would be aggregated into its own group. However I suspect this would not be straightforward.

Related to #134 since this would reduce to calling block to tensor on each of the smaller commuting terms.

@jpmoutinho
Copy link
Collaborator Author

@vytautas-a like we discussed, I suspect a first implementation of this may only require dealing with the HamEvo class in qadence.operations, but I'm not sure about a clean way to do it. The digital_decomposition method is relevant.

The HamEvo instantiates a TimeEvolutionBlock which then is handled by the convert_ops.py in the pyqtorch backend (also horqrux backend), so all the code needed should be in these files.

@rajaiitp
Copy link
Collaborator

HamEvo checks if the qubit hamiltonian is made of commuting terms and generates a gate representation of it from what I can see. So why does it take different execution times if the generator is specified as one chunk or separate commuting chunks when both have identical digital decomposition to be implemented in the end

@jpmoutinho
Copy link
Collaborator Author

HamEvo checks if the qubit hamiltonian is made of commuting terms and generates a gate representation of it from what I can see. So why does it take different execution times if the generator is specified as one chunk or separate commuting chunks when both have identical digital decomposition to be implemented in the end

Actually it currently only does that if you call HamEvo.digital_decomposition(). Otherwise it builds the full matrix for the Hamiltonian and then exponentiates it, which is why it's faster in my example: there it is exponentiating two smaller matrices instead of a large one.

But the suggestion here is exactly to make the commutation check be done automatically and optimize based on that.

@rajaiitp
Copy link
Collaborator

Cool, understood.
I would like to add a few points with designing the algorithm that I thought about

  • If the commuting operators act on the same set of qubits, it is good to keep them together, eg Z_1Z_2+X_1X_2
  • If there is a generator term with support that encompasses the support of other terms, then they need to included with it eg Z1Z2Z3 + Z2 +Z3
  • Start with the Z basis, if 2 generators have only a small overlap keep them separate like Z_1Z_2 separate from Z_2Z_3. Use other noncommuting generators like X_2 as a glue. for these separate groups to come together.

@rajaiitp
Copy link
Collaborator

Found a paper about distributing a pauli decomposition into commuting set of collections, https://arxiv.org/pdf/1908.06942.pdf

Apparently there is a lot of work on this mostly in the direction of doing efficient observable estimation with minimal shots, but can be used here as well, i think

@jpmoutinho
Copy link
Collaborator Author

Nice @rajaiitp ! Seems relevant yes.

@jpmoutinho
Copy link
Collaborator Author

Also came accross this one: https://arxiv.org/pdf/1907.09040.pdf

@jpmoutinho
Copy link
Collaborator Author

Closing after opening in PyQTorch: pasqal-io/pyqtorch#177

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
digital-analog performance to_check To check in the development of the new expression system.
Projects
None yet
Development

No branches or pull requests

3 participants