Skip to content

Network solvers redesign #242

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: main
Choose a base branch
from
Open

Network solvers redesign #242

wants to merge 10 commits into from

Conversation

emstoudenmire
Copy link
Contributor

@emstoudenmire emstoudenmire commented Jul 2, 2025

This PR is a rewrite of codes for "sweep solver" algorithms such as DMRG, TDVP, etc.
It introduces a simplified designs for the whole system, especially regarding the creation of "region plans", how the double loop comprising what is now called sweep_solve is coded, and the handling of keyword arguments.

The PR also adds a subspace expansion system with one currently implementation (based on a "projected density matrix perturbation" idea, basically McCulloch+White hybrid method). Having a subspace expansion is crucial to converge DMRG on certain tree lattices, not to mention other cases like 2D DMRG with QN conservation.

Internally, the core design revolves around two iterators: a sweep iterator and a region iterator. Each of these is in principle wrappable by "iteration adapters" (see for example the tuple adapter in adapters.jl). At each iteration of the RegionIterator type, it calls a region_iterator_action function which can be overloaded, but by default calls three "subactions": extracter, updater, and inserter. Currently extracter also calls down into a subspace_expand function which can be customized through multiple backends. These "action" functions all dispatch on a "problem" type which can hold arbitrary data, making these codes more flexible and future-proof toward cases like optimizing two tensor network states at once, working with sets of operators, etc.

Other improvements that may fit better into future PRs:

  • install the "fitting" implementations of truncate and apply already in the NetworkSolvers repo (there were some issues bringing them over)
  • replace the use of ProjTTN with a BP cache
  • more friendly interfaces to methods like dmrg which automatically propagate truncation arguments into more "expert" keyword argument packs
  • helper functions for making keyword parameter packs and propagating defaults
  • discuss what is the best strategy to optionally truncate the bonds in 1-site TDVP, perhaps during the orthogonalize/gauge_walk part of the extracter step
  • improve the subspace expansion code to expand all bonds around the current region and simplify the current code (the first change may already help to simplify the code somewhat)

@emstoudenmire
Copy link
Contributor Author

Fyi, At least one test was failing on my machine, but it didn't seem to be related at all to the code in this PR.

Copy link

codecov bot commented Jul 2, 2025

Codecov Report

Attention: Patch coverage is 0% with 454 lines in your changes missing coverage. Please review.

Project coverage is 0.00%. Comparing base (2d7db49) to head (0a4db77).

Files with missing lines Patch % Lines
src/solvers/subspace/densitymatrix.jl 0.00% 44 Missing ⚠️
src/solvers/applyexp.jl 0.00% 42 Missing ⚠️
src/solvers/iterators.jl 0.00% 42 Missing ⚠️
src/solvers/region_plans/euler_tour.jl 0.00% 37 Missing ⚠️
src/solvers/subspace/ortho_subspace.jl 0.00% 37 Missing ⚠️
src/solvers/region_plans.jl 0.00% 33 Missing ⚠️
src/solvers/region_plans/tdvp_region_plans.jl 0.00% 29 Missing ⚠️
src/solvers/subspace.jl 0.00% 28 Missing ⚠️
src/solvers/eigsolve.jl 0.00% 27 Missing ⚠️
src/solvers/inserter.jl 0.00% 18 Missing ⚠️
... and 12 more

❗ There is a different number of reports uploaded between BASE (2d7db49) and HEAD (0a4db77). Click for more details.

HEAD has 6 uploads less than BASE
Flag BASE (2d7db49) HEAD (0a4db77)
docs 2 1
5 0
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #242       +/-   ##
==========================================
- Coverage   77.39%   0.00%   -77.40%     
==========================================
  Files          77      86        +9     
  Lines        3805    4018      +213     
==========================================
- Hits         2945       0     -2945     
- Misses        860    4018     +3158     
Flag Coverage Δ
docs 0.00% <0.00%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@emstoudenmire
Copy link
Contributor Author

Is the version check meaning I should just bump the version number? Which version number should I bump to?

@mtfishman
Copy link
Member

Is the version check meaning I should just bump the version number? Which version number should I bump to?

The version check is checking whether or not you've bumped the package version here:

version = "0.13.12"
. The reason for the check is that in general we are trying to enforce that we always bump the package version in every PR and register a new version after the PR is merged, so that versions don't drag out and accumulate too many changes, turnover time to release bug fixes and features for users is faster, etc. I'm sure this PR is breaking in some way so you should bump the version to v0.14.

@@ -0,0 +1,103 @@
using Printf: @printf
import ConstructionBase: setproperties
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
import ConstructionBase: setproperties
using ConstructionBase: setproperties

The difference between import and using in cases like this is that import would allow you to overload setproperties without prepending with the module ConstructionBase, which is a style I avoid now anyway. So in general you should use using instead of import.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a helpful reminder – I had become a bit fuzzy on the reason for preferring using. I'll switch these over and similarly in all other files.

@preserve_graph psi[v] = C
psi = set_orthogonal_region ? set_ortho_region(psi, [v]) : psi
normalize && @preserve_graph psi[v] = psi[v] / norm(psi[v])
return setproperties(problem; state=psi)
Copy link
Member

@mtfishman mtfishman Jul 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer using the style set_state(problem, psi) in general, setproperties doesn't seem to add much here. Though functions like set_state could be implemented using setproperties or as @set problem.state = psi using https://github.com/JuliaObjects/Accessors.jl.

I realize I suggested using setproperties but I was picturing it would be used to help implement more specialized setter functions like set_state, or used in cases where multiple fields are being set. If you are just setting one field I don't see it adding much value to use setproperties directly in the codebase.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. I think my reason was just to save a few lines of code (i.e. not even define the "setters") in the files with the problem objects, but it's only 2-3 lines of code and probably a good idea for those to have setters defined anyway.

end

eigenvalue(E::EigsolveProblem) = E.eigenvalue
ITensorNetworks.state(E::EigsolveProblem) = E.state
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ITensorNetworks.state(E::EigsolveProblem) = E.state
state(E::EigsolveProblem) = E.state

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was getting an error if I didn't prepend ITensorNetworks here, which is why I put it. I'll find out the exact error and maybe we can fix the issue more at the root of what's causing it.

# generates each tuple?
#

mutable struct TupleRegionIterator{RegionIter}
Copy link
Member

@mtfishman mtfishman Jul 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can come up with a name that is more descriptive, like RegionIteratorWithKwargs. Tuple is a bit vague.

current_time::Number = 0.0
end

ITensorNetworks.state(A::ApplyExpProblem) = A.state
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ITensorNetworks.state(A::ApplyExpProblem) = A.state
state(A::ApplyExpProblem) = A.state

since we are in the ITensorNetworks module.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll look into what error I was getting when I didn't prepend ITensorNetworks. I agree it shouldn't be needed in principle.


if !isnothing(which)
S.region_iter = region_iterator(
problem(S.region_iter); sweep=S.which_sweep, current_sweep_kws...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
problem(S.region_iter); sweep=S.which_sweep, current_sweep_kws...
problem(S); sweep=S.which_sweep, current_sweep_kws...

I think, based on the definition of problem(::SweepIterator) above.

Copy link
Member

@mtfishman mtfishman Jul 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this code pattern confused me a bit, what do you think of writing it like this:

    update_region_iterator!(S; current_sweep_kws...)

and hiding the implementation in update_region_iterator!?

Comment on lines +5 to +9
mutable struct SweepIterator
sweep_kws
region_iter
which_sweep::Int
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about parameterizing this struct by the types of the fields sweep_kws and region_iter? Is it meant to be dynamic, i.e. is it expected that those types might change?

Copy link
Member

@mtfishman mtfishman Jul 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems best to have a goal of having those be concrete and not changing type, but if they do you can still parameterize the type and then set the type parameter to an abstract type when it is being constructed (as needed), for example you can explicitly construct it as SweepIterator{Any,Any}(sweep_kws, region_iter, which_sweep).


problem(R::RegionIterator) = R.problem
current_region_plan(R::RegionIterator) = R.region_plan[R.which_region]
current_region(R::RegionIterator) = current_region_plan(R)[1]
Copy link
Member

@mtfishman mtfishman Jul 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the data structure for the output of current_region_plan(R)? Something doesn't feel right to me that the region is accessed with indexing by 1, maybe it should be a NamedTuple and the region could be accessed as current_region_plan(R).region or a struct where it is accessed with a function get_region(current_region_plan(R))?

current_region(R::RegionIterator) = current_region_plan(R)[1]
region_kwargs(R::RegionIterator) = current_region_plan(R)[2]
function previous_region(R::RegionIterator)
R.which_region==1 ? nothing : R.region_plan[R.which_region - 1][1]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like maybe we should be using function accessors rather than field accessors, i.e. which_region(R) instead of R.which_region.

for (sweep, region_iter) in enumerate(sweep_iterator)
for (region, region_kwargs) in region_tuples(region_iter)
region_callback(
problem(region_iter);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed this gets passed the problem while sweep_callback and sweep_printer get passed the region_iter, is there a logic behind that? I would lean towards all of them getting passed the region_iter, and there could be a layer that unwraps the problem as needed.

@@ -0,0 +1,64 @@
using Test: @test, @testset
using ITensors
using TensorOperations # Needed to use contraction order finding
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
using TensorOperations # Needed to use contraction order finding
using TensorOperations: TensorOperations # Needed to use contraction order finding

That only brings TensorOperations itself into the namespace.

Comment on lines +1 to +2
import Graphs as gr
import NamedGraphs as ng
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the sake of keeping the style consistent throughout the library I would prefer:

using Graphs: add_edge!, ...
using NamedGraphs: NamedGraph, ...

If we don't stay consistent with the style it leads to chaos where code is written in different styles and it is a lot of work to go back and make it consistent later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I meant to change these to the current style of ITensorNetworks. I will do that and in any other files I find.

@@ -0,0 +1,17 @@
import ConstructionBase: setproperties

function extracter(problem, region_iterator; sweep, trunc=(;), kws...)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't change this to extract as discussed elsewhere, which should change this to extractor since extracter isn't the correct spelling.

function applyexp(
init_prob,
exponents;
extracter_kwargs=(;),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
extracter_kwargs=(;),
extractor_kwargs=(;),

purely spelling, since extractor is a word but extracter isn't.

Comment on lines +4 to +8
@kwdef mutable struct ApplyExpProblem{State}
state::State
operator
current_time::Number = 0.0
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to my comment above about SweepIterator, I think we should parameterize the problem types. Also related to my other comment I think we should switch the argument ordering so the operator is first.

Also, I'm suspicious when I see 0.0 since that is a Float64. If the time is Float64 it might then promote the element type of other objects it interacts with to double precision, which would be bad on GPU where single precision is better. We should have a constructor that determines the number type from the inputs, such as: ApplyExpProblem(operator, state) = ApplyExpProblem(operator, state, zero(promote_type(scalartype(operator), scalartype(state))).

@@ -0,0 +1,14 @@
default_maxdim() = typemax(Int)
default_mindim() = 1
default_cutoff() = 0.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
default_cutoff() = 0.0
default_cutoff(elt::Type{<:Real}) = zero(elt)

so that we don't assume double precision (Float64).

Copy link
Member

@mtfishman mtfishman Jul 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could then define default_cutoff() = default_cutoff(Float64) but I would prefer not to right now and instead make sure to propagate specific element types that are determined from the inputs.

Comment on lines +65 to +66
(norm(Y) <= 1E-15 || expand_maxdim <= 0) && return local_tensor
Ux, S, V = svd(Y, basis_inds; cutoff=1E-14, maxdim=expand_maxdim, lefttags="ux,Link")
Copy link
Member

@mtfishman mtfishman Jul 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should determine an element type/scalar type elt from the inputs and replace 1E-15 and 1E-14 with something like eps(elt) * 10 or eps(elt) * 10^2 to be more agnostic about using single precision or double precision, to be friendlier for running on GPU.

using NDTensors.BackendSelection: Backend, @Backend_str
import ConstructionBase: setproperties

default_expansion_factor() = 1.5
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
default_expansion_factor() = 1.5
default_expansion_factor(elt::Type{<:Real}) = 3 * one(elt) / 2

include("utilities/simple_ed_methods.jl")
include("utilities/tree_graphs.jl")

@testset "Tree DMRG" begin
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@testset "Tree DMRG" begin
@testset "Tree DMRG (eltype=elt)" for elt in (Float32, Float64, ComplexF32)

and then propagate elt throughout the tests, for example when constructing the operator, state, cutoff, etc. I like to set up tests where we loop over element types like this from the start to make sure the code is agnostic about the element types, it is easier to do that from the start rather than after a lot of library code and tests have been written already.

Even more advanced is then looping of array backends like CPU and GPU arrays, we can discuss how to do that. Again, it is nice to set up tests like that early on rather than trying to fix those issues and test them after the fact.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JoeyT1994
Copy link
Contributor

@emstoudenmire the reason for the failing test is that the apply(t1::ttn, t2::ttn; alg = "fit") is no longer in the code (it was in the old solvers code) but there is a test test/test_ttn_contract.jl that uses it. It is now passing to the apply() function in src/apply which should be made more type restricted, i.e. we should ust change apply(o, tn::AbstractITensorNetwork) to apply(o::Union{NamedEdge, ITensors}, tn::AbstractITensorNetwork) and reenable the test once we are settled on the apply interface with the new solver (which can handle this case).

)
end
S.which_sweep += 1
return S.region_iter, next
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit on the fence about this, but I would lean towards a design where iterate(::SweepIterator) actually performs the region iteration as opposed to returning the region iterator, so then for _ in sweep_iterator end actually runs the calculation. Then, we could have an iteration adapter such as region_iterators(::SweepIterator) returns an iterator that returns the region iterator at each iteration.

Copy link
Member

@mtfishman mtfishman Jul 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An interesting question is then, in that alternative design, what iterate(::SweepIterator) should output at each iteration. The DifferentialEquations.jl design is that it outputs the iterator itself, i.e. if you call:

for x in sweep_iterator
end

x will be the latest updated sweep_iterator, which I think makes sense since then you can access the information of the sweep_iterator inside the loop.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For reference, in DifferentialEquations.jl, iterate is just defined as:

function Base.iterate(integrator::DEIntegrator, state = 0)
    done(integrator) && return nothing
    state += 1
    step!(integrator)
    return integrator, state
end

which seems like a reasonable goal to aim for, and then the complexity of the implementation is in done(...) and step!(...).


mutable struct SweepIterator
sweep_kws
region_iter
Copy link
Member

@mtfishman mtfishman Jul 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this should be called current_region_iter or something like that to indicate it is the latest region iterator.

I have to say this part confused me a bit. I see the region iterator is recreated from scratch from the sweep_kws at each sweep, is there a reason to store it? It seems like it could just be made and used "on the fly" at each sweep anyway. Instead it seems like we could just store the problem in the SweepIterator.

#

mutable struct SweepIterator
sweep_kws
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this could be named each_sweep_kwargs or sweep_kwargs_iterator to indicate that it itself is an iterator that returns the keyword arguments for the sweep at each iteration (rather than just the keyword arguments of the latest sweep).

Better handling of zero and vector creation

Co-authored-by: Matt Fishman <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants