Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: unify Learner1D, Learner2D and LearnerND #220

Open
wants to merge 105 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 94 commits
Commits
Show all changes
105 commits
Select commit Hold shift + click to select a range
1e633cd
WIP: add new learner concept and example notebook
jbweston Oct 6, 2019
a5a8f5f
WIP: correct implementation of unfinished point removal
jbweston Oct 6, 2019
d7cb56c
add necessary information to scale data for loss functions that want to
jbweston Oct 6, 2019
a288b64
start internal queue attributes with underscore
jbweston Oct 6, 2019
f2d6125
recalculate losses when scale increases by a certain factor
jbweston Oct 6, 2019
5d3c2b1
blackify
jbweston Oct 6, 2019
9db175a
make 1D points use floats rather than length-1 tuples
jbweston Oct 6, 2019
cdb669d
correct point insertion logic
jbweston Oct 6, 2019
042eba0
add parallel example to proof of concept notebook
jbweston Oct 7, 2019
84cc671
remove superfluous method and docstrings from Interval class
jbweston Oct 7, 2019
9c565ce
import SortedList and SortedDict and update 'insert_points'
jbweston Oct 7, 2019
963fb07
fixup docstrings etc.
jbweston Oct 7, 2019
eb712b9
add first implementation of ConvexHull domain
jbweston Oct 7, 2019
18a88bc
make everything work
jbweston Oct 8, 2019
e3fd6cc
make everything work again
jbweston Oct 8, 2019
632faf9
update Interval class
jbweston Oct 8, 2019
bd574ab
add 'remove' method to Domain
jbweston Oct 8, 2019
3e27066
implement 'ask(tell_pending=False)'
jbweston Oct 8, 2019
772a5f4
flakify
jbweston Oct 8, 2019
053ba24
small refactors and comments
jbweston Oct 9, 2019
5e36e24
rename bound_points to boundary_points
jbweston Oct 9, 2019
fd6bce4
do not evaluate boundary points in __init__
jbweston Oct 9, 2019
be07de9
put pending points in separate data structure from data
jbweston Oct 9, 2019
d07b9c3
replace '_scaled_loss' with 'priority'
jbweston Oct 9, 2019
df667df
rename 'self.loss' to 'self.loss_function'
jbweston Oct 9, 2019
5377438
store priorities as (priority, loss)
jbweston Oct 9, 2019
fec432b
docstring update
jbweston Oct 9, 2019
61bf9df
correctly compute point priorities
jbweston Oct 9, 2019
ce4d9d2
refactor comments and fix minor logic bug
jbweston Oct 9, 2019
ccabce3
introduce a dict of losses as an optimization
jbweston Oct 9, 2019
1434f69
clarify the ordering of queried items and priorities
jbweston Oct 9, 2019
cdc6d49
don't store a tuple as priority, use 'self.losses' from with 'self.loss'
jbweston Oct 9, 2019
166a912
comments
jbweston Oct 9, 2019
ba5d207
make points a tuple as they should be
jbweston Oct 9, 2019
52fed32
correctly update pending points and data
jbweston Oct 9, 2019
5ea9b56
update example notebook
jbweston Oct 9, 2019
224d9ce
implement methods 'vertices' and 'subpoints' for 'Domain'
jbweston Oct 11, 2019
bbed8c6
abstract out choosing points in an interval
jbweston Oct 11, 2019
0f59184
catch interface bugs
jbweston Oct 11, 2019
3d88c2c
implement '__contains__' for 'ConvexHull'
jbweston Oct 11, 2019
c570cd5
use subdomain volume rather than constant for the loss before data is
jbweston Oct 11, 2019
207ac47
correct implementation of queue
jbweston Oct 12, 2019
f1fdfee
black
jbweston Oct 12, 2019
bc38a7b
fix neighbor detection when adding points
jbweston Oct 12, 2019
b3c394d
abstract out making new subtriangulations
jbweston Oct 12, 2019
f38e903
black
jbweston Oct 12, 2019
0cff467
add map from subpoints to subdomains to speed up triangulation splitting
jbweston Oct 12, 2019
5d5402b
separate Queue and Domain into separate modules
jbweston Oct 12, 2019
95c0381
add tests for priority queue and domains
jbweston Oct 12, 2019
d191e3c
add hypothesis to test requirements
jbweston Oct 12, 2019
b70eea8
rename 'sub_domains' to 'sub_triangulations'
jbweston Oct 12, 2019
4305536
flake
jbweston Oct 12, 2019
faf68cc
add __all__ to domain.py
jbweston Oct 12, 2019
5377e69
use math module
jbweston Oct 12, 2019
4b7f390
docstring formatting
jbweston Oct 12, 2019
9ed92e8
increase allowed time for tests to complete
jbweston Oct 12, 2019
6e95d68
remove Queue.priorities method
jbweston Oct 12, 2019
5710762
Queue.items() returns items in priority order
jbweston Oct 12, 2019
3b2d9cd
make LossFunction an ABC
jbweston Oct 12, 2019
fc348b2
allow the learner to take a ConvexHull as bounds
jbweston Oct 12, 2019
953f62d
add more comments to the learner
jbweston Oct 12, 2019
ebcd6f2
show that the new learner works for ND output
jbweston Oct 12, 2019
6c07070
typo
jbweston Oct 12, 2019
64a6d0f
correct sign in simplex boundary equations
jbweston Oct 12, 2019
9624f7f
correct abstract loss function interface
jbweston Oct 12, 2019
0055d57
rename need_loss_udate_factor
jbweston Oct 12, 2019
cc1f83f
correct boundary point choosing logic
jbweston Oct 12, 2019
19ab936
make point adding idempotent in learner
jbweston Oct 12, 2019
a0bea12
fix bugs in 1D domain
jbweston Oct 12, 2019
1a84089
fix bug in tell_pending
jbweston Oct 12, 2019
bf64362
catch edge case in loss
jbweston Oct 12, 2019
f171173
implement data saving
jbweston Oct 12, 2019
6111a7c
explicitly set inf
jbweston Oct 12, 2019
4ede935
add new learner to some tests (those that it passes)
jbweston Oct 12, 2019
6908ec2
add xfailing tests for new learner
jbweston Oct 12, 2019
484cebf
correctly set inf
jbweston Oct 12, 2019
460bd2e
replace Domain.__contains__ with Domain.contains_subdomain
jbweston Oct 12, 2019
929a9fd
correct implementation of points outside domain
jbweston Oct 12, 2019
3696379
implement method 'Domain.encloses' for finding points inside a domain
jbweston Oct 12, 2019
7ff9372
allow NewLearnerND to accept points not in the domain
jbweston Oct 13, 2019
3979872
make sure we can ask for more points after adding data outside the
jbweston Oct 13, 2019
bb38de3
add new learner to datasaving test
jbweston Oct 13, 2019
60e0a0c
correct conditional
jbweston Oct 13, 2019
ceaa87d
make 'points_inside' select points from more than 1 subdomain
jbweston Oct 13, 2019
a13711a
correct name of domain test
jbweston Oct 13, 2019
fbbe44a
rename point loss to point priority
jbweston Oct 13, 2019
d227e1b
do 1 operation per line only
jbweston Oct 13, 2019
9f8774a
correct docstring
jbweston Oct 13, 2019
6bdbe88
correct test that all internal points are reassigned when splitting
jbweston Oct 13, 2019
32518d1
run all domain tests over random domains, rather than hypercubes
jbweston Oct 13, 2019
5d3434d
update priority queue docstring
jbweston Oct 13, 2019
6efc250
neater ordering
jbweston Oct 13, 2019
06003cb
add clarifying comment about adding points to neighboring subdomains
jbweston Oct 14, 2019
b4a4814
add shortcuts when creating triangulations with exactly 1 simplex
jbweston Oct 14, 2019
52175f0
fix bug where we receive more points than we asked for
jbweston Oct 16, 2019
cdf26a7
add learner tests that simulate the runner and randomly asking/telling
jbweston Oct 16, 2019
011c5f3
create queue on initialization rather than calling 'insert'
jbweston Oct 16, 2019
51901cc
suppress context when raising when an item was not in a queue
jbweston Oct 16, 2019
d401cbb
fix bug in subdomain updating logic when telling data
jbweston Oct 16, 2019
445c6f1
improve comment
jbweston Oct 16, 2019
d41b355
blackify domain.py
jbweston Oct 16, 2019
7430143
check that points matrix is well conditioned
jbweston Oct 16, 2019
6c0cafa
remove stateful invariant checks
jbweston Oct 17, 2019
9ffd7bc
make random domains by sampling points uniformly in the unit hypercube
jbweston Oct 17, 2019
587d027
simplify utilities for generating domains and points inside/outside them
jbweston Oct 17, 2019
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
743 changes: 743 additions & 0 deletions adaptive/domain.py

Large diffs are not rendered by default.

510 changes: 510 additions & 0 deletions adaptive/learner/new_learnerND.py

Large diffs are not rendered by default.

29 changes: 17 additions & 12 deletions adaptive/learner/triangulation.py
Original file line number Diff line number Diff line change
Expand Up @@ -266,7 +266,7 @@ class Triangulation:
or more simplices in the
"""

def __init__(self, coords):
def __init__(self, coords, *, _check_vertices=True):
akhmerov marked this conversation as resolved.
Show resolved Hide resolved
if not is_iterable_and_sized(coords):
raise TypeError("Please provide a 2-dimensional list of points")
coords = list(coords)
Expand All @@ -287,23 +287,28 @@ def __init__(self, coords):
raise ValueError("Please provide at least one simplex")

coords = list(map(tuple, coords))
vectors = np.subtract(coords[1:], coords[0])
if np.linalg.matrix_rank(vectors) < dim:
raise ValueError(
"Initial simplex has zero volumes "
"(the points are linearly dependent)"
)
if _check_vertices:
vectors = np.subtract(coords[1:], coords[0])
if np.linalg.matrix_rank(vectors) < dim:
raise ValueError(
"Initial simplex has zero volumes "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not necessarily a single simplex at this point; this should read "Hull has a zero volume", right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. This error message was already wrong before I touched it. Maybe I should update it

"(the points are linearly dependent)"
)

self.vertices = list(coords)
self.simplices = set()
# initialise empty set for each vertex
self.vertex_to_simplices = [set() for _ in coords]

# find a Delaunay triangulation to start with, then we will throw it
# away and continue with our own algorithm
initial_tri = scipy.spatial.Delaunay(coords)
for simplex in initial_tri.simplices:
self.add_simplex(simplex)
if len(coords) == dim + 1:
# There is just a single simplex
self.add_simplex(tuple(range(dim + 1)))
else:
# find a Delaunay triangulation to start with, then we will throw it
# away and continue with our own algorithm
initial_tri = scipy.spatial.Delaunay(coords)
for simplex in initial_tri.simplices:
self.add_simplex(simplex)

def delete_simplex(self, simplex):
simplex = tuple(sorted(simplex))
Expand Down
115 changes: 115 additions & 0 deletions adaptive/priority_queue.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
from sortedcontainers import SortedDict, SortedList

__all__ = ["Empty", "Queue"]


class Empty(KeyError):
pass


class Queue:
"""Priority queue supporting update and removal at arbitrary position.

Parameters
----------
entries : iterable of (item, priority)
The initial data in the queue. Providing this is faster than
calling 'insert' a bunch of times.
"""

def __init__(self, entries=()):
self._queue = SortedDict(
((priority, -n), item) for n, (item, priority) in enumerate(entries)
)
# 'self._queue' cannot be keyed only on priority, as there may be several
# items that have the same priority. To keep unique elements the key
# will be '(priority, self._n)', where 'self._n' is decremented whenever
# we add a new element. 'self._n' is negative so that elements with equal
# priority are sorted by insertion order.
self._n = -len(self._queue)
# To efficiently support updating and removing items if their priority
# is unknown we have to keep the reverse map of 'self._queue'. Because
# items may not be hashable we cannot use a SortedDict, so we use a
# SortedList storing '(item, key)'.
self._items = SortedList(((v, k) for k, v in self._queue.items()))

def __len__(self):
return len(self._queue)

def items(self):
"Return an iterator over the items in the queue in priority order."
return reversed(self._queue.values())
jbweston marked this conversation as resolved.
Show resolved Hide resolved

def peek(self):
"""Return the item and priority at the front of the queue.

Raises
------
Empty : if the queue is empty
"""
self._check_nonempty()
((priority, _), item) = self._queue.peekitem()
return item, priority

def pop(self):
"""Remove and return the item and priority at the front of the queue.

Raises
------
Empty : if the queue is empty
"""
self._check_nonempty()
(key, item) = self._queue.popitem()
i = self._items.index((item, key))
del self._items[i]
priority, _ = key
return item, priority

def insert(self, item, priority):
"Insert 'item' into the queue with the given priority."
key = (priority, self._n)
self._items.add((item, key))
self._queue[key] = item
self._n -= 1

def _check_nonempty(self):
if not self._queue:
raise Empty()

def _find_first(self, item):
self._check_nonempty()
i = self._items.bisect_left((item, ()))
try:
should_be, key = self._items[i]
except IndexError:
raise KeyError("item is not in queue")
if item != should_be:
raise KeyError("item is not in queue")
return i, key

def remove(self, item):
"""Remove the 'item' from the queue.

Raises
------
KeyError : if 'item' is not in the queue.
"""
i, key = self._find_first(item)
del self._queue[key]
del self._items[i]

def update(self, item, priority):
"""Update 'item' in the queue to have the given priority.

Raises
------
KeyError : if 'item' is not in the queue.
"""
i, key = self._find_first(item)
_, n = key
new_key = (priority, n)

del self._queue[key]
del self._items[i]
self._queue[new_key] = item
self._items.add((item, new_key))
175 changes: 175 additions & 0 deletions adaptive/tests/domain_utils.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,175 @@
import itertools

import numpy as np
import scipy.linalg
import scipy.spatial

import hypothesis.strategies as st
from adaptive.learner.new_learnerND import ConvexHull, Interval
from hypothesis.extra import numpy as hynp


def reflections(ndim):
return map(np.diag, itertools.product([1, -1], repeat=ndim))


reals = st.floats(min_value=-100, max_value=100, allow_nan=False, allow_infinity=False)
positive_reals = st.floats(
min_value=1e-3, max_value=100, allow_nan=False, allow_infinity=False
)


@st.composite
def point(draw, ndim):
return draw(reals if ndim == 1 else st.tuples(*[reals] * ndim))


def unique_vectors(xs):
xs = np.asarray(xs)
if len(xs.shape) == 1:
xs = xs[:, None]
c = np.max(np.linalg.norm(xs, axis=1))
if c == 0:
return False
d = scipy.spatial.distance_matrix(xs, xs)
d = np.extract(1 - np.identity(d.shape[0]), d)
return not np.any(d < 1e-3 / c)


@st.composite
def point_inside_simplex(draw, simplex):
simplex = draw(simplex)
simplex = np.asarray(simplex)
dim = simplex.shape[1]
# Set the numpy random seed
draw(st.random_module())
# Generate a point in the unit simplex.
# https://cs.stackexchange.com/questions/3227/uniform-sampling-from-a-simplex
# We avoid using Hypothesis to generate the points as it typically chooses
# very annoying points, which we want to avoid testing for now.
xb = np.random.rand(dim)
xb = np.array(sorted(xb))
xb[1:] = xb[1:] - xb[:-1]
# Transform into the simplex we need
v0, vecs = simplex[0], simplex[1:] - simplex[0]
x = tuple(v0 + (vecs.T @ xb))
return x


@st.composite
def points_inside(draw, domain, n):
kwargs = dict(
allow_nan=False, allow_infinity=False, exclude_min=True, exclude_max=True
)
if isinstance(domain, Interval):
a, b = domain.bounds
eps = (b - a) * 1e-2
x = st.floats(min_value=(a + eps), max_value=(b - eps), **kwargs)
else:
assert isinstance(domain, ConvexHull)
tri = domain.triangulation
simplices = list(tri.simplices)
simplex = st.sampled_from(simplices).map(
lambda simplex: [tri.vertices[s] for s in simplex]
)
x = point_inside_simplex(simplex)

xs = st.tuples(*[x] * n).filter(unique_vectors)
return draw(xs)


@st.composite
def point_inside(draw, domain):
return draw(points_inside(domain, 1))[0]


@st.composite
def a_few_points_inside(draw, domain):
n = draw(st.integers(3, 20))
return draw(points_inside(domain, n))


@st.composite
def points_outside(draw, domain, n):
kwargs = dict(allow_nan=False, allow_infinity=False)
if isinstance(domain, Interval):
a, b = domain.bounds
length = b - a
before_domain = st.floats(a - 10 * length, a, exclude_max=True, **kwargs)
after_domain = st.floats(b, b + 10 * length, exclude_min=True, **kwargs)
x = before_domain | after_domain
else:
assert isinstance(domain, ConvexHull)
hull = domain.bounds
# Generate point between bounding box and bounding box * 10
points = hull.points[hull.vertices]
x = st.tuples(
*[
(
st.floats(a - 10 * (b - a), a, exclude_max=True, **kwargs)
| st.floats(b, b + 10 * (b - a), exclude_min=True, **kwargs)
)
for a, b in zip(points.min(axis=0), points.max(axis=0))
]
)

xs = st.tuples(*[x] * n).filter(unique_vectors)
return draw(xs)


@st.composite
def point_outside(draw, domain):
return draw(points_outside(domain, 1))[0]


@st.composite
def point_on_shared_face(draw, domain, dim):
# Return a point that is shared by at least 2 subdomains
assert isinstance(domain, ConvexHull)
assert 0 < dim < domain.ndim

tri = domain.triangulation

for face in tri.faces(dim + 1):
containing_subdomains = tri.containing(face)
if len(containing_subdomains) > 1:
break

vertices = np.array([tri.vertices[i] for i in face])

f = st.floats(1e-3, 1 - 1e-3, allow_nan=False, allow_infinity=False)
xb = draw(st.tuples(*[f] * dim))

x = tuple(vertices[0] + xb @ (vertices[1:] - vertices[0]))

assert all(tri.point_in_simplex(x, s) for s in containing_subdomains)

return x


@st.composite
def make_random_domain(draw, ndim, fill=True):
if ndim == 1:
limits = draw(st.tuples(reals, reals).map(sorted).filter(lambda x: x[0] < x[1]))
domain = Interval(*limits)
else:
points = draw(
hynp.arrays(np.float, (10, ndim), elements=reals, unique=True).filter(
unique_vectors
)
)
domain = ConvexHull(points)
return domain


@st.composite
def make_hypercube_domain(draw, ndim, fill=True):
if ndim == 1:
limit = draw(positive_reals)
subdomain = Interval(-limit, limit)
else:
x = draw(positive_reals)
point = np.full(ndim, x)
boundary_points = [r @ point for r in reflections(ndim)]
subdomain = ConvexHull(boundary_points)
return subdomain
Loading