-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nested Vectorization #1465
Comments
@pgkirsch firstly, I have little experience with as_view (pretty much didn't know of its existence until I checked the documentation a few minutes ago). |
Very simple MWE of working 3d vector subs to base discussion on: from gpkit import Variable, Model, Vectorize
import numpy as np
with Vectorize(2):
with Vectorize(3):
with Vectorize(4):
v = Variable()
u = Variable()
m = Model(np.prod(v), [v[:] >= u[:]])
m.substitutions.update({u: np.random.rand(4,3,2)})
sol = m.solve() |
@pgkirsch What about the substitutions most bothers you? Now that I have tried, they seem to work really well out of the box., and it's relatively clean from a practicality standpoint. As for good printing of solutions, that's a challenge that other models face too, since large vectorizations are flattened beyond 2d. But I'm of the opinion that such solutions should implement problem-specific visualizations instead of terminal outputs. But for terminal, you could try this: varkeys = [k for k in sol.program.gps[-1].varlocs if somekey in k.models] to only print variables of a certain model (indicating level of vectorization). |
obligatory citation, Tony Tao's thesis has a great exploration of multidimensional vectorization: https://dspace.mit.edu/handle/1721.1/120427?show=full He had 3d vectorization down pretty well: flight states sharing airplanes sharing tooling |
Sounds good! I tend to think of it was what variables are shared, i.e. not discretized at each level. So climb phases share a mission, missions share an aircraft. idk but it helps me keep track of it.
I'm pleased to tell you that it is not supposed to be the case! import numpy as np
from gpkit import *
with Vectorize(2): # payloads
payload = Payload()
desertgoosequail = Airplane(payload)
class Mission(Model):
def setup(self, airplane):
# writing constraints in here will be tricky, but
# I think it can be done without having to manually read
# input variable size by stretching those input variables
# across new dimensions as necessary
mo = MissionObjectives(airplane)
with Vectorize(10):
apm = airplane.performance_model(mo)
return [mo, apm]
with Vectorize(3):
missions = Mission(desertgoosequail)
m = Model(missions.fuel.sum(), [missions])
Welll it was designed with nested in mind (see my response on that issue) but that doesn't mean it's good at it.
This is really hard! Large design spaces like this are sometimes best understand by focal points, and clustering; we could cluster results together and do a kind of PCA analysis?
Been musing about something like this for evaluating "paragons" i.e. making it a first-class thing to solve a model with multiple objective functions, then compare those solutions in a table.
Unfortunately it doesn't make the tables and display any easier, just the code more readable! |
Oof sorry for the month-long delay in response. I'm not sure your example corresponds to what I'm experiencing. It might be easier to discuss with eyes on real code in the other repo, but here is my attempt at generalizing it, this time using aero drag as a perhaps slightly more accurate analog: ...
with Vectorize(numberofmissions):
...
with Vectorize(numberofaircraftconfigurations):
self.Drag()
... where class Drag(Model):
...
constraints = [
...
]
substitutions = {
'drag coefficient': np.tile(np.array([[1], [2]]), numberofmissions), # previously [1, 2] sufficed
...
}
return constraints, substitutions I assume Solution ArraysI have two proposals for solution arrays, one short term and one maybe longer term: 1. Array LegendA legend of sorts that can optionally be printed at the top of
The first two are readable but, without digging into GPkit source code or inferring from the values, it's impossible to tell which element is which (how values are tiled) for the latter two.
Note that this would require the concept of a named vectorization as described above. 2.
|
Modified my previous comment to remove the "Diff" section. The KeyError I mentioned seems to be a fairly narrow issue with sensitivity diffs only, so doesn't belong in this discussion. It would still be nice to be able to explicitly diff one mission from a multi-mission solution against the single-mission-only solution, but I think that can be discussed as part of the "Accessing Solutions" section. A nice-to-have feature for diffs might be: a_simple = AirlineSystem([mission1], [config1, config2])
sol_simple = a_simple.solve()
a_complex = AirlineSystem([mission1, mission2, mission3], [config1, config2])
sol_complex = a_complex.solve()
sol_simple.diff(sol_complex['mission1']) # use of `[]` is intended as pseudo-code for discussion purpose |
@pgkirsch these are great ideas! We definitely have the structure to store index "keys" like this already, between the global VECTORIZATION counter and varkey metadata...the tricks for (1) and (2) above would be more of interface questions, and would be much harder than Accessing Solutions. It seems to me like a good place to start would be a class (or metaclass) which serves as a "range" and one which serves as an "index". For example: class Dimension:
def __get__(self, idx):
name = None
if hasattr(self, "idxnames") and self.idxnames:
name = self.idxnames[idx]
return Index(self, idx, name)
class Index:
def __init__(self, dim, idx, name=None):
self.dim = dim
self.idx = idx
self.name = name
class ConstraintIter(Dimension):
def __init__(self, constraints):
self.constraints = constraints # each mission is a constraintset
self.shape = (len(constraints),)
if isinstance(constraints, dict):
self.idxnames = constraints.keys()
## declaration
missions = {"Short Hop": mission1, "Sprint": mission2, "Long Haul": mission3}
missiondim = ConstraintIter(missions)
## vectorization
with Vectorize(missiondim): # equivalent to Vectorize(3)
...
## variable slicing
x[missiondim[0]] # x_(:, Short Hop, :) |
@bqpd thanks for this and sorry for the slow response on my end. I have come back and looked at this probably a dozen times and am still struggling to follow a few things:
To explore 2., I have tried the follow: In [1]: from gpkit import Model, Variable, Vectorize
In [2]: %paste
class Dimension:
def __get__(self, idx):
name = None
if hasattr(self, "idxnames") and self.idxnames:
name = self.idxnames[idx]
return Index(self, idx, name)
class Index:
def __init__(self, dim, idx, name=None):
self.dim = dim
self.idx = idx
self.name = name
class ConstraintIter(Dimension):
def __init__(self, constraints):
self.constraints = constraints # each mission is a constraintset
self.shape = (len(constraints),)
if isinstance(constraints, dict):
self.idxnames = constraints.keys()
## -- End pasted text --
In [3]: x = Variable("x")
In [4]: y = Variable("y")
In [5]: from gpkit import ConstraintSet
In [6]: mission1 = ConstraintSet([x >= 1])
In [7]: mission2 = ConstraintSet([x >= 2])
In [8]: mission3 = ConstraintSet([x >= 3])
In [9]: missions = {"Short Hop": mission1, "Sprint": mission2, "Long Haul": mission3}
In [10]: missiondim = ConstraintIter(missions)
In [11]: with Vectorize(missiondim):
...: cs = ConstraintSet([y >= x])
...:
In [13]: m = Model(y, cs+mission1+mission2+mission3)
In [14]: sol = m.solve()
Using solver 'mosek_cli'
for 2 free variables
in 5 posynomial inequalities.
Solving took 0.167 seconds.
In [15]: print(sol.table())
Optimal Cost
------------
3
Free Variables
--------------
x : 3
y : 3
...
In [16]: sol(x[missiondim[0]])
...:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-16-fd8a35524b9e> in <module>
----> 1 sol(x[missiondim[0]])
TypeError: 'ConstraintIter' object is not subscriptable
In [17]: x["Short Hop"] # ??? How do mission names get used?
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-17-d586d69427aa> in <module>
----> 1 x["Short Hop"] # ??? How do mission names get used?
TypeError: 'Variable' object is not subscriptable Apologies if I'm missing the point or making some basic mistake. What am I not understanding? As for 3., I think it might be useful to start from the MWE @1ozturkbe provided and show the functionality I'm hoping for and you can tell me how feasible it is. from gpkit import Variable, Model, Vectorize
import numpy as np
with Vectorize(2):
with Vectorize(3):
v = Variable("v")
u = Variable("u")
m = Model(np.prod(v), [v[:] >= u[:]])
m.substitutions.update({u: np.random.rand(3,2)})
sol = m.solve()
print(sol.table())
# Optimal Cost
# ------------
# 0.02721
#
# Free Variables
# --------------
# v : [ 0.339 0.911 0.378
# 0.866 0.459 0.587 ]
#
# Fixed Variables
# ---------------
# u : [ 0.339 0.911 0.378
# 0.866 0.459 0.587 ]
sol1 = sol.slice([:,1]) # a method that returns a SolutionArray
sol2 = sol.slice([:,2])
print(sol1.table())
# Optimal Cost
# ------------
# (Excluded)
#
# Free Variables
# --------------
# v : [ 0.911
# 0.459 ]
#
# Fixed Variables
# ---------------
# u : [ 0.911
# 0.459 ]
print(sol1.diff(sol2)) # really want to be able to diff slices
sol1.save(...) Does this seem feasible? |
Starting a design discussion thread for nested vectorization.
Generic motivating example
A user wants to optimize the design of an aircraft. This aircraft has several different missions to complete, and the user wants to optimize over all of these missions simultaneously (the "outer vectorization"). For each mission, the aircraft has a max-payload configuration and an average-payload configuration (the "inner vectorization"). Of course the aircraft needs to be able to complete all missions with the max-payload configuration, but performance on each mission should be optimized around the average configuration. To further complicate things slightly, the climb and descent phases are discretized using VectorVariables. This means that some variables have as many as three dimensions.
Design challenges to address
Inputs
In the example given above, the payload is necessarily a 2-element substitution (average payload and maximum payload). But when the outer vectorization is introduced, GPkit currently requires that substitution to be 2xn, where n is the number of missions (the "degree(?)" of outer vectorization). Ideally this wouldn't be the case.
Slicing specific levels of vectorizations
As described in #1464,
as_view
(which works fine for one level of vectorization) seems to not be designed with nested vectorization in mind (please correct me if this is wrong). Maybe it's a simple fix, but would be good to know if there is perhaps something more scalable than the currentas_view
approach.Printing and accessing solutions
The trusty workhorse
SolutionArray.table
method starts to struggle with communicating information beyond the inner vectorization. It's hard to tell which values refer to which slices of the vectorization. Maybe this is just the natural limit of plaintext solution tables, but I'm interested in discussing ways the table could be made more readable (more explicit), even with three-dimensional variables.Even the output of
sol(somethreedimensionalvariable)
is hard to interpret.Diff-ing
Closely related to the above: how do we adapt
sol.diff
to effectively capture changes in the solution when some variables have three dimensions.Discuss!
I suspect one of the possible solutions could be something along the lines of Named Vectorizations, i.e. the idea of a key-oriented vectorization as opposed to an index-oriented vectorization, but I'm sure that would present lots of challenges with which I'm not yet familiar.
I'm sure we'll think of more challenges too, so this is just a start. Interested to hear people's thoughts!
The text was updated successfully, but these errors were encountered: