Merge branch 'main' into ms/gurobi-funcs

Bravos-Power · Dec 27, 2024 · b9be2f1 · b9be2f1
2 parents ce58ae5 + 3c967c9
commit b9be2f1
Show file tree

Hide file tree

Showing 29 changed files with 1,252 additions and 179 deletions.
diff --git a/.gitignore b/.gitignore
@@ -1,7 +1,6 @@
 *.pyc
 __pycache__/
 build
-.vscode
 .venv
 .pytest_cache
 *.egg-info

diff --git a/.vscode/settings.json b/.vscode/settings.json
@@ -0,0 +1,7 @@
+{
+    "python.testing.pytestArgs": [
+        "."
+    ],
+    "python.testing.unittestEnabled": false,
+    "python.testing.pytestEnabled": true
+}
diff --git a/docs/learn/01_getting-started/01_installation.md b/docs/learn/01_getting-started/01_installation.md
@@ -1,17 +1,20 @@
 ## Install Pyoframe
-```
+
+```cmd
 pip install pyoframe
 ```
 
 ## Install a solver
 
-*[solver]: Solvers like HiGHS and Gurobi do the actual solving of your model. Pyoframe is a wrapper that makes it easy to build models but Pyoframe still needs a solver to work.
+*[solver]: Solvers like HiGHS and Gurobi do the actual solving of your model. Pyoframe is a layer on top of the solver that makes it easy to build models and switch between solvers.
 
-=== "HiGHS"
+=== "HiGHS (free)"
 
-    `pip install pyoframe[highs]`
+    ```cmd
+    pip install pyoframe[highs]
+    ```
 
-=== "Gurobi"
+=== "Gurobi (commercial)"
 
     1. [Install Gurobi](https://www.gurobi.com/downloads/gurobi-software/) from their website.
     2. Ensure you have a valid Gurobi license installed on your machine.

diff --git a/docs/learn/01_getting-started/02_build-simple-model.md b/docs/learn/01_getting-started/02_build-simple-model.md
@@ -5,33 +5,35 @@ Here's a simple model to show you Pyoframe's syntax. Click on the :material-plus
 ```python
 import pyoframe as pf
 
-m = pf.Model("max") # (1)!
+m = pf.Model()
 
 # You can buy tofu or chickpeas
-m.tofu = pf.Variable(lb=0)  # (2)!
+m.tofu = pf.Variable(lb=0)  # (1)!
 m.chickpeas = pf.Variable(lb=0)
 
-# Youd want to maximize your protein intake (10g for tofu, 8g for chickpeas)
-m.objective = 10 * m.tofu + 8 * m.chickpeas # (3)!
+# You want to maximize your protein intake (10g per tofu, 8g per chickpeas)
+m.maximize = 10 * m.tofu + 8 * m.chickpeas # (2)!
 
-# You have $10 and tofu costs $4 while chickpeas cost $2.
-m.budget_constraint = 4 * m.tofu + 2 * m.chickpeas <= 10 # (4)!
+# You must stay with your $10 budget (4$ per tofu, $2 per chickpeas)
+m.budget_constraint = 4 * m.tofu + 2 * m.chickpeas <= 10 # (3)!
 
-m.optimize()
+m.optimize()  # (4)!
 
 print("You should buy:")
 print(f"\t{m.tofu.solution} blocks of tofu")
 print(f"\t{m.chickpeas.solution} cans of chickpeas")
 ```
 
-1. Creating your model is always the starting point!
-2. `lb=0` sets the variable's lower bound to ensure you can't buy a negative quantity of tofu!
-3. Variables can be added and multiplied as you'd expect!
-4. Using `<=`, `>=` or `==` will automatically create a constraint.
+1. Create a variable with a lower bound of zero (`lb=0`) so that you can't buy a negative quantity of tofu!
+2. Define your objective by setting the reserved variables `.maximize` or `.minimize`.
+3. Creates constraints by using `<=`, `>=`, or `==`.
+4. Pyoframe automatically detects your installed solver and optimizes your model!
 
 ## Use dimensions
 
-The above model would quickly become unworkable if we had more than just tofu and chickpeas. Let's create a `food` dimension to make this scalable. While were at it, let's also read our data from the following .csv file instead of hardcoding it.
+The above model would quickly become unworkable if we had more than just tofu and chickpeas. I'll walk you through how we can make a `food` dimension to make this scalable. You can also skip to the end to see the example in full!
+
+Note that instead of hardcoding our values, we'll be reading them from the following csv file.
 
 > `food_data.csv`
 >
@@ -64,7 +66,7 @@ Nothing special here. Load your data using your favourite dataframe library. We
 
 ```python
 import pyoframe as pf
-m = pf.Model("max")
+m = pf.Model()
 ```
 
 ### Create an dimensioned variable
@@ -84,13 +86,13 @@ If you print the variable, you'll see it actually contains a `tofu` and `chickpe
 ```
 
 !!! tip "Tip"
-    Naming your model's decision variables with an uppercase first letter (e.g. `m.Buy`) makes it to remember what's a variable and what isn't.
+    Naming your model's decision variables with an uppercase first letter (e.g. `m.Buy`) makes it easier to remember what's a variable and what isn't.
 
 ### Create the objective
 
 Previously we had:
 ```python
-m.objective = 10 * m.tofu + 8 * m.chickpeas
+m.maximize = 10 * m.tofu + 8 * m.chickpeas
 ```
 
 How do we make use of our dimensioned variable `m.Buy` instead?
@@ -119,7 +121,7 @@ Second, notice that our `Expression` still has a `food` dimension—it really co
 This works and since `food` is the only dimensions we don't even need to specify it. Putting it all together:
 
 ```python
-m.objective = pf.sum(data[["food", "protein"]] * m.Buy)
+m.maximize = pf.sum(data[["food", "protein"]] * m.Buy)
 ```
 
 ### Adding the constraint
@@ -138,9 +140,9 @@ import pyoframe as pf
 
 data = pd.read_csv("food_data.csv")
 
-m = pf.Model("max")
+m = pf.Model()
 m.Buy = pf.Variable(data[["food"]], lb=0)
-m.objective = pf.sum(data[["food", "protein"]] * m.Buy)
+m.maximize = pf.sum(data[["food", "protein"]] * m.Buy)
 m.budget_constraint = pf.sum(data[["food", "cost"]] * m.Buy) <= 10
 
 m.optimize()

diff --git a/docs/learn/03_concepts/03_quadratic_expressions.md b/docs/learn/03_concepts/03_quadratic_expressions.md
@@ -0,0 +1,32 @@
+# Quadratic Expressions
+
+Quadratic expressions work as you'd expect. Simply multiply two linear expression together (or square an expression with `**2`) and you'll get a quadratic. The quadratic can then be used in constraints or the objective. 
+
+## Example
+
+### Maximize area of box
+Here's a short example that shows that a square maximizes the area of any box with a fixed perimeter.
+
+```python3
+import pyoframe as pf
+model = pf.Model("max")
+model.w = pf.Variable(lb=0)
+model.h = pf.Variable(lb=0)
+model.limit_perimter = 2 * (model.w + model.h) <= 20
+model.objective = model.w * model.h
+model.solve()
+print(f"It's a square: {model.w.solution==model.h.solution}")
+
+# Outputs: It's a square: True
+```
+### Facility Location Problem
+
+See [examples/facility_location](../tests/examples/facility_location/).
+
+## Note for Pyoframe developers: Internal Representation of Quadratics
+
+Internally, Pyoframe's `Expression` object is used for both linear and quadratic expressions. When the dataframe within an `Expression` object (i.e. `Expression.data`) contains an additional column (named `__quadratic_variable_id`) we know that the expression is a quadratic.
+
+This extra column stores the ID of the second variable in quadratic terms. For terms with only one variable, this column contains ID `0` (a reserved variable ID which can thought of as meaning 'no variable'). The variables in a quadratic are rearranged such that the ID in the `__variable_id` column is always greater or equal than the variable ID in the `__quadratic_variable_id` (recall: a*b=b*a). This rearranging not only ensures that a*b+b*a=2a*b but also generates a useful property: If the variable ID in the first column (`__variable_id`) is `0` we know the variable ID in the second must also be `0` and therefore the term must be a constant.
+
+The additional quadratic variable ID column is automatically dropped if through arithmetic the quadratic terms cancel out.
diff --git a/docs/learn/03_concepts/03_troubleshooting.md → docs/learn/03_concepts/04_troubleshooting.md b/docs/learn/03_concepts/03_troubleshooting.md → docs/learn/03_concepts/04_troubleshooting.md
diff --git a/docs/learn/03_concepts/SUMMARY.md b/docs/learn/03_concepts/SUMMARY.md
@@ -1,3 +1,4 @@
 - [Pyoframe datastructure](./01_pyoframe-datastructure.md)
 - [Performance](./02_performance_tips.md)
-- [Troubleshooting](./03_troubleshooting.md)
+- [Quadratics](./03_quadratic_expressions.md)
+- [Troubleshooting](./04_troubleshooting)
diff --git a/pyproject.toml b/pyproject.toml
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
 
 [project]
 name = "pyoframe"
-version = "0.0.11"
+version = "0.1.0"
 authors = [{ name = "Bravos Power", email = "[email protected]" }]
 description = "Blazing fast linear program interface"
 readme = "README.md"

diff --git a/src/pyoframe/_arithmetic.py b/src/pyoframe/_arithmetic.py
@@ -1,10 +1,17 @@
+"""
+Defines helper functions for doing arithmetic operations on expressions (e.g. addition).
+"""
+
 from typing import TYPE_CHECKING, List, Optional
 
 import polars as pl
 
 from pyoframe.constants import (
     COEF_KEY,
+    CONST_TERM,
+    KEY_TYPE,
     POLARS_VERSION,
+    QUAD_VAR_KEY,
     RESERVED_COL_KEYS,
     VAR_KEY,
     Config,
@@ -16,6 +23,41 @@
     from pyoframe.core import Expression
 
 
+def _multiply_expressions(self: "Expression", other: "Expression") -> "Expression":
+    """
+    Multiply two or more expressions together.
+
+    Examples:
+        >>> import pyoframe as pf
+        >>> m = pf.Model("min")
+        >>> m.x1 = pf.Variable()
+        >>> m.x2 = pf.Variable()
+        >>> m.x3 = pf.Variable()
+        >>> result = 5 * m.x1 * m.x2
+        >>> result
+        <Expression size=1 dimensions={} terms=1 degree=2>
+        5 x2 * x1
+        >>> result * m.x3
+        Traceback (most recent call last):
+        ...
+        pyoframe.constants.PyoframeError: Failed to multiply expressions:
+        <Expression size=1 dimensions={} terms=1 degree=2> * <Expression size=1 dimensions={} terms=1>
+        Due to error:
+        Cannot multiply a quadratic expression by a non-constant.
+    """
+    try:
+        return _multiply_expressions_core(self, other)
+    except PyoframeError as error:
+        raise PyoframeError(
+            "Failed to multiply expressions:\n"
+            + " * ".join(
+                e.to_str(include_header=True, include_data=False) for e in [self, other]
+            )
+            + "\nDue to error:\n"
+            + str(error)
+        ) from error
+
+
 def _add_expressions(*expressions: "Expression") -> "Expression":
     try:
         return _add_expressions_core(*expressions)
@@ -30,6 +72,98 @@ def _add_expressions(*expressions: "Expression") -> "Expression":
         ) from error
 
 
+def _multiply_expressions_core(self: "Expression", other: "Expression") -> "Expression":
+    self_degree, other_degree = self.degree(), other.degree()
+    if self_degree + other_degree > 2:
+        # We know one of the two must be a quadratic since 1 + 1 is not greater than 2.
+        raise PyoframeError("Cannot multiply a quadratic expression by a non-constant.")
+    if self_degree < other_degree:
+        self, other = other, self
+        self_degree, other_degree = other_degree, self_degree
+    if other_degree == 1:
+        assert (
+            self_degree == 1
+        ), "This should always be true since the sum of degrees must be <=2."
+        return _quadratic_multiplication(self, other)
+
+    assert (
+        other_degree == 0
+    ), "This should always be true since other cases have already been handled."
+    multiplier = other.data.drop(
+        VAR_KEY
+    )  # QUAD_VAR_KEY doesn't need to be dropped since we know it doesn't exist
+
+    dims = self.dimensions_unsafe
+    other_dims = other.dimensions_unsafe
+    dims_in_common = [dim for dim in dims if dim in other_dims]
+
+    data = (
+        self.data.join(
+            multiplier,
+            on=dims_in_common if len(dims_in_common) > 0 else None,
+            how="inner" if dims_in_common else "cross",
+        )
+        .with_columns(pl.col(COEF_KEY) * pl.col(COEF_KEY + "_right"))
+        .drop(COEF_KEY + "_right")
+    )
+
+    return self._new(data)
+
+
+def _quadratic_multiplication(self: "Expression", other: "Expression") -> "Expression":
+    """
+    Multiply two expressions of degree 1.
+
+    Examples:
+        >>> import polars as pl
+        >>> df = pl.DataFrame({"dim": [1, 2, 3], "value": [1, 2, 3]})
+        >>> m = pf.Model()
+        >>> m.x1 = pf.Variable()
+        >>> m.x2 = pf.Variable()
+        >>> expr1 = df * m.x1
+        >>> expr2 = df * m.x2 * 2 + 4
+        >>> expr1 * expr2
+        <Expression size=3 dimensions={'dim': 3} terms=6 degree=2>
+        [1]: 4 x1 +2 x2 * x1
+        [2]: 8 x1 +8 x2 * x1
+        [3]: 12 x1 +18 x2 * x1
+        >>> (expr1 * expr2) - df * m.x1 * df * m.x2 * 2
+        <Expression size=3 dimensions={'dim': 3} terms=3>
+        [1]: 4 x1
+        [2]: 8 x1
+        [3]: 12 x1
+    """
+    dims = self.dimensions_unsafe
+    other_dims = other.dimensions_unsafe
+    dims_in_common = [dim for dim in dims if dim in other_dims]
+
+    data = (
+        self.data.join(
+            other.data,
+            on=dims_in_common if len(dims_in_common) > 0 else None,
+            how="inner" if dims_in_common else "cross",
+        )
+        .with_columns(pl.col(COEF_KEY) * pl.col(COEF_KEY + "_right"))
+        .drop(COEF_KEY + "_right")
+        .rename({VAR_KEY + "_right": QUAD_VAR_KEY})
+        # Swap VAR_KEY and QUAD_VAR_KEY so that VAR_KEy is always the larger one
+        .with_columns(
+            pl.when(pl.col(VAR_KEY) < pl.col(QUAD_VAR_KEY))
+            .then(pl.col(QUAD_VAR_KEY))
+            .otherwise(pl.col(VAR_KEY))
+            .alias(VAR_KEY),
+            pl.when(pl.col(VAR_KEY) < pl.col(QUAD_VAR_KEY))
+            .then(pl.col(VAR_KEY))
+            .otherwise(pl.col(QUAD_VAR_KEY))
+            .alias(QUAD_VAR_KEY),
+        )
+    )
+
+    data = _sum_like_terms(data)
+
+    return self._new(data)
+
+
 def _add_expressions_core(*expressions: "Expression") -> "Expression":
     # Mapping of how a sum of two expressions should propogate the unmatched strategy
     propogatation_strategies = {
@@ -163,11 +297,24 @@ def get_indices(expr):
         propogate_strat = expressions[0].unmatched_strategy
         expr_data = [expr.data for expr in expressions]
 
+    # Add quadratic column if it is needed and doesn't already exist
+    if any(QUAD_VAR_KEY in df.columns for df in expr_data):
+        expr_data = [
+            (
+                df.with_columns(pl.lit(CONST_TERM).alias(QUAD_VAR_KEY).cast(KEY_TYPE))
+                if QUAD_VAR_KEY not in df.columns
+                else df
+            )
+            for df in expr_data
+        ]
+
     # Sort columns to allow for concat
-    expr_data = [e.select(sorted(e.columns)) for e in expr_data]
+    expr_data = [
+        e.select(dims + [c for c in e.columns if c not in dims]) for e in expr_data
+    ]
 
     data = pl.concat(expr_data, how="vertical_relaxed")
-    data = data.group_by(dims + [VAR_KEY], maintain_order=True).sum()
+    data = _sum_like_terms(data)
 
     new_expr = expressions[0]._new(data)
     new_expr.unmatched_strategy = propogate_strat
@@ -215,6 +362,20 @@ def _add_dimension(self: "Expression", target: "Expression") -> "Expression":
     return self._new(result)
 
 
+def _sum_like_terms(df: pl.DataFrame) -> pl.DataFrame:
+    """Combines terms with the same variables. Removes quadratic column if they all happen to cancel."""
+    dims = [c for c in df.columns if c not in RESERVED_COL_KEYS]
+    var_cols = [VAR_KEY] + ([QUAD_VAR_KEY] if QUAD_VAR_KEY in df.columns else [])
+    df = (
+        df.group_by(dims + var_cols, maintain_order=True)
+        .sum()
+        .filter(pl.col(COEF_KEY) != 0)
+    )
+    if QUAD_VAR_KEY in df.columns and (df.get_column(QUAD_VAR_KEY) == CONST_TERM).all():
+        df = df.drop(QUAD_VAR_KEY)
+    return df
+
+
 def _get_dimensions(df: pl.DataFrame) -> Optional[List[str]]:
     """
     Returns the dimensions of the DataFrame. Reserved columns do not count as dimensions.