This repository contains code meant to model interactions between actors competing to develop a new, risky technology (we have AI technology in mind in particular, but this could be any technology that carries some risk of unintended negative consequences).
A lot of the functionality of the code here overlaps with AIIncentives.jl. Both this code and that code are based on the same basic model, though this code specializes in multi-period extensions of that model, while AIIncentives.jl is meant to provide a robust set of tools for studying the static model.
The easiest way to use this package is likely via the provided Python bindings. I provide Python wheels in the releases section; if one of those wheels is compatible with your OS and Python version, you should be able to download it and install it with, e.g., pip.
If you want to interact directly with the Rust code or build for an unsupported version of Python, you'll need to take the following steps (otherwise, just skip this section)
You'll probably want to start by creating a new virtual environment: if you have Conda (e.g., Anaconda), you can do that by running
conda create --name pyo3-dev python=3.7
You can create your venv in some other way and call it whatever you want, but you do need the Python version to be at least 3.7.
You'll then need to activate the virtual environment. With Conda:
conda activate pyo3-dev
You can then install the maturin build tool in this environment with
pip install maturin
Finally, to compile the Rust code and create a Python library, run the following from the main directory of this repository:
maturin develop
(To get added performance, run with the --release
option, i.e., maturin develop --release
.)
The maturin develop
command builds the library and installs it on the active Python environment (the venv you created). If you instead want to build a wheel to install elsewhere, you'll need to use the maturin build
command.
As long as you have the venv you created active, you should then be able to import the Python bindings in a module called dynapai
.
Once installed, you can use this like any other Python package. Here's a simple example:
import dynapai as dp
prod_func = dp.ProdFunc(
a = [10., 10.],
alpha = [0.5, 0.5],
b = [10., 10.],
beta = [0.5, 0.5]
)
payoff_func = dp.PayoffFunc(
prod_func = prod_func,
risk_func = dp.RiskFunc.winner_only([0.5, 0.5]),
d = [1.0, 1.0],
r = [0.1, 0.1]
)
actions = dp.Actions.from_inputs([1., 1.], [2., 2.])
print(f"Payoff from actions: {payoff_func(actions)}")
This should print
Payoff from actions: [-0.2099315 -0.2099315]
To make sure things are working correctly, you can try running the demo.py
script in this directory. That script also gives a good idea of what features are available in the package.
AIIncentives.jl models games where players choose
The notation used is:
-
$\sigma_j$ is the probability of a safe outcome given that$j$ wins the contest -
$q_j$ is the probability that$j$ wins the contest -
$\rho_{ij}$ is the payoff to player$i$ if$j$ wins the contest -
$d_{ij}$ is the cost of an unsafe (disaster) outcome to player$i$ if$j$ wins the contest -
$c_i$ is the cost paid to use inputs$x_s, x_p$
This package can model those same games, but it can also model dynamic versions of those games, where players choose a schedule
In a typical case, we'll assume that players are exponential discounters; i.e., each player has some discount rate
Each player chooses a strategy
The number of time periods (
One interesting case is to assume that the game ends the first time someone wins the contest. In this case,
We might also impose a cutoff time
In addition to players choosing the amounts of effort that they want to expend on safety and performance (
(This is the form used by the InvestActions
type in the package.)
If we want to model technology/knowledge spillovers and assume that players can choose how much information they share, we can introduce choice variables
The interpretation here is that players may choose to share a fraction
(This is the form used by the SharingActions
type in the package.)
This section is meant to give an overview of how the package code is arranged.
The fundamental trait is the ActionType
, which represents actions for all players in a single time period. There are three action types implemented by default:
-
Actions
-- represents${x_s(t), x_p(t)}$ for some$t$ -
InvestActions
-- represents${x_s(t), x_p(t), i_s(t), i_p(t)}$ for some$t$ -
SharingActions
-- represents${x_s(t), x_p(t), i_s(t), i_p(t), ʃ_s(t), ʃ_p(t)}$ for some$t$
A vector of actions can be packaged into a Strategies
object, the sequence of actions that constitute a set of strategies for all players.
Key idea: actions are for a single period (but all players); strategies are for multiple time periods (and all players).
All of the following types of objects come associated with a specific action type. That means that you cannot define, for example, a payoff function meant for InvestActions
and then use it on SharingActions
-- you need to make sure that all the model components you use are compatible with the type of actions you want to work with.
An object with the PayoffFunction
trait defines a utility function (payoff) for a single period -- i.e. it maps from actions to player payoffs.
The default payoff function is the ModularPayoff
which calculates a payoff from the following components:
-
A production function, which determines how actions/strategies translate into the outputs
$s$ and$p$ . The default implementation has$s = Ax_s^\alpha$ and$p = Bx_p^\beta$ . When players make investments and/or share technology, the production function parameters are mutated between time periods. -
A risk function, which determines the probability of a disaster outcome conditional on each player winning the contest.
-
A reward function, which determines a matrix of rewards, what player
$i$ gets if player$j$ wins -
A contest success function (CSF), which determines the probability of winning for each player
-
A disaster cost function, which determines a matrix of penalties, what player
$i$ gets if player$j$ causes a disaster -
A cost function, which determines the price each player pays for their choice of actions
An object with the State
trait describes players beliefs about the payoff function in a given time period. A simple version is the CommonBeliefs
type, which just encapsulates a single payoff function.
An object with the Aggregator
trait can take a starting state and a strategy set (Strategies
object) and calculate (aggregate) players payoffs from those. The default implementations assume players discount future payoffs exponentially.