From 93aba11b40ceba6665a17996b0eab8d5abe75efa Mon Sep 17 00:00:00 2001 From: Zachary Clement Date: Thu, 18 May 2023 10:06:12 -0400 Subject: [PATCH] causal discovery post --- causal_discovery_store_distance.ipynb | 4190 +++++++++++++++++++++++++ 1 file changed, 4190 insertions(+) create mode 100644 causal_discovery_store_distance.ipynb diff --git a/causal_discovery_store_distance.ipynb b/causal_discovery_store_distance.ipynb new file mode 100644 index 0000000..2f5270a --- /dev/null +++ b/causal_discovery_store_distance.ipynb @@ -0,0 +1,4190 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "8a2b1d76", + "metadata": {}, + "source": [ + "# Causal discovery using Julia: A brief introduction, a simulation, and a possible use case\n", + "\n", + "In situations where the causal structure a dataset is unknown, algorithms can be used to estimate DAGs for use in causual inference." + ] + }, + { + "cell_type": "markdown", + "id": "34538ef6", + "metadata": {}, + "source": [ + "In [this blog post](https://medium.com/juliazoid/simulating-causal-effects-with-julia-47abca8ab73), I introduced the concept of a DAG (directed acyclic graph), and why a DAG is necessary for causal inference. While DAGs are a powerful tool for inferring causal effects, if an incorrect DAG is used to conduct an analysis, the resulting parameter estimates will be biased. In cases when researchers are uncertain about the causal structure of the data, but still want to conduct causal inference, causal discovery algorithms can be a useful as a first step to determining causal relationships" + ] + }, + { + "cell_type": "markdown", + "id": "fdfa8a70", + "metadata": {}, + "source": [ + "### Causal diagram used in this article\n", + "\n", + "In this article, we will simulate spending patterns at a store. We will have covariates for age, a person's distance from the store, a person's personal income, the population density of the person's neighborhood, and the zoning restrictions in place around the person's neighborhood. " + ] + }, + { + "cell_type": "code", + "execution_count": 63, + "id": "33762d17", + "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "\u001b[32m\u001b[1m Resolving\u001b[22m\u001b[39m package versions...\n", + "\u001b[32m\u001b[1m Installed\u001b[22m\u001b[39m CausalInference ─ v0.9.1\n", + "\u001b[32m\u001b[1m Updating\u001b[22m\u001b[39m `~/.julia/environments/v1.8/Project.toml`\n", + " \u001b[90m [8e462317] \u001b[39m\u001b[93m~ CausalInference v0.9.1 `/Volumes/SanDisk/repos/CausalInference.jl#tikz_better_regex` ⇒ v0.9.1\u001b[39m\n", + "\u001b[32m\u001b[1m Updating\u001b[22m\u001b[39m `~/.julia/environments/v1.8/Manifest.toml`\n", + " \u001b[90m [8e462317] \u001b[39m\u001b[93m~ CausalInference v0.9.1 `/Volumes/SanDisk/repos/CausalInference.jl#tikz_better_regex` ⇒ v0.9.1\u001b[39m\n", + "\u001b[32m\u001b[1mPrecompiling\u001b[22m\u001b[39m project...\n", + "\u001b[33m ✓ \u001b[39mCausalInference\n", + " 1 dependency successfully precompiled in 8 seconds. 329 already precompiled. 1 skipped during auto due to previous errors.\n", + " \u001b[33m1\u001b[39m dependency precompiled but a different version is currently loaded. Restart julia to access the new version\n" + ] + } + ], + "source": [ + "using Pkg\n", + "Pkg.add(\"CausalInference\")" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "04079c0c", + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "", + "image/svg+xml": [ + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + " \n", + " \n", + " Spending\n", + " \n", + " \n", + " \n", + " \n", + " Age\n", + " \n", + " \n", + " \n", + " \n", + " StoreDistance\n", + " \n", + " \n", + " \n", + " \n", + " PersonalIncome\n", + " \n", + " \n", + " \n", + " \n", + " PopulationDensity\n", + " \n", + " \n", + " \n", + " \n", + " Zoning\n", + " \n", + " \n", + "\n", + "\n" + ], + "text/html": [ + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + " \n", + " \n", + " Spending\n", + " \n", + " \n", + " \n", + " \n", + " Age\n", + " \n", + " \n", + " \n", + " \n", + " StoreDistance\n", + " \n", + " \n", + " \n", + " \n", + " PersonalIncome\n", + " \n", + " \n", + " \n", + " \n", + " PopulationDensity\n", + " \n", + " \n", + " \n", + " \n", + " Zoning\n", + " \n", + " \n", + "\n", + "\n", + "\n" + ], + "text/plain": [ + "Context(Measures.BoundingBox{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}, Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}((0.0w, 0.0h), (1.0w, 1.0h)), UnitBox{Float64, Float64, Float64, Float64}(-1.2, -1.2, 2.4, 2.4, 0.0mm, 0.0mm, 0.0mm, 0.0mm), nothing, nothing, nothing, List([Context(Measures.BoundingBox{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}, Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}((0.0w, 0.0h), (1.0w, 1.0h)), nothing, nothing, nothing, nothing, List([]), List([Compose.Form{Compose.LinePrimitive}(Compose.LinePrimitive[Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.8939339828220179cx, -0.8939339828220179cy), (0.8939339828220179cx, 0.8939339828220179cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.8658359213500126cx, -0.9329179606750063cy), (0.19916925468334587cx, -0.40041537265832705cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(0.40041537265832694cx, -0.199169254683346cy), (0.9329179606750063cx, 0.8658359213500126cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.85cx, 1.0cy), (0.85cx, 1.0cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.8939339828220179cx, 0.8939339828220179cy), (-0.4393993505113155cx, 0.4393993505113154cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.85cx, -0.33333333333333337cy), (0.18333333333333326cx, -0.33333333333333337cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.8939339828220179cx, -0.22726731615535126cy), (-0.4393993505113155cx, 0.22726731615535115cy)])], Symbol(\"\"))]), List([Compose.Property{Compose.LineWidthPrimitive}(Compose.LineWidthPrimitive[Compose.LineWidthPrimitive(1.2247448713915892mm)]), Compose.Property{Compose.FillPrimitive}(Compose.FillPrimitive[Compose.FillPrimitive(RGBA{Float64}(0.0,0.0,0.0,0.0))]), Compose.Property{Compose.StrokePrimitive}(Compose.StrokePrimitive[Compose.StrokePrimitive(RGBA{Float64}(0.8274509803921568,0.8274509803921568,0.8274509803921568,1.0))])]), 0, false, false, false, false, nothing, nothing, 0.0, Symbol(\"\")), Context(Measures.BoundingBox{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}, Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}((0.0w, 0.0h), (1.0w, 1.0h)), nothing, nothing, nothing, nothing, List([]), List([Compose.Form{Compose.LinePrimitive}(Compose.LinePrimitive[Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(0.8516721566479479cx, 0.8033032041183529cy), (0.8939339828220179cx, 0.8939339828220179cy), (0.8033032041183529cx, 0.8516721566479479cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(0.13041619736506527cx, -0.4730309158249882cy), (0.19916925468334587cx, -0.40041537265832705cy), (0.09982498575904862cx, -0.41184849261295486cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(0.9214848407203785cx, 0.7664916524257154cy), (0.9329179606750063cx, 0.8658359213500126cy), (0.8603024175083452cx, 0.797082864031732cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(0.7560307379214091cx, 0.9657979856674331cy), (0.85cx, 1.0cy), (0.7560307379214091cx, 1.034202014332567cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.5300301292149805cx, 0.48166117668538533cy), (-0.4393993505113155cx, 0.4393993505113154cy), (-0.48166117668538544cx, 0.5300301292149804cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(0.08936407125474241cx, -0.36753534766590024cy), (0.18333333333333326cx, -0.33333333333333337cy), (0.08936407125474241cx, -0.2991313190007665cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.48166117668538544cx, 0.13663653745168614cy), (-0.4393993505113155cx, 0.22726731615535115cy), (-0.5300301292149805cx, 0.1850054899812812cy)])], Symbol(\"\"))]), List([Compose.Property{Compose.LineWidthPrimitive}(Compose.LineWidthPrimitive[Compose.LineWidthPrimitive(1.2247448713915892mm)]), Compose.Property{Compose.StrokePrimitive}(Compose.StrokePrimitive[Compose.StrokePrimitive(RGBA{Float64}(0.8274509803921568,0.8274509803921568,0.8274509803921568,1.0))])]), 0, false, false, false, false, nothing, nothing, 0.0, Symbol(\"\")), Context(Measures.BoundingBox{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}, Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}((0.0w, 0.0h), (1.0w, 1.0h)), nothing, nothing, nothing, nothing, List([]), List([]), List([Compose.Property{Compose.FontSizePrimitive}(Compose.FontSizePrimitive[Compose.FontSizePrimitive(4.0mm)]), Compose.Property{Compose.StrokePrimitive}(Compose.StrokePrimitive[Compose.StrokePrimitive(RGBA{Float64}(0.0,0.0,0.0,0.0))]), Compose.Property{Compose.FillPrimitive}(Compose.FillPrimitive[Compose.FillPrimitive(RGBA{Float64}(0.0,0.0,0.0,1.0))])]), 0, false, false, false, false, nothing, nothing, 0.0, Symbol(\"\")), Context(Measures.BoundingBox{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}, Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}((0.0w, 0.0h), (1.0w, 1.0h)), nothing, nothing, nothing, nothing, List([]), List([Compose.Form{Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}}(Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}[Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}((1.0cx, 1.0cy), 0.06w), Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}((-1.0cx, -1.0cy), 0.06w), Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}((0.33333333333333326cx, -0.33333333333333337cy), 0.06w), Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}((-1.0cx, 1.0cy), 0.06w), Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}((-0.33333333333333337cx, 0.33333333333333326cy), 0.06w), Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}((-1.0cx, -0.33333333333333337cy), 0.06w)], Symbol(\"\"))]), List([Compose.Property{Compose.LineWidthPrimitive}(Compose.LineWidthPrimitive[Compose.LineWidthPrimitive(0.0mm), Compose.LineWidthPrimitive(0.0mm), Compose.LineWidthPrimitive(0.0mm), Compose.LineWidthPrimitive(0.0mm), Compose.LineWidthPrimitive(0.0mm), Compose.LineWidthPrimitive(0.0mm)]), Compose.Property{Compose.StrokePrimitive}(Compose.StrokePrimitive[Compose.StrokePrimitive(RGBA{Float64}(0.0,0.0,0.0,1.0))]), Compose.Property{Compose.FillPrimitive}(Compose.FillPrimitive[Compose.FillPrimitive(RGBA{Float64}(0.25098039215686274,0.8784313725490196,0.8156862745098039,1.0)), Compose.FillPrimitive(RGBA{Float64}(0.25098039215686274,0.8784313725490196,0.8156862745098039,1.0)), Compose.FillPrimitive(RGBA{Float64}(0.25098039215686274,0.8784313725490196,0.8156862745098039,1.0)), Compose.FillPrimitive(RGBA{Float64}(0.8274509803921568,0.8274509803921568,0.8274509803921568,1.0)), Compose.FillPrimitive(RGBA{Float64}(0.25098039215686274,0.8784313725490196,0.8156862745098039,1.0)), Compose.FillPrimitive(RGBA{Float64}(0.8274509803921568,0.8274509803921568,0.8274509803921568,1.0))])]), 0, false, false, false, false, nothing, nothing, 0.0, Symbol(\"\")), Context(Measures.BoundingBox{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}, Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}((0.0w, 0.0h), (1.0w, 1.0h)), nothing, nothing, nothing, nothing, List([]), List([Compose.Form{Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}}(Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}[Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}((1.0cx, 1.0cy), \"Spending\", Compose.HCenter(), Compose.VCenter(), Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}(0.0, (0.5w, 0.5h)), (0.0mm, 0.0mm)), Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}((-1.0cx, -1.0cy), \"Age\", Compose.HCenter(), Compose.VCenter(), Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}(0.0, (0.5w, 0.5h)), (0.0mm, 0.0mm)), Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}((0.33333333333333326cx, -0.33333333333333337cy), \"Store\\nDistance\", Compose.HCenter(), Compose.VCenter(), Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}(0.0, (0.5w, 0.5h)), (0.0mm, 0.0mm)), Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}((-1.0cx, 1.0cy), \"Personal\\nIncome\", Compose.HCenter(), Compose.VCenter(), Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}(0.0, (0.5w, 0.5h)), (0.0mm, 0.0mm)), Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}((-0.33333333333333337cx, 0.33333333333333326cy), \"Population\\nDensity\", Compose.HCenter(), Compose.VCenter(), Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}(0.0, (0.5w, 0.5h)), (0.0mm, 0.0mm)), Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}((-1.0cx, -0.33333333333333337cy), \"Zoning\", Compose.HCenter(), Compose.VCenter(), Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}(0.0, (0.5w, 0.5h)), (0.0mm, 0.0mm))], Symbol(\"\"))]), List([Compose.Property{Compose.FontSizePrimitive}(Compose.FontSizePrimitive[Compose.FontSizePrimitive(3.0mm)]), Compose.Property{Compose.StrokePrimitive}(Compose.StrokePrimitive[Compose.StrokePrimitive(RGBA{Float64}(0.0,0.0,0.0,0.0))]), Compose.Property{Compose.FillPrimitive}(Compose.FillPrimitive[Compose.FillPrimitive(RGBA{Float64}(0.0,0.0,0.0,1.0))])]), 0, false, false, false, false, nothing, nothing, 0.0, Symbol(\"\"))]), List([]), List([]), 0, false, false, false, false, nothing, nothing, 0.0, Symbol(\"\"))" + ] + }, + "execution_count": 4, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "using Graphs, GraphPlot\n", + "using Compose, Cairo, Fontconfig\n", + "using Colors\n", + "\n", + "g = SimpleDiGraph(6)\n", + "add_edge!(g, 3,1)\n", + "add_edge!(g, 4,1)\n", + "add_edge!(g, 4,5)\n", + "add_edge!(g, 6,3)\n", + "add_edge!(g, 6,5)\n", + "add_edge!(g, 2,1)\n", + "add_edge!(g, 2,3)\n", + "\n", + "\n", + "\n", + "\n", + "nodelabel = [\"Spending\", \"Age\", \"Store\\nDistance\",\n", + " \"Personal\\nIncome\", \"Population\\nDensity\", \"Zoning\"\n", + "\n", + "]\n", + "x_dims = Dict(\"Spending\" =>4, \n", + " \"Age\"=>1, \n", + " \"Store\\nDistance\"=>3, \n", + " \"Personal\\nIncome\"=>1, \n", + " \"Population\\nDensity\"=>2, \n", + " \"Zoning\"=>1)\n", + "locs_x = [x_dims[label] for label in nodelabel]\n", + "y_dims = Dict(\"Spending\" =>4, \n", + " \"Age\"=>1, \n", + " \"Gender\"=>1, \n", + " \"Store\\nDistance\"=>2, \n", + " \"Personal\\nIncome\"=>4, \n", + " \"Population\\nDensity\"=>3, \n", + " \"Zoning\"=>2)\n", + "locs_y = [y_dims[label] for label in nodelabel]\n", + "\n", + "\n", + "\n", + "nodefillc_dict = \n", + "Dict(\"Spending\" =>colorant\"turquoise\", \n", + " \"Age\"=>colorant\"turquoise\", \n", + " \"Gender\"=>colorant\"turquoise\", \n", + " \"Store\\nDistance\"=>colorant\"turquoise\", \n", + " \"Personal\\nIncome\"=>colorant\"lightgrey\", \n", + " \"Population\\nDensity\"=>colorant\"turquoise\", \n", + " \"Zoning\"=>colorant\"lightgrey\")\n", + "\n", + "nodefillc = [nodefillc_dict[label] for label in nodelabel]\n", + "\n", + "nodestrokelw = [0, 0, 0, 0, 0, 0]\n", + "\n", + "\n", + "g_plot = gplot(g, locs_x, locs_y, nodelabel=nodelabel, \n", + " nodestrokec = \"black\", nodestrokelw = nodestrokelw, NODESIZE = .15, NODELABELSIZE = 3, nodefillc=nodefillc)\n", + "draw(PNG(\"causal_discovery_1.png\", 16cm, 16cm),g_plot)\n", + "g_plot" + ] + }, + { + "cell_type": "markdown", + "id": "4ee968ad", + "metadata": {}, + "source": [ + "As shown in the DAG, we will assume that: \n", + "1. Older people spend more than younger people\n", + "2. Older people will choose to live farther from stores than younger people (this is kind of contrived, but my wife has told me before that she wouldn't ever want to live more than an hour from a Target, so it could be true in some hypothetical world)\n", + "3. Zoning requirements may force stores to be located at a greater distance from the individual\n", + "4. Zoning requirements influence population density in the area\n", + "5. Individuals with higher personal income levels will be willing to live in areas of higher population density\n", + "6. Individuals with higher personal income levels will spend more at stores\n", + "7. There are no outside variables that are a common cause of any variables we have mentioned\n", + "\n", + "We will further assume that information on age, distance from stores, spending patterns, and populationi density in an individual's zip code are available to the analysis, but that information on zoning and personal income is expensive to obtain. \n", + "\n", + "This simulation is intended to evaluate whether causal discovery may be used in a specific use case. Imagine that we are working on a marketing team, and we want to get a precise estimate of the effect of distance from the store on spending patterns. Further imagine that we want to avoid spending too much money on obtaining information regarding zoning patterns and personal income, but that that information is available at a given cost per customer. \n", + "\n", + "In this scenario, we might be able to spend a relatively small amount of money to obtain all available for a small subset of the patients, and use that data to create a causal diagram. Then, we could use that causal diagram to determine which variables in our freely available dataset we should condition on to do our analysis." + ] + }, + { + "cell_type": "markdown", + "id": "ecc1ed04", + "metadata": {}, + "source": [ + "By looking at the DAG, we can see that we can block the backdoor path by conditioning by age alone. " + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "2ea5dd64", + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "", + "image/svg+xml": [ + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + " \n", + " \n", + " Spending\n", + " \n", + " \n", + " \n", + " \n", + " Age\n", + " \n", + " \n", + " \n", + " \n", + " StoreDistance\n", + " \n", + " \n", + " \n", + " \n", + " PersonalIncome\n", + " \n", + " \n", + " \n", + " \n", + " PopulationDensity\n", + " \n", + " \n", + " \n", + " \n", + " Zoning\n", + " \n", + " \n", + "\n", + "\n" + ], + "text/html": [ + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + " \n", + " \n", + " Spending\n", + " \n", + " \n", + " \n", + " \n", + " Age\n", + " \n", + " \n", + " \n", + " \n", + " StoreDistance\n", + " \n", + " \n", + " \n", + " \n", + " PersonalIncome\n", + " \n", + " \n", + " \n", + " \n", + " PopulationDensity\n", + " \n", + " \n", + " \n", + " \n", + " Zoning\n", + " \n", + " \n", + "\n", + "\n", + "\n" + ], + "text/plain": [ + "Context(Measures.BoundingBox{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}, Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}((0.0w, 0.0h), (1.0w, 1.0h)), UnitBox{Float64, Float64, Float64, Float64}(-1.2, -1.2, 2.4, 2.4, 0.0mm, 0.0mm, 0.0mm, 0.0mm), nothing, nothing, nothing, List([Context(Measures.BoundingBox{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}, Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}((0.0w, 0.0h), (1.0w, 1.0h)), nothing, nothing, nothing, nothing, List([]), List([Compose.Form{Compose.LinePrimitive}(Compose.LinePrimitive[Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.8939339828220179cx, -0.8939339828220179cy), (0.8939339828220179cx, 0.8939339828220179cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.8658359213500126cx, -0.9329179606750063cy), (0.19916925468334587cx, -0.40041537265832705cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(0.40041537265832694cx, -0.199169254683346cy), (0.9329179606750063cx, 0.8658359213500126cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.85cx, 1.0cy), (0.85cx, 1.0cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.8939339828220179cx, 0.8939339828220179cy), (-0.4393993505113155cx, 0.4393993505113154cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.85cx, -0.33333333333333337cy), (0.18333333333333326cx, -0.33333333333333337cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.8939339828220179cx, -0.22726731615535126cy), (-0.4393993505113155cx, 0.22726731615535115cy)])], Symbol(\"\"))]), List([Compose.Property{Compose.LineWidthPrimitive}(Compose.LineWidthPrimitive[Compose.LineWidthPrimitive(1.2247448713915892mm)]), Compose.Property{Compose.FillPrimitive}(Compose.FillPrimitive[Compose.FillPrimitive(RGBA{Float64}(0.0,0.0,0.0,0.0))]), Compose.Property{Compose.StrokePrimitive}(Compose.StrokePrimitive[Compose.StrokePrimitive(RGBA{Float64}(0.8274509803921568,0.8274509803921568,0.8274509803921568,1.0))])]), 0, false, false, false, false, nothing, nothing, 0.0, Symbol(\"\")), Context(Measures.BoundingBox{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}, Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}((0.0w, 0.0h), (1.0w, 1.0h)), nothing, nothing, nothing, nothing, List([]), List([Compose.Form{Compose.LinePrimitive}(Compose.LinePrimitive[Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(0.8516721566479479cx, 0.8033032041183529cy), (0.8939339828220179cx, 0.8939339828220179cy), (0.8033032041183529cx, 0.8516721566479479cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(0.13041619736506527cx, -0.4730309158249882cy), (0.19916925468334587cx, -0.40041537265832705cy), (0.09982498575904862cx, -0.41184849261295486cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(0.9214848407203785cx, 0.7664916524257154cy), (0.9329179606750063cx, 0.8658359213500126cy), (0.8603024175083452cx, 0.797082864031732cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(0.7560307379214091cx, 0.9657979856674331cy), (0.85cx, 1.0cy), (0.7560307379214091cx, 1.034202014332567cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.5300301292149805cx, 0.48166117668538533cy), (-0.4393993505113155cx, 0.4393993505113154cy), (-0.48166117668538544cx, 0.5300301292149804cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(0.08936407125474241cx, -0.36753534766590024cy), (0.18333333333333326cx, -0.33333333333333337cy), (0.08936407125474241cx, -0.2991313190007665cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.48166117668538544cx, 0.13663653745168614cy), (-0.4393993505113155cx, 0.22726731615535115cy), (-0.5300301292149805cx, 0.1850054899812812cy)])], Symbol(\"\"))]), List([Compose.Property{Compose.LineWidthPrimitive}(Compose.LineWidthPrimitive[Compose.LineWidthPrimitive(1.2247448713915892mm)]), Compose.Property{Compose.StrokePrimitive}(Compose.StrokePrimitive[Compose.StrokePrimitive(RGBA{Float64}(0.8274509803921568,0.8274509803921568,0.8274509803921568,1.0))])]), 0, false, false, false, false, nothing, nothing, 0.0, Symbol(\"\")), Context(Measures.BoundingBox{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}, Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}((0.0w, 0.0h), (1.0w, 1.0h)), nothing, nothing, nothing, nothing, List([]), List([]), List([Compose.Property{Compose.FontSizePrimitive}(Compose.FontSizePrimitive[Compose.FontSizePrimitive(4.0mm)]), Compose.Property{Compose.StrokePrimitive}(Compose.StrokePrimitive[Compose.StrokePrimitive(RGBA{Float64}(0.0,0.0,0.0,0.0))]), Compose.Property{Compose.FillPrimitive}(Compose.FillPrimitive[Compose.FillPrimitive(RGBA{Float64}(0.0,0.0,0.0,1.0))])]), 0, false, false, false, false, nothing, nothing, 0.0, Symbol(\"\")), Context(Measures.BoundingBox{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}, Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}((0.0w, 0.0h), (1.0w, 1.0h)), nothing, nothing, nothing, nothing, List([]), List([Compose.Form{Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}}(Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}[Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}((1.0cx, 1.0cy), 0.06w), Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}((-1.0cx, -1.0cy), 0.06w), Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}((0.33333333333333326cx, -0.33333333333333337cy), 0.06w), Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}((-1.0cx, 1.0cy), 0.06w), Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}((-0.33333333333333337cx, 0.33333333333333326cy), 0.06w), Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}((-1.0cx, -0.33333333333333337cy), 0.06w)], Symbol(\"\"))]), List([Compose.Property{Compose.LineWidthPrimitive}(Compose.LineWidthPrimitive[Compose.LineWidthPrimitive(0.0mm), Compose.LineWidthPrimitive(1.2247448713915892mm), Compose.LineWidthPrimitive(0.0mm), Compose.LineWidthPrimitive(0.0mm), Compose.LineWidthPrimitive(0.0mm), Compose.LineWidthPrimitive(0.0mm)]), Compose.Property{Compose.StrokePrimitive}(Compose.StrokePrimitive[Compose.StrokePrimitive(RGBA{Float64}(0.0,0.0,0.0,1.0))]), Compose.Property{Compose.FillPrimitive}(Compose.FillPrimitive[Compose.FillPrimitive(RGBA{Float64}(0.25098039215686274,0.8784313725490196,0.8156862745098039,1.0)), Compose.FillPrimitive(RGBA{Float64}(0.25098039215686274,0.8784313725490196,0.8156862745098039,1.0)), Compose.FillPrimitive(RGBA{Float64}(0.25098039215686274,0.8784313725490196,0.8156862745098039,1.0)), Compose.FillPrimitive(RGBA{Float64}(0.8274509803921568,0.8274509803921568,0.8274509803921568,1.0)), Compose.FillPrimitive(RGBA{Float64}(0.25098039215686274,0.8784313725490196,0.8156862745098039,1.0)), Compose.FillPrimitive(RGBA{Float64}(0.8274509803921568,0.8274509803921568,0.8274509803921568,1.0))])]), 0, false, false, false, false, nothing, nothing, 0.0, Symbol(\"\")), Context(Measures.BoundingBox{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}, Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}((0.0w, 0.0h), (1.0w, 1.0h)), nothing, nothing, nothing, nothing, List([]), List([Compose.Form{Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}}(Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}[Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}((1.0cx, 1.0cy), \"Spending\", Compose.HCenter(), Compose.VCenter(), Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}(0.0, (0.5w, 0.5h)), (0.0mm, 0.0mm)), Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}((-1.0cx, -1.0cy), \"Age\", Compose.HCenter(), Compose.VCenter(), Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}(0.0, (0.5w, 0.5h)), (0.0mm, 0.0mm)), Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}((0.33333333333333326cx, -0.33333333333333337cy), \"Store\\nDistance\", Compose.HCenter(), Compose.VCenter(), Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}(0.0, (0.5w, 0.5h)), (0.0mm, 0.0mm)), Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}((-1.0cx, 1.0cy), \"Personal\\nIncome\", Compose.HCenter(), Compose.VCenter(), Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}(0.0, (0.5w, 0.5h)), (0.0mm, 0.0mm)), Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}((-0.33333333333333337cx, 0.33333333333333326cy), \"Population\\nDensity\", Compose.HCenter(), Compose.VCenter(), Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}(0.0, (0.5w, 0.5h)), (0.0mm, 0.0mm)), Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}((-1.0cx, -0.33333333333333337cy), \"Zoning\", Compose.HCenter(), Compose.VCenter(), Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}(0.0, (0.5w, 0.5h)), (0.0mm, 0.0mm))], Symbol(\"\"))]), List([Compose.Property{Compose.FontSizePrimitive}(Compose.FontSizePrimitive[Compose.FontSizePrimitive(3.0mm)]), Compose.Property{Compose.StrokePrimitive}(Compose.StrokePrimitive[Compose.StrokePrimitive(RGBA{Float64}(0.0,0.0,0.0,0.0))]), Compose.Property{Compose.FillPrimitive}(Compose.FillPrimitive[Compose.FillPrimitive(RGBA{Float64}(0.0,0.0,0.0,1.0))])]), 0, false, false, false, false, nothing, nothing, 0.0, Symbol(\"\"))]), List([]), List([]), 0, false, false, false, false, nothing, nothing, 0.0, Symbol(\"\"))" + ] + }, + "execution_count": 5, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "\n", + "\n", + "nodestrokelw = [0, 2, 0, 0, 0, 0]\n", + "\n", + "\n", + "g_plot = gplot(g, locs_x, locs_y, nodelabel=nodelabel, \n", + " nodestrokec = \"black\", nodestrokelw = nodestrokelw, NODESIZE = .15, NODELABELSIZE = 3, nodefillc=nodefillc)\n", + "draw(PNG(\"causal_discovery_2.png\", 16cm, 16cm),g_plot)\n", + "g_plot" + ] + }, + { + "cell_type": "markdown", + "id": "2bb05756", + "metadata": {}, + "source": [ + "However, if we were to condition on population density, we would be conditioning on a collider (also known as [M bias](https://towardsdatascience.com/causal-inference-in-data-science-structure-of-m-bias-with-confounding-adjustment-70e4a263ad08)) and opening a backdoor path between store distance and spending." + ] + }, + { + "cell_type": "code", + "execution_count": 64, + "id": "3ec1d23c", + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "", + "image/svg+xml": [ + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + " \n", + " \n", + " Spending\n", + " \n", + " \n", + " \n", + " \n", + " Age\n", + " \n", + " \n", + " \n", + " \n", + " StoreDistance\n", + " \n", + " \n", + " \n", + " \n", + " PersonalIncome\n", + " \n", + " \n", + " \n", + " \n", + " PopulationDensity\n", + " \n", + " \n", + " \n", + " \n", + " Zoning\n", + " \n", + " \n", + "\n", + "\n" + ], + "text/html": [ + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + " \n", + " \n", + " Spending\n", + " \n", + " \n", + " \n", + " \n", + " Age\n", + " \n", + " \n", + " \n", + " \n", + " StoreDistance\n", + " \n", + " \n", + " \n", + " \n", + " PersonalIncome\n", + " \n", + " \n", + " \n", + " \n", + " PopulationDensity\n", + " \n", + " \n", + " \n", + " \n", + " Zoning\n", + " \n", + " \n", + "\n", + "\n", + "\n" + ], + "text/plain": [ + "Context(BBox{l,t,r,b,w,h = 0.0w,0.0h, 1.0w,1.0h, 1.0w,1.0h}, UnitBox{Float64, Float64, Float64, Float64}(-1.2, -1.2, 2.4, 2.4, 0.0mm, 0.0mm, 0.0mm, 0.0mm), nothing, nothing, nothing, List([Context(BBox{l,t,r,b,w,h = 0.0w,0.0h, 1.0w,1.0h, 1.0w,1.0h}, nothing, nothing, nothing, nothing, List([]), List([Compose.Form{Compose.LinePrimitive}(Compose.LinePrimitive[Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.8939339828220179cx, -0.8939339828220179cy), (0.8939339828220179cx, 0.8939339828220179cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.8658359213500126cx, -0.9329179606750063cy), (0.19916925468334587cx, -0.40041537265832705cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(0.40041537265832694cx, -0.199169254683346cy), (0.9329179606750063cx, 0.8658359213500126cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.85cx, 1.0cy), (0.85cx, 1.0cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.8939339828220179cx, 0.8939339828220179cy), (-0.4393993505113155cx, 0.4393993505113154cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.85cx, -0.33333333333333337cy), (0.18333333333333326cx, -0.33333333333333337cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.8939339828220179cx, -0.22726731615535126cy), (-0.4393993505113155cx, 0.22726731615535115cy)])], Symbol(\"\"))]), List([Compose.Property{Compose.LineWidthPrimitive}(Compose.LineWidthPrimitive[Compose.LineWidthPrimitive(1.2247448713915892mm)]), Compose.Property{Compose.FillPrimitive}(Compose.FillPrimitive[Compose.FillPrimitive(RGBA{Float64}(0.0,0.0,0.0,0.0))]), Compose.Property{Compose.StrokePrimitive}(Compose.StrokePrimitive[Compose.StrokePrimitive(RGBA{Float64}(0.8274509803921568,0.8274509803921568,0.8274509803921568,1.0))])]), 0, false, false, false, false, nothing, nothing, 0.0, Symbol(\"\")), Context(BBox{l,t,r,b,w,h = 0.0w,0.0h, 1.0w,1.0h, 1.0w,1.0h}, nothing, nothing, nothing, nothing, List([]), List([Compose.Form{Compose.LinePrimitive}(Compose.LinePrimitive[Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(0.8516721566479479cx, 0.8033032041183529cy), (0.8939339828220179cx, 0.8939339828220179cy), (0.8033032041183529cx, 0.8516721566479479cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(0.13041619736506527cx, -0.4730309158249882cy), (0.19916925468334587cx, -0.40041537265832705cy), (0.09982498575904862cx, -0.41184849261295486cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(0.9214848407203785cx, 0.7664916524257154cy), (0.9329179606750063cx, 0.8658359213500126cy), (0.8603024175083452cx, 0.797082864031732cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(0.7560307379214091cx, 0.9657979856674331cy), (0.85cx, 1.0cy), (0.7560307379214091cx, 1.034202014332567cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.5300301292149805cx, 0.48166117668538533cy), (-0.4393993505113155cx, 0.4393993505113154cy), (-0.48166117668538544cx, 0.5300301292149804cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(0.08936407125474241cx, -0.36753534766590024cy), (0.18333333333333326cx, -0.33333333333333337cy), (0.08936407125474241cx, -0.2991313190007665cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.48166117668538544cx, 0.13663653745168614cy), (-0.4393993505113155cx, 0.22726731615535115cy), (-0.5300301292149805cx, 0.1850054899812812cy)])], Symbol(\"\"))]), List([Compose.Property{Compose.LineWidthPrimitive}(Compose.LineWidthPrimitive[Compose.LineWidthPrimitive(1.2247448713915892mm)]), Compose.Property{Compose.StrokePrimitive}(Compose.StrokePrimitive[Compose.StrokePrimitive(RGBA{Float64}(0.8274509803921568,0.8274509803921568,0.8274509803921568,1.0))])]), 0, false, false, false, false, nothing, nothing, 0.0, Symbol(\"\")), Context(BBox{l,t,r,b,w,h = 0.0w,0.0h, 1.0w,1.0h, 1.0w,1.0h}, nothing, nothing, nothing, nothing, List([]), List([]), List([Compose.Property{Compose.FontSizePrimitive}(Compose.FontSizePrimitive[Compose.FontSizePrimitive(4.0mm)]), Compose.Property{Compose.StrokePrimitive}(Compose.StrokePrimitive[Compose.StrokePrimitive(RGBA{Float64}(0.0,0.0,0.0,0.0))]), Compose.Property{Compose.FillPrimitive}(Compose.FillPrimitive[Compose.FillPrimitive(RGBA{Float64}(0.0,0.0,0.0,1.0))])]), 0, false, false, false, false, nothing, nothing, 0.0, Symbol(\"\")), Context(BBox{l,t,r,b,w,h = 0.0w,0.0h, 1.0w,1.0h, 1.0w,1.0h}, nothing, nothing, nothing, nothing, List([]), List([Compose.Form{Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}}(Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}[Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}((1.0cx, 1.0cy), 0.06w), Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}((-1.0cx, -1.0cy), 0.06w), Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}((0.33333333333333326cx, -0.33333333333333337cy), 0.06w), Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}((-1.0cx, 1.0cy), 0.06w), Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}((-0.33333333333333337cx, 0.33333333333333326cy), 0.06w), Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}((-1.0cx, -0.33333333333333337cy), 0.06w)], Symbol(\"\"))]), List([Compose.Property{Compose.LineWidthPrimitive}(Compose.LineWidthPrimitive[Compose.LineWidthPrimitive(0.0mm), Compose.LineWidthPrimitive(1.2247448713915892mm), Compose.LineWidthPrimitive(0.0mm), Compose.LineWidthPrimitive(0.0mm), Compose.LineWidthPrimitive(1.2247448713915892mm), Compose.LineWidthPrimitive(0.0mm)]), Compose.Property{Compose.StrokePrimitive}(Compose.StrokePrimitive[Compose.StrokePrimitive(RGBA{Float64}(0.0,0.0,0.0,1.0))]), Compose.Property{Compose.FillPrimitive}(Compose.FillPrimitive[Compose.FillPrimitive(RGBA{Float64}(0.25098039215686274,0.8784313725490196,0.8156862745098039,1.0)), Compose.FillPrimitive(RGBA{Float64}(0.25098039215686274,0.8784313725490196,0.8156862745098039,1.0)), Compose.FillPrimitive(RGBA{Float64}(0.25098039215686274,0.8784313725490196,0.8156862745098039,1.0)), Compose.FillPrimitive(RGBA{Float64}(0.8274509803921568,0.8274509803921568,0.8274509803921568,1.0)), Compose.FillPrimitive(RGBA{Float64}(0.25098039215686274,0.8784313725490196,0.8156862745098039,1.0)), Compose.FillPrimitive(RGBA{Float64}(0.8274509803921568,0.8274509803921568,0.8274509803921568,1.0))])]), 0, false, false, false, false, nothing, nothing, 0.0, Symbol(\"\")), Context(BBox{l,t,r,b,w,h = 0.0w,0.0h, 1.0w,1.0h, 1.0w,1.0h}, nothing, nothing, nothing, nothing, List([]), List([Compose.Form{Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}}(Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}[Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}((1.0cx, 1.0cy), \"Spending\", Compose.HCenter(), Compose.VCenter(), Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}(0.0, (0.5w, 0.5h)), (0.0mm, 0.0mm)), Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}((-1.0cx, -1.0cy), \"Age\", Compose.HCenter(), Compose.VCenter(), Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}(0.0, (0.5w, 0.5h)), (0.0mm, 0.0mm)), Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}((0.33333333333333326cx, -0.33333333333333337cy), \"Store\\nDistance\", Compose.HCenter(), Compose.VCenter(), Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}(0.0, (0.5w, 0.5h)), (0.0mm, 0.0mm)), Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}((-1.0cx, 1.0cy), \"Personal\\nIncome\", Compose.HCenter(), Compose.VCenter(), Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}(0.0, (0.5w, 0.5h)), (0.0mm, 0.0mm)), Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}((-0.33333333333333337cx, 0.33333333333333326cy), \"Population\\nDensity\", Compose.HCenter(), Compose.VCenter(), Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}(0.0, (0.5w, 0.5h)), (0.0mm, 0.0mm)), Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}((-1.0cx, -0.33333333333333337cy), \"Zoning\", Compose.HCenter(), Compose.VCenter(), Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}(0.0, (0.5w, 0.5h)), (0.0mm, 0.0mm))], Symbol(\"\"))]), List([Compose.Property{Compose.FontSizePrimitive}(Compose.FontSizePrimitive[Compose.FontSizePrimitive(3.0mm)]), Compose.Property{Compose.StrokePrimitive}(Compose.StrokePrimitive[Compose.StrokePrimitive(RGBA{Float64}(0.0,0.0,0.0,0.0))]), Compose.Property{Compose.FillPrimitive}(Compose.FillPrimitive[Compose.FillPrimitive(RGBA{Float64}(0.0,0.0,0.0,1.0))])]), 0, false, false, false, false, nothing, nothing, 0.0, Symbol(\"\"))]), List([]), List([]), 0, false, false, false, false, nothing, nothing, 0.0, Symbol(\"\"))" + ] + }, + "execution_count": 64, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "nodestrokelw = [0, 2, 0, 0, 2, 0]\n", + "\n", + "\n", + "g_plot = gplot(g, locs_x, locs_y, nodelabel=nodelabel, \n", + " nodestrokec = \"black\", nodestrokelw = nodestrokelw, NODESIZE = .15, NODELABELSIZE = 3, nodefillc=nodefillc)\n", + "draw(PNG(\"causal_discovery_m_bias.png\", 16cm, 16cm),g_plot)\n", + "g_plot" + ] + }, + { + "cell_type": "markdown", + "id": "d5a11e41", + "metadata": {}, + "source": [ + "We could of course condition on all of these variables, but it would be expensive to obtain data for all customers, and our confidence intervals would be wide because of [multicollinearity](https://en.wikipedia.org/wiki/Multicollinearity). " + ] + }, + { + "cell_type": "code", + "execution_count": 66, + "id": "59598840", + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "", + "image/svg+xml": [ + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + " \n", + " \n", + " Spending\n", + " \n", + " \n", + " \n", + " \n", + " Age\n", + " \n", + " \n", + " \n", + " \n", + " StoreDistance\n", + " \n", + " \n", + " \n", + " \n", + " PersonalIncome\n", + " \n", + " \n", + " \n", + " \n", + " PopulationDensity\n", + " \n", + " \n", + " \n", + " \n", + " Zoning\n", + " \n", + " \n", + "\n", + "\n" + ], + "text/html": [ + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + " \n", + " \n", + " Spending\n", + " \n", + " \n", + " \n", + " \n", + " Age\n", + " \n", + " \n", + " \n", + " \n", + " StoreDistance\n", + " \n", + " \n", + " \n", + " \n", + " PersonalIncome\n", + " \n", + " \n", + " \n", + " \n", + " PopulationDensity\n", + " \n", + " \n", + " \n", + " \n", + " Zoning\n", + " \n", + " \n", + "\n", + "\n", + "\n" + ], + "text/plain": [ + "Context(BBox{l,t,r,b,w,h = 0.0w,0.0h, 1.0w,1.0h, 1.0w,1.0h}, UnitBox{Float64, Float64, Float64, Float64}(-1.2, -1.2, 2.4, 2.4, 0.0mm, 0.0mm, 0.0mm, 0.0mm), nothing, nothing, nothing, List([Context(BBox{l,t,r,b,w,h = 0.0w,0.0h, 1.0w,1.0h, 1.0w,1.0h}, nothing, nothing, nothing, nothing, List([]), List([Compose.Form{Compose.LinePrimitive}(Compose.LinePrimitive[Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.8939339828220179cx, -0.8939339828220179cy), (0.8939339828220179cx, 0.8939339828220179cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.8658359213500126cx, -0.9329179606750063cy), (0.19916925468334587cx, -0.40041537265832705cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(0.40041537265832694cx, -0.199169254683346cy), (0.9329179606750063cx, 0.8658359213500126cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.85cx, 1.0cy), (0.85cx, 1.0cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.8939339828220179cx, 0.8939339828220179cy), (-0.4393993505113155cx, 0.4393993505113154cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.85cx, -0.33333333333333337cy), (0.18333333333333326cx, -0.33333333333333337cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.8939339828220179cx, -0.22726731615535126cy), (-0.4393993505113155cx, 0.22726731615535115cy)])], Symbol(\"\"))]), List([Compose.Property{Compose.LineWidthPrimitive}(Compose.LineWidthPrimitive[Compose.LineWidthPrimitive(1.2247448713915892mm)]), Compose.Property{Compose.FillPrimitive}(Compose.FillPrimitive[Compose.FillPrimitive(RGBA{Float64}(0.0,0.0,0.0,0.0))]), Compose.Property{Compose.StrokePrimitive}(Compose.StrokePrimitive[Compose.StrokePrimitive(RGBA{Float64}(0.8274509803921568,0.8274509803921568,0.8274509803921568,1.0))])]), 0, false, false, false, false, nothing, nothing, 0.0, Symbol(\"\")), Context(BBox{l,t,r,b,w,h = 0.0w,0.0h, 1.0w,1.0h, 1.0w,1.0h}, nothing, nothing, nothing, nothing, List([]), List([Compose.Form{Compose.LinePrimitive}(Compose.LinePrimitive[Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(0.8516721566479479cx, 0.8033032041183529cy), (0.8939339828220179cx, 0.8939339828220179cy), (0.8033032041183529cx, 0.8516721566479479cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(0.13041619736506527cx, -0.4730309158249882cy), (0.19916925468334587cx, -0.40041537265832705cy), (0.09982498575904862cx, -0.41184849261295486cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(0.9214848407203785cx, 0.7664916524257154cy), (0.9329179606750063cx, 0.8658359213500126cy), (0.8603024175083452cx, 0.797082864031732cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(0.7560307379214091cx, 0.9657979856674331cy), (0.85cx, 1.0cy), (0.7560307379214091cx, 1.034202014332567cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.5300301292149805cx, 0.48166117668538533cy), (-0.4393993505113155cx, 0.4393993505113154cy), (-0.48166117668538544cx, 0.5300301292149804cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(0.08936407125474241cx, -0.36753534766590024cy), (0.18333333333333326cx, -0.33333333333333337cy), (0.08936407125474241cx, -0.2991313190007665cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.48166117668538544cx, 0.13663653745168614cy), (-0.4393993505113155cx, 0.22726731615535115cy), (-0.5300301292149805cx, 0.1850054899812812cy)])], Symbol(\"\"))]), List([Compose.Property{Compose.LineWidthPrimitive}(Compose.LineWidthPrimitive[Compose.LineWidthPrimitive(1.2247448713915892mm)]), Compose.Property{Compose.StrokePrimitive}(Compose.StrokePrimitive[Compose.StrokePrimitive(RGBA{Float64}(0.8274509803921568,0.8274509803921568,0.8274509803921568,1.0))])]), 0, false, false, false, false, nothing, nothing, 0.0, Symbol(\"\")), Context(BBox{l,t,r,b,w,h = 0.0w,0.0h, 1.0w,1.0h, 1.0w,1.0h}, nothing, nothing, nothing, nothing, List([]), List([]), List([Compose.Property{Compose.FontSizePrimitive}(Compose.FontSizePrimitive[Compose.FontSizePrimitive(4.0mm)]), Compose.Property{Compose.StrokePrimitive}(Compose.StrokePrimitive[Compose.StrokePrimitive(RGBA{Float64}(0.0,0.0,0.0,0.0))]), Compose.Property{Compose.FillPrimitive}(Compose.FillPrimitive[Compose.FillPrimitive(RGBA{Float64}(0.0,0.0,0.0,1.0))])]), 0, false, false, false, false, nothing, nothing, 0.0, Symbol(\"\")), Context(BBox{l,t,r,b,w,h = 0.0w,0.0h, 1.0w,1.0h, 1.0w,1.0h}, nothing, nothing, nothing, nothing, List([]), List([Compose.Form{Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}}(Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}[Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}((1.0cx, 1.0cy), 0.06w), Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}((-1.0cx, -1.0cy), 0.06w), Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}((0.33333333333333326cx, -0.33333333333333337cy), 0.06w), Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}((-1.0cx, 1.0cy), 0.06w), Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}((-0.33333333333333337cx, 0.33333333333333326cy), 0.06w), Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}((-1.0cx, -0.33333333333333337cy), 0.06w)], Symbol(\"\"))]), List([Compose.Property{Compose.LineWidthPrimitive}(Compose.LineWidthPrimitive[Compose.LineWidthPrimitive(0.0mm), Compose.LineWidthPrimitive(1.2247448713915892mm), Compose.LineWidthPrimitive(0.0mm), Compose.LineWidthPrimitive(1.2247448713915892mm), Compose.LineWidthPrimitive(1.2247448713915892mm), Compose.LineWidthPrimitive(1.2247448713915892mm)]), Compose.Property{Compose.StrokePrimitive}(Compose.StrokePrimitive[Compose.StrokePrimitive(RGBA{Float64}(0.0,0.0,0.0,1.0))]), Compose.Property{Compose.FillPrimitive}(Compose.FillPrimitive[Compose.FillPrimitive(RGBA{Float64}(0.25098039215686274,0.8784313725490196,0.8156862745098039,1.0)), Compose.FillPrimitive(RGBA{Float64}(0.25098039215686274,0.8784313725490196,0.8156862745098039,1.0)), Compose.FillPrimitive(RGBA{Float64}(0.25098039215686274,0.8784313725490196,0.8156862745098039,1.0)), Compose.FillPrimitive(RGBA{Float64}(0.8274509803921568,0.8274509803921568,0.8274509803921568,1.0)), Compose.FillPrimitive(RGBA{Float64}(0.25098039215686274,0.8784313725490196,0.8156862745098039,1.0)), Compose.FillPrimitive(RGBA{Float64}(0.8274509803921568,0.8274509803921568,0.8274509803921568,1.0))])]), 0, false, false, false, false, nothing, nothing, 0.0, Symbol(\"\")), Context(BBox{l,t,r,b,w,h = 0.0w,0.0h, 1.0w,1.0h, 1.0w,1.0h}, nothing, nothing, nothing, nothing, List([]), List([Compose.Form{Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}}(Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}[Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}((1.0cx, 1.0cy), \"Spending\", Compose.HCenter(), Compose.VCenter(), Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}(0.0, (0.5w, 0.5h)), (0.0mm, 0.0mm)), Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}((-1.0cx, -1.0cy), \"Age\", Compose.HCenter(), Compose.VCenter(), Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}(0.0, (0.5w, 0.5h)), (0.0mm, 0.0mm)), Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}((0.33333333333333326cx, -0.33333333333333337cy), \"Store\\nDistance\", Compose.HCenter(), Compose.VCenter(), Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}(0.0, (0.5w, 0.5h)), (0.0mm, 0.0mm)), Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}((-1.0cx, 1.0cy), \"Personal\\nIncome\", Compose.HCenter(), Compose.VCenter(), Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}(0.0, (0.5w, 0.5h)), (0.0mm, 0.0mm)), Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}((-0.33333333333333337cx, 0.33333333333333326cy), \"Population\\nDensity\", Compose.HCenter(), Compose.VCenter(), Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}(0.0, (0.5w, 0.5h)), (0.0mm, 0.0mm)), Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}((-1.0cx, -0.33333333333333337cy), \"Zoning\", Compose.HCenter(), Compose.VCenter(), Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}(0.0, (0.5w, 0.5h)), (0.0mm, 0.0mm))], Symbol(\"\"))]), List([Compose.Property{Compose.FontSizePrimitive}(Compose.FontSizePrimitive[Compose.FontSizePrimitive(3.0mm)]), Compose.Property{Compose.StrokePrimitive}(Compose.StrokePrimitive[Compose.StrokePrimitive(RGBA{Float64}(0.0,0.0,0.0,0.0))]), Compose.Property{Compose.FillPrimitive}(Compose.FillPrimitive[Compose.FillPrimitive(RGBA{Float64}(0.0,0.0,0.0,1.0))])]), 0, false, false, false, false, nothing, nothing, 0.0, Symbol(\"\"))]), List([]), List([]), 0, false, false, false, false, nothing, nothing, 0.0, Symbol(\"\"))" + ] + }, + "execution_count": 66, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "nodestrokelw = [0, 2, 0, 2, 2, 2]\n", + "\n", + "\n", + "g_plot = gplot(g, locs_x, locs_y, nodelabel=nodelabel, \n", + " nodestrokec = \"black\", nodestrokelw = nodestrokelw, NODESIZE = .15, NODELABELSIZE = 3, nodefillc=nodefillc)\n", + "draw(PNG(\"causal_discovery_3.png\", 16cm, 16cm),g_plot)\n", + "g_plot" + ] + }, + { + "cell_type": "markdown", + "id": "31af0202", + "metadata": {}, + "source": [ + "First, let's define a function to simulate data from this DAG" + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "id": "b1f76f0f", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "get_data" + ] + }, + "execution_count": 17, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "using DataFrames\n", + "\"\"\"\n", + "Create a dataframe with the relationships shown in the DAG\n", + "\"\"\"\n", + "function get_data(n::Int, actual_spending_decrease::Float64 = -2.0)\n", + " age = rand(n) .* 50 \n", + " zoning = rand(n) \n", + " personal_income = rand(n) \n", + " store_distance = ( rand(n) .* .1 .+ (age ./ 100) .+ zoning ) .* 100 ## pretend that people who are old live farther from stores\n", + "\n", + " \n", + " population_density = (zoning .+ personal_income .+ rand(n) )\n", + " spending = ((personal_income .* 100) .+ age .+ (store_distance .* actual_spending_decrease) .+ rand(n) * 100)\n", + "## Note: the order of these needs to be the same as the order of the names in the original DAG so that the causal discovery algorithm can work right\n", + " return DataFrame(spending = spending, age = age, store_distance = store_distance,\n", + " personal_income = personal_income, \n", + " population_density = population_density, \n", + " zoning = zoning, )\n", + " \n", + "end\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "id": "d34bda8d", + "metadata": {}, + "source": [ + "If we conduct an unadjusted analysis, we will get a biased estimate of the effect of store distance on spending" + ] + }, + { + "cell_type": "code", + "execution_count": 67, + "id": "af58eb8a", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "StatsModels.TableRegressionModel{LinearModel{GLM.LmResp{Vector{Float64}}, GLM.DensePredChol{Float64, LinearAlgebra.CholeskyPivoted{Float64, Matrix{Float64}, Vector{Int64}}}}, Matrix{Float64}}\n", + "\n", + "spending ~ 1 + store_distance\n", + "\n", + "Coefficients:\n", + "────────────────────────────────────────────────────────────────────────────────\n", + " Coef. Std. Error t Pr(>|t|) Lower 95% Upper 95%\n", + "────────────────────────────────────────────────────────────────────────────────\n", + "(Intercept) 109.103 0.0510028 2139.15 <1e-99 109.003 109.203\n", + "store_distance -1.80103 0.000590968 -3047.59 <1e-99 -1.80219 -1.79987\n", + "────────────────────────────────────────────────────────────────────────────────" + ] + }, + "execution_count": 67, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "using GLM \n", + "df = get_data(5000000)\n", + "lm(@formula(spending ~ store_distance), df)" + ] + }, + { + "cell_type": "markdown", + "id": "df4da5e1", + "metadata": {}, + "source": [ + "As discussed above, if we condition only on age, we obtain an unbiased estimate of the effect of store distance on spending (-2.0)" + ] + }, + { + "cell_type": "code", + "execution_count": 68, + "id": "4af0cb65", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "StatsModels.TableRegressionModel{LinearModel{GLM.LmResp{Vector{Float64}}, GLM.DensePredChol{Float64, LinearAlgebra.CholeskyPivoted{Float64, Matrix{Float64}, Vector{Int64}}}}, Matrix{Float64}}\n", + "\n", + "spending ~ 1 + store_distance + age\n", + "\n", + "Coefficients:\n", + "───────────────────────────────────────────────────────────────────────────────\n", + " Coef. Std. Error t Pr(>|t|) Lower 95% Upper 95%\n", + "───────────────────────────────────────────────────────────────────────────────\n", + "(Intercept) 99.9799 0.0503169 1987.01 <1e-99 99.8813 100.079\n", + "store_distance -1.99915 0.000629403 -3176.26 <1e-99 -2.00038 -1.99792\n", + "age 0.99863 0.00141326 706.61 <1e-99 0.99586 1.0014\n", + "───────────────────────────────────────────────────────────────────────────────" + ] + }, + "execution_count": 68, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "lm(@formula(spending ~ store_distance + age), df)" + ] + }, + { + "cell_type": "markdown", + "id": "e716c9d9", + "metadata": {}, + "source": [ + "And, if we were to condition on both age and population density (the \"free\" variables available to us) we would induce M-bias and have a biased estimate again" + ] + }, + { + "cell_type": "code", + "execution_count": 69, + "id": "05872134", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "StatsModels.TableRegressionModel{LinearModel{GLM.LmResp{Vector{Float64}}, GLM.DensePredChol{Float64, LinearAlgebra.CholeskyPivoted{Float64, Matrix{Float64}, Vector{Int64}}}}, Matrix{Float64}}\n", + "\n", + "spending ~ 1 + store_distance + age + population_density\n", + "\n", + "Coefficients:\n", + "───────────────────────────────────────────────────────────────────────────────────\n", + " Coef. Std. Error t Pr(>|t|) Lower 95% Upper 95%\n", + "───────────────────────────────────────────────────────────────────────────────────\n", + "(Intercept) 52.4224 0.0571568 917.17 <1e-99 52.3104 52.5345\n", + "store_distance -2.49213 0.000666522 -3739.00 <1e-99 -2.49343 -2.49082\n", + "age 1.49162 0.00128333 1162.31 <1e-99 1.48911 1.49414\n", + "population_density 49.7762 0.0386726 1287.12 <1e-99 49.7004 49.852\n", + "───────────────────────────────────────────────────────────────────────────────────" + ] + }, + "execution_count": 69, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "lm(@formula(spending ~ store_distance + age + population_density), df)" + ] + }, + { + "cell_type": "markdown", + "id": "73d9e97d", + "metadata": {}, + "source": [ + "We could also adjust on all of the possible variables, but notice how the standard error values are much higher here than when we just adjusted for age alone. " + ] + }, + { + "cell_type": "code", + "execution_count": 70, + "id": "af3a9855", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "StatsModels.TableRegressionModel{LinearModel{GLM.LmResp{Vector{Float64}}, GLM.DensePredChol{Float64, LinearAlgebra.CholeskyPivoted{Float64, Matrix{Float64}, Vector{Int64}}}}, Matrix{Float64}}\n", + "\n", + "spending ~ 1 + store_distance + age + population_density + zoning + personal_income\n", + "\n", + "Coefficients:\n", + "────────────────────────────────────────────────────────────────────────────────────────\n", + " Coef. Std. Error t Pr(>|t|) Lower 95% Upper 95%\n", + "────────────────────────────────────────────────────────────────────────────────────────\n", + "(Intercept) 49.9691 0.0516369 967.70 <1e-99 49.8679 50.0703\n", + "store_distance -1.994 0.00447123 -445.96 <1e-99 -2.00276 -1.98524\n", + "age 0.993022 0.00455984 217.78 <1e-99 0.984085 1.00196\n", + "population_density 0.00501536 0.0447225 0.11 0.9107 -0.0826392 0.0926699\n", + "zoning -0.557078 0.451542 -1.23 0.2173 -1.44208 0.327928\n", + "personal_income 100.012 0.0632306 1581.70 <1e-99 99.8881 100.136\n", + "────────────────────────────────────────────────────────────────────────────────────────" + ] + }, + "execution_count": 70, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "lm(@formula(spending ~ store_distance + age + population_density + zoning + personal_income), df)" + ] + }, + { + "cell_type": "markdown", + "id": "738b07eb", + "metadata": {}, + "source": [ + "Now that we understand what's going on, let's see if causal discovery is able to recover our DAG from the simulated data. In this post, I'll use the `CausalInference` Julia package to attempt to recover causal relationships." + ] + }, + { + "cell_type": "markdown", + "id": "01723019", + "metadata": {}, + "source": [ + "### PC vs FCI algorithms\n", + "\n", + "The `CausalInference` Julia package uses two algorithms for causal discovery. The PC algorithm (named after Peter Spirtes and Clark Glymour) can be used in situations where causal sufficiency can be assumed. That is, the PC algorithm should only be used when a researcher is sure that there are no common causes of any variables included. In the PC algorithms, an edge (or line between variables) can indicate that two variables may have a common cause (if an edge is bidirectional), an edge can indicate that one variable is a cause of the other variable (if an edge is unidirectional) or that there is no relationship between the two variables (shown by no edges between the two varibles).\n", + "\n", + "The FCI (Fast Causal Inference) algorithm is intended for use in cases where there may be unmeasured confounders between variables. There are many types of edges in graphs produced by FCI, and the documentation for [this python package](https://causal-learn.readthedocs.io/en/latest/search_methods_index/Constraint-based%20causal%20discovery%20methods/FCI.html) is a good resource for understaanding what they mean. " + ] + }, + { + "cell_type": "markdown", + "id": "3bcc825f", + "metadata": {}, + "source": [ + "In both algorithms, one may set the cutoff `p` associated with the test for dependence between nodes. If `p` is very low, the resulting graph may have too few edges." + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "id": "5093c6c3", + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "", + "image/svg+xml": [ + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + " \n", + " \n", + " Spending\n", + " \n", + " \n", + " \n", + " \n", + " Age\n", + " \n", + " \n", + " \n", + " \n", + " StoreDistance\n", + " \n", + " \n", + " \n", + " \n", + " PersonalIncome\n", + " \n", + " \n", + " \n", + " \n", + " PopulationDensity\n", + " \n", + " \n", + " \n", + " \n", + " Zoning\n", + " \n", + " \n", + "\n", + "\n" + ], + "text/html": [ + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + " \n", + " \n", + " Spending\n", + " \n", + " \n", + " \n", + " \n", + " Age\n", + " \n", + " \n", + " \n", + " \n", + " StoreDistance\n", + " \n", + " \n", + " \n", + " \n", + " PersonalIncome\n", + " \n", + " \n", + " \n", + " \n", + " PopulationDensity\n", + " \n", + " \n", + " \n", + " \n", + " Zoning\n", + " \n", + " \n", + "\n", + "\n", + "\n" + ], + "text/plain": [ + "Context(Measures.BoundingBox{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}, Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}((0.0w, 0.0h), (1.0w, 1.0h)), UnitBox{Float64, Float64, Float64, Float64}(-1.2, -1.2, 2.4, 2.4, 0.0mm, 0.0mm, 0.0mm, 0.0mm), nothing, nothing, nothing, List([Context(Measures.BoundingBox{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}, Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}((0.0w, 0.0h), (1.0w, 1.0h)), nothing, nothing, nothing, nothing, List([]), List([Compose.Form{Compose.LinePrimitive}(Compose.LinePrimitive[Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.8658359213500126cx, -0.9329179606750063cy), (0.19916925468334587cx, -0.40041537265832705cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.8939339828220179cx, 0.8939339828220179cy), (-0.4393993505113155cx, 0.4393993505113154cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.4393993505113155cx, 0.4393993505113154cy), (-0.8939339828220179cx, 0.8939339828220179cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.85cx, -0.33333333333333337cy), (0.18333333333333326cx, -0.33333333333333337cy)])], Symbol(\"\"))]), List([Compose.Property{Compose.LineWidthPrimitive}(Compose.LineWidthPrimitive[Compose.LineWidthPrimitive(1.2247448713915892mm)]), Compose.Property{Compose.FillPrimitive}(Compose.FillPrimitive[Compose.FillPrimitive(RGBA{Float64}(0.0,0.0,0.0,0.0))]), Compose.Property{Compose.StrokePrimitive}(Compose.StrokePrimitive[Compose.StrokePrimitive(RGBA{Float64}(0.8274509803921568,0.8274509803921568,0.8274509803921568,1.0))])]), 0, false, false, false, false, nothing, nothing, 0.0, Symbol(\"\")), Context(Measures.BoundingBox{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}, Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}((0.0w, 0.0h), (1.0w, 1.0h)), nothing, nothing, nothing, nothing, List([]), List([Compose.Form{Compose.LinePrimitive}(Compose.LinePrimitive[Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(0.13041619736506527cx, -0.4730309158249882cy), (0.19916925468334587cx, -0.40041537265832705cy), (0.09982498575904862cx, -0.41184849261295486cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.5300301292149805cx, 0.48166117668538533cy), (-0.4393993505113155cx, 0.4393993505113154cy), (-0.48166117668538544cx, 0.5300301292149804cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.8033032041183529cx, 0.8516721566479479cy), (-0.8939339828220179cx, 0.8939339828220179cy), (-0.8516721566479479cx, 0.8033032041183529cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(0.08936407125474241cx, -0.36753534766590024cy), (0.18333333333333326cx, -0.33333333333333337cy), (0.08936407125474241cx, -0.2991313190007665cy)])], Symbol(\"\"))]), List([Compose.Property{Compose.LineWidthPrimitive}(Compose.LineWidthPrimitive[Compose.LineWidthPrimitive(1.2247448713915892mm)]), Compose.Property{Compose.StrokePrimitive}(Compose.StrokePrimitive[Compose.StrokePrimitive(RGBA{Float64}(0.8274509803921568,0.8274509803921568,0.8274509803921568,1.0))])]), 0, false, false, false, false, nothing, nothing, 0.0, Symbol(\"\")), Context(Measures.BoundingBox{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}, Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}((0.0w, 0.0h), (1.0w, 1.0h)), nothing, nothing, nothing, nothing, List([]), List([]), List([Compose.Property{Compose.FontSizePrimitive}(Compose.FontSizePrimitive[Compose.FontSizePrimitive(4.0mm)]), Compose.Property{Compose.StrokePrimitive}(Compose.StrokePrimitive[Compose.StrokePrimitive(RGBA{Float64}(0.0,0.0,0.0,0.0))]), Compose.Property{Compose.FillPrimitive}(Compose.FillPrimitive[Compose.FillPrimitive(RGBA{Float64}(0.0,0.0,0.0,1.0))])]), 0, false, false, false, false, nothing, nothing, 0.0, Symbol(\"\")), Context(Measures.BoundingBox{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}, Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}((0.0w, 0.0h), (1.0w, 1.0h)), nothing, nothing, nothing, nothing, List([]), List([Compose.Form{Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}}(Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}[Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}((1.0cx, 1.0cy), 0.06w), Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}((-1.0cx, -1.0cy), 0.06w), Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}((0.33333333333333326cx, -0.33333333333333337cy), 0.06w), Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}((-1.0cx, 1.0cy), 0.06w), Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}((-0.33333333333333337cx, 0.33333333333333326cy), 0.06w), Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}((-1.0cx, -0.33333333333333337cy), 0.06w)], Symbol(\"\"))]), List([Compose.Property{Compose.LineWidthPrimitive}(Compose.LineWidthPrimitive[Compose.LineWidthPrimitive(0.0mm), Compose.LineWidthPrimitive(0.0mm), Compose.LineWidthPrimitive(0.0mm), Compose.LineWidthPrimitive(0.0mm), Compose.LineWidthPrimitive(0.0mm), Compose.LineWidthPrimitive(0.0mm)]), Compose.Property{Compose.StrokePrimitive}(Compose.StrokePrimitive[Compose.StrokePrimitive(RGBA{Float64}(0.0,0.0,0.0,1.0))]), Compose.Property{Compose.FillPrimitive}(Compose.FillPrimitive[Compose.FillPrimitive(RGBA{Float64}(0.25098039215686274,0.8784313725490196,0.8156862745098039,1.0)), Compose.FillPrimitive(RGBA{Float64}(0.25098039215686274,0.8784313725490196,0.8156862745098039,1.0)), Compose.FillPrimitive(RGBA{Float64}(0.25098039215686274,0.8784313725490196,0.8156862745098039,1.0)), Compose.FillPrimitive(RGBA{Float64}(0.8274509803921568,0.8274509803921568,0.8274509803921568,1.0)), Compose.FillPrimitive(RGBA{Float64}(0.25098039215686274,0.8784313725490196,0.8156862745098039,1.0)), Compose.FillPrimitive(RGBA{Float64}(0.8274509803921568,0.8274509803921568,0.8274509803921568,1.0))])]), 0, false, false, false, false, nothing, nothing, 0.0, Symbol(\"\")), Context(Measures.BoundingBox{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}, Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}((0.0w, 0.0h), (1.0w, 1.0h)), nothing, nothing, nothing, nothing, List([]), List([Compose.Form{Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}}(Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}[Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}((1.0cx, 1.0cy), \"Spending\", Compose.HCenter(), Compose.VCenter(), Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}(0.0, (0.5w, 0.5h)), (0.0mm, 0.0mm)), Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}((-1.0cx, -1.0cy), \"Age\", Compose.HCenter(), Compose.VCenter(), Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}(0.0, (0.5w, 0.5h)), (0.0mm, 0.0mm)), Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}((0.33333333333333326cx, -0.33333333333333337cy), \"Store\\nDistance\", Compose.HCenter(), Compose.VCenter(), Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}(0.0, (0.5w, 0.5h)), (0.0mm, 0.0mm)), Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}((-1.0cx, 1.0cy), \"Personal\\nIncome\", Compose.HCenter(), Compose.VCenter(), Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}(0.0, (0.5w, 0.5h)), (0.0mm, 0.0mm)), Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}((-0.33333333333333337cx, 0.33333333333333326cy), \"Population\\nDensity\", Compose.HCenter(), Compose.VCenter(), Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}(0.0, (0.5w, 0.5h)), (0.0mm, 0.0mm)), Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}((-1.0cx, -0.33333333333333337cy), \"Zoning\", Compose.HCenter(), Compose.VCenter(), Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}(0.0, (0.5w, 0.5h)), (0.0mm, 0.0mm))], Symbol(\"\"))]), List([Compose.Property{Compose.FontSizePrimitive}(Compose.FontSizePrimitive[Compose.FontSizePrimitive(3.0mm)]), Compose.Property{Compose.StrokePrimitive}(Compose.StrokePrimitive[Compose.StrokePrimitive(RGBA{Float64}(0.0,0.0,0.0,0.0))]), Compose.Property{Compose.FillPrimitive}(Compose.FillPrimitive[Compose.FillPrimitive(RGBA{Float64}(0.0,0.0,0.0,1.0))])]), 0, false, false, false, false, nothing, nothing, 0.0, Symbol(\"\"))]), List([]), List([]), 0, false, false, false, false, nothing, nothing, 0.0, Symbol(\"\"))" + ] + }, + "execution_count": 22, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "using CausalInference\n", + "using TikzGraphs\n", + "df = get_data(300)\n", + "est_g = pcalg(df, 0.00000000000001, gausscitest)\n", + "nodestrokelw = [0, 0, 0, 0, 0, 0]\n", + "g_plot = gplot(est_g, locs_x, locs_y, nodelabel=nodelabel, \n", + " nodestrokec = \"black\", nodestrokelw = nodestrokelw, NODESIZE = .15, NODELABELSIZE = 3, nodefillc=nodefillc)\n", + "draw(PNG(\"causal_discovery_4.png\", 16cm, 16cm),g_plot)\n", + "g_plot" + ] + }, + { + "cell_type": "markdown", + "id": "fe5c0152", + "metadata": {}, + "source": [ + "If `p` is too high, the resulting graph will probably have too many edges:" + ] + }, + { + "cell_type": "code", + "execution_count": 23, + "id": "bba03d44", + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "", + "image/svg+xml": [ + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + " \n", + " \n", + " Spending\n", + " \n", + " \n", + " \n", + " \n", + " Age\n", + " \n", + " \n", + " \n", + " \n", + " StoreDistance\n", + " \n", + " \n", + " \n", + " \n", + " PersonalIncome\n", + " \n", + " \n", + " \n", + " \n", + " PopulationDensity\n", + " \n", + " \n", + " \n", + " \n", + " Zoning\n", + " \n", + " \n", + "\n", + "\n" + ], + "text/html": [ + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + " \n", + " \n", + " Spending\n", + " \n", + " \n", + " \n", + " \n", + " Age\n", + " \n", + " \n", + " \n", + " \n", + " StoreDistance\n", + " \n", + " \n", + " \n", + " \n", + " PersonalIncome\n", + " \n", + " \n", + " \n", + " \n", + " PopulationDensity\n", + " \n", + " \n", + " \n", + " \n", + " Zoning\n", + " \n", + " \n", + "\n", + "\n", + "\n" + ], + "text/plain": [ + "Context(Measures.BoundingBox{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}, Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}((0.0w, 0.0h), (1.0w, 1.0h)), UnitBox{Float64, Float64, Float64, Float64}(-1.2, -1.2, 2.4, 2.4, 0.0mm, 0.0mm, 0.0mm, 0.0mm), nothing, nothing, nothing, List([Context(Measures.BoundingBox{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}, Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}((0.0w, 0.0h), (1.0w, 1.0h)), nothing, nothing, nothing, nothing, List([]), List([Compose.Form{Compose.LinePrimitive}(Compose.LinePrimitive[Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.8939339828220179cx, -0.8939339828220179cy), (0.8939339828220179cx, 0.8939339828220179cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.8658359213500126cx, -0.9329179606750063cy), (0.19916925468334587cx, -0.40041537265832705cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(0.40041537265832694cx, -0.199169254683346cy), (0.9329179606750063cx, 0.8658359213500126cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.85cx, 1.0cy), (0.85cx, 1.0cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.8939339828220179cx, 0.8939339828220179cy), (-0.4393993505113155cx, 0.4393993505113154cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-1.0cx, 0.85cy), (-1.0cx, -0.18333333333333338cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.22726731615535123cx, 0.22726731615535115cy), (0.22726731615535115cx, -0.22726731615535123cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.4393993505113155cx, 0.22726731615535112cy), (-0.8939339828220179cx, -0.22726731615535126cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.8751924558493235cx, -0.250128303899549cy), (0.8751924558493235cx, 0.9167949705662156cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.85cx, -0.33333333333333337cy), (0.18333333333333326cx, -0.33333333333333337cy)])], Symbol(\"\"))]), List([Compose.Property{Compose.LineWidthPrimitive}(Compose.LineWidthPrimitive[Compose.LineWidthPrimitive(1.2247448713915892mm)]), Compose.Property{Compose.FillPrimitive}(Compose.FillPrimitive[Compose.FillPrimitive(RGBA{Float64}(0.0,0.0,0.0,0.0))]), Compose.Property{Compose.StrokePrimitive}(Compose.StrokePrimitive[Compose.StrokePrimitive(RGBA{Float64}(0.8274509803921568,0.8274509803921568,0.8274509803921568,1.0))])]), 0, false, false, false, false, nothing, nothing, 0.0, Symbol(\"\")), Context(Measures.BoundingBox{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}, Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}((0.0w, 0.0h), (1.0w, 1.0h)), nothing, nothing, nothing, nothing, List([]), List([Compose.Form{Compose.LinePrimitive}(Compose.LinePrimitive[Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(0.8516721566479479cx, 0.8033032041183529cy), (0.8939339828220179cx, 0.8939339828220179cy), (0.8033032041183529cx, 0.8516721566479479cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(0.13041619736506527cx, -0.4730309158249882cy), (0.19916925468334587cx, -0.40041537265832705cy), (0.09982498575904862cx, -0.41184849261295486cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(0.9214848407203785cx, 0.7664916524257154cy), (0.9329179606750063cx, 0.8658359213500126cy), (0.8603024175083452cx, 0.797082864031732cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(0.7560307379214091cx, 0.9657979856674331cy), (0.85cx, 1.0cy), (0.7560307379214091cx, 1.034202014332567cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.5300301292149805cx, 0.48166117668538533cy), (-0.4393993505113155cx, 0.4393993505113154cy), (-0.48166117668538544cx, 0.5300301292149804cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-1.034202014332567cx, -0.08936407125474254cy), (-1.0cx, -0.18333333333333338cy), (-0.9657979856674331cx, -0.08936407125474252cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(0.13663653745168614cx, -0.18500548998128127cy), (0.22726731615535115cx, -0.22726731615535123cy), (0.1850054899812812cx, -0.13663653745168625cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.8516721566479479cx, -0.13663653745168625cy), (-0.8939339828220179cx, -0.22726731615535126cy), (-0.8033032041183529cx, -0.1850054899812813cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(0.815977167739695cx, 0.8362124063597218cy), (0.8751924558493235cx, 0.9167949705662156cy), (0.778033439616549cx, 0.8931279985444407cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(0.08936407125474241cx, -0.36753534766590024cy), (0.18333333333333326cx, -0.33333333333333337cy), (0.08936407125474241cx, -0.2991313190007665cy)])], Symbol(\"\"))]), List([Compose.Property{Compose.LineWidthPrimitive}(Compose.LineWidthPrimitive[Compose.LineWidthPrimitive(1.2247448713915892mm)]), Compose.Property{Compose.StrokePrimitive}(Compose.StrokePrimitive[Compose.StrokePrimitive(RGBA{Float64}(0.8274509803921568,0.8274509803921568,0.8274509803921568,1.0))])]), 0, false, false, false, false, nothing, nothing, 0.0, Symbol(\"\")), Context(Measures.BoundingBox{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}, Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}((0.0w, 0.0h), (1.0w, 1.0h)), nothing, nothing, nothing, nothing, List([]), List([]), List([Compose.Property{Compose.FontSizePrimitive}(Compose.FontSizePrimitive[Compose.FontSizePrimitive(4.0mm)]), Compose.Property{Compose.StrokePrimitive}(Compose.StrokePrimitive[Compose.StrokePrimitive(RGBA{Float64}(0.0,0.0,0.0,0.0))]), Compose.Property{Compose.FillPrimitive}(Compose.FillPrimitive[Compose.FillPrimitive(RGBA{Float64}(0.0,0.0,0.0,1.0))])]), 0, false, false, false, false, nothing, nothing, 0.0, Symbol(\"\")), Context(Measures.BoundingBox{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}, Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}((0.0w, 0.0h), (1.0w, 1.0h)), nothing, nothing, nothing, nothing, List([]), List([Compose.Form{Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}}(Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}[Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}((1.0cx, 1.0cy), 0.06w), Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}((-1.0cx, -1.0cy), 0.06w), Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}((0.33333333333333326cx, -0.33333333333333337cy), 0.06w), Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}((-1.0cx, 1.0cy), 0.06w), Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}((-0.33333333333333337cx, 0.33333333333333326cy), 0.06w), Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}((-1.0cx, -0.33333333333333337cy), 0.06w)], Symbol(\"\"))]), List([Compose.Property{Compose.LineWidthPrimitive}(Compose.LineWidthPrimitive[Compose.LineWidthPrimitive(0.0mm), Compose.LineWidthPrimitive(0.0mm), Compose.LineWidthPrimitive(0.0mm), Compose.LineWidthPrimitive(0.0mm), Compose.LineWidthPrimitive(0.0mm), Compose.LineWidthPrimitive(0.0mm)]), Compose.Property{Compose.StrokePrimitive}(Compose.StrokePrimitive[Compose.StrokePrimitive(RGBA{Float64}(0.0,0.0,0.0,1.0))]), Compose.Property{Compose.FillPrimitive}(Compose.FillPrimitive[Compose.FillPrimitive(RGBA{Float64}(0.25098039215686274,0.8784313725490196,0.8156862745098039,1.0)), Compose.FillPrimitive(RGBA{Float64}(0.25098039215686274,0.8784313725490196,0.8156862745098039,1.0)), Compose.FillPrimitive(RGBA{Float64}(0.25098039215686274,0.8784313725490196,0.8156862745098039,1.0)), Compose.FillPrimitive(RGBA{Float64}(0.8274509803921568,0.8274509803921568,0.8274509803921568,1.0)), Compose.FillPrimitive(RGBA{Float64}(0.25098039215686274,0.8784313725490196,0.8156862745098039,1.0)), Compose.FillPrimitive(RGBA{Float64}(0.8274509803921568,0.8274509803921568,0.8274509803921568,1.0))])]), 0, false, false, false, false, nothing, nothing, 0.0, Symbol(\"\")), Context(Measures.BoundingBox{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}, Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}((0.0w, 0.0h), (1.0w, 1.0h)), nothing, nothing, nothing, nothing, List([]), List([Compose.Form{Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}}(Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}[Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}((1.0cx, 1.0cy), \"Spending\", Compose.HCenter(), Compose.VCenter(), Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}(0.0, (0.5w, 0.5h)), (0.0mm, 0.0mm)), Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}((-1.0cx, -1.0cy), \"Age\", Compose.HCenter(), Compose.VCenter(), Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}(0.0, (0.5w, 0.5h)), (0.0mm, 0.0mm)), Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}((0.33333333333333326cx, -0.33333333333333337cy), \"Store\\nDistance\", Compose.HCenter(), Compose.VCenter(), Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}(0.0, (0.5w, 0.5h)), (0.0mm, 0.0mm)), Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}((-1.0cx, 1.0cy), \"Personal\\nIncome\", Compose.HCenter(), Compose.VCenter(), Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}(0.0, (0.5w, 0.5h)), (0.0mm, 0.0mm)), Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}((-0.33333333333333337cx, 0.33333333333333326cy), \"Population\\nDensity\", Compose.HCenter(), Compose.VCenter(), Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}(0.0, (0.5w, 0.5h)), (0.0mm, 0.0mm)), Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}((-1.0cx, -0.33333333333333337cy), \"Zoning\", Compose.HCenter(), Compose.VCenter(), Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}(0.0, (0.5w, 0.5h)), (0.0mm, 0.0mm))], Symbol(\"\"))]), List([Compose.Property{Compose.FontSizePrimitive}(Compose.FontSizePrimitive[Compose.FontSizePrimitive(3.0mm)]), Compose.Property{Compose.StrokePrimitive}(Compose.StrokePrimitive[Compose.StrokePrimitive(RGBA{Float64}(0.0,0.0,0.0,0.0))]), Compose.Property{Compose.FillPrimitive}(Compose.FillPrimitive[Compose.FillPrimitive(RGBA{Float64}(0.0,0.0,0.0,1.0))])]), 0, false, false, false, false, nothing, nothing, 0.0, Symbol(\"\"))]), List([]), List([]), 0, false, false, false, false, nothing, nothing, 0.0, Symbol(\"\"))" + ] + }, + "execution_count": 23, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "est_g = pcalg(df, .9, gausscitest)\n", + "nodestrokelw = [0, 0, 0, 0, 0, 0]\n", + "g_plot = gplot(est_g, locs_x, locs_y, nodelabel=nodelabel, \n", + " nodestrokec = \"black\", nodestrokelw = nodestrokelw, NODESIZE = .15, NODELABELSIZE = 3, nodefillc=nodefillc)\n", + "draw(PNG(\"causal_discovery_5.png\", 16cm, 16cm),g_plot)\n", + "g_plot" + ] + }, + { + "cell_type": "markdown", + "id": "d861e5c0", + "metadata": {}, + "source": [ + "The [package documentation](https://mschauer.github.io/CausalInference.jl/latest/examples/pc_basic_examples/) uses a parameter value of 0.01, which we will use for the rest of the post. In practice, `p` should be selected based on the size of the data (because tests done on a larger dataset will be more powerful even at the same cutoff level) and a range of parameter values should be used for sensitivity analyses. Note that the DAG we recover using 0.01 as our parameter value is the same as our original DAG" + ] + }, + { + "cell_type": "code", + "execution_count": 72, + "id": "04ad22e1", + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "", + "image/svg+xml": [ + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + " \n", + " \n", + " Spending\n", + " \n", + " \n", + " \n", + " \n", + " Age\n", + " \n", + " \n", + " \n", + " \n", + " StoreDistance\n", + " \n", + " \n", + " \n", + " \n", + " PersonalIncome\n", + " \n", + " \n", + " \n", + " \n", + " PopulationDensity\n", + " \n", + " \n", + " \n", + " \n", + " Zoning\n", + " \n", + " \n", + "\n", + "\n" + ], + "text/html": [ + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + " \n", + " \n", + " Spending\n", + " \n", + " \n", + " \n", + " \n", + " Age\n", + " \n", + " \n", + " \n", + " \n", + " StoreDistance\n", + " \n", + " \n", + " \n", + " \n", + " PersonalIncome\n", + " \n", + " \n", + " \n", + " \n", + " PopulationDensity\n", + " \n", + " \n", + " \n", + " \n", + " Zoning\n", + " \n", + " \n", + "\n", + "\n", + "\n" + ], + "text/plain": [ + "Context(BBox{l,t,r,b,w,h = 0.0w,0.0h, 1.0w,1.0h, 1.0w,1.0h}, UnitBox{Float64, Float64, Float64, Float64}(-1.2, -1.2, 2.4, 2.4, 0.0mm, 0.0mm, 0.0mm, 0.0mm), nothing, nothing, nothing, List([Context(BBox{l,t,r,b,w,h = 0.0w,0.0h, 1.0w,1.0h, 1.0w,1.0h}, nothing, nothing, nothing, nothing, List([]), List([Compose.Form{Compose.LinePrimitive}(Compose.LinePrimitive[Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.8939339828220179cx, -0.8939339828220179cy), (0.8939339828220179cx, 0.8939339828220179cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.8658359213500126cx, -0.9329179606750063cy), (0.19916925468334587cx, -0.40041537265832705cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(0.40041537265832694cx, -0.199169254683346cy), (0.9329179606750063cx, 0.8658359213500126cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.85cx, 1.0cy), (0.85cx, 1.0cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.8939339828220179cx, 0.8939339828220179cy), (-0.4393993505113155cx, 0.4393993505113154cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.85cx, -0.33333333333333337cy), (0.18333333333333326cx, -0.33333333333333337cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.8939339828220179cx, -0.22726731615535126cy), (-0.4393993505113155cx, 0.22726731615535115cy)])], Symbol(\"\"))]), List([Compose.Property{Compose.LineWidthPrimitive}(Compose.LineWidthPrimitive[Compose.LineWidthPrimitive(1.2247448713915892mm)]), Compose.Property{Compose.FillPrimitive}(Compose.FillPrimitive[Compose.FillPrimitive(RGBA{Float64}(0.0,0.0,0.0,0.0))]), Compose.Property{Compose.StrokePrimitive}(Compose.StrokePrimitive[Compose.StrokePrimitive(RGBA{Float64}(0.8274509803921568,0.8274509803921568,0.8274509803921568,1.0))])]), 0, false, false, false, false, nothing, nothing, 0.0, Symbol(\"\")), Context(BBox{l,t,r,b,w,h = 0.0w,0.0h, 1.0w,1.0h, 1.0w,1.0h}, nothing, nothing, nothing, nothing, List([]), List([Compose.Form{Compose.LinePrimitive}(Compose.LinePrimitive[Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(0.8516721566479479cx, 0.8033032041183529cy), (0.8939339828220179cx, 0.8939339828220179cy), (0.8033032041183529cx, 0.8516721566479479cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(0.13041619736506527cx, -0.4730309158249882cy), (0.19916925468334587cx, -0.40041537265832705cy), (0.09982498575904862cx, -0.41184849261295486cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(0.9214848407203785cx, 0.7664916524257154cy), (0.9329179606750063cx, 0.8658359213500126cy), (0.8603024175083452cx, 0.797082864031732cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(0.7560307379214091cx, 0.9657979856674331cy), (0.85cx, 1.0cy), (0.7560307379214091cx, 1.034202014332567cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.5300301292149805cx, 0.48166117668538533cy), (-0.4393993505113155cx, 0.4393993505113154cy), (-0.48166117668538544cx, 0.5300301292149804cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(0.08936407125474241cx, -0.36753534766590024cy), (0.18333333333333326cx, -0.33333333333333337cy), (0.08936407125474241cx, -0.2991313190007665cy)]), Compose.LinePrimitive{Tuple{Measure, Measure}}(Tuple{Measure, Measure}[(-0.48166117668538544cx, 0.13663653745168614cy), (-0.4393993505113155cx, 0.22726731615535115cy), (-0.5300301292149805cx, 0.1850054899812812cy)])], Symbol(\"\"))]), List([Compose.Property{Compose.LineWidthPrimitive}(Compose.LineWidthPrimitive[Compose.LineWidthPrimitive(1.2247448713915892mm)]), Compose.Property{Compose.StrokePrimitive}(Compose.StrokePrimitive[Compose.StrokePrimitive(RGBA{Float64}(0.8274509803921568,0.8274509803921568,0.8274509803921568,1.0))])]), 0, false, false, false, false, nothing, nothing, 0.0, Symbol(\"\")), Context(BBox{l,t,r,b,w,h = 0.0w,0.0h, 1.0w,1.0h, 1.0w,1.0h}, nothing, nothing, nothing, nothing, List([]), List([]), List([Compose.Property{Compose.FontSizePrimitive}(Compose.FontSizePrimitive[Compose.FontSizePrimitive(4.0mm)]), Compose.Property{Compose.StrokePrimitive}(Compose.StrokePrimitive[Compose.StrokePrimitive(RGBA{Float64}(0.0,0.0,0.0,0.0))]), Compose.Property{Compose.FillPrimitive}(Compose.FillPrimitive[Compose.FillPrimitive(RGBA{Float64}(0.0,0.0,0.0,1.0))])]), 0, false, false, false, false, nothing, nothing, 0.0, Symbol(\"\")), Context(BBox{l,t,r,b,w,h = 0.0w,0.0h, 1.0w,1.0h, 1.0w,1.0h}, nothing, nothing, nothing, nothing, List([]), List([Compose.Form{Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}}(Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}[Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}((1.0cx, 1.0cy), 0.06w), Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}((-1.0cx, -1.0cy), 0.06w), Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}((0.33333333333333326cx, -0.33333333333333337cy), 0.06w), Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}((-1.0cx, 1.0cy), 0.06w), Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}((-0.33333333333333337cx, 0.33333333333333326cy), 0.06w), Compose.CirclePrimitive{Tuple{Measure, Measure}, Measure}((-1.0cx, -0.33333333333333337cy), 0.06w)], Symbol(\"\"))]), List([Compose.Property{Compose.LineWidthPrimitive}(Compose.LineWidthPrimitive[Compose.LineWidthPrimitive(0.0mm), Compose.LineWidthPrimitive(0.0mm), Compose.LineWidthPrimitive(0.0mm), Compose.LineWidthPrimitive(0.0mm), Compose.LineWidthPrimitive(0.0mm), Compose.LineWidthPrimitive(0.0mm)]), Compose.Property{Compose.StrokePrimitive}(Compose.StrokePrimitive[Compose.StrokePrimitive(RGBA{Float64}(0.0,0.0,0.0,1.0))]), Compose.Property{Compose.FillPrimitive}(Compose.FillPrimitive[Compose.FillPrimitive(RGBA{Float64}(0.25098039215686274,0.8784313725490196,0.8156862745098039,1.0)), Compose.FillPrimitive(RGBA{Float64}(0.25098039215686274,0.8784313725490196,0.8156862745098039,1.0)), Compose.FillPrimitive(RGBA{Float64}(0.25098039215686274,0.8784313725490196,0.8156862745098039,1.0)), Compose.FillPrimitive(RGBA{Float64}(0.8274509803921568,0.8274509803921568,0.8274509803921568,1.0)), Compose.FillPrimitive(RGBA{Float64}(0.25098039215686274,0.8784313725490196,0.8156862745098039,1.0)), Compose.FillPrimitive(RGBA{Float64}(0.8274509803921568,0.8274509803921568,0.8274509803921568,1.0))])]), 0, false, false, false, false, nothing, nothing, 0.0, Symbol(\"\")), Context(BBox{l,t,r,b,w,h = 0.0w,0.0h, 1.0w,1.0h, 1.0w,1.0h}, nothing, nothing, nothing, nothing, List([]), List([Compose.Form{Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}}(Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}[Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}((1.0cx, 1.0cy), \"Spending\", Compose.HCenter(), Compose.VCenter(), Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}(0.0, (0.5w, 0.5h)), (0.0mm, 0.0mm)), Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}((-1.0cx, -1.0cy), \"Age\", Compose.HCenter(), Compose.VCenter(), Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}(0.0, (0.5w, 0.5h)), (0.0mm, 0.0mm)), Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}((0.33333333333333326cx, -0.33333333333333337cy), \"Store\\nDistance\", Compose.HCenter(), Compose.VCenter(), Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}(0.0, (0.5w, 0.5h)), (0.0mm, 0.0mm)), Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}((-1.0cx, 1.0cy), \"Personal\\nIncome\", Compose.HCenter(), Compose.VCenter(), Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}(0.0, (0.5w, 0.5h)), (0.0mm, 0.0mm)), Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}((-0.33333333333333337cx, 0.33333333333333326cy), \"Population\\nDensity\", Compose.HCenter(), Compose.VCenter(), Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}(0.0, (0.5w, 0.5h)), (0.0mm, 0.0mm)), Compose.TextPrimitive{Tuple{Measures.Length{:cx, Float64}, Measures.Length{:cy, Float64}}, Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}, Tuple{Measures.AbsoluteLength, Measures.AbsoluteLength}}((-1.0cx, -0.33333333333333337cy), \"Zoning\", Compose.HCenter(), Compose.VCenter(), Rotation{Tuple{Measures.Length{:w, Float64}, Measures.Length{:h, Float64}}}(0.0, (0.5w, 0.5h)), (0.0mm, 0.0mm))], Symbol(\"\"))]), List([Compose.Property{Compose.FontSizePrimitive}(Compose.FontSizePrimitive[Compose.FontSizePrimitive(3.0mm)]), Compose.Property{Compose.StrokePrimitive}(Compose.StrokePrimitive[Compose.StrokePrimitive(RGBA{Float64}(0.0,0.0,0.0,0.0))]), Compose.Property{Compose.FillPrimitive}(Compose.FillPrimitive[Compose.FillPrimitive(RGBA{Float64}(0.0,0.0,0.0,1.0))])]), 0, false, false, false, false, nothing, nothing, 0.0, Symbol(\"\"))]), List([]), List([]), 0, false, false, false, false, nothing, nothing, 0.0, Symbol(\"\"))" + ] + }, + "execution_count": 72, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "est_g = pcalg(df, 0.01, gausscitest)\n", + "nodestrokelw = [0, 0, 0, 0, 0, 0]\n", + "g_plot = gplot(est_g, locs_x, locs_y, nodelabel=nodelabel, \n", + " nodestrokec = \"black\", nodestrokelw = nodestrokelw, NODESIZE = .15, NODELABELSIZE = 3, nodefillc=nodefillc)\n", + "draw(PNG(\"causal_discovery_6.png\", 16cm, 16cm),g_plot)\n", + "g_plot" + ] + }, + { + "cell_type": "markdown", + "id": "eb75b158", + "metadata": {}, + "source": [ + "Now, let's look at how the FCI algorithm handles these data. (The nodes are oriented differently because we are using a different graphing library here to accommodate the different types of edges.) Note that I increased the sample size a lot, and it still ended up with one of the arrows reversed. I believe that the FCI algorithm would do better if we used longitudinal data, and gave the algorithm more information on which causal relationships are impossible." + ] + }, + { + "cell_type": "code", + "execution_count": 25, + "id": "1f736e37", + "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "\u001b[33m\u001b[1m┌ \u001b[22m\u001b[39m\u001b[33m\u001b[1mWarning: \u001b[22m\u001b[39mtest.pdf already exists, overwriting!\n", + "\u001b[33m\u001b[1m└ \u001b[22m\u001b[39m\u001b[90m@ TikzPictures ~/.julia/packages/TikzPictures/4zjh8/src/TikzPictures.jl:333\u001b[39m\n" + ] + }, + { + "data": { + "image/svg+xml": [ + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + " \n", + "\n", + "\n", + " \n", + "\n", + "\n", + " \n", + "\n", + "\n", + " \n", + "\n", + "\n", + " \n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + "\n", + "\n" + ], + "text/plain": [ + "TikzPicture(\"\\\\graph [layered layout, , ] {\\n1/\\\"spending\\\" [draw, rounded corners, fill=blue!10],\\n2/\\\"age\\\" [draw, rounded corners, fill=blue!10],\\n3/\\\"store\\\\_distance\\\" [draw, rounded corners, fill=blue!10],\\n4/\\\"personal\\\\_income\\\" [draw, rounded corners, fill=blue!10],\\n5/\\\"population\\\\_density\\\" [draw, rounded corners, fill=blue!10],\\n6/\\\"zoning\\\" [draw, rounded corners, fill=blue!10],\\n;\\n1 -> [,] 2;\\n1 -> [,<-,] 3;\\n1 -> [,<-o,] 4;\\n2 -> [,] 3;\\n3 -> [,<-o,] 6;\\n4 -> [,o->,] 5;\\n5 -> [,<-o,] 6;\\n};\\n\", \"scale=2\", \"\\\\usepackage{fontspec}\\n\\\\setmainfont{Latin Modern Math}\\n\\\\usetikzlibrary{arrows}\\n\\\\usetikzlibrary{graphs}\\n\\\\usetikzlibrary{graphdrawing}\\n\\n% from: https://tex.stackexchange.com/questions/453132/fresh-install-of-tl2018-no-tikz-graph-drawing-libraries-found\\n\\\\usepackage{luacode}\\n\\\\begin{luacode*}\\n\\tfunction pgf_lookup_and_require(name)\\n\\tlocal sep = package.config:sub(1,1)\\n\\tlocal function lookup(name)\\n\\tlocal sub = name:gsub('%.',sep) \\n\\tif kpse.find_file(sub, 'lua') then\\n\\trequire(name)\\n\\telseif kpse.find_file(sub, 'clua') then\\n\\tcollectgarbage('stop') \\n\\trequire(name)\\n\\tcollectgarbage('restart')\\n\\telse\\n\\treturn false\\n\\tend\\n\\treturn true\\n\\tend\\n\\treturn\\n\\tlookup('pgf.gd.' .. name .. '.library') or\\n\\tlookup('pgf.gd.' .. name) or\\n\\tlookup(name .. '.library') or\\n\\tlookup(name) \\n\\tend\\n\\\\end{luacode*}\\n\\n\\\\usegdlibrary{layered}\", \"tikzpicture\", \"\", \"\", true, true)" + ] + }, + "execution_count": 25, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "using TikzPictures\n", + "df = get_data(1000000)\n", + "est_g = fcialg(df, 0.1, gausscitest)\n", + "p = plot_fci_graph_tikz(est_g, [col for col in names(df)])\n", + "TikzPictures.save(TikzPictures.PDF(\"test\"), p)\n", + "p" + ] + }, + { + "cell_type": "code", + "execution_count": 40, + "id": "a842e08a", + "metadata": {}, + "outputs": [ + { + "data": { + "image/svg+xml": [ + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + " \n", + "\n", + "\n", + " \n", + "\n", + "\n", + " \n", + "\n", + "\n", + " \n", + "\n", + "\n", + " \n", + "\n", + "\n", + " \n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "\n", + "\n", + "\n", + "\n" + ], + "text/plain": [ + "TikzPicture(\"\\\\graph [layered layout, , ] {\\n1/\\\"curly\\\\{braces\\\\}\\\" [draw, rounded corners, fill=blue!10],\\n2/\\\"dollar\\\\\\$sign\\\" [draw, rounded corners, fill=blue!10],\\n3/\\\"under\\\\_score\\\\^{}caret\\\" [draw, rounded corners, fill=blue!10],\\n4/\\\"amper\\\\&sand\\\" [draw, rounded corners, fill=blue!10],\\n5/\\\"hash\\\\#tag\\\" [draw, rounded corners, fill=blue!10],\\n6/\\\"per\\\\%cent\\\" [draw, rounded corners, fill=blue!10],\\n;\\n1 -> [,] 2;\\n1 -> [,<-,] 3;\\n1 -> [,<-o,] 4;\\n2 -> [,] 3;\\n3 -> [,<-o,] 6;\\n4 -> [,o->,] 5;\\n5 -> [,<-o,] 6;\\n};\\n\", \"scale=2\", \"\\\\usepackage{fontspec}\\n\\\\setmainfont{Latin Modern Math}\\n\\\\usetikzlibrary{arrows}\\n\\\\usetikzlibrary{graphs}\\n\\\\usetikzlibrary{graphdrawing}\\n\\n% from: https://tex.stackexchange.com/questions/453132/fresh-install-of-tl2018-no-tikz-graph-drawing-libraries-found\\n\\\\usepackage{luacode}\\n\\\\begin{luacode*}\\n\\tfunction pgf_lookup_and_require(name)\\n\\tlocal sep = package.config:sub(1,1)\\n\\tlocal function lookup(name)\\n\\tlocal sub = name:gsub('%.',sep) \\n\\tif kpse.find_file(sub, 'lua') then\\n\\trequire(name)\\n\\telseif kpse.find_file(sub, 'clua') then\\n\\tcollectgarbage('stop') \\n\\trequire(name)\\n\\tcollectgarbage('restart')\\n\\telse\\n\\treturn false\\n\\tend\\n\\treturn true\\n\\tend\\n\\treturn\\n\\tlookup('pgf.gd.' .. name .. '.library') or\\n\\tlookup('pgf.gd.' .. name) or\\n\\tlookup(name .. '.library') or\\n\\tlookup(name) \\n\\tend\\n\\\\end{luacode*}\\n\\n\\\\usegdlibrary{layered}\", \"tikzpicture\", \"\", \"\", true, true)" + ] + }, + "execution_count": 40, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "## for testing \n", + "names_to_use = [\"curly{braces}\"\n", + " \"dollar\\$sign\"\n", + " \"under_score^caret\"\n", + " \"amper&sand\"\n", + " \"hash#tag\"\n", + " \"per%cent\"]\n", + "p = plot_fci_graph_tikz(est_g, names_to_use)\n", + "p" + ] + }, + { + "cell_type": "markdown", + "id": "b335803e", + "metadata": {}, + "source": [ + "Now, we will define some functions to conduct our simulations." + ] + }, + { + "cell_type": "code", + "execution_count": 29, + "id": "bd4f5d09", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "simulate_discovery_pc" + ] + }, + "execution_count": 29, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "using CausalInference\n", + "\n", + "\"\"\"Create data and determine whether the causal discovery algorithm worked correctly\"\"\"\n", + "function simulate_discovery_pc(n::Int)\n", + " df = get_data(n)\n", + " est_g = pcalg(df, 0.01, gausscitest)\n", + " return est_g == g\n", + " \n", + " \n", + "end" + ] + }, + { + "cell_type": "code", + "execution_count": 46, + "id": "362a3353", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "get_confint_width_control_all" + ] + }, + "execution_count": 46, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "using GLM, Distributions\n", + "\"\"\"Get the confidence interval of the desired parameter when controlling for all covariates\"\"\"\n", + "function get_confint_width_control_all(n::Int)\n", + " df = get_data(n)\n", + " mod = lm(@formula(spending ~ store_distance + age + population_density + zoning + personal_income), df)\n", + " return stderror(mod)[2] * 2* quantile(Normal(0.0, 1.0),.975)\n", + " \n", + "end" + ] + }, + { + "cell_type": "code", + "execution_count": 47, + "id": "2de31d53", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "get_confint_width_control_age" + ] + }, + "execution_count": 47, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "\"\"\"Get the confidence interval of the desired parameter when controlling for just age\"\"\"\n", + "function get_confint_width_control_age(n::Int)\n", + " df = get_data(n)\n", + " mod = lm(@formula(spending ~ store_distance + age), df)\n", + " return stderror(mod)[2] * 2* quantile(Normal(0.0, 1.0),.975)\n", + " \n", + "end" + ] + }, + { + "cell_type": "markdown", + "id": "40534a58", + "metadata": {}, + "source": [ + "And, we'll use simulation to see how causal inference might do under many possible sample sizes. Here, we'll pretend that there are 100,000 possible customers in our store's dataset, so we get information on age and population density for \"free\" for all of those customers. However, we may choose to purchase additional information for a smaller number of customers." + ] + }, + { + "cell_type": "code", + "execution_count": 48, + "id": "6f3ad753", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "9-element Vector{Int64}:\n", + " 200\n", + " 500\n", + " 1000\n", + " 2000\n", + " 3000\n", + " 5000\n", + " 8000\n", + " 10000\n", + " 100000" + ] + }, + "execution_count": 48, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "sample_sizes = [200,500,1000, 2000, 3000, 5000, 8000, 10000, 100000]" + ] + }, + { + "cell_type": "code", + "execution_count": 49, + "id": "15639f14", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "9-element Vector{Float64}:\n", + " 2.8115630705511534\n", + " 1.7676608862714198\n", + " 1.244541315132393\n", + " 0.8793204902309119\n", + " 0.7164897592324871\n", + " 0.5547073387754579\n", + " 0.43917852198231894\n", + " 0.39189421712184797\n", + " 0.12395185564974553" + ] + }, + "execution_count": 49, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "widths_control_all = [ mean([get_confint_width_control_all(sample_size) for i in 1:50]) \n", + " for sample_size in sample_sizes]" + ] + }, + { + "cell_type": "code", + "execution_count": 55, + "id": "d316d9e3", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "9-element Vector{Float64}:\n", + " 0.39066356290977355\n", + " 0.24687584500294046\n", + " 0.17445306399686047\n", + " 0.12343422155246556\n", + " 0.10035008540387573\n", + " 0.0782271754568998\n", + " 0.06165551987110357\n", + " 0.05514468317732508\n", + " 0.017436101603674883" + ] + }, + "execution_count": 55, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "widths_control_age = [ mean([get_confint_width_control_age(sample_size) for i in 1:50]) \n", + " for sample_size in sample_sizes]" + ] + }, + { + "cell_type": "code", + "execution_count": 51, + "id": "7c720344", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "9-element Vector{Float64}:\n", + " 0.02\n", + " 0.08\n", + " 0.26\n", + " 0.66\n", + " 0.9\n", + " 0.98\n", + " 1.0\n", + " 1.0\n", + " 0.98" + ] + }, + "execution_count": 51, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "prop_correct_dag = [ mean([simulate_discovery_pc(sample_size) for i in 1:50]) \n", + " for sample_size in sample_sizes]" + ] + }, + { + "cell_type": "code", + "execution_count": 52, + "id": "a0f86f08", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "0.017441668616459544" + ] + }, + "execution_count": 52, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "estimate_from_age_control = mean([get_confint_width_control_age(sample_sizes[length(sample_sizes)]) \n", + " for i in 1:500]) ## Assume that we can get this data for free\n" + ] + }, + { + "cell_type": "markdown", + "id": "73d73110", + "metadata": {}, + "source": [ + "Finally, let's look at how the efficiency of estimation changes as we increase our sample size. \n", + "\n", + "In this hypothetical scenario, there are three possible strategies we could use to estimate the effect of store distance on spending. \n", + "1. Sample a given amount of data (e.g. for 3,000 customers), and use all variables (both those which we have \"free\" and \"expensive\" data for to estimate the effect \n", + "2. Sample a given amount of data, use causal discovery to identify age as the variable that needs to be controlled, and estimate the effect controlling for only age on the smaller subsample of data \n", + "3. Sample a given amount of data, use causal discovery to identify age as the variable that needs to be controlled, and estimate the effect controlling for only age on the full population that we have \"free\" data for (100,000 customers) \n", + "\n", + "Note that both controlling for age and controlling for all possible variables yield unbiased estimates of our causal parameter.\n", + "\n", + "In our graph, we can see that the confidence interval width for age (estimated from strategies 2 and 3) is always smaller than the interval width when controlling for all variables. However, we can only realistically use those strategies if we have enough data to correctly identify the causal structure of the data. So, strategy 2 and 3 are always better than strategy 1 as long as we have at least 5,000 patients. \n", + "\n", + "\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": 62, + "id": "5f234539", + "metadata": {}, + "outputs": [ + { + "data": { + "image/svg+xml": [ + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + "\n", + "\n", + "\n", + " \n", + " \n", + " \n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n" + ] + }, + "execution_count": 62, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "using StatsPlots\n", + "p = StatsPlots.plot(xlabel = \"Sample size\", ylabel = \"Confidence Interval Width\")\n", + "p_twin = StatsPlots.twinx(p)\n", + "\n", + "StatsPlots.plot!(p, sample_sizes, widths_control_all,xaxis=:log, \n", + " legend = :right, linecolor = :steelblue, label = \"Controlling for all\")\n", + "StatsPlots.plot!(p, sample_sizes, widths_control_age,xaxis=:log, \n", + " legend = :right, linecolor = :turquoise, label = \"Controlling for Age\")\n", + "\n", + "StatsPlots.plot!(p_twin, sample_sizes, [i * 100 for i in prop_correct_dag], xaxis=:log, legend = :topright, linecolor = :green, \n", + "label = \"% correct DAGs\", ylabel = \"Proportion of correct DAGs\")\n" + ] + }, + { + "cell_type": "markdown", + "id": "417d23dc", + "metadata": {}, + "source": [ + "In this example, I used `gausscitest`, which assumes that relationships between variables are linear. This may not hold in many situations. \n", + "\n", + "In my previous posts, I have discussed how [double machine learning](https://medium.com/@clementzach_38631/double-machine-learning-for-causal-inference-from-a-partially-linear-model-ada4c39914e3) may be used to recover estimates of causal parameters in these situations, so causal inference could still be done in those situations as long as you have a valid causal discovery algorithm. \n", + "\n", + "If you find yourself in a situation where the assumption of linear relationships between covariates is untenable, you can still do causal discovery! If you're using the Julia package I used in this package, you just need to use `cmitest` rather than `gausscitest` as demonstrated in [this example](https://mschauer.github.io/CausalInference.jl/latest/examples/pc_cmi_examples/). This will take much longer because the algorithm for assessing independence without assuming linearity is much more complicated than the algorithm which assumes linearity, but it enables you to use causal inference in situations where you do not think that linear relationships between covariates will hold. " + ] + }, + { + "cell_type": "markdown", + "id": "db0e7a9e", + "metadata": {}, + "source": [ + "And, I've said this before, but I want to just reiterate that you should be cautious about use causal discovery methods in cases when there are [unmeasured common causes of variables in your dataset](https://arxiv.org/pdf/2106.02234.pdf) or in cases of selection bias, regardless of the algorithm you use. In reality, it's probably next to impossible to rule out these possibilities, but causal discovery is a fun thing to learn about anyway!" + ] + }, + { + "cell_type": "markdown", + "id": "274eb527", + "metadata": {}, + "source": [ + "If you're interested in learning more, here are a couple of resources: \n", + "\n", + "[Causation, Prediction, and Search book (the definitive resource)](https://philarchive.org/archive/SPICPA-2)\n", + "\n", + "[A chapter from Cosma Shalizi](https://www.stat.cmu.edu/~cshalizi/uADA/12/lectures/ch25.pdf)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b5ff5a24", + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Julia 1.8.3", + "language": "julia", + "name": "julia-1.8" + }, + "language_info": { + "file_extension": ".jl", + "mimetype": "application/julia", + "name": "julia", + "version": "1.8.3" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +}