04_RunChart.Rmd

---
title: "05_RunChart"
output: pdf_document
---

# Run Chart {#run_chart}

Run charts are designed to show a metric of interest over time. They do not rely on parametric statistical theory, so they cannot distinguish between common cause and special cause variation. Control charts can be more powerful when properly constructed, but run charts are easier to implement where statistical knowledge is limited and still provide practical monitoring and useful insights into the process.

Run charts typically employ the median for the reference line. Run charts help you determine whether there are unusual runs in the data, which suggest non-random variation. They tend to be better than control charts in trying to detect moderate (~$\pm$ 1.5$\sigma$) changes in process than using the control charts' typical $\pm$ 3$\sigma$ limits rule alone. 

<br>
\vspace{12pt}

*In other words, a run chart can be more useful than a control chart when trying to detect improvement while that improvement work is still going on.*

<br>
\vspace{12pt}

There are two basic "tests" for run charts (an astronomical data point or looking for cycles aren't tests *per se*):  

- *Process shift:* A non-random run is a set of $log_2(n) + 3$ consecutive data points (rounded to the nearest integer) that are all above or all below the median line, where *n* is the number of points that do *not* fall directly on the median line. For example, if there are 34 points and 2 fall on the median, then $n = 32$ observations.Plugging this value into the equation: $log_2(32) + 3 = 5 + 3 = 8$. So, in this case, the longest run should be no more than 8 points.

- *Number of crossings:* Too many or too few median line crossings suggest a pattern inconsistent with natural variation. You can use the binomial distribution (`qbinom(0.05, n-1, 0.50)` in R for a 5% false positive rate and expected proportion 50% of values on each side of the median) or a table (e.g., [Table 1 in Anhøj & Olesen 2014](http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0113825#pone-0113825-t001)) to find the minimum number of expected crossings. 

<br>
\vspace{12pt}

However, this book uses a package called `qicharts2`. This package will create a run chart or control charts for you, and calculate relevant values for you. The best way to understand, is to jump into an example. Let's plot the run chart for Rachel's data. The arguments `data`, `y`, and `x` are straightforward. The `n` argument is only needed for certain applications. CLABSI data is one of those applications where n is the number of linedays (which is a column available in Rachel's dataset). The `multiply` argument refers to the number of patient days, in this case we want to look at per 1000 patient days. We encourage you to read through the documentation of the package (`?qicharts2::qic`) to determine which arguments are needed for *your* data. 

The argument `print.summary = TRUE` can be added to the qic function *or* the summary function can be called on the plot object (`summary(run_chart)`). Both will give a table of calculated values that you can evaluate. 


```{r}
qicharts2::qic(x = months, y = infections, n = linedays, data = rachel_data, 
               multiply = 1000, chart = 'run', x.angle = 45, 
               title = "Run chart of Rachel's data", 
               xlab = "Month", 
               ylab = "Infection count per 1000 patient days", 
               print.summary = TRUE)
```


By viewing the summary, you can check the number of observations, `n.obs`, to the number of observations useful, `n.useful`. You can ensure that the longest run in the data, `longest.run`, is less than the maximum allowed, `longest.max`. Finally you can ensure that the number of crossing, `n.crossings`, is greater than the minimum expected, `n.crossings.min`. 

***
```{remark}
\nop

\vspace{12pt}
\vspace{12pt}

Looking at the run chart and the summary table for Rachel's data, is any non-random variation suggested?

\vspace{12pt}
```

***

<br>
\vspace{12pt}
<br>
\vspace{12pt}
<br>
\vspace{12pt}

No, there is not any non-random variation suggested. There are no unusual runs in the data. The longest run is four points when the maximum was eight. Furthermore, there are twelve crossings, when we needed at least eight. So both tests (process shift and number of crossings) suggest that there is no non-random variation in this process.