Skip to content

Commit

Permalink
Phrasing 'on top of axis' in vignette
Browse files Browse the repository at this point in the history
  • Loading branch information
hanneoberman committed Sep 7, 2023
1 parent beaa379 commit 2bce810
Showing 1 changed file with 3 additions and 3 deletions.
6 changes: 3 additions & 3 deletions vignettes/ggmice.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,7 @@ The `mapping` argument in `ggmice()` cannot be empty. An `x` or `y` mapping (or

## Incomplete data

If the object supplied to the `data` argument in `ggmice()` is a `data.frame`, the visualization will contain observed data in blue and missing data in red. Since missing data points are by definition unobserved, the values themselves cannot be plotted. What we *can* plot are sets of variable pairs. Any missing values on one variable can be displayed on top of the axis of the other. This provides a visual cue that the missing data is distinct from the observed values, but still displays the observed value of the other variable.
If the object supplied to the `data` argument in `ggmice()` is a `data.frame`, the visualization will contain observed data in blue and missing data in red. Since missing data points are by definition unobserved, the values themselves cannot be plotted. What we *can* plot are sets of variable pairs. Any missing values in one variable can be displayed on the axis of the other. This provides a visual cue that the missing data is distinct from the observed values, but still displays the observed value of the other variable.

For example, the variable `age` is completely observed, while there are some missing entries for the height variable `hgt`. We can create a scatter plot of these two variables with:

Expand All @@ -99,7 +99,7 @@ ggmice(dat, aes(age, hgt)) +
geom_point()
```

The `age` of cases with missing `hgt` are plotted on top of the horizontal axis. This is in contrast to a regular `ggplot()` call with the same arguments, which would leave out all cases with missing `hgt`. So, with `ggmice()` we loose less information, and may even gain valuable insight into the missingness in the data.
The `age` of cases with missing `hgt` are plotted on the horizontal axis. This is in contrast to a regular `ggplot()` call with the same arguments, which would leave out all cases with missing `hgt`. So, with `ggmice()` we loose less information, and may even gain valuable insight into the missingness in the data.

Another example of `ggmice()` in action on incomplete data is when one of the variables is categorical. The incomplete continuous variable `hgt` is plotted against the incomplete categorical variable `reg` with:

Expand All @@ -108,7 +108,7 @@ ggmice(dat, aes(reg, hgt)) +
geom_point()
```

Again, missing values are plotted on top of the axes. Cases with observed `hgt` and missing `reg` are plotted on top of the vertical axis. Cases with observed `reg` and missing `hgt` are plotted on top of the horizontal axis. There are no cases were neither is observed, but otherwise these would be plotted on the intersection of the two axes.
Again, missing values are plotted on the axes. Cases with observed `hgt` and missing `reg` are plotted on the vertical axis. Cases with observed `reg` and missing `hgt` are plotted on the horizontal axis. There are no cases were neither is observed, but otherwise these would be plotted on the intersection of the two axes.

The 'grammar of graphics' makes it easy to adjust the plots programmatically. For example, we could be interested in the differences in growth data between the city and other regions. Add facets based on a clustering variable with:

Expand Down

0 comments on commit 2bce810

Please sign in to comment.