Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

qcc.groups() with only one observation cause error in qcc() from sd.xbar, etc. #13

Open
mcaselli opened this issue Nov 20, 2018 · 4 comments

Comments

@mcaselli
Copy link

With variable group sizes, if a group happens to have only one member, qcc() will throw an error from sd.xbar().

Ideally, the functions should be robust to this type of input so that the tool can be used for in-process monitoring: e.g. with time-based subgroups, the user may run the report before a second (or third, or perhaps even first) observation is collected.

library(qcc)
data(pistonrings)

grp <- qcc.groups(pistonrings[-c(7:10, 11:13, 16:17),"diameter"], pistonrings[-c(7:10, 11:13, 16:17),"sample"])
qcc(grp, type="xbar")
## Error in sd.xbar(c(74.03, 73.995, 74.005, 73.993, 73.992, 74.009, 73.995,  : 
##  group sizes must be larger than one
@luca-scr
Copy link
Owner

Ok, this is a case outside the standard situations where you have "rationale subgroups" of at least two observations each, or all samples with a single observations. The two cases need different methods for estimate the within-sample standard deviation.
In the case you presented, which is the best way to estimate the within-sample standard deviation?
I don't know any theoretical study dealing with this, but perhaps a simple solution would be to just remove the singleton samples. Something like would work:

size <- apply(grp, 1, function(x) sum(!is.na(x)))
qcc(grp, type="xbar", std.dev = sd.xbar(grp[size > 1,]))

This can be easily implemented within the sd.xbar() function.

@mcaselli
Copy link
Author

Indeed I did something like this as a hack in my present application:

QCCGroups  <- function(data, sample){
    mat <- qcc::qcc.groups(data, sample)
    mat <- mat[which(apply(mat, 1, function(x) length(which(!is.na(x))))>1),,drop=FALSE]
    return(mat)
}

I think the "real" issue here is that using stop() in sd.xbar() at line 664 is perhaps a bit heavy-handed, since it will propagate all the way through the call stack to abort any script etc that calls it.

stats::sd() simply returns NA for a vector of length 1, maybe that's the best way to go here?

@mcaselli
Copy link
Author

also...thanks again for the great package, and for taking the time to look at this issue!

@luca-scr
Copy link
Owner

Your hack simply remove the sample(s) with a single observation (and the code can be simplified to improve readability), so you can do this with touching any qcc code.
sd.xbar() can't return NA because qcc() needs an estimate for std.dev. If you provide one (as I did in the example) then sd.xbar() won't be called.
Finally, my proposal is different from yours because you drop from the data the singleton sample, whereas I used it for plotting but not for computing std.dev.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants