Towards being able to handle crashing processes. #317

stevana · 2019-06-11T15:16:40Z

See #162 for details.

kderme · 2019-06-12T14:30:44Z

@stevana does this resolve the error simplify: impossible, because of the structure of linearise. which sometimes appear when there is some Exception?

stevana · 2019-06-13T07:18:01Z

@kderme: no, that error should never happen -- please open a ticket if you see it.

kderme · 2019-06-14T13:36:11Z

src/Test/StateMachine/Parallel.hs

+               , reason2
+               , env1 <> env2
+               ) pairs
+    go hchan (Ok, ExceptionThrown, env) (Pair cmds1 _cmds2 : pairs) = do


@stevana is there any reason to continue execution if there is an ExceptionThrown?

Sequential execution also stops if there is an exception.

I think we should stop execution for a specific process (or Pid) if there's an ExceptionThrown on that process (but continue executing on the other processes). I believe that's what I've implemented here.

This is also why we need your genernalisation to allow execution on more than two processes, so that we can introduce faults (exceptions) without stopping all execution altogether.

I see, I guess it depends on how we want to simulate user usage of the api and since this gives more user option I think it's on the right direction. @stevana do you think this is close to being merged? I see some errors like executeCommands: impossible and some false positives tests, which I believe will be solved with this pr.

Another important thing that this PR adds is the ability to complete a history which contains crashed processes. We do so by appending a user specified response to the end of the history. For example in the case of the memory reference example, if a Write fails then we can have a history like this:

Process 1: |-- x <- Create --| |---- Read x ---| Process 2: |---- Write 5 x --...

In this case Read can return both 0 (the default value of create) or 5 depending on if Write crashed before or after writing.

To account of this we complete the above history as follows:

Process 1: |-- x <- Create --| |---- Read x ---| Process 2: |---- Write 5 x -------------------------|

Notice that the Write is concurrent with the Read. Now because of the way linearise works (it tries all possible sequential interleavings of the action calls) it will accept Read returning both 0 and 5.

To complete for Write is easy as it's just an Ack, and it doesn't really matter what we complete Read with as it doesn't change the model. If a Create crashes, we are kind of screwed that's why in this PR I also allowed the pre-condition failed exception to be thrown. No doubt will there be examples for which complete will be harder...

A possible alternative would be to have the user catch all exceptions (due to fault injection) and account of the non-determinism, e.g. have not just a single value for each memory reference, but a set of values. This complicates the model and is a lot more work for the user though.

Yet another possibility might be to have the fault injection be more precise, so that we know if the Write failed before or after it wrote to the memory. That way we know if it should return 0 or 5. I'm not sure if having this precision is always possible. It also complicates the model because we need to keep track of exactly what faults are injected and return the correct response in the presence of the fault.

Does this make sense? If not, let me know where I lost you and I can try to explain further.

It's my current understanding of the Linearizability paper and what Jepsen does. It took me a while to get here, and I'm still not certain if I understand things correctly. So I think it would be great if: 1) I could convince you that this approach makes sense, and 2) that we develop more examples that confirm that it indeed works.

See #162 for details.

kderme reviewed Jun 14, 2019

View reviewed changes

stevana added 2 commits June 19, 2019 10:23

Towards being able to handle crashing processes.

59f7e88

See #162 for details.

Increase maxSuccess to 1000 for the CrashAndLogicBugParallel property.

27e4138

stevana force-pushed the feat/complete-crashing-history branch from d66cb09 to 27e4138 Compare June 19, 2019 09:19

Increase maxSuccess to 3000 for the CrashAndLogicBugParallel property.

cda082b

stevana merged commit 1bec806 into master Jun 19, 2019

stevana deleted the feat/complete-crashing-history branch June 19, 2019 12:53

kderme mentioned this pull request Jul 1, 2019

Property Result in case of exceptions #326

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Towards being able to handle crashing processes. #317

Towards being able to handle crashing processes. #317

stevana commented Jun 11, 2019

kderme commented Jun 12, 2019

stevana commented Jun 13, 2019

kderme Jun 14, 2019

kderme Jun 14, 2019

stevana Jun 15, 2019

kderme Jun 19, 2019

stevana Jun 19, 2019

Towards being able to handle crashing processes. #317

Towards being able to handle crashing processes. #317

Conversation

stevana commented Jun 11, 2019

kderme commented Jun 12, 2019

stevana commented Jun 13, 2019

kderme Jun 14, 2019

Choose a reason for hiding this comment

kderme Jun 14, 2019

Choose a reason for hiding this comment

stevana Jun 15, 2019

Choose a reason for hiding this comment

kderme Jun 19, 2019

Choose a reason for hiding this comment

stevana Jun 19, 2019

Choose a reason for hiding this comment