Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nbgrader cell grading issues: kernel not agnostic, confusing, English #542

Closed
dsblank opened this issue Sep 8, 2016 · 18 comments
Closed
Labels
Milestone

Comments

@dsblank
Copy link
Member

dsblank commented Sep 8, 2016

Currently, when you create a nbgrader notebook and mark a cell to be autograded, it will replace the cell's contents (if any) with:

# YOUR CODE HERE
raise NotImplementedError()

This is confusing for two reasons:

  1. If you actually put in code, such as:
(define my-test-function
    (lambda (arg)
        ;; your code here...

it is just removed.

  1. It is replaced with Python code. Beginning Python students are confused by these lines. It is very confusing if they are beginning students of a different language.

Constraints: there is an automatic feature that will replace/search for text, such as :

### BEGIN SOLUTION
...
### END SOLUTION

This is, of course, language specific (eg, the comment hash is specific to Python).

A related problem is that text Markdown cells that are marked for autograding are also removed. For example, the text is always replaced with:

YOUR ANSWER HERE

My overall feeling is that nbgrader is attempting to do too much, and that causes issues because what it does do is not kernel (nor human-language) agnostic. If it instead left the text/code alone the cells, many of these issues would go away.

But what to do about automatic solution hiding? Perhaps something more general:

  • if a line contains the text "*** begin solution ***" (can be embedded in comments of kernel), then start of solution
  • if a line contains the text "*** end solution ***" (can be embedded in comments of kernel), then end of solution

Also, the system should not rely on raising NotImplementedError(). This is confusing to see, and is a weak way to check for "validation".

@jhamrick
Copy link
Member

jhamrick commented Sep 8, 2016

The solution regions and the code that they are replaced with are all customizable, so if the defaults aren't quite right for your class (e.g. because of the language or the level of the students), you can definitely change them, e.g.:

c.ClearSolutions.comment_mark = ";"
c.ClearSolutions.begin_solution_delimeter = ";; BEGIN SOLUTION"
c.ClearSolutions.end_solution_delimeter = ";; END SOLUTION"
c.ClearSolutions.code_stub = ";; your code here..."

This actually has changed on master, too, so you can specify different values depending on the kernel language, allowing you to have different notebooks in different languages within the same assignment:

c.ClearSolutions.begin_solution_delimeter = {'python': '### BEGIN SOLUTION', 'lisp': ';;; BEGIN SOLUTION'}
c.ClearSolutions.end_solution_delimeter = {'python': '### END SOLUTION', 'lisp': ';;; END SOLUTION'}
c.ClearSolutions.code_stub = {'python': '# YOUR CODE HERE', 'lisp': ';; your code here...'}

I agree this isn't super well documented; I can definitely improve the documentation to make this stuff clearer. I do think this functionality is incredibly important, though -- maintaining separate instructor and student versions of our assignments used to be one of our biggest pain points. Being able to only maintain a single master copy and generate the student version from it has made things for our class much, much simpler.

A related problem is that text Markdown cells that are marked for autograding are also removed.

I'm not sure I understand the issue here -- is there are reason you can't split the markdown cell into multiple cells, one with the text you want to keep (that isn't autograded) and one where you want students to put their answer? Can you give an example of the cell contents for this sort of thing?

Also, the system should not rely on raising NotImplementedError(). This is confusing to see, and is a weak way to check for "validation".

nbgrader definitely doesn't rely on the NotImplementedError -- you can use whatever value you want there. The only constraint for nbgrader validate is that all the test cells pass without error, and the only constraint for nbgrader validate --invert is that all of the test cells fail with an error (not necessarily NotImplementedError).

@dsblank
Copy link
Member Author

dsblank commented Sep 8, 2016

@jhamrick Thanks for the feedback!

I think I want to push back on the philosophy in the answer a bit, though. It is great that these are configurable values. However, it would be better if it just worked in an intuitive way without having to have configurable values at all. It would be great to have less documentation (because it is easy to use), and fewer configs. nbgrader is already too complicated to use. Having to learn esoteric internals (such as c.ClearSolutions.begin_solution_delimeter) is asking a lot from the users. It could be implemented in an easier way that worked for all languages (as I suggested).

Regarding the removing of text in an autograded Markdown, three points:

  1. sometimes I give hints about what to answer. For example, I could have some Markdown syntax with a similar your answer here.

  2. There is no indication that what I write in the autograded markdown will be removed.

  3. The notebook is a general framework. Having the English message "YOUR ANSWER HERE" stuffed in there is inflexible. That could be a default, but I had already written a more appropriate text that was removed.

nbgrader definitely doesn't rely on the NotImplementedError -- you can use whatever value you want there.

Of course, I didn't put that there... nbgrader put it there. Are you saying that that is another configurable?

I'll have to add an assert-like component to the language to be able to make easy checks. So you are saying that as long as the code causes an error (and properly causes the front end to halt) any kernel should work?

@jhamrick
Copy link
Member

jhamrick commented Sep 8, 2016

It could be implemented in an easier way that worked for all languages (as I suggested).

Unfortunately, I'm not sure it's quite so simple... the delimiters need to be prefixed by the relevant comment mark of the language or else it's invalid syntax. The comment mark is then something that would either need to be tracked explicitly by nbgrader itself, or still something that needs to be configured. I don't think it should be nbgrader's responsibility to keep a mapping of all comment marks for every possible kernel -- it would be a pain to maintain, and more importantly, it would be frustrating for users who want to use a new kernel to wait for the next release of nbgrader for their kernel to be supported/added to the mapping. Also, although it is unusual, there are some languages (e.g. Mathematica) that only have support for block comments (rather than inline comments), which means it would not be sufficient to just prefix the delimeters with the comment mark.

I agree nbgrader can be quite complicated to use, but I think that is sometimes an unfortunate necessity if nbgrader is going to be flexible enough to support the different needs of a wide range of classes. I definitely think there are some places where nbgrader can be simplified and made less complicated, but I don't think there is much that there can be done in this particular case that wouldn't introduce additional complexity in some other way.

There is no indication that what I write in the autograded markdown will be removed.

There isn't an indication in the interface, that's true, though this behavior is documented. I would definitely be in favor of adding something to the UI that would indicate that this will happen, but I haven't been able to come up with anything good (UI is not my strength). Do you have any suggestions?

The notebook is a general framework. Having the English message "YOUR ANSWER HERE" stuffed in there is inflexible. That could be a default, but I had already written a more appropriate text that was removed.

It is a default -- you can customize it with the ClearSolutions.text_stub option.

Of course, I didn't put that there... nbgrader put it there. Are you saying that that is another configurable?

Yes, it is configurable through the ClearSolutions.code_stub option. The default is '# YOUR CODE HERE\\nraise NotImplementedError()'. The list of configuration options shows all the ways the ClearSolutions preprocessor can be customized, including default values -- but as I mentioned earlier, I agree this should be documented better.

So you are saying that as long as the code causes an error (and properly causes the front end to halt) any kernel should work?

The code stub itself doesn't even have to cause an error (though that is the easiest thing to do, in my opinion) -- it all depends on how you write your test cases. The only constraint is that when autograding happens, the behavior is such that:

  1. If the tests pass, the student gets full credit.
  2. If the tests fail, the student gets no credit.

So if the student hasn't given an answer, the tests should ideally fail by default. How they fail is totally up to you.

@willingc
Copy link
Member

willingc commented Sep 9, 2016

@jhamrick Would making a cookiecutter for config options help at all?

@dsblank
Copy link
Member Author

dsblank commented Sep 9, 2016

Unfortunately, I'm not sure it's quite so simple... the delimiters need to be prefixed by the relevant comment mark of the language or else it's invalid syntax.

The delimiter can be inside a comment, whatever it happens to be. I don't think the delimiter needs to be the entire comment. For example, these could all work:

def function():
    ## **BEGIN SOLUTION**
    ## **END SOLUTION**
(define function
    (lambda (v)
        ;; **BEGIN SOLUTION**
        ;; **END SOLUTION**
    )
)

The comment mark is then something that would either need to be tracked explicitly by nbgrader itself,

No, I wouldn't do that. I would just look for the text delimiters in a line, and use that whole line to indicate indentation level (starting with whatever starts that line).

Also, although it is unusual, there are some languages (e.g. Mathematica) that only have support for block comments (rather than inline comments), which means it would not be sufficient to just prefix the delimeters with the comment mark.

Wouldn't this work:

some.other(language);
   like-mathematica...
        /* **BEGIN SOLUTION** */
        /* **END SOLUTION** */

I really don't like "'# YOUR CODE HERE\nraise NotImplementedError()'" as a default, for a couple of reasons. One is that it does cause an error, which can be intimidating. Error messages take some practice to be able to read and understand. The other is that it is advanced Python, and Python-specific. I would try to keep defaults as neutral and simple as possible. But, at least it is configurable!

I thought that kernels knew how to comment code for use in the "Download as ..." as a native file in that language. But it looks like that didn't happen (saves as a txt file with no added comments). If kernels did know how to insert comments, you could use it here too.

but I haven't been able to come up with anything good (UI is not my strength). Do you have any suggestions?

  • Empty the contents
  • Make it read-only
  • Make it red if it has stuff in it

Better: allow text to be able to be put in there, and then don't use the "YOUR ANSWER HERE" text. Then you don't need any UI, and it would match expectations.

Some of the text in your answer should be in the docs too. I agree that by default tests should fail. But that doesn't mean that they need to raise exceptions in the actual code. It isn't clear from the default stubs and all the examples that that isn't necessary.

As far as design of the tests using assert: to me, it would be cool if a student could "run all" and see the results "8 out of 10 correct" or "110/230 points". Having more than one assert in a cell prevents that from happening. And of course, it will stop at the first error anyway. This could be re-thought to be more useful pedagogically, I think.

@dsblank
Copy link
Member Author

dsblank commented Sep 9, 2016

@willingc I don't know about a cookiecutter, but I wouldn't be able to use nbgrader if I didn't have a notebook for each course. I can't remember all of the steps and syntax (sqlite, files, commands, flags). So, I copy a notebook into the directory, and it helps me manage the class. I can show you a template, if I scrub the details...

@jhamrick
Copy link
Member

jhamrick commented Sep 9, 2016

@dsblank Ah, so is what you're suggesting matching the delimeter to any part of the line, i.e. it would also work to have # a;owfija;iwejfa;**BEGIN SOLUTION**aoef;aefjiowejroweij? I had thought about doing it this way, but I was concerned about the edge case where you really did want to have the delimeter (like **BEGIN SOLUTION**) somewhere but didn't want to use it as a delimeter. So I decided on the stricter version where the whole line (excluding whitespace) is the delimeter.

If kernels did know how to insert comments, you could use it here too.

Yeah, if kernels knew how to insert comments then I think this would be a moot point.

Empty the contents
Make it read-only
Make it red if it has stuff in it

Hmm, I don't think that's the right solution. Instructors should be able to put stuff in those cells for their own use (e.g. a reference solution, grading rubric, etc.) -- it just shouldn't end up in the student version. I don't want to force the cells to be empty or convey to instructors that they can't put stuff in those cells.

Better: allow text to be able to be put in there, and then don't use the "YOUR ANSWER HERE" text. Then you don't need any UI, and it would match expectations.

This would again though require maintaining separate versions, one with the solutions and one without, which is something I'd really like to avoid.

Some of the text in your answer should be in the docs too. I agree that by default tests should fail. But that doesn't mean that they need to raise exceptions in the actual code. It isn't clear from the default stubs and all the examples that that isn't necessary.

What specifically do you think would be useful to include in the docs? I added a note about the exceptions being optional in #545.

As far as design of the tests using assert: to me, it would be cool if a student could "run all" and see the results "8 out of 10 correct" or "110/230 points". Having more than one assert in a cell prevents that from happening. And of course, it will stop at the first error anyway. This could be re-thought to be more useful pedagogically, I think.

Yes, I agree this would be nice. One of the very early versions of nbgrader did something more like this but it ended up getting scrapped because it made things way more complicated. Now that nbgrader is fairly mature though I think it would certainly be possible to build a notebook extension that does something like this that could be included as part of nbgrader.

@dsblank
Copy link
Member Author

dsblank commented Sep 9, 2016

Here is a stripped-down version of my nbgrader course notebook. I have one in each course directory in my ~/nbgrader folder: https://athena.brynmawr.edu/jupyter/hub/dblank/public/nbgrader_template.ipynb

@dsblank
Copy link
Member Author

dsblank commented Sep 9, 2016

I liked this text:

The code stub itself doesn't even have to cause an error (though that is the easiest thing to do, in my opinion) -- it all depends on how you write your test cases. The only constraint is that when autograding happens, the behavior is such that:

1. If the tests pass, the student gets full credit.
1. If the tests fail, the student gets no credit.

So if the student hasn't given an answer, the tests should ideally fail by default. How they fail is totally up to you.

@dsblank
Copy link
Member Author

dsblank commented Sep 9, 2016

Hmm, I don't think that's the right solution. Instructors should be able to put stuff in those cells for their own use (e.g. a reference solution, grading rubric, etc.) -- it just shouldn't end up in the student version. I don't want to force the cells to be empty or convey to instructors that they can't put stuff in those cells.

Are such solutions visible while grading? Otherwise, a solution written in a manually-graded Markdown isn't very useful anyway.

@jhamrick
Copy link
Member

jhamrick commented Sep 9, 2016

Are such solutions visible while grading?

No, but I will open up the instructor version and refer to it during grading. We also use the instructor version of the assignment to generate solutions that we give out to the students after we've returned their grades.

@dsblank
Copy link
Member Author

dsblank commented Sep 9, 2016

No, but I will open up the instructor version and refer to it during grading. We also use the instructor version of the assignment to generate solutions that we give out to the students after we've returned their grades.

Useful information about workflow! Thanks!

@dsblank
Copy link
Member Author

dsblank commented Sep 9, 2016

but I was concerned about the edge case where you really did want to have the delimeter (like BEGIN SOLUTION) somewhere but didn't want to use it as a delimeter. So I decided on the stricter version where the whole line (excluding whitespace) is the delimeter.

Personally, I wouldn't worry about that weird usage. It would make everything easier, I think, if you prevent that and look for it more loosely. I'm all about making it easier :)

@jhamrick
Copy link
Member

jhamrick commented Sep 9, 2016

I liked this text

Cool, I will add that to the docs!

Personally, I wouldn't worry about that weird usage. It would make everything easier, I think, if you prevent that and look for it more loosely. I'm all about making it easier :)

Yeah, it is probably unlikely to occur in practice. Ok, I think I am sold on the idea. I'll open an issue specifically for that.

@jhamrick
Copy link
Member

jhamrick commented Sep 9, 2016

@willingc

Would making a cookiecutter for config options help at all?

I am not sure. There are so many config options and they're not relevant to everybody. I could create a default nbgrader_config.py and include it in the docs (with nbgrader --generate-config) but that is mostly redundant with the existing docs. The user guide has a very simple example of nbgrader_config.py -- I guess the issue is in intermediate cases, but those are the cases that are hardest to document exactly because everybody has different things that they want to configure.

@jhamrick jhamrick added this to the 0.4.0 milestone Sep 9, 2016
@willingc
Copy link
Member

willingc commented Sep 9, 2016

@jhamrick @dsblank Happy to help on this and other 0.4.0 milestones. You know my heart is with education and Jupyter/JupyerHub, and Doug is doing very cool stuff. I'll give the notebook template a closer look tomorrow.

P.S. Nice to have you back @jhamrick.

@jhamrick
Copy link
Member

jhamrick commented Sep 9, 2016

@willingc Thanks! I would definitely really appreciate anything you want to contribute 😄 🎉

@jhamrick
Copy link
Member

To summarize, a few issues came out of this discussion:

I think the bulk of what we talked about in this issue has been turned into these other issues, so I am going to close this. @dsblank if you disagree, please let me know and either I or you can open up new issues for specific things that should be addressed.

@jhamrick jhamrick modified the milestones: 0.4.0, 0.5.0 Jan 13, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants