-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
automatically fill the id attributes of the headings #267
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution!
I've just one question, but it generally looks solid to me. 🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Meant to leave review status as "Comment" :)
Ah, I just looked at the failures in CI. There are two issues:
How does that sound? |
d01fbf0
to
eec7388
Compare
I just inlined the function as it was straightforward to do: 0570385
I added an As for the identifiers, there's actually some rules described in the doc on how to derive them:
I had to switch to an implementation that supports Unicode characters in eec7388 (using the |
I improved the algorithm slightly; it should be ok for a second review. There are still some differences compared to the result I get on https://pandoc.org/try/, but pandoc can only generate id's when using The only thing missing I think is:
So for instance in pandoc, the following
becomes:
I guess it means that we'll need to keep a state somewhere with all the ids we've already generated. Not sure how to best implement that and if that's something we want. Would love to have your input on that :) We'll also have to decide how to handle "empty" id. In pandoc:
So
becomes
|
c1eccba
to
2a64b8c
Compare
cc @koonwen, this would probably fix ocaml/ocaml.org#523! |
I'm fine with following pandoc here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good! I finally was able to address your outstanding questions. Sorry again for the long delay here!
@@ -128,9 +178,13 @@ and inline = function | |||
| Image (attr, { label; destination; title }) -> | |||
img label destination title attr | |||
|
|||
let rec block = function | |||
let rec block ~auto_identifiers = function |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it means that we'll need to keep a state somewhere with all the ids we've already generated. Not sure how to best implement that and if that's something we want. Would love to have your input on that :)
Perhaps we can have the block
function take a record for this argument that can carry some configuration/state? For the identifier numerical suffix, it could be a int StringMap.t
. Of course, we could also use a mutable hash map for this, but would be nice to avoid if feasible. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion, I added an int StringMap.t
to keep track of the numerical suffix. Let me know what you think :)
8ad77e1
to
5a2d985
Compare
5a2d985
to
621589b
Compare
621589b
to
c07e958
Compare
Thanks to the index and accumulator arguments given by `Uutf.String.fold_utf_8`, we can avoid the need for a sentinel "start" the need for repeated checks after we've found the suffix, and the need to allocate a new string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few followup suggestions. I think we're just about there!
src/html.ml
Outdated
let id = slugify (to_plain_text text) in | ||
(* Default identifier if empty. It matches what pandoc does. *) | ||
let id = if id = "" then "section" else id in | ||
let count, identifiers = Identifiers.touch id identifiers in | ||
let id = | ||
if count = 0 then id else Printf.sprintf "%s-%i" id count | ||
in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this might better belong in the slugify
function. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have any strong opinion on that one. I made the changes in 820f310
(#267)
Similar to the approach with `drop_while`, we can use the accumulator argument of the fold to track whether or not we've seen white space, which allows us to clean up the logic a bit, IMO. However, unlike that case, the need to construct a new string still requires some side effects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. Thanks so much, and sorry for the sluggish review loops!
Fixes: #251
This is my attempt at implementing #251
I didn't add an option to enable/disable that feature. Let me know if I should.