-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: rendering issues for migrated content #1371
Conversation
Netlify Deployments: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Device | URL |
---|---|
mobile | https://ocw-hugo-themes-www-pr-1371--ocw-next.netlify.app/ |
Device | URL |
---|---|
mobile | https://ocw-hugo-themes-www-pr-1371--ocw-next.netlify.app/search/ |
Device | URL |
---|---|
mobile | https://ocw-hugo-themes-course-v2-pr-1371--ocw-next.netlify.app/ |
Not what you expected? Are your scores flaky? GitHub runners could be the cause.
Try running on Foo instead
{{- $text := default .title .text | htmlEscape | page.RenderString -}} | ||
{{- $text := default .title .text -}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
htmlEscape
is not required, rather it should be avoided. This is because Studio renders escaped content where required.
RenderString
is unnecessary because markdown is rendered automatically in the relevant shortcode.
{{- if findRE "^(?:http|https|ftp|mailto)" $title -}} | ||
{{- $title = (replace $title "/" "\\/") -}} | ||
{{- end -}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shortcode creates a new anchor tag whenever it sees a URL in the text.
So a markdown of
<a href="https://google.com">https://google.com</a>
produces two anchor tags.
<a href="https://google.com"></a>
<a href="https://google.com">https://google.com</a>
I was unable to figure out why this is the case. My best guess is that the shortcode's output is run through an additional markdown renderer.
Escaping the URL fixes the problem. Though a better solution might still exist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took a look into this and I think I found an explanation. Although, I should preface this with the fact that I was able to revert resource_link.html
to the old version without seeing the two anchor tags like you mentioned. Anyway, here's what I think is going on:
- The content calls the shortcode with the
%
delimiters that tell Hugo that the shortcode will produce markdown that needs to be processed, i.e.{{% resource_link "f24cc1d7-0985-4902-bf27-f9fc8a03e9b4" "Google" %}}
- The shortcode calls
{{- partial "external_resource_link" (merge .Params (dict "text" $title)) -}}
, which calls a partial and not a shortcode - The
external_resource_link
partial then gathers all the properties of the link and calls{{- partial "link" (dict "href" $href "text" $text "class" $className "onClick" $onClick "target" "_blank") -}}
- The
link.html
partial builds and returns an HTML link
Hugo then goes to render the page, and since the original shortcode was called with %
delimiters, it assumes that it is returning markdown. Therefore, the raw HTML link generated by link.html
is processed as markdown, resulting in the behavior you saw. I think in general, it is problematic to have a shortcode that can generate markdown through one code path, and HTML through another. The link.html
partial was created to be able to create links in templates that set more than just the href
and inner text, rather than the limited syntax of markdown links which only let you set those properties. I can think of two solutions here:
- Change the
else
block inresource_link.html
to return a raw HTML link and change all calls toresource_link
in content to use<
/>
delimiters, something like:
{{- else -}}
<a href="{{- .Permalink -}}{{- $anchor_id -}}">{{ $title }}</a>
{{- end -}}
- Create a separate shortcode, maybe
resource_link_external.html
and call that partial instead for external links but also with the<
/>
delimiters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gumaerc, I looked at this again and you're exactly right about the cause.
I was hoping to find something that could mark the partial's output as "does not need markdown rendering" but I don't think such a feature exists in Hugo for {{%
tags.
From the possible solutions you shared above, we might not be able to rely on 2. Because using a new shortcode might require us to create a new UI flow.
I think 1 is our best option, however, it involves many steps. We'd have to:
- Update themes
- Update Studio (i.e. CKEditor syntax because
{{< resource_link...
is not currently supported) - Update content and publish it
This issue with the URL as text is the only glitch I have found. I think we can apply this PR code as a temporary fix while we work on the proper solution. @gumaerc what do you think?
In any case, I'll write up an issue for this.
{{- .context.RenderString (truncate 320 $courseData.course_description) -}} | ||
{{- truncate 320 (.context.RenderString $courseData.course_description) -}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Truncating description before resolving shortcodes breaks the shortcodes. The order has been changes here to first render content (i.e. shortcodes) then truncate the output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. Do you think this will fix this recently reported bug, https://github.com/mitodl/hq/issues/3858 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems this was my fault in #876. @pdpinch I was able to confirm that this will fix https://github.com/mitodl/hq/issues/3858.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the links functioned properly for me at http://localhost:3000/pages/external-links/.
I was not seeing a proper link in the course description as you mentioned:
Looking at the course.json
in your test course, I saw this at the beginning of the description
field:
"course_description": "{{\\< resource\\_link \"be7e0a90-ad6e-4085-80f2-2b6b6f4a77fe\" \"OCW\" >}}
etc...
I noticed that the opening angle brace and the underscore in resource_link
are escaped. This will cause Hugo to render this as plain text. After removing those escape characters so it looked like this:
"course_description": "{{< resource_link \"be7e0a90-ad6e-4085-80f2-2b6b6f4a77fe\" \"OCW\" >}}
etc...
...then I was able to see the link:
I believe this is an issue in ocw-studio
. Previous to gohugoio/hugo#6703 being resolved and Hugo 0.100.0 being released, it was not possible put shortcodes inside markdown being sent to .RenderString
. Since this is now possible (and is the mechanism being used here to render a link in the description), we likely need to look at the character escaping being done on these fields when they're being written out to course.json
in ocw-studio
.
{{- if findRE "^(?:http|https|ftp|mailto)" $title -}} | ||
{{- $title = (replace $title "/" "\\/") -}} | ||
{{- end -}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took a look into this and I think I found an explanation. Although, I should preface this with the fact that I was able to revert resource_link.html
to the old version without seeing the two anchor tags like you mentioned. Anyway, here's what I think is going on:
- The content calls the shortcode with the
%
delimiters that tell Hugo that the shortcode will produce markdown that needs to be processed, i.e.{{% resource_link "f24cc1d7-0985-4902-bf27-f9fc8a03e9b4" "Google" %}}
- The shortcode calls
{{- partial "external_resource_link" (merge .Params (dict "text" $title)) -}}
, which calls a partial and not a shortcode - The
external_resource_link
partial then gathers all the properties of the link and calls{{- partial "link" (dict "href" $href "text" $text "class" $className "onClick" $onClick "target" "_blank") -}}
- The
link.html
partial builds and returns an HTML link
Hugo then goes to render the page, and since the original shortcode was called with %
delimiters, it assumes that it is returning markdown. Therefore, the raw HTML link generated by link.html
is processed as markdown, resulting in the behavior you saw. I think in general, it is problematic to have a shortcode that can generate markdown through one code path, and HTML through another. The link.html
partial was created to be able to create links in templates that set more than just the href
and inner text, rather than the limited syntax of markdown links which only let you set those properties. I can think of two solutions here:
- Change the
else
block inresource_link.html
to return a raw HTML link and change all calls toresource_link
in content to use<
/>
delimiters, something like:
{{- else -}}
<a href="{{- .Permalink -}}{{- $anchor_id -}}">{{ $title }}</a>
{{- end -}}
- Create a separate shortcode, maybe
resource_link_external.html
and call that partial instead for external links but also with the<
/>
delimiters.
{{- .context.RenderString (truncate 320 $courseData.course_description) -}} | ||
{{- truncate 320 (.context.RenderString $courseData.course_description) -}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems this was my fault in #876. @pdpinch I was able to confirm that this will fix https://github.com/mitodl/hq/issues/3858.
Thanks for the review @gumaerc. I'll work on the feedback. A small note about the comment below.
You're correct about Studio not supporting shortcodes for these fields. Did you download the content from RC? I intended for you to use this repo from my org. (I should've been more clear about this in the instructions.) In practice, we will use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This issue with the URL as text is the only glitch I have found. I think we can apply this PR code as a temporary fix while we work on the proper solution.
I think this is fine, as long as you create an issue to address the problem at hand.
I was hoping to find something that could mark the partial's output as "does not need markdown rendering" but I don't think such a feature exists in Hugo for
{{%
tags.
This is the whole point of {{<
/ >}}
vs {{%
%}}
, by using the angle brace delimiters when you call the shortcode you are telling it exactly that; that you don't need the shortcode to render markdown. It will indeed be a bit of a pain to address, as we'll need to switch the way resource_link
is called. We will run into the same issues we always have when trying to make a change that affects all content. Most people who make Hugo sites have direct control over their content, so our situation is not likely a common issue. The situation surrounding shortcode delimiters has a lot of history. This is an old article (2019), but it details a lot of the historical problems surrounding this issue: https://oostens.me/posts/hugo-shortcodes-with-markdown-gotchas/. In general I think it's problematic to enforce that a shortcode be called one way or the other. The functionality behind the scenes should be transparent to the "user" (in this case, the content creator) and you should be able to use either one. Like Nelis points out in his article, having different behavior between the two delimiters exposes implementation details to the user. If you look at the ref
shortcode documentation as an example though, they say "Always use the {{% %}}
notation when calling this shortcode." This indicates to me that the ref
shortcode always returns markdown that needs to be processed, so if you call it the other way then unrendered markdown will end up in your content.
From the Hugo documentation on shortcodes:
"Shortcodes using the %
as the outer-most delimiter will be fully rendered when sent to the content renderer. This means that the rendered output from a shortcode can be part of the page’s table of contents, footnotes, etc."
"The < character indicates that the shortcode’s inner content does not need further rendering. Often shortcodes without Markdown include internal HTML."
What they are referring to with that last bit is the ability to call {{ .Inner }}
in a shortcode template, which will give you the inner HTML from the called shortcode like in their example:
{{< myshortcode >}}<p>Hello <strong>World!</strong></p>{{< /myshortcode >}}
Meanwhile what they mention about %
shortcodes also makes sense, as if your shortcode is rendering markdown that contains a header (for example) and your page also has a markdown Table of Contents, you need the shortcode to return markdown in order for that to work. The same goes for the footnote syntax.
In our case with resource_link
I think we're fine to switch to angle brace syntax as resource links will not make use of either the TOC or footnotes markdown syntax. If you look at the ocw-hugo-themes
source code, there are some shortcodes that already require using the angle brace syntax, since they return raw HTML and use the {{ .Inner }}
property, such as:
- Image Gallery
- Tables
- Quizzes
Whether it's worth it or not to switch just to address this one issue, I'm not sure. Relying on string replacement of escape characters doesn't feel great, but neither does the idea of changing the expected resource_link
syntax.
Did you download the content from RC? I intended for you to use this repo from my org. (I should've been more clear about this in the instructions.)
I thought I had grabbed this from your test org, as I have a folder specifically for content from there for testing your PR's, but apparently I cloned this repo from RC inside that folder:
gumaerc@gumaerc-work:~/Code/hussaintaj-mitodl/ht-test-2024$ git remote -v
origin [email protected]:ocw-content-rc/ht-test-2024.git (fetch)
origin [email protected]:ocw-content-rc/ht-test-2024.git (push)
I removed and replaced it with the version from your test org and the description
field is fine now.
To summarize, I think we are okay to merge this PR. Performing the string replacement feels hacky, so I wanted to call it out and explain the situation a little better. I think I've said all there is to say on the matter though, so we need to make a decision about what to do about the resource_link
syntax afterward.
What are the relevant tickets?
Relates to https://github.com/mitodl/hq/issues/3678
Required for mitodl/ocw-studio#2130
Description (What does it do?)
In https://github.com/mitodl/hq/issues/3678, we will migrate old external link data to the new external resource type content. I expect a few issues to arise so I'm adding these fixes proactively.
Please see individual comments below for details of each issue this PR solves.
How can this be tested?
ocw-hugo-themes
project directory.hussaintaj/3678-migrate-data-fixes
.