Use array reduction when cleaning paragraphs #906

JKingweb · 2024-08-23T13:07:05Z

The current code for cleaning non-content paragraphs is straightforward, but brittle against change: one must remember to modifiy both the list of element counts as well as totalCount. This change removes the need to modify totalCount (or indeed to set any more variables) by instead having a list of tag names reduce()d into totalCount in one operation. It's slightly less readable, but it seemed worth it to me.

This is more robust if the list of content tags is ever changed, since we no longer need to add terms to totalCount

gijsk

It's a good thing to identify that this code is a little repetitive. However, I think we can juice this particular fruit a little more - I left a comment inline.

gijsk · 2024-08-29T17:24:23Z

Readability.js

+        // We use an array reduction here so the counts of elements are summed
+        // without anyone having to make further code edits if the list of
+        // content tags is changed.
+        var totalCount = contentTags.reduce(function (total, tag) {
+          return total + paragraph.getElementsByTagName(tag).length;
+        }, 0);


So I'm a little lost. Any reason not to use this._getAllNodesWithTag(paragraph, contentTags).length ? That would seem easier than hand-rolling the reduce.

The other option, for reasons of speed, would be to iterate over contentTags (without a nested function) and to return false as soon as we find any of the content tags (as the return below will be false as soon as we find any of the elements in question). It might also make sense to put the _getInnerText call first in that case.

Both of those would still have the same benefit in terms of not requiring manual copy-pasting for each kind of inner tag that we're looking for.

I see what you mean: I fixed one problem of weird methodology in the original, but not the other. I'll iterate further when I find some time.

Use array reduction when cleaning paragraphs

b03b1eb

This is more robust if the list of content tags is ever changed, since we no longer need to add terms to totalCount

gijsk requested changes Aug 29, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use array reduction when cleaning paragraphs #906

Use array reduction when cleaning paragraphs #906

JKingweb commented Aug 23, 2024

gijsk left a comment

gijsk Aug 29, 2024

JKingweb Aug 30, 2024

Use array reduction when cleaning paragraphs #906

Are you sure you want to change the base?

Use array reduction when cleaning paragraphs #906

Conversation

JKingweb commented Aug 23, 2024

gijsk left a comment

Choose a reason for hiding this comment

gijsk Aug 29, 2024

Choose a reason for hiding this comment

JKingweb Aug 30, 2024

Choose a reason for hiding this comment