gsub_diff: Compare LookupTypes 2, 3, 4, 7 #455

m4rc1e · 2017-10-04T12:53:47Z

Currently gsub_diff.GsubDiffFinder can only parse and compare LookupType 1 Single subs. We cannot compare more complicated lookups. I'm in the middle of checking a collection of families and discovered this limitation. This pr will add support for the aforementioned lookup types.

My tests use some of the examples mentioned in the .fea spec. I've also added tests for the lookup types this pr does not support; ensuring they don't get parsed incorrectly.

I decided against adding support for LookupTypes 5 and 6 because parsing OT features using regex and .ttx files is rather messy. I'd prefer notodiff and its corresponding modules to get refactored once fontTools implements fonttools/fonttools#468 fully. It can currently 'build' features for not 'unbuild'

I apologise for the size of this pr, its pretty much a complete rewrite.

If you accept this pr, I'd like to add support for finding missing features as well. this pr will only find new/missing rules which is the same behaviour as before.

The current implementation only supports LookupType 1, Single sub

In the previous implementation, only LookupType 1, Single substitutions could be compared. This commit adds support for LookupTypes 2, 3, 4 and 7. LookupTypes 5, 6, 8 are still not supported by this commit. Tests have been added to ensure each LookupType gets parsed correctly. Further tests have been added for each missing LookupType to ensure they are not getting parsed.

dougfelt · 2017-10-11T19:56:51Z

@anthrotype Can you look at this?

anthrotype · 2017-10-12T12:45:33Z

nototools/gsub_diff.py

-            if rule not in rules_a:
-                diffs.append(('+',) + rule)
-        # ('+', 'smcp', 'Q', 'Q.sc')
+        new = [self._format_rule(r, '+') for r in self.find_new_rules()]


I see that refactoring those two for loops into new find_new_rules and find_missing_rules methods make the code more readable, but you are calling the same _get_gsub_rules method twice now, which may be an expensive operation. Perhaps you could compute rules_a and rules_b once and pass them as arguments to the previous methods.. but then again the methods would just become a label for a list comprehension (readable enough from POV). Well, up to you.

anthrotype · 2017-10-12T13:05:33Z

nototools/gsub_diff.py

        for name in re.findall(feature_name_rx, text):
-            contents = re.findall(contents_rx % (name, name), text, re.S)
-            assert len(contents) == 1, 'Multiple %s features in %s' % (


the previous code would assert that only one feature block with a given name exists (which might not be true, at least for handwritten, non-ttxn-generated, feature files, since the spec allows multiple feature blocks with same name).
In your _get_feature_content below you have removed the assert and simply returning the result of re.findall, which may contain more than one match. But you are only retaining the first one, since you do contents[0] later on.
Was this intentional?
If we only support reading lookups from the first feature block that matches, then we need to assert that there's only one. Probably, it would be better to collect lookup rules for all the feature blocks with that name.

anthrotype · 2017-10-12T13:08:57Z

nototools/gsub_diff.py

+                s.append(item)
+        return s
+
+    def _gsub_rule_group_to_string(self, seq):


I find defining a method a bit overkill for such a trivial operation as seq[1:-1], but I don't mind.

m4rc1e · 2017-10-12T13:10:10Z

@anthrotype Thanks for this. We are working on a design proposal for modifying the diff objects. I'm aware of the repetition. We're probably going to split the OT parsing, diffing and reporting into separate objects.

I'd keep this open for now but expect more changes, to these diff modules.

anthrotype · 2017-10-12T13:12:07Z

nototools/gsub_diff.py

+            if op == 'by': # parse LookupType 1, 2, 4
+                parsed.append((feature, left.split(), op, right.split()))
+            elif op == 'from': # parse LookupType 3
+                for glyph in right.split(): # 'a.alt a.sc' -> ['a.alt', 'a.sc']


comment is redundant

anthrotype · 2017-10-12T13:14:49Z

nototools/gsub_diff.py

+                parsed.append((feature, left.split(), op, right.split()))
+            elif op == 'from': # parse LookupType 3
+                for glyph in right.split(): # 'a.alt a.sc' -> ['a.alt', 'a.sc']
+                    parsed.append((feature, left.split(), op, [glyph]))


left.split() shouldn't be necessary since in lookuptype 3 alternate substitutions the format is substitute <glyph> from <glyphclass>; so the left slot can contain only one glyph. A left.strip() would do.

anthrotype · 2019-02-21T17:52:59Z

@m4rc1e status?

m4rc1e · 2019-02-22T12:47:45Z

@anthrotype Unfortunately I've let this rot. I moved away from using ttxn. I don't know whether I should just spend an hour to finish this off?

I'm still looking into otlLibunbuild ;-)

m4rc1e added 2 commits October 4, 2017 13:33

added unit tests for gsub_diff module.

a655a97

The current implementation only supports LookupType 1, Single sub

dougfelt requested a review from anthrotype October 11, 2017 19:56

anthrotype reviewed Oct 12, 2017

View reviewed changes

Base automatically changed from master to main March 1, 2021 20:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gsub_diff: Compare LookupTypes 2, 3, 4, 7 #455

gsub_diff: Compare LookupTypes 2, 3, 4, 7 #455

m4rc1e commented Oct 4, 2017 •

edited

Loading

dougfelt commented Oct 11, 2017

anthrotype Oct 12, 2017

anthrotype Oct 12, 2017

anthrotype Oct 12, 2017

m4rc1e commented Oct 12, 2017

anthrotype Oct 12, 2017

anthrotype Oct 12, 2017

anthrotype commented Feb 21, 2019

m4rc1e commented Feb 22, 2019

gsub_diff: Compare LookupTypes 2, 3, 4, 7 #455

Are you sure you want to change the base?

gsub_diff: Compare LookupTypes 2, 3, 4, 7 #455

Conversation

m4rc1e commented Oct 4, 2017 • edited Loading

dougfelt commented Oct 11, 2017

anthrotype Oct 12, 2017

Choose a reason for hiding this comment

anthrotype Oct 12, 2017

Choose a reason for hiding this comment

anthrotype Oct 12, 2017

Choose a reason for hiding this comment

m4rc1e commented Oct 12, 2017

anthrotype Oct 12, 2017

Choose a reason for hiding this comment

anthrotype Oct 12, 2017

Choose a reason for hiding this comment

anthrotype commented Feb 21, 2019

m4rc1e commented Feb 22, 2019

m4rc1e commented Oct 4, 2017 •

edited

Loading