Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add list of parsers that keep up with this testsuite in CI #141

Open
untitaker opened this issue Jan 16, 2022 · 25 comments
Open

Add list of parsers that keep up with this testsuite in CI #141

untitaker opened this issue Jan 16, 2022 · 25 comments

Comments

@untitaker
Copy link
Contributor

untitaker commented Jan 16, 2022

I want to add some tokenizer testcases, and so far have been field-testing them on html5lib-python, but would like to know if there are any parsers out there that would break on those testcases. Particularly because html5lib-python's test harness doesn't seem to compare error locations, and I'm about to add some tests that are all about error locations.

It would be nice to start collecting parsers that use any part of this testsuite in CI, in some sort of document, e.g. README. That would solve my usecase and also make it easier to discover conformant parsers for various languages.

Another idea would be to add all of those parsers to this repo's CI, but that sounds like too much effort

@untitaker untitaker changed the title Add list of parsers that keep up with this in CI Add list of parsers that keep up with this testsuite in CI Jan 16, 2022
@untitaker
Copy link
Contributor Author

@JKingweb
Copy link

JKingweb commented Mar 9, 2023

For what it's worth my parser does test error locations, and includes patches for tokenizer tests because some of the error locations seem wrong but there doesn't seem to be any interest in ironing that out.

I don't use CI, though, so nothing would break immediately. I also don't use submodules, so it's not included in that Sourcegraph search.

@flavorjones
Copy link
Contributor

👋 Nokogiri (https://github.com/sparklemotion/nokogiri), a Ruby library that includes HTML5 parsing, uses this test suite. We do regular upstream builds against this repository's master branch, so we would catch any breakage in a day or two.

I'd be happy to do the legwork to add Nokogiri's test suite in a "downstream" CI job on this repository if that would be useful.

@sideshowbarker
Copy link
Contributor

👋 Nokogiri (sparklemotion/nokogiri), a Ruby library that includes HTML5 parsing, uses this test suite. We do regular upstream builds against this repository's master branch, so we would catch any breakage in a day or two.

I'd be happy to do the legwork to add Nokogiri's test suite in a "downstream" CI job on this repository if that would be useful.

That sounds great. If you can provide the details about what would be necessary upstream here in this repo, I can work with you to get the necessary changes landed here.

Assuming we’d want to work from a PR: I’m a member of the html5lib org — though not full admin or maintainer perms — and I have perms to approve and merge PRs (write/push perms).

But if any changes to the settings for the repo are needed — in contrast to what we could get done in a PR — then at that point we’d need to have somebody with admin/owner perms make any necessary settings changes.

@untitaker
Copy link
Contributor Author

Is this something we want to constrain to a handful of parsers? I am more than happy to also add a job for my parser html5gum but it pales in relevance compared to nokogiri.

I would add all of those CI jobs as non-required, because parsers can indeed be sometimes broken and updating the testsuite shouldn't be blocked on fixing them. As such I don't think repo settings changes are required.

@sideshowbarker
Copy link
Contributor

Is this something we want to constrain to a handful of parsers?

As far as I understand it, No, we have no reason to constrain it to just a particular/small number of parsers.

I am more than happy to also add a job for my parser html5gum but it pales in relevance compared to nokogiri.

I think the relevance shouldn’t be a factor in deciding. It seems to me the only factors in deciding should be: whether the parser project has somebody willing to contribute the code for running CI for their parser — and that CI for their parser is passing rather than failing — and whether they can commit to continuing to keeping that up to date.

But even in the worst case if somebody no longer has time to maintain their CI contribution here — and their CI is failing rather than passing — then we just yank it out here and quit running it.

But in general, in my opinion at least, the more, the merrier (running more parsers through the test suite here under CI is better for html5lib than running fewer).

@sideshowbarker
Copy link
Contributor

For what it's worth my parser does test error locations, and includes patches for tokenizer tests because some of the error locations seem wrong but there doesn't seem to be any interest in ironing that out.

I don't use CI, though, so nothing would break immediately. I also don't use submodules, so it's not included in that Sourcegraph search.

If you don’t have existing CI set up but could help write something to run under a GitHub Actions workflow here, than that would be great.

@untitaker
Copy link
Contributor Author

I have a PR here to add my parser: #152

for some reason it doesn't show any status

it does run as part of GHA within my own fork: https://github.com/untitaker/html5lib-tests/actions/runs/4386180990/jobs/7679878268

@sideshowbarker
Copy link
Contributor

I have a PR here to add my parser: #152

for some reason it doesn't show any status

it does run as part of GHA within my own fork: untitaker/html5lib-tests/actions/runs/4386180990/jobs/7679878268

I guess that’s most likely due to some repository setting here. I’ll try to get it figured out.

@flavorjones
Copy link
Contributor

flavorjones commented Mar 13, 2023

@sideshowbarker Github Actions won't run a new workflow unless that workflow file already exists on the default branch.

Let me ship a PR with an skeleton "downstream" workflow file that you can review and merge, and then @untitaker and I (and others) can modify that workflow in subsequent PRs? I think I can do that tonight.

@flavorjones
Copy link
Contributor

I've created #153 as that skeleton workflow file.

@sideshowbarker
Copy link
Contributor

@sideshowbarker Github Actions won't run a new workflow unless that workflow file already exists on the default branch.

Aha, that makes sense

Let me ship a PR with an skeleton "downstream" workflow file that you can review and merge, and then @untitaker and I (and others) can modify that workflow in subsequent PRs?

I've created #153 as that skeleton workflow file.

Super — thanks, now merged

@untitaker
Copy link
Contributor Author

Thank you for setting this up, I have updated #152 to look like #154

@fb55
Copy link
Contributor

fb55 commented Mar 15, 2023

Awesome idea! I've opened a PR for parse5 at #155.

@untitaker
Copy link
Contributor Author

hello @sideshowbarker, any update on this? I think the three above PRs are pretty much ready to go

@flavorjones
Copy link
Contributor

@untitaker I wanted to check if you were still considering merging the downstream test suites at #152, #154, and #155? Do you have any concerns that we can address?

@untitaker
Copy link
Contributor Author

untitaker commented May 20, 2023 via email

@untitaker
Copy link
Contributor Author

untitaker commented May 21, 2023

ping @gsnedders @sideshowbarker again, can somebody make a decision on whether we want to merge above PRs for CI?

@flavorjones
Copy link
Contributor

@untitaker Sorry about that -- I meant to ping @sideshowbarker, sorry for the notification noise.

@flavorjones
Copy link
Contributor

@sideshowbarker Are you still open merging the downstream test suites at #152, #154, and #155? Do you have any concerns that we can address?

@flavorjones
Copy link
Contributor

@sideshowbarker Are you still open to merging #152, #154, and #155? Do you have any concerns with this approach, or those PRs, that we should address?

@sideshowbarker
Copy link
Contributor

I have no objections to merging them and I’ve now approved them all — but the problem continues to be that every PR has a stalled “lint Expected — Waiting for status to be reported” check, and I don’t have sufficient perms in this repo to force-merge the PRs without the check being green.

@untitaker
Copy link
Contributor Author

I've merged master into my branch, that should fix it. I think maintainers can merge master into arbitrary PRs to resolve this too

@fb55
Copy link
Contributor

fb55 commented Aug 9, 2023

I've updated #155 as well. Happy to see this move forward!

@flavorjones
Copy link
Contributor

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants