apply one-sentence-per-line rule for rst files via github workflow. #4996

fujitatomoya · 2025-02-05T05:56:56Z

follow-up of #4990, that adds
please make sure to write only one sentence per line as it makes reviewing and modifying your file much easier.

this is still good, but without checking it by system, eventually we could miss and extra iteration.
having this as custom rule using python script, possibly support this requirement and maintenance & developer friendly environment.

this can be still draft, there could be other way to explore. (AFAIK, sphinx-lint does not support this kind of rule.) any comments and suggestions are welcome 🤔

github-actions · 2025-02-05T06:00:10Z

HTML artifacts: https://github.com/ros2/ros2_documentation/actions/runs/13151376674/artifacts/2538770703.

To view the resulting site:

Click on the above link to download the artifacts archive
Extract it
Open html-artifacts-4996/index.html in your favorite browser

Signed-off-by: Tomoya Fujita <[email protected]>

fujitatomoya · 2025-02-05T06:23:59Z

I did some test if this actually fails the github workflow as following test patch.

diff --git a/source/Citations.rst b/source/Citations.rst
index a433d9fc..2f4e3a27 100644
--- a/source/Citations.rst
+++ b/source/Citations.rst
@@ -3,6 +3,13 @@
 Citations
 =========

+This is a test.
+This is a test. This should fail.
+Is this a test?
+Is this a test? Shoud this fail?
+This is a test!
+This is a test! This should fail!
+
 If you use ROS 2 in your work please cite the 2022 Science Robotics paper `Robot Operating System 2: Design, architecture, and uses in the wild <https://www.science.org/doi/10.1126/scirobotics.abm6074>`_.

     | S. Macenski, T. Foote, B. Gerkey, C. Lalancette, W. Woodall, "Robot Operating System 2: Design, architecture, and uses in the wild," Science Robotics vol. 7, May 2022.

and it works as expected.

.github/workflows/rst_check.yml

fujitatomoya · 2025-02-05T06:27:48Z

scripts/sentence_line_checker.py

+    # this assumes a sentence ends with a period, question mark, or exclamation mark.
+    sentence_endings = re.findall(r'[.!?]', line)


Could be more rules applied here? i would like to have some feedback for more use cases.

One way to find out would be to run it on all files and see.

Hmm one use case I can think of is for example using ellipses ....
As this is right now this f() would treat it as multiple sentences instead of just one.

Another not so strange use case could be using multiple exclamation marks: "This is one sentence!!!"

I suggest slightly changing the regex to allow multiple appearances. Well, at least that would solve the problems above.

Suggested change

# this assumes a sentence ends with a period, question mark, or exclamation mark.

sentence_endings = re.findall(r'[.!?]', line)

# this assumes a sentence ends with a period, question mark, or exclamation mark.

sentence_endings = re.findall(r'[.!?]+', line)

An small example applying above change
Notice that the last 3 sentences do not cause the test to fail, as expected.

fujitatomoya · 2025-02-05T06:29:31Z

@kscottz @christophebedard got some spare time for this, what do you think? would love to hear some opinions 😃

fujitatomoya · 2025-02-05T06:37:01Z

scripts/sentence_line_checker.py

+    if violations:
+        print(f"\n⚠️ Found {len(violations)} violations: One sentence per line is required.")
+        for lineno, line in violations:
+            print(f"  ❌ Line {lineno}: {line}")


hmmm, this is not user-friendly output, it would be better to output original file name and line to be fixed.

see my comment about annotations: #4996 (comment)

clalancette · 2025-02-05T14:22:13Z

@kscottz @christophebedard got some spare time for this, what do you think? would love to hear some opinions 😃

I think this is a great idea, for what it is worth. See also #4592, which attempts to do a similar thing.

If we do this, here are a few thoughts:

We need to have a way to run this locally. It looks like this PR accomplishes that, but it is an important use-case.
The output needs to point to exactly which line and exactly which file caused the violation, without too much other output. Otherwise contributors tend to be confused about where the problem is (we have that problem in rosdistro a lot).
We have tons of existing violations of this, so this definitely should only be done on new changes for now. This PR also accomplishes that, so that is great.
We may need a way to have exceptions to the rule. There are some places, particularly in bullet-points, where we may need to have multiple sentences per line.

christophebedard · 2025-02-05T15:10:01Z

I think this is good if it doesn't have many false positives.

The output needs to point to exactly which line and exactly which file caused the violation, without too much other output. Otherwise contributors tend to be confused about where the problem is (we have that problem in rosdistro a lot).

I don't want to increase the initial scope of this tool, but this would be a nice use-case for PR file annotations. It doesn't seem that complex to set up: https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/workflow-commands-for-github-actions#setting-an-error-message

fujitatomoya · 2025-02-05T16:52:44Z

@clalancette @christophebedard #4592 looks good too.

plugins/ros_checkers.py of Sentence checker #4592, tries to address in different design that 1st tries to split the line into sentences, this could be much complicated than i am trying to here. i am not even sure which one is better, one thing i am concerned is false alarms.

One way to find out would be to run it on _all) files and see.

yeah, that would be good practice. i will try with all .md and .rst files to see if what is gonna happen. probably we can have 2 cases, the one is to run the checker for all .md and .rst files, the other is to run the checker for only changes part via workflow.

We need to have a way to run this locally. It looks like this PR accomplishes that, but it is an important use-case.

1st i wanted to try with github workflow, but yes definitely this should be done locally. i would consider Makefile target for local check.

The output needs to point to exactly which line and exactly which file caused the violation, without too much other output.

agree. thanks for the comment and info. the thing is that we are doing here is checking the only changed part via git, that means we need to implement this functionality in the python, i guess. i will try to give it a shot.

We have tons of existing violations of this, so this definitely should only be done on new changes for now. This PR also accomplishes that, so that is great.

totally agree. i was considering 2 different perspectives. as maintainer's one, we should probably run the checker for all .md and .rst files once in a while (including updating rules). as contributor's aspect, they only want to focus on what has been changed.

We may need a way to have exceptions to the rule. There are some places, particularly in bullet-points, where we may need to have multiple sentences per line.

related to doing One way to find out would be to run it on _all) files and see., i will see how complicated it would be... 😓 probably.

i also would love to hear #4592 (comment)

kscottz · 2025-02-05T17:15:48Z

Tomoya you've made me the happiest women in the world and I love it. I'll take a look this afternoon.

DLu · 2025-02-06T03:41:48Z

What is the overhead to add a pip dependency to the ros2_documentation tool chain? I'd much rather write a sentence-parser package with a bunch of tests than just check code in here.

fujitatomoya · 2025-02-06T05:09:25Z

What is the overhead to add a pip dependency to the ros2_documentation tool chain?

i did not add any pip packages here, just simply add python script to detect that. i think adding pip dependency should be okay to ros2_documentation.

DLu · 2025-02-06T16:46:56Z

Sorry, to be clear, I was proposing rather than adding a bunch of sentence checking logic here, I want to create a separate sentence checking logic Python project, and upload it to pypi so that it can be depended on here.

kscottz · 2025-02-06T18:02:19Z

Let me preface this by saying I don't want great to be the enemy of good here. If this works, let's roll with it and we can always circle back. Perhaps we can get a rough set of ideas together and then find a summer GSoC student to work this all into an ament tool down the road?

Having said that, I am going to agree with @DLu on this, what if we just push all of this stuff down into Python package so we could (1) use that package as part of an ament lint utility that could deliver the same functionality to package maintainers (2) use it locally, and (3) use it part of CI task?

@mjcarroll can we add this to the ROS 2 meeting for next week? It seems like rolling this and #4997 into ament lint would be a great GSoC summer project. With the wiki being deprecated it seems like improved tooling and some guidance for package maintainers would be really useful right about now.

mjcarroll · 2025-02-07T00:44:21Z

Added to agenda. Thanks.

Santti4go · 2025-02-12T02:13:01Z

scripts/sentence_line_checker.py

+    """
+    Check only changed lines for violations.
+
+    .. warning:: It checks only added / modified lines for the viaolation in the file,


Suggested change

.. warning:: It checks only added / modified lines for the viaolation in the file,

::warning:: It checks only added / modified lines for the violation in the file,

Santti4go · 2025-02-12T02:50:49Z

scripts/sentence_line_checker.py

+    # this assumes a sentence ends with a period, question mark, or exclamation mark.
+    sentence_endings = re.findall(r'[.!?]', line)


Hmm one use case I can think of is for example using ellipses ....
As this is right now this f() would treat it as multiple sentences instead of just one.

Another not so strange use case could be using multiple exclamation marks: "This is one sentence!!!"

I suggest slightly changing the regex to allow multiple appearances. Well, at least that would solve the problems above.

Suggested change

# this assumes a sentence ends with a period, question mark, or exclamation mark.

sentence_endings = re.findall(r'[.!?]', line)

# this assumes a sentence ends with a period, question mark, or exclamation mark.

sentence_endings = re.findall(r'[.!?]+', line)

An small example applying above change
Notice that the last 3 sentences do not cause the test to fail, as expected.

fujitatomoya requested review from clalancette and audrow as code owners February 5, 2025 05:56

fujitatomoya force-pushed the fujitatomoya/one-sentence-per-line branch from 96762ad to 621e2a2 Compare February 5, 2025 05:58

fujitatomoya force-pushed the fujitatomoya/one-sentence-per-line branch 2 times, most recently from 1ca12ba to 3edda48 Compare February 5, 2025 06:05

apply one-sentence-per-line rule for rst files via github workflow.

bcf5d90

Signed-off-by: Tomoya Fujita <[email protected]>

fujitatomoya force-pushed the fujitatomoya/one-sentence-per-line branch from 3edda48 to bcf5d90 Compare February 5, 2025 06:13

fujitatomoya changed the title ~~one-sentence-per-line github workflow test.~~ apply one-sentence-per-line rule for rst files via github workflow. Feb 5, 2025

fujitatomoya commented Feb 5, 2025

View reviewed changes

fujitatomoya requested review from kscottz and christophebedard February 5, 2025 06:28

fujitatomoya commented Feb 5, 2025

View reviewed changes

fujitatomoya mentioned this pull request Feb 7, 2025

Clean up sentences (white space changes only) #5001

Merged

Santti4go reviewed Feb 12, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

apply one-sentence-per-line rule for rst files via github workflow. #4996

apply one-sentence-per-line rule for rst files via github workflow. #4996

fujitatomoya commented Feb 5, 2025 •

edited

Loading

github-actions bot commented Feb 5, 2025 •

edited

Loading

fujitatomoya commented Feb 5, 2025

fujitatomoya Feb 5, 2025

christophebedard Feb 5, 2025 •

edited

Loading

Santti4go Feb 12, 2025 •

edited

Loading

fujitatomoya commented Feb 5, 2025

fujitatomoya Feb 5, 2025

christophebedard Feb 5, 2025

clalancette commented Feb 5, 2025

christophebedard commented Feb 5, 2025

fujitatomoya commented Feb 5, 2025

kscottz commented Feb 5, 2025

DLu commented Feb 6, 2025

fujitatomoya commented Feb 6, 2025

DLu commented Feb 6, 2025

kscottz commented Feb 6, 2025

mjcarroll commented Feb 7, 2025

Santti4go Feb 12, 2025

Santti4go Feb 12, 2025 •

edited

Loading

		# this assumes a sentence ends with a period, question mark, or exclamation mark.
		sentence_endings = re.findall(r'[.!?]', line)

	.. warning:: It checks only added / modified lines for the viaolation in the file,
	::warning:: It checks only added / modified lines for the violation in the file,

apply one-sentence-per-line rule for rst files via github workflow. #4996

Are you sure you want to change the base?

apply one-sentence-per-line rule for rst files via github workflow. #4996

Conversation

fujitatomoya commented Feb 5, 2025 • edited Loading

github-actions bot commented Feb 5, 2025 • edited Loading

fujitatomoya commented Feb 5, 2025

fujitatomoya Feb 5, 2025

Choose a reason for hiding this comment

christophebedard Feb 5, 2025 • edited Loading

Choose a reason for hiding this comment

Santti4go Feb 12, 2025 • edited Loading

Choose a reason for hiding this comment

fujitatomoya commented Feb 5, 2025

fujitatomoya Feb 5, 2025

Choose a reason for hiding this comment

christophebedard Feb 5, 2025

Choose a reason for hiding this comment

clalancette commented Feb 5, 2025

christophebedard commented Feb 5, 2025

fujitatomoya commented Feb 5, 2025

kscottz commented Feb 5, 2025

DLu commented Feb 6, 2025

fujitatomoya commented Feb 6, 2025

DLu commented Feb 6, 2025

kscottz commented Feb 6, 2025

mjcarroll commented Feb 7, 2025

Santti4go Feb 12, 2025

Choose a reason for hiding this comment

Santti4go Feb 12, 2025 • edited Loading

Choose a reason for hiding this comment

fujitatomoya commented Feb 5, 2025 •

edited

Loading

github-actions bot commented Feb 5, 2025 •

edited

Loading

christophebedard Feb 5, 2025 •

edited

Loading

Santti4go Feb 12, 2025 •

edited

Loading

Santti4go Feb 12, 2025 •

edited

Loading