Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Debugger tweaks, added tracing #1080

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Conversation

rlpowell
Copy link

@rlpowell rlpowell commented Feb 5, 2025

  • Fixed debugger argument handling
  • Expanded debugger behaviour a bit (like being able to continue a bunch of times)
  • Added support for tracing the productions of a grammar during parsing, in both the debugger and normal modes

Commentary on this as a PR:

  • Just for your interest: I really needed to add tracing to debug why what is probably the largest PEG grammar in existence ( https://github.com/lojban/ilmentufa/blob/master/camxes.peg ) is parsing things very differently with Pest vs. the other parsers we've used
  • I've never done macros before and I've never worked with pest before; if I should do something differently, let me know
  • It was not at all obvious to me how to run all the tests in all the places in this repo, so I didn't; please guide me. Unless the PR process already does the tests, I guess.
  • Similarly, I'd love to set up a test that runs a simple grammar through both the debugger and the regular flow and shows that their tracing output is the same, but I don't really have any good ideas for how to do that besides a shell script.
  • This is kind of two things in one, but they're tied together so I'd rather not split them unless it's a big deal
  • My changes to the debugger had a mid-air collision with Jonas's, sorry about that; I believe my changes do all the things they added, though

Summary by CodeRabbit

  • New Features

    • Introduced specialized parser debugger and optional tracing output.
    • Enhanced command-line interface with new options for grammar, input, session files, and tracing configurations.
    • Added tracing capabilities to the parser, allowing for detailed debugging and analysis.
  • Documentation

    • Updated user guides to include a new tracing section with examples, detailing how to enable and leverage the improved tracing features.

@rlpowell rlpowell requested a review from a team as a code owner February 5, 2025 01:50
@rlpowell rlpowell requested review from tomtau and removed request for a team February 5, 2025 01:50
Copy link
Contributor

coderabbitai bot commented Feb 5, 2025

Walkthrough

This pull request introduces extensive tracing capabilities across the project. It updates documentation by adding new features and sections, refines the debugger to incorporate tracing options into its context and CLI handling via the clap library, and enhances code generation to support tracing configurations. Additionally, modifications in the parser and virtual machine components ensure that tracing is integrated into parsing routines, with new configurations, wrappers, and extended dependency specifications.

Changes

File(s) Change Summary
README.md, derive/src/lib.rs Updated documentation: Added “Specialized parser debugger”, “Optional tracing output”, new tracing section, header hierarchy adjustment, and updated derive attributes to include tracing.
debugger/Cargo.toml, debugger/src/lib.rs, debugger/src/main.rs Improved debugger: Added new dependency (clap), introduced tracing configuration field and methods in DebuggerContext, and refactored CLI handling with new structs and increased timeout.
generator/Cargo.toml, generator/src/generator.rs, generator/src/macros.rs, generator/src/parse_derive.rs Enhanced generator: Updated dependency for syn, added a tracing_config parameter to generation functions, modified macros (additional parameter for implicit), and updated parsing to extract and handle tracing attributes.
pest/Cargo.toml, pest/src/lib.rs, pest/src/parser_state.rs, pest/src/iterators/queueable_token.rs Upgraded parser: Added new dependencies (quote, proc-macro2), introduced TracingConfig and TracingType, updated exports and parser state creation with tracing wrappers, and fixed a minor documentation typo.
vm/src/lib.rs Modified virtual machine: Updated parsing flow to use parse_with_tracing, integrating tracing configurations into rule processing and error handling.

Suggested reviewers

  • NoahTheDuke
  • tomtau

Poem

I’m a little rabbit, hopping with cheer,
New tracing trails shine bright and clear,
Debuggers and parsers now dance in sync,
Code leaps forward in just a blink,
With CLI magic and a trace so fine,
Our project’s hopping on a brilliant line!
🐰✨

Tip

🌐 Web search-backed reviews and chat
  • We have enabled web search-based reviews and chat for all users. This feature allows CodeRabbit to access the latest documentation and information on the web.
  • You can disable this feature by setting web_search: false in the knowledge_base settings.
  • Please share any feedback in the Discord discussion.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (8)
debugger/src/main.rs (3)

18-25: Avoid panics in user-facing paths.

The usage of panic!() (line 18) is generally discouraged in non-testing code. Converting it to a graceful error return, or properly reporting errors via Result or logging, would enhance resilience.


81-128: Use built-in help and unify subcommand usage.

Defining subcommands for grammar, input, and breakpoints simplifies the CLI. The custom Help {} subcommand offers help text, but consider letting Clap handle it automatically to reduce manual maintenance and ensure consistency with built-in usage patterns.


326-326: Thirty-second timeout for requests is a reasonable fallback.

Increasing from 5s to 30s for network requests can significantly improve reliability for slower hosts. Still, consider user-configurable timeouts for maximum flexibility.

vm/src/lib.rs (3)

93-99: Graceful early return on forced termination.

You check if the listener requests stopping by returning Err(ParserState::new(...). This is a neat mechanism, though the default error remains somewhat generic. Consider adding a more specific message if user intervention is the reason for termination.


124-136: Comprehensive range matching is sys-tematically traced.

Matching digits and letters with trace wrappers ensures full coverage in debugging. This is correct, although the repeated .or_else(...) might reduce clarity; consider factoring to smaller, dedicated methods if it grows further.


357-389: Complex skip logic merges whitespace/comments.

The nesting with trace_wrapper inside sequence is thorough. Code remains correct but is large. Splitting out subroutines or macros could improve readability for future maintenance.

pest/src/parser_state.rs (1)

24-26: Using std::println! might be too intrusive.
Relying on println! for tracing output can clutter STDOUT. Consider a logging approach (e.g., a configurable logger) to avoid mixing parser messages with user-facing console output.

pest/Cargo.toml (1)

34-35: Addition of quote and proc-macro2 Dependencies
These new dependencies are common and useful for procedural macro expansion and code generation tasks—likely supporting the enhanced tracing features. Please confirm that these libraries are needed unconditionally; if they are only used in tracing contexts, you might consider gating them behind a feature flag for optimal dependency management.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a13c05c and 3e5c66a.

📒 Files selected for processing (14)
  • README.md (1 hunks)
  • debugger/Cargo.toml (1 hunks)
  • debugger/src/lib.rs (6 hunks)
  • debugger/src/main.rs (9 hunks)
  • derive/src/lib.rs (2 hunks)
  • generator/Cargo.toml (1 hunks)
  • generator/src/generator.rs (10 hunks)
  • generator/src/macros.rs (1 hunks)
  • generator/src/parse_derive.rs (5 hunks)
  • pest/Cargo.toml (1 hunks)
  • pest/src/iterators/queueable_token.rs (1 hunks)
  • pest/src/lib.rs (2 hunks)
  • pest/src/parser_state.rs (6 hunks)
  • vm/src/lib.rs (4 hunks)
✅ Files skipped from review due to trivial changes (1)
  • pest/src/iterators/queueable_token.rs
🔇 Additional comments (56)
generator/src/macros.rs (3)

12-12: Consider verifying the boolean nature of the second macro parameter.

Passing false for $implicit here works fine as an example, but notice $implicit: ident doesn’t inherently enforce a boolean type. If you accidentally pass a non-boolean token, the macro won’t catch it. A more explicit naming or usage pattern might improve safety and clarity.


18-25: Macro signature expansion looks consistent.

By adding $implicit to the signature and forwarding it to state.trace_wrapper, you’ve cleanly integrated the new parameter. No immediate functional problems are apparent. Ensure that call sites always match the updated signature to avoid compile-time errors.


33-40: Maintain consistency across macro variants under different configurations.

The non-std macro behavior mirrors the std version, which is good. If future changes occur, ensure both versions remain aligned to prevent configuration drift.

debugger/src/main.rs (6)

35-79: Clap-based CLI initialization looks clear and organized.

The fields for grammar, input, session, and tracing are neatly structured. This is a user-friendly approach that clarifies which options are available. Nice job removing custom argument parsing in favor of Clap’s built-in functionality.


151-159: Flexible tracing selection is a welcome addition.

The code properly distinguishes among PegViz, Indented, and "none," though the fallback to Indented for "none" is slightly surprising. Double-check whether that’s truly intended behavior; if not, consider a distinct approach or a no-op for "none."


226-233: Long help rendering is well-handled.

Calling ClapRLArgs::command().render_long_help() is neat. This approach is user-friendly, but ensure that it’s tested thoroughly to avoid issues with large help texts or edge cases from repeated calls.


315-317: Prompt for session data reusability improved.

Storing session_file in a dedicated variable helps keep logic straightforward. Ensure that writing/reading from the session file is robust in error scenarios, such as locked or read-only files.


347-347: Runtime helper injection looks tidy.

You properly set the rustyline helper after constructing CliHelper. No issues found.


351-397: Graceful error messages on load failure.

Printing an explicit message when history or file loading fails is great for UX. Ensure future expansions of error handling remain consistent.

vm/src/lib.rs (11)

21-21: Imports for tracing.

Adding TracingConfig is essential for hooking up the new tracing features. This aligns well with the broader changes.


71-73: Reuse existing parse logic in parse_with_tracing.

Delegating from parse to parse_with_tracing avoids code duplication. This is a clean approach.


75-83: Configuration-based parsing extension.

state_with_tracing is used effectively to incorporate user-defined tracing preferences. Watch for potential performance implications when toggling heavy tracing in large grammars.


100-122: Extensive trace_wrapping covers built-in rules.

From ANY to PEEK_ALL, the approach is consistent. Carefully ensure the repeated false for implicit is correct for all.


138-149: Case merging for ASCII alpha checks is correct.

By or-ing the ranges a..z and A..Z, you handle either case seamlessly. The tracing approach remains consistent across these rules.


151-157: Combining numeric ranges fosters clarity.

Chaining multiple .or_else statements is an understandable pattern for ASCII alphanumeric. Maintain caution about potential performance overhead if the grammar expands significantly.


159-161: Matches ASCII range.

Using '\x00'..'\x7f' is valid for ASCII. This addition is well-documented via trace_wrapper.


164-170: Cross-platform line break support.

Catching \n, \r\n, and \r covers different OS line endings. The fallback approach is robust, and the trace wrapping is consistent.


177-185: Silent or atomic states for whitespace/comment.

Using RuleType to categorize whitespace/comment logic helps maintain clarity. Good job ensuring consistent trace flags (true for implicit).


325-335: “skip” rule applies repeated whitespace usage.

trace_wrapper("skip", true, false, ...) is used to label implicit whitespace, matching your new tracing style. Just verify performance if invoked repeatedly on large inputs.


343-349: Consistent approach for comment skipping.

Logic parallels the whitespace skipping. Repeated patterns keep the code uniform.

generator/src/generator.rs (17)

38-38: Extracting tracing config is a clean approach.
Retrieving tracing_config from parsed_derive aligns well with the new tracing feature, ensuring the generator respects user-defined tracing settings.


80-80: Importing TracingType is good.
No issues with adding the use ::pest::TracingType; statement. This keeps the code more explicit about the tracing dependency.


88-90: Switch to state_with_tracing for tracing integration.
Replacing the direct ::pest::state call with ::pest::state_with_tracing correctly injects the tracing configuration. This helps unify standard parsing and traced parsing in one function.


307-312: Marking WHITESPACE and COMMENT as implicit is clear.
Setting the implicit flag when the rule name is "WHITESPACE" or "COMMENT" is straightforward for ignoring them in tracing mode, improving readability of traced logs.


320-323: Atomic rule wrapped with trace support.
Using trace_wrapper before atomic(::pest::Atomicity::Atomic, ...) ensures consistent tracing coverage.


337-340: Trace wrapper for normal rules.
Wrapping normal rules in trace_wrapper provides uniform tracing outputs across atomic and non-atomic rules.


348-350: Silent rule traced or optionally excluded.
Passing true for silent to trace_wrapper matches the new skip logic in TracingConfig.


357-361: Consistent atomic tracing usage.
Ensuring full coverage of atomic rules under trace_wrapper preserves the new debugging approach.


370-374: CompoundAtomic integrated with tracing.
Same pattern as the atomic variant, continuing consistent tracing for compound-atomic rules.


383-387: NonAtomic traced similarly.
Maintaining the same logic flow for NonAtomic ensures minimal divergence in how each rule variant is traced.


400-400: Adjusting skip rule for no whitespace/comment.
This specialized branch better reflects the new skipping logic, making traced output simpler.


403-403: Skip rule with atomicity check (whitespace only).
The updated macro call clarifies that we always pass true for the skip’s implicit parameter, consistent with the adjacency logic for whitespace.


410-410: Skip with comment logic.
Similarly extends the approach for comment skipping.


412-412: Combined whitespace/comment skip.
Allows the parser to handle both cases in an all-encompassing skip rule under tracing.


513-515: Introducing tracing_config in ParserState.
Storing the tracing configuration in the parser state ensures that all parsing methods can reference and respect the user’s desired tracing behavior.


596-596: Potential breaking change in ParserState::new signature.
Changing the constructor to accept tracing_config might break existing code that used the original signature with fewer parameters. Consider maintaining a separate constructor or a default-based fallback to preserve backward compatibility.


607-609: Field initialization for tracing.
Revising the struct initializer for parse_attempts, tracing_config, tracing_depth is coherent, seamlessly integrating the new tracing feature.

pest/src/parser_state.rs (3)

80-101: TracingType enum introduction is well-structured.
The enum clearly captures multiple tracing styles (None, Indented, PegViz). This extension is straightforward and easy to maintain.


102-159: TracingConfig is coherent and easily extensible.
Defining skip flags (skip_implicit, skip_silent) and spacing for indentation offers flexible control over how trace information is presented. Implementing Default is a good practice for fallback usage.


533-536: state_with_tracing integration.
Switching the standard state(...) to state_with_tracing(...) ensures that any parse closure can seamlessly incorporate optional tracing data rather than forcing an entirely separate parser path.

generator/src/parse_derive.rs (5)

12-12: Imports for new tracing constructs.
Bringing in TracingConfig and TracingType underscores integration of the new tracer into the derive logic.


29-29: New tracing_config field in ParsedDerive.
Allowing the macro input to supply user-specific tracing settings is a logical extension of the struct.


52-52: Collecting and parsing #[tracing(...)] attributes.
These lines transparently retrieve tracing-related attributes from the AST, forwarding them to the custom parser function. This maintains a clear separation of concerns.

Also applies to: 55-61


79-79: Renaming attribute processing function is consistent.
The new attr_to_grammar_source name better reflects its role in extracting grammar sources.


98-160: Robust attr_to_tracing_config logic.
This function neatly supports multiple tracing directives (Indented, PegViz, etc.) with optional parameters like spacing, verifying them case-insensitively. Overall, it’s straightforward and easy to extend.

derive/src/lib.rs (2)

55-88: Well-documented tracing functionality!

The documentation clearly explains the tracing feature with good examples and formatting. It covers all tracing options including indentation, PegViz output, and rule skipping configurations.


352-352: LGTM: Added tracing attribute to proc macro.

The addition of the 'tracing' attribute to the proc macro aligns with the new tracing functionality.

pest/src/lib.rs (2)

65-101: Documentation matches derive/src/lib.rs.

The tracing documentation is consistent with the documentation in derive/src/lib.rs, which is good for maintaining consistency across the codebase.


369-370: LGTM: Added tracing exports.

The addition of state_with_tracing, TracingConfig, and TracingType to the public exports provides the necessary types for the tracing functionality.

debugger/src/lib.rs (4)

136-136: LGTM: Added tracing configuration field.

The addition of tracing_config field to DebuggerContext enables tracing support.


216-233: Well-structured tracing configuration methods.

The new methods for configuring tracing are well-documented and provide a clean API for setting up tracing options.


294-294: LGTM: Updated parsing to use tracing configuration.

The handle method now correctly uses parse_with_tracing with the tracing configuration.


389-389: LGTM: Added default tracing configuration.

The default implementation correctly initializes the tracing configuration.

generator/Cargo.toml (1)

29-29: LGTM: Added extra-traits feature to syn dependency.

The addition of the "extra-traits" feature to the syn dependency is appropriate for supporting the enhanced tracing functionality.

debugger/Cargo.toml (1)

20-20: New Clap Dependency Added
The introduction of the clap dependency (version "4.0.32" with the derive feature) is well aligned with the CLI refactoring objectives. This upgrade should improve command-line argument handling within the debugger module.

README.md (1)

164-166: Documentation Enhancement for New Features
The README now documents the newly added "Specialized parser debugger" and "Optional tracing output" features. This update clearly informs users about the debugger enhancements and tracing capabilities. Ensure that the descriptions remain consistent with the implemented behavior in the CLI and parser modules.

Copy link
Contributor

@tomtau tomtau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR contains many semver-breaking changes -- I suggest either changing it to be major version non-breaking or putting it under a feature flag

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
generator/src/parse_derive.rs (1)

102-102: Remove debug println statements.

Debug print statements should be removed before merging.

-        println!("attr");
-        println!("indent: {:#?}", 2);

Also applies to: 116-117, 120-121

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3e5c66a and 1f05060.

📒 Files selected for processing (14)
  • README.md (1 hunks)
  • debugger/Cargo.toml (1 hunks)
  • debugger/src/lib.rs (6 hunks)
  • debugger/src/main.rs (9 hunks)
  • derive/src/lib.rs (2 hunks)
  • generator/Cargo.toml (1 hunks)
  • generator/src/generator.rs (11 hunks)
  • generator/src/macros.rs (1 hunks)
  • generator/src/parse_derive.rs (5 hunks)
  • pest/Cargo.toml (1 hunks)
  • pest/src/iterators/queueable_token.rs (1 hunks)
  • pest/src/lib.rs (2 hunks)
  • pest/src/parser_state.rs (6 hunks)
  • vm/src/lib.rs (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (6)
  • pest/src/iterators/queueable_token.rs
  • README.md
  • debugger/Cargo.toml
  • generator/Cargo.toml
  • pest/Cargo.toml
  • derive/src/lib.rs
🔇 Additional comments (25)
generator/src/macros.rs (3)

12-12: LGTM: Updated builtin rule generation to support tracing.

The insert_builtin! macro now correctly passes false for the $implicit parameter, indicating that builtin rules are not implicit rules.


18-26: LGTM: Added tracing support to rule generation (std version).

The generate_rule! macro now wraps the pattern execution with trace_wrapper, enabling tracing capabilities. The implementation correctly handles:

  • Rule name via stringify!($name)
  • Implicit rule flag via $implicit
  • Debug flag (set to false)

33-41: LGTM: Added tracing support to rule generation (no_std version).

The no_std version mirrors the std version's tracing implementation, maintaining consistency across both environments.

generator/src/parse_derive.rs (5)

12-12: LGTM: Added tracing imports.

The required types for tracing configuration are properly imported.


29-30: LGTM: Added tracing configuration to ParsedDerive.

The tracing_config field is properly documented and typed.


55-61: LGTM: Added tracing attribute parsing.

The implementation correctly collects and processes tracing attributes from the AST.


98-159: LGTM: Comprehensive tracing configuration parsing.

The implementation handles all tracing options:

  • Indented with configurable spacing
  • PegViz format
  • Skip implicit rules
  • Skip silent rules

The error handling is robust, with descriptive error messages.


161-178: LGTM: Case-insensitive identifier checking.

The helper function provides a robust way to compare identifiers case-insensitively, improving user experience by allowing flexible casing in attributes.

debugger/src/main.rs (4)

37-79: LGTM: Well-structured CLI arguments using clap.

The ClapCliArgs struct provides a comprehensive and well-documented set of CLI options, including tracing configuration.


81-128: LGTM: Well-structured command handling using clap.

The Commands enum provides a clean and type-safe way to handle various debugger commands.


177-177: Verify the increased timeout duration.

The timeout duration has been increased from 5 seconds to 30 seconds in multiple locations. While this might help with complex grammars, it could impact user experience for simpler cases.

Please confirm if this increase is necessary based on the grammar complexity mentioned in the PR objectives.

Also applies to: 200-200, 326-326


151-171: LGTM: Added tracing support to Cli struct.

The implementation provides comprehensive tracing configuration:

  • Support for PegViz and Indented formats
  • Configurable indentation spacing
  • Options to skip implicit and silent rules
pest/src/lib.rs (2)

72-101: LGTM: Clear and comprehensive tracing documentation.

The documentation clearly explains:

  • How to enable tracing
  • Available tracing formats (Indented, PegViz)
  • Additional options (SkipImplicit, SkipSilent)

369-371: LGTM: Added tracing exports.

The necessary types for tracing support are properly exported:

  • state_with_tracing
  • TracingConfig
  • TracingType
pest/src/parser_state.rs (4)

81-100: Well-designed tracing type enum with clear documentation.

The TracingType enum provides a clean and extensible way to configure tracing output with three well-documented variants. The ToTokens implementation enables seamless code generation.


102-143: Well-structured tracing configuration with comprehensive defaults.

The TracingConfig struct provides a clean API for configuring tracing behavior with:

  • Clear field documentation
  • Sensible default values
  • Code generation support via ToTokens

513-517: Clean integration of tracing into parser state.

The tracing functionality is well-integrated into ParserState with appropriate fields and initialization.

Also applies to: 595-609


612-714: Robust tracing wrapper with comprehensive output options.

The trace_wrapper implementation provides:

  • Clean handling of different tracing modes
  • Proper skipping of implicit/silent rules based on configuration
  • Well-formatted output for debugging
debugger/src/lib.rs (3)

73-73: Clean integration of tracing configuration into debugger context.

The tracing configuration is well-integrated into the DebuggerContext struct with appropriate imports.

Also applies to: 136-136


215-234: Well-designed API for configuring tracing behavior.

The tracing configuration methods provide a clean and intuitive interface for controlling tracing output with clear documentation.


267-268: Proper integration of tracing into parsing process.

The handle method correctly integrates tracing configuration into the parsing process by passing it to the VM.

Also applies to: 294-294

vm/src/lib.rs (2)

72-84: Clean implementation of tracing-enabled parsing.

The parse method changes maintain backward compatibility while adding tracing support through a new parse_with_tracing method.


177-205: Consistent integration of tracing in rule handling.

The trace_wrapper is well-integrated throughout the rule handling logic with proper support for:

  • Implicit rules
  • Silent rules
  • Different rule types

Also applies to: 329-335, 357-388

generator/src/generator.rs (2)

38-38: Clean integration of tracing into code generation.

The generate function properly integrates tracing configuration into the generated parser implementation.

Also applies to: 88-94


309-313: Consistent integration of tracing in generated code.

The trace_wrapper is well-integrated into the generated code with proper handling of:

  • Implicit rules
  • Silent rules
  • Different rule types

Also applies to: 320-325, 337-341

@rlpowell
Copy link
Author

rlpowell commented Feb 5, 2025

Ah, running cargo test at the top actually runs everything looks like; working on cleaning up the tests now.

@rlpowell
Copy link
Author

rlpowell commented Feb 5, 2025

This PR contains many semver-breaking changes -- I suggest either changing it to be major version non-breaking or putting it under a feature flag

This should change no behaviour for normal users, except users of the debugger. What semver-breaking changes are you pointing at? (Also, I'm in the discord if you want to chat there.)

- Fixed debugger argument handling
- Expanded debugger behaviour a bit (like being able to continue a bunch
  of times)
- Added support for tracing the productions of a grammar during parsing,
  in both the debugger and normal modes
@rlpowell
Copy link
Author

rlpowell commented Feb 5, 2025

OK, all the tests should pass now except that tests::json_handles_deep_nesting has a stack overflow; they all pass when I run RUST_MIN_STACK=16777216 cargo test though. Not sure what to do about that. Haven't gone over the coderabbit nits yet.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (5)
generator/src/parse_derive.rs (1)

102-102: Remove debug println statements.

These debug statements should be removed before merging:

-        println!("attr");
-        println!("indent: {:#?}", n);
-        println!("indent: {:#?}", 2);

Also applies to: 116-116, 120-120

vm/src/lib.rs (2)

98-116: Consider consolidating repeated trace_wrapper calls.

The code has multiple similar trace_wrapper calls for different built-in rules. This pattern could be refactored to reduce duplication.

Consider introducing a helper function:

+fn trace_builtin_rule<F>(state: Box<ParserState>, rule: &str, f: F) -> ParseResult<Box<ParserState>>
+where
+    F: FnOnce(Box<ParserState>) -> ParseResult<Box<ParserState>>,
+{
+    state.trace_wrapper(rule, false, false, f)
+}

-return state.trace_wrapper(rule, false, false, |state| state.skip(1));
+return trace_builtin_rule(state, rule, |state| state.skip(1));

332-390: Consider simplifying nested trace_wrapper calls.

The whitespace/comment handling code has deeply nested trace_wrapper calls that could be simplified for better readability.

Consider extracting the nested logic into separate functions:

+fn handle_whitespace(state: Box<ParserState>) -> ParseResult<Box<ParserState>> {
+    state.trace_wrapper("WHITESPACE", true, false, |state| {
+        self.parse_rule("WHITESPACE", state)
+    })
+}

+fn handle_comment(state: Box<ParserState>) -> ParseResult<Box<ParserState>> {
+    state.trace_wrapper("COMMENT", true, false, |state| {
+        self.parse_rule("COMMENT", state)
+    })
+}

-state.trace_wrapper("skip", true, false, |state| {
-    state.sequence(|state| {
-        state.repeat(|state| {
-            state.trace_wrapper("WHITESPACE", true, false, |state| {
-                self.parse_rule("WHITESPACE", state)
-            })
-        })
-    })
-})
+state.trace_wrapper("skip", true, false, |state| {
+    state.sequence(|state| {
+        state.repeat(handle_whitespace)
+    })
+})
generator/src/generator.rs (1)

176-206: Consider consolidating rule type handling.

The code has similar patterns for handling different rule types with tracing. This could be simplified to reduce duplication.

Consider introducing a helper function:

+fn generate_rule_with_tracing(rule_name: &str, implicit: bool, silent: bool, expr: TokenStream) -> TokenStream {
+    quote! {
+        state.trace_wrapper(#rule_name, #implicit, #silent, |state| {
+            state.rule(#rule_name, |state| {
+                state.atomic(::pest::Atomicity::Atomic, |state| {
+                    #expr
+                })
+            })
+        })
+    }
+}

-RuleType::Normal => state.trace_wrapper(&rule.name, true, false, |state| {
-    state.rule(&rule.name, |state| {
-        state.atomic(Atomicity::Atomic, |state| {
-            self.parse_expr(&rule.expr, state)
-        })
-    })
-}),
+RuleType::Normal => generate_rule_with_tracing(&rule.name, true, false, self.parse_expr(&rule.expr, state)),
pest/src/parser_state.rs (1)

625-718: Consider breaking down the trace_wrapper method.

The trace_wrapper method is quite long and handles multiple concerns. It could be split into smaller, focused methods.

Consider breaking it down:

+impl<'i, R: RuleType> ParserState<'i, R> {
+    fn handle_tracing_output(
+        &self,
+        rule_name: &str,
+        pos: Position<'i>,
+        tracing_type: TracingType,
+        success: bool
+    ) {
+        match (tracing_type, success) {
+            (TracingType::Indented, true) => println!(
+                "{}MATCHED `{}` at {}:{}",
+                " ".repeat(self.tracing_depth * self.tracing_config.spacing),
+                rule_name,
+                pos.line_col().0,
+                pos.line_col().1,
+            ),
+            (TracingType::Indented, false) => println!(
+                "{}FAILED `{}` at {}:{}",
+                " ".repeat(self.tracing_depth * self.tracing_config.spacing),
+                rule_name,
+                pos.line_col().0,
+                pos.line_col().1,
+            ),
+            // ... handle other cases
+        }
+    }
+}

-match (tracing_type, result) {
-    (TracingType::None, Ok(new_state)) => Ok(new_state),
-    // ... long match block
-}
+self.handle_tracing_output(rule_name, pos, tracing_type, result.is_ok());
+result
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1f05060 and 74c00ed.

📒 Files selected for processing (14)
  • README.md (1 hunks)
  • debugger/Cargo.toml (1 hunks)
  • debugger/src/lib.rs (6 hunks)
  • debugger/src/main.rs (9 hunks)
  • derive/src/lib.rs (2 hunks)
  • generator/Cargo.toml (1 hunks)
  • generator/src/generator.rs (11 hunks)
  • generator/src/macros.rs (1 hunks)
  • generator/src/parse_derive.rs (5 hunks)
  • pest/Cargo.toml (1 hunks)
  • pest/src/iterators/queueable_token.rs (1 hunks)
  • pest/src/lib.rs (2 hunks)
  • pest/src/parser_state.rs (6 hunks)
  • vm/src/lib.rs (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (7)
  • pest/src/iterators/queueable_token.rs
  • README.md
  • pest/Cargo.toml
  • debugger/Cargo.toml
  • generator/Cargo.toml
  • pest/src/lib.rs
  • derive/src/lib.rs
🔇 Additional comments (13)
generator/src/macros.rs (2)

12-12: LGTM! The macro signature update is consistent.

The addition of the $implicit parameter to generate_rule! macro and its usage in insert_builtin! is implemented consistently across both std and no_std features.

Also applies to: 18-18, 33-33


23-25: LGTM! The trace wrapper implementation is correct.

The trace_wrapper call is properly implemented with:

  • Rule name using stringify!($name)
  • Implicit flag from the new parameter
  • Consistent implementation in both std and no_std versions

Also applies to: 38-40

generator/src/parse_derive.rs (1)

161-178: LGTM! The case-insensitive implementation is robust.

The case-insensitive identifier check is well-implemented with:

  • Clear documentation
  • Proper error handling
  • Efficient string comparison
debugger/src/main.rs (2)

226-280: LGTM! The command execution is well-structured.

The command execution implementation:

  • Uses clear pattern matching
  • Handles all commands consistently
  • Provides good error handling

177-177: Verify the increased timeout duration.

The timeout duration has been increased from 5 to 30 seconds in multiple locations. While this might help with long-running operations, it could impact user experience.

Also applies to: 200-200, 326-326

✅ Verification successful

🌐 Web query:

What is the recommended timeout duration for CLI applications?

Length of output: 4781


Timeout Duration Verification in Debugger Operations

  • The changes increasing the timeout from 5 to 30 seconds have been applied consistently (e.g., at lines 177, 200, and 326).
  • The recv_timeout(Duration::from_secs(30)) call is used to wait for asynchronous responses; this duration is separate from typical CLI session timeouts.
  • As the web query shows, recommended CLI session timeouts are usually configured in minutes for longer interactive sessions. In contrast, a 30‑second wait in a debugging operation is often more appropriate to accommodate slower responses without unnecessarily prolonging idle waits.

Overall, the increased timeout duration is acceptable for the intended long-running operations while balancing responsiveness.

debugger/src/lib.rs (2)

216-233: LGTM! The tracing methods are well-implemented.

The tracing configuration methods are:

  • Well-documented
  • Have clear, single-responsibility implementations
  • Follow consistent patterns

267-267: LGTM! The handle method update properly integrates tracing.

The tracing integration in the handle method:

  • Correctly passes tracing_config to parse_with_tracing
  • Maintains existing functionality

Also applies to: 294-294

vm/src/lib.rs (2)

67-73: LGTM! Clean delegation to parse_with_tracing.

The parse method now cleanly delegates to parse_with_tracing with default tracing configuration, maintaining backward compatibility while enabling tracing support.


77-84: LGTM! Well-structured tracing integration.

The new parse_with_tracing method cleanly integrates tracing support by passing the configuration to pest::state_with_tracing. The method signature is clear and maintains the same error handling pattern as the original parse method.

generator/src/generator.rs (2)

38-39: LGTM! Clean tracing configuration extraction.

The tracing configuration is cleanly extracted from the parsed derive input.


309-313: LGTM! Well-structured implicit rule handling.

The code cleanly determines if a rule is implicit based on its name (WHITESPACE or COMMENT) and sets the flag appropriately.

pest/src/parser_state.rs (2)

81-89: LGTM! Well-designed tracing type enum.

The TracingType enum provides clear options for tracing output formats with good documentation.


103-113: LGTM! Well-structured tracing configuration.

The TracingConfig struct provides a clean way to configure tracing behavior with good documentation for each field.

})
.expect("Macro attribute `tracing' parse failed.");
}
_ => panic!("wut"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve error message for unexpected tracing attribute.

The error message "wut" is not descriptive enough. Consider providing a more informative message.

-            _ => panic!("wut"),
+            _ => panic!("Tracing attribute must be a list of the form `tracing(...)`."),
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
_ => panic!("wut"),
_ => panic!("Tracing attribute must be a list of the form `tracing(...)`."),

Comment on lines 151 to 159
fn tracing(&mut self, ttype: &str) {
let ttype = match ttype.to_lowercase().as_str() {
"pegviz" => TracingType::PegViz,
"indented" => TracingType::Indented,
"none" => TracingType::Indented,
_ => panic!("Bad tracing type {ttype}!"),
};
self.context.tracing(ttype);
}
Copy link
Contributor

@coderabbitai coderabbitai bot Feb 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Replace panic with proper error handling in tracing type matching.

The function panics on invalid tracing type. Consider returning a Result instead:

-    fn tracing(&mut self, ttype: &str) {
+    fn tracing(&mut self, ttype: &str) -> Result<(), String> {
         let ttype = match ttype.to_lowercase().as_str() {
             "pegviz" => TracingType::PegViz,
             "indented" => TracingType::Indented,
             "none" => TracingType::Indented,
-            _ => panic!("Bad tracing type {ttype}!"),
+            _ => return Err(format!("Invalid tracing type: {}", ttype)),
         };
         self.context.tracing(ttype);
+        Ok(())
     }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
fn tracing(&mut self, ttype: &str) {
let ttype = match ttype.to_lowercase().as_str() {
"pegviz" => TracingType::PegViz,
"indented" => TracingType::Indented,
"none" => TracingType::Indented,
_ => panic!("Bad tracing type {ttype}!"),
};
self.context.tracing(ttype);
}
fn tracing(&mut self, ttype: &str) -> Result<(), String> {
let ttype = match ttype.to_lowercase().as_str() {
"pegviz" => TracingType::PegViz,
"indented" => TracingType::Indented,
"none" => TracingType::Indented,
_ => return Err(format!("Invalid tracing type: {}", ttype)),
};
self.context.tracing(ttype);
Ok(())
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that's not how the rest of the debugger works; the standard here appears to be "print an error and do nothing", which my new version (not yet pushed) does.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

@rlpowell Understood, I apologize for the misaligned suggestion. You're absolutely right about the debugger's error handling pattern. I'll make note of this for future reviews.


🏁 Script executed:

#!/bin/bash
# Look for error handling patterns in the Cli struct methods
rg "Error:|println!.*Error:" debugger/src/main.rs

Length of output: 516

@rlpowell Thanks for the clarification. Looking at the codebase's error handling patterns more closely, I can see that your approach of printing the error and continuing is indeed consistent with the rest of the debugger. Your unpushed changes are on the right track.


✏️ Learnings added
Learnt from: rlpowell
PR: pest-parser/pest#1080
File: debugger/src/main.rs:151-159
Timestamp: 2025-02-05T05:17:46.419Z
Learning: In the pest debugger, the standard error handling pattern is to print an error message and continue execution, rather than propagating errors via Result.

Learnt from: rlpowell
PR: pest-parser/pest#1080
File: debugger/src/main.rs:151-159
Timestamp: 2025-02-05T05:17:46.419Z
Learning: The pest debugger's CLI methods handle errors by:
1. Using `println!("Error: ...")` or `eprintln!("Error: ...")` to display the error
2. Continuing execution after error display
3. Not propagating errors via Result types in CLI methods

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

@rlpowell
Copy link
Author

rlpowell commented Feb 5, 2025

Replying to coderabbit comments that I think might be important but I don't have good solutions for:

vm/src/lib.rs

326-326: Thirty-second timeout for requests is a reasonable fallback.

Increasing from 5s to 30s for network requests can significantly improve reliability for slower hosts. Still, consider user-configurable timeouts for maximum flexibility.

This was done solely because the grammar I'm using Very Large; people should object if it feels like a problem.

pest/src/parser_state.rs

24-26: Using std::println! might be too intrusive.
Relying on println! for tracing output can clutter STDOUT. Consider a logging approach (e.g., a configurable logger) to avoid mixing parser messages with user-facing console output.

This is definitely something people should express opinions about if they have them.

generator/src/macros.rs
12-12: Consider verifying the boolean nature of the second macro parameter.

Passing false for $implicit here works fine as an example, but notice $implicit: ident doesn’t inherently enforce a boolean type. If you accidentally pass a non-boolean token, the macro won’t catch it. A more explicit naming or usage pattern might improve safety and clarity.

Yeah I have no idea what a better fix looks like here.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (7)
generator/src/parse_derive.rs (1)

149-149: Improve error message for tracing attribute parsing failure.

The current error message is not descriptive enough.

-                .expect("Macro attribute `tracing' parse failed.");
+                .expect("Failed to parse tracing attribute. Please ensure the attribute follows the format: #[tracing(Indented(N), SkipImplicit, SkipSilent)]");
debugger/src/main.rs (6)

63-65: Consider using an enum for tracing type.

Instead of accepting any string, consider using an enum to restrict the values at compile-time:

-    /// Turn on tracing (string should be PegViz or Indented)
-    #[arg(short, long)]
-    tracing: Option<String>,
+    /// Turn on tracing
+    #[arg(short, long, value_enum)]
+    tracing: Option<TracingMode>,
+}
+
+#[derive(clap::ValueEnum, Clone)]
+enum TracingMode {
+    PegViz,
+    Indented,
+    None,

124-126: Convert code comment to doc comment.

The comment about help implementation should be a doc comment to maintain consistency with other command documentation.

-    // We have to do help ourselves so it doesn't exit :D
-    /// Show help
+    /// Show help (implemented manually to prevent automatic exit)

150-161: Simplify tracing type matching.

The match expression can be more concise using map_or_else:

     fn tracing(&mut self, ttype: &str) {
-        let ttype = match ttype.to_lowercase().as_str() {
-            "pegviz" => TracingType::PegViz,
-            "indented" => TracingType::Indented,
-            "none" => TracingType::None,
-            _ => {
-                eprintln!("Bad tracing type {ttype}!");
-                return;
-            }
-        };
-        self.context.tracing(ttype);
+        ttype.to_lowercase().as_str().into_iter().map_or_else(
+            || eprintln!("Bad tracing type {ttype}!"),
+            |t| match t {
+                "pegviz" => self.context.tracing(TracingType::PegViz),
+                "indented" => self.context.tracing(TracingType::Indented),
+                "none" => self.context.tracing(TracingType::None),
+                _ => eprintln!("Bad tracing type {ttype}!"),
+            }
+        );
     }

179-179: Consider making timeout duration configurable.

The timeout duration of 30 seconds is hardcoded in multiple places. Consider making it configurable:

+const DEFAULT_TIMEOUT_SECS: u64 = 30;
+
 struct ClapCliArgs {
+    /// Set timeout duration in seconds (default: 30)
+    #[arg(long, default_value_t = DEFAULT_TIMEOUT_SECS)]
+    timeout: u64,
     // ... other fields
 }

Then use cli_args.timeout instead of the hardcoded value:

-            .timeout(Some(Duration::from_secs(30)))
+            .timeout(Some(Duration::from_secs(cli_args.timeout)))

Also applies to: 202-202, 328-328


260-264: Convert comment to a proper TODO with tracking.

The comment about potential clap argument issues should be converted to a proper TODO with tracking:

-            // NOTE: It seems possible that there are conditions under which this doesn't work
-            // because the "arguments" are not ones that Clap likes; if that happens, just pull the
-            // `id string` handling out of here.  Just check that the string starts with "id " and
-            // keep the rest.
+            // TODO(#issue): Investigate and fix potential clap argument parsing issues
+            // If the arguments are not clap-compatible, fallback to simple string handling:
+            // 1. Check if the string starts with "id "
+            // 2. Keep the rest as raw input

Would you like me to create an issue to track this TODO?


374-378: Simplify breakpoint initialization using iterators.

The nested loops can be simplified using iterator methods:

-    for breakpoint_list in cli_args.breakpoints {
-        for breakpoint in breakpoint_list.split_whitespace() {
-            context.breakpoint(breakpoint);
-        }
-    }
+    cli_args.breakpoints
+        .iter()
+        .flat_map(|list| list.split_whitespace())
+        .for_each(|breakpoint| context.breakpoint(breakpoint));
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 74c00ed and 70e9b54.

📒 Files selected for processing (2)
  • debugger/src/main.rs (9 hunks)
  • generator/src/parse_derive.rs (5 hunks)
🧰 Additional context used
📓 Learnings (1)
debugger/src/main.rs (1)
Learnt from: rlpowell
PR: pest-parser/pest#1080
File: debugger/src/main.rs:151-159
Timestamp: 2025-02-05T05:17:46.512Z
Learning: In the pest debugger, the standard error handling pattern is to print an error message and continue execution, rather than propagating errors via Result.
🔇 Additional comments (4)
generator/src/parse_derive.rs (4)

12-12: LGTM! Clean addition of tracing support.

The new imports and struct field are well-documented and properly integrated.

Also applies to: 29-30


55-61: LGTM! Well-structured tracing attribute collection.

The code follows consistent patterns and maintains good readability.


158-175: LGTM! Well-implemented case-insensitive comparison.

The helper function is well-documented and properly handles error cases. The case-insensitive comparison improves user experience by being more flexible with input.


177-263: Add test coverage for tracing functionality.

The new tracing configuration parsing needs test coverage. Consider adding tests for:

  • Different tracing configurations (Indented, PegViz, etc.)
  • Invalid tracing attributes
  • Case sensitivity handling

Would you like me to generate the test cases?

@rlpowell
Copy link
Author

rlpowell commented Feb 5, 2025

I believe I've fixed everything obviously-fixable in coderabbit's comments, but see my long comment just above.

@tomtau tomtau requested a review from NoahTheDuke February 5, 2025 07:28
@tomtau
Copy link
Contributor

tomtau commented Feb 5, 2025

This should change no behaviour for normal users, except users of the debugger. What semver-breaking changes are you pointing at? (Also, I'm in the discord if you want to chat there.)

@rlpowell I only had a quick look and I saw that pub fn new(... added parameters, but that's been fixed. Now, it looks better, but I haven't looked in detail -- especially I'm not sure about the derive-macro related changes.
In general, it should be fine, but one issue I encountered before was that cargo can (for whatever mysterious dependency resolution reasons) decide to include different versions of pest crates (e.g. pest 2.7.15 and pest_generator 2.5.7), so some of these "internal cross-crate" changes could accidentally break the end user code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants