Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parser problem with chained method calls #29

Open
Leeft opened this issue Mar 29, 2023 · 3 comments
Open

Parser problem with chained method calls #29

Leeft opened this issue Mar 29, 2023 · 3 comments

Comments

@Leeft
Copy link
Contributor

Leeft commented Mar 29, 2023

I've come up with a bit of code we use a lot at work, and the parser doesn't handle it very well. I'm not sure where to go with the solution, so here are some thoughts and hoping for feedback.

my $rs = $self->schema->resultset('Customer')
    ->active
    ->uk_based(  );

The current parser fails rather spectacularly at this:

# ex1
(source_file [0, 0] - [2, 19]
  (variable_declaration [0, 0] - [2, 18]                                    # my $rs = $self->schema->resultset('Customer')->active->uk_based();
    (scope [0, 0] - [0, 2])                                                 # my
    (single_var_declaration [0, 3] - [0, 6]                                 # $rs
        name: (scalar_variable [0, 3] - [0, 6]))                            #
    value: (method_invocation [0, 9] - [2, 18]                              # $self->schema->resultset('Customer')->active->uk_based();
      object: (scalar_variable [0, 9] - [0, 14])                            # $self
      (arrow_operator [0, 14] - [0, 16])                                    #      ->
      function_name: (identifier [0, 16] - [0, 22])                         #        schema
      (arrow_operator [0, 22] - [0, 24])                                    #              ->
      function_name: (identifier [0, 24] - [0, 33])                         #                resultset
      args: (arguments [0, 33] - [2, 18]                                    #                         ('Customer')->active->uk_based();
        (argument [0, 33] - [2, 18]                                         #                         ('Customer')->active->uk_based();
          (method_invocation [0, 33] - [2, 18]                              #                         ('Customer')->active->uk_based();
            object_return_value: (string_single_quoted [0, 34] - [0, 44])   #                          'Customer'
            (arrow_operator [1, 4] - [1, 6])                                #                                     ->
            function_name: (identifier [1, 6] - [1, 12])                    #                                       active
            (arrow_operator [2, 4] - [2, 6])                                #                                             ->
            function_name: (identifier [2, 6] - [2, 14])                    #                                               uk_based
            args: (empty_parenthesized_argument [2, 14] - [2, 18]))))))     #                                                       ()
  (semi_colon [2, 18] - [2, 19]))

I was fairly easily able to make it stop do that, though the result isn't that much better overall. Note how there is a single method_invocation while there are 4 method calls:

  # ex2
  (variable_declaration [0, 0] - [2, 18]        
    (scope [0, 0] - [0, 2])
    (single_var_declaration [0, 3] - [0, 6]     
        name: (scalar_variable [0, 3] - [0, 6]))
    value: (method_invocation [0, 9] - [2, 18]  
      object: (scalar_variable [0, 9] - [0, 14])
      (arrow_operator [0, 14] - [0, 16])
      function_name: (identifier [0, 16] - [0, 22])
      (arrow_operator [0, 22] - [0, 24])
      function_name: (identifier [0, 24] - [0, 33])
      args: (parenthesized_argument [0, 33] - [0, 45]
        (arguments [0, 34] - [0, 44]
          (argument [0, 34] - [0, 44]
            (string_single_quoted [0, 34] - [0, 44]))))
      (arrow_operator [1, 4] - [1, 6])
      function_name: (identifier [1, 6] - [1, 12])
      (arrow_operator [2, 4] - [2, 6])
      function_name: (identifier [2, 6] - [2, 14])
      args: (empty_parenthesized_argument [2, 14] - [2, 18])))
  (semi_colon [2, 18] - [2, 19]))

Part of the problem would seem to be that method_invocation includes that what it is invoked on. I think that makes sense if the methods were executed right to left and thus somewhat recursively, but they're not, they're executed sequentially. Aside from that, I think the scope/single_var_declaration should've been wrapped in an lvalue node, and the rest after the assignment in an rvalue node (adding more structure). In these fictional results I've added this rvalue.

More method_invocation calls would start to look something like this:

# ex3
(source_file
  (variable_declaration
    (scope)                                                                   # my
    (single_var_declaration (scalar_variable))                                # $rs
	                                                                          # = (anonymous)
    (rvalue                                                                          
      (method_invocation (scalar_variable) (arrow_operator) (identifier))       # $self -> schema
      (method_invocation (arrow_operator) (identifier)
         (parenthesized_argument (arguments (argument (string_single_quoted))))
      )                                                                         # -> resultset( 'Customer' )
      (method_invocation (arrow_operator) (identifier))                         # -> active
      (method_invocation (arrow_operator) (identifier)
         (empty_parenthesized_argument)
      )                                                                         # -> uk_based ()
    )
  )
  (semi_colon)
)

But now the 1st one has the variable, the rest has nothing. I'm not keen on that. And a method call is just that, it is called on whatever is fed into it. Which led me to something like this:

# ex4
(source_file
  (variable_declaration
    (scope)                                                                     # my
    (single_var_declaration (scalar_variable))                                  # $rs
	                                                                            # = (anonymous)
    (rvalue                                                                     
      (scalar_variable)                                                         # $self
      (method_invocation (arrow_operator) (identifier))                         # -> schema
      (method_invocation (arrow_operator) (identifier)
         (parenthesized_argument (arguments (argument (string_single_quoted))))
      )                                                                         # -> resultset( 'Customer' )
      (method_invocation (arrow_operator) (identifier))                         # -> active
      (method_invocation (arrow_operator) (identifier)
         (empty_parenthesized_argument)
      )                                                                         # -> uk_based ()
    )
  )
  (semi_colon)
)

I've not yet investigated how hard it might be to get to this result (doesn't seem to be an inconsequential change to the parse tree), but I'd love to know what others think.

@Leeft
Copy link
Contributor Author

Leeft commented Mar 29, 2023

And yeah, maybe things should be more nested, but maybe not. I'm not decided yet either way. FWIW, here's what that other parser (tree-sitter-perl/tree-sitter-perl) makes of it:

(source_file [0, 0] - [2, 19]
  (expression_statement [0, 0] - [2, 18]
    (assignment_expression [0, 0] - [2, 18]
      left: (variable_declaration [0, 0] - [0, 6]
        variable: (scalar [0, 3] - [0, 6]))
      right: (method_call_expression [0, 9] - [2, 18]
        invocant: (method_call_expression [0, 9] - [1, 12]
          invocant: (method_call_expression [0, 9] - [0, 45]
            invocant: (method_call_expression [0, 9] - [0, 22]
              invocant: (scalar [0, 9] - [0, 14])
              method: (method [0, 16] - [0, 22]))
            method: (method [0, 24] - [0, 33])
            arguments: (string_literal [0, 34] - [0, 44]))
          method: (method [1, 6] - [1, 12]))
        method: (method [2, 6] - [2, 14])))))

(not very readable at least imo)

@ganezdragon
Copy link
Owner

@Leeft let me work on this, this weekend. Even in my org, we have several expressions as such. Let me refactor it.

@Leeft
Copy link
Contributor Author

Leeft commented Mar 29, 2023

Okay @ganezdragon, thank you and please keep us posted :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants