Skip to content
  • Koichi ITO's avatar
    75ed0863
    [ruby/prism] Fix `kDO_LAMBDA` token incompatibility for `Prism::Translation::Parser::Lexer` · 75ed0863
    Koichi ITO authored
    ## Summary
    
    This PR fixes `kDO_LAMBDA` token incompatibility between Parser gem and `Prism::Translation::Parser` for lambda `do` block.
    
    ### Parser gem (Expected)
    
    Returns `kDO_LAMBDA` token:
    
    ```console
    $ bundle exec ruby -Ilib -rparser/ruby33 -ve \
    'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "-> do end"; p Parser::Ruby33.new.tokenize(buf)[2]'
    ruby 3.4.0dev (2024-09-01T11:00:13Z master https://github.com/ruby/prism/commit/eb144ef91e) [x86_64-darwin23]
    [[:tLAMBDA, ["->", #<Parser::Source::Range example.rb 0...2>]], [:kDO_LAMBDA, ["do", #<Parser::Source::Range example.rb 3...5>]],
    [:kEND, ["end", #<Parser::Source::Range example.rb 6...9>]]]
    ```
    
    ### `Prism::Translation::Parser` (Actual)
    
    Previously, the parser returned `kDO` token when parsing the following:
    
    ```console
    $ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
    'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "-> do end"; p Prism::Translation::Parser33.new.tokenize(buf)[2]'
    ruby 3.4.0dev (2024-09-01T11:00:13Z master https://github.com/ruby/prism/commit/eb144ef91e) [x86_64-darwin23]
    [[:tLAMBDA, ["->", #<Parser::Source::Range example.rb 0...2>]], [:kDO, ["do", #<Parser::Source::Range example.rb 3...5>]],
    [:kEND, ["end", #<Parser::Source::Range example.rb 6...9>]]]
    ```
    
    After the update, the parser now returns `kDO_LAMBDA` token for the same input:
    
    ```console
    $ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
    'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "-> do end"; p Prism::Translation::Parser33.new.tokenize(buf)[2]'
    ruby 3.4.0dev (2024-09-01T11:00:13Z master https://github.com/ruby/prism/commit/eb144ef91e) [x86_64-darwin23]
    [[:tLAMBDA, ["->", #<Parser::Source::Range example.rb 0...2>]], [:kDO_LAMBDA, ["do", #<Parser::Source::Range example.rb 3...5>]],
    [:kEND, ["end", #<Parser::Source::Range example.rb 6...9>]]]
    ```
    
    ## Additional Information
    
    Unfortunately, this kind of edge case doesn't work as expected; `kDO` is returned instead of `kDO_LAMBDA`.
    However, since `kDO` is already being returned in this case, there is no change in behavior.
    
    ### Parser gem
    
    Returns `tLAMBDA` token:
    
    ```console
    $ bundle exec ruby -Ilib -rparser/ruby33 -ve \
    'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "-> (foo = -> (bar) {}) do end"; p Parser::Ruby33.new.tokenize(buf)[2]'
    ruby 3.3.5 (2024-09-03 revision https://github.com/ruby/prism/commit/ef084cc8f4) [x86_64-darwin23]
    [[:tLAMBDA, ["->", #<Parser::Source::Range example.rb 0...2>]], [:tLPAREN2, ["(", #<Parser::Source::Range example.rb 3...4>]],
    [:tIDENTIFIER, ["foo", #<Parser::Source::Range example.rb 4...7>]], [:tEQL, ["=", #<Parser::Source::Range example.rb 8...9>]],
    [:tLAMBDA, ["->", #<Parser::Source::Range example.rb 10...12>]], [:tLPAREN2, ["(", #<Parser::Source::Range example.rb 13...14>]],
    [:tIDENTIFIER, ["bar", #<Parser::Source::Range example.rb 14...17>]], [:tRPAREN, [")", #<Parser::Source::Range example.rb 17...18>]],
    [:tLAMBEG, ["{", #<Parser::Source::Range example.rb 19...20>]], [:tRCURLY, ["}", #<Parser::Source::Range example.rb 20...21>]],
    [:tRPAREN, [")", #<Parser::Source::Range example.rb 21...22>]], [:kDO_LAMBDA, ["do", #<Parser::Source::Range example.rb 23...25>]],
    [:kEND, ["end", #<Parser::Source::Range example.rb 26...29>]]]
    ```
    
    ### `Prism::Translation::Parser`
    
    Returns `kDO` token:
    
    ```console
    $ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
    'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "-> (foo = -> (bar) {}) do end"; p Prism::Translation::Parser33.new.tokenize(buf)[2]'
    ruby 3.3.5 (2024-09-03 revision https://github.com/ruby/prism/commit/ef084cc8f4) [x86_64-darwin23]
    [[:tLAMBDA, ["->", #<Parser::Source::Range example.rb 0...2>]], [:tLPAREN2, ["(", #<Parser::Source::Range example.rb 3...4>]],
    [:tIDENTIFIER, ["foo", #<Parser::Source::Range example.rb 4...7>]], [:tEQL, ["=", #<Parser::Source::Range example.rb 8...9>]],
    [:tLAMBDA, ["->", #<Parser::Source::Range example.rb 10...12>]], [:tLPAREN2, ["(", #<Parser::Source::Range example.rb 13...14>]],
    [:tIDENTIFIER, ["bar", #<Parser::Source::Range example.rb 14...17>]], [:tRPAREN, [")", #<Parser::Source::Range example.rb 17...18>]],
    [:tLAMBEG, ["{", #<Parser::Source::Range example.rb 19...20>]], [:tRCURLY, ["}", #<Parser::Source::Range example.rb 20...21>]],
    [:tRPAREN, [")", #<Parser::Source::Range example.rb 21...22>]], [:kDO, ["do", #<Parser::Source::Range example.rb 23...25>]],
    [:kEND, ["end", #<Parser::Source::Range example.rb 26...29>]]]
    ```
    
    As the intention is not to address such special cases at this point, a comment has been left indicating that this case still returns `kDO`.
    In other words, `kDO_LAMBDA` will now be returned except for edge cases after this PR.
    
    https://github.com/ruby/prism/commit/2ee480654c
    75ed0863
    [ruby/prism] Fix `kDO_LAMBDA` token incompatibility for `Prism::Translation::Parser::Lexer`
    Koichi ITO authored
    ## Summary
    
    This PR fixes `kDO_LAMBDA` token incompatibility between Parser gem and `Prism::Translation::Parser` for lambda `do` block.
    
    ### Parser gem (Expected)
    
    Returns `kDO_LAMBDA` token:
    
    ```console
    $ bundle exec ruby -Ilib -rparser/ruby33 -ve \
    'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "-> do end"; p Parser::Ruby33.new.tokenize(buf)[2]'
    ruby 3.4.0dev (2024-09-01T11:00:13Z master https://github.com/ruby/prism/commit/eb144ef91e) [x86_64-darwin23]
    [[:tLAMBDA, ["->", #<Parser::Source::Range example.rb 0...2>]], [:kDO_LAMBDA, ["do", #<Parser::Source::Range example.rb 3...5>]],
    [:kEND, ["end", #<Parser::Source::Range example.rb 6...9>]]]
    ```
    
    ### `Prism::Translation::Parser` (Actual)
    
    Previously, the parser returned `kDO` token when parsing the following:
    
    ```console
    $ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
    'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "-> do end"; p Prism::Translation::Parser33.new.tokenize(buf)[2]'
    ruby 3.4.0dev (2024-09-01T11:00:13Z master https://github.com/ruby/prism/commit/eb144ef91e) [x86_64-darwin23]
    [[:tLAMBDA, ["->", #<Parser::Source::Range example.rb 0...2>]], [:kDO, ["do", #<Parser::Source::Range example.rb 3...5>]],
    [:kEND, ["end", #<Parser::Source::Range example.rb 6...9>]]]
    ```
    
    After the update, the parser now returns `kDO_LAMBDA` token for the same input:
    
    ```console
    $ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
    'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "-> do end"; p Prism::Translation::Parser33.new.tokenize(buf)[2]'
    ruby 3.4.0dev (2024-09-01T11:00:13Z master https://github.com/ruby/prism/commit/eb144ef91e) [x86_64-darwin23]
    [[:tLAMBDA, ["->", #<Parser::Source::Range example.rb 0...2>]], [:kDO_LAMBDA, ["do", #<Parser::Source::Range example.rb 3...5>]],
    [:kEND, ["end", #<Parser::Source::Range example.rb 6...9>]]]
    ```
    
    ## Additional Information
    
    Unfortunately, this kind of edge case doesn't work as expected; `kDO` is returned instead of `kDO_LAMBDA`.
    However, since `kDO` is already being returned in this case, there is no change in behavior.
    
    ### Parser gem
    
    Returns `tLAMBDA` token:
    
    ```console
    $ bundle exec ruby -Ilib -rparser/ruby33 -ve \
    'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "-> (foo = -> (bar) {}) do end"; p Parser::Ruby33.new.tokenize(buf)[2]'
    ruby 3.3.5 (2024-09-03 revision https://github.com/ruby/prism/commit/ef084cc8f4) [x86_64-darwin23]
    [[:tLAMBDA, ["->", #<Parser::Source::Range example.rb 0...2>]], [:tLPAREN2, ["(", #<Parser::Source::Range example.rb 3...4>]],
    [:tIDENTIFIER, ["foo", #<Parser::Source::Range example.rb 4...7>]], [:tEQL, ["=", #<Parser::Source::Range example.rb 8...9>]],
    [:tLAMBDA, ["->", #<Parser::Source::Range example.rb 10...12>]], [:tLPAREN2, ["(", #<Parser::Source::Range example.rb 13...14>]],
    [:tIDENTIFIER, ["bar", #<Parser::Source::Range example.rb 14...17>]], [:tRPAREN, [")", #<Parser::Source::Range example.rb 17...18>]],
    [:tLAMBEG, ["{", #<Parser::Source::Range example.rb 19...20>]], [:tRCURLY, ["}", #<Parser::Source::Range example.rb 20...21>]],
    [:tRPAREN, [")", #<Parser::Source::Range example.rb 21...22>]], [:kDO_LAMBDA, ["do", #<Parser::Source::Range example.rb 23...25>]],
    [:kEND, ["end", #<Parser::Source::Range example.rb 26...29>]]]
    ```
    
    ### `Prism::Translation::Parser`
    
    Returns `kDO` token:
    
    ```console
    $ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
    'buf = Parser::Source::Buffer.new("example.rb"); buf.source = "-> (foo = -> (bar) {}) do end"; p Prism::Translation::Parser33.new.tokenize(buf)[2]'
    ruby 3.3.5 (2024-09-03 revision https://github.com/ruby/prism/commit/ef084cc8f4) [x86_64-darwin23]
    [[:tLAMBDA, ["->", #<Parser::Source::Range example.rb 0...2>]], [:tLPAREN2, ["(", #<Parser::Source::Range example.rb 3...4>]],
    [:tIDENTIFIER, ["foo", #<Parser::Source::Range example.rb 4...7>]], [:tEQL, ["=", #<Parser::Source::Range example.rb 8...9>]],
    [:tLAMBDA, ["->", #<Parser::Source::Range example.rb 10...12>]], [:tLPAREN2, ["(", #<Parser::Source::Range example.rb 13...14>]],
    [:tIDENTIFIER, ["bar", #<Parser::Source::Range example.rb 14...17>]], [:tRPAREN, [")", #<Parser::Source::Range example.rb 17...18>]],
    [:tLAMBEG, ["{", #<Parser::Source::Range example.rb 19...20>]], [:tRCURLY, ["}", #<Parser::Source::Range example.rb 20...21>]],
    [:tRPAREN, [")", #<Parser::Source::Range example.rb 21...22>]], [:kDO, ["do", #<Parser::Source::Range example.rb 23...25>]],
    [:kEND, ["end", #<Parser::Source::Range example.rb 26...29>]]]
    ```
    
    As the intention is not to address such special cases at this point, a comment has been left indicating that this case still returns `kDO`.
    In other words, `kDO_LAMBDA` will now be returned except for edge cases after this PR.
    
    https://github.com/ruby/prism/commit/2ee480654c
Loading