Skip to content
  • Koichi ITO's avatar
    3605d607
    [ruby/prism] Fix token incompatibility for `Prism::Translation::Parser::Lexer` · 3605d607
    Koichi ITO authored
    This PR fixes token incompatibility for `Prism::Translation::Parser::Lexer` when using backquoted heredoc indetiner:
    
    ```ruby
    <<-`  FOO`
    a
    b
         FOO
    ```
    
    ## Parser gem (Expected)
    
    Returns `tXSTRING_BEG` as the first token:
    
    ```console
    $ bundle exec ruby -Ilib -rparser/ruby33 -ve \
    'buf = Parser::Source::Buffer.new("example.rb"); buf.source = File.read("example.rb"); p Parser::Ruby33.new.tokenize(buf)'
    ruby 3.3.0 (2023-12-25 revision https://github.com/ruby/prism/commit/5124f9ac75) [x86_64-darwin22]
    [s(:xstr,
      s(:str, "a\n"),
      s(:str, "b\n")), [], [[:tXSTRING_BEG, ["<<`", #<Parser::Source::Range example.rb 0...10>]],
    [:tSTRING_CONTENT, ["a\n", #<Parser::Source::Range example.rb 11...13>]],
    [:tSTRING_CONTENT, ["b\n", #<Parser::Source::Range example.rb 13...15>]],
    [:tSTRING_END, ["  FOO", #<Parser::Source::Range example.rb 15...23>]], [:tNL, [nil, #<Parser::Source::Range example.rb 10...11>]]]]
    ```
    
    ## `Prism::Translation::Parser` (Actual)
    
    Previously, the tokens returned by the Parser gem were different. The escaped backslash does not match in the `tSTRING_BEG` token and
    value of `tSTRING_END` token.
    
    ```console
    $ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
    'buf = Parser::Source::Buffer.new("example.rb"); buf.source = File.read("example.rb"); p Prism::Translation::Parser33.new.tokenize(buf)'
    ruby 3.3.0 (2023-12-25 revision https://github.com/ruby/prism/commit/5124f9ac75) [x86_64-darwin22]
    [s(:xstr,
      s(:str, "a\n"),
      s(:str, "b\n")), [], [[:tSTRING_BEG, ["<<\"", #<Parser::Source::Range example.rb 0...10>]],
    [:tSTRING_CONTENT, ["a\n", #<Parser::Source::Range example.rb 11...13>]],
    [:tSTRING_CONTENT, ["b\n", #<Parser::Source::Range example.rb 13...15>]],
    [:tSTRING_END, ["`  FOO`", #<Parser::Source::Range example.rb 15...23>]], [:tNL, [nil, #<Parser::Source::Range example.rb 10...11>]]]]
    ```
    
    After this correction, the AST and tokens returned by the Parser gem are the same:
    
    ```console
    $ bunlde exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
    'buf = Parser::Source::Buffer.new("example.rb"); buf.source = File.read("example.rb"); p Prism::Translation::Parser33.new.tokenize(buf)'
    ruby 3.3.0 (2023-12-25 revision https://github.com/ruby/prism/commit/5124f9ac75) [x86_64-darwin22]
    [s(:xstr,
      s(:str, "a\n"),
      s(:str, "b\n")), [], [[:tXSTRING_BEG, ["<<`", #<Parser::Source::Range example.rb 0...10>]],
    [:tSTRING_CONTENT, ["a\n", #<Parser::Source::Range example.rb 11...13>]],
    [:tSTRING_CONTENT, ["b\n", #<Parser::Source::Range example.rb 13...15>]],
    [:tSTRING_END, ["  FOO", #<Parser::Source::Range example.rb 15...23>]], [:tNL, [nil, #<Parser::Source::Range example.rb 10...11>]]]]
    ```
    
    https://github.com/ruby/prism/commit/308f8d85a1
    3605d607
    [ruby/prism] Fix token incompatibility for `Prism::Translation::Parser::Lexer`
    Koichi ITO authored
    This PR fixes token incompatibility for `Prism::Translation::Parser::Lexer` when using backquoted heredoc indetiner:
    
    ```ruby
    <<-`  FOO`
    a
    b
         FOO
    ```
    
    ## Parser gem (Expected)
    
    Returns `tXSTRING_BEG` as the first token:
    
    ```console
    $ bundle exec ruby -Ilib -rparser/ruby33 -ve \
    'buf = Parser::Source::Buffer.new("example.rb"); buf.source = File.read("example.rb"); p Parser::Ruby33.new.tokenize(buf)'
    ruby 3.3.0 (2023-12-25 revision https://github.com/ruby/prism/commit/5124f9ac75) [x86_64-darwin22]
    [s(:xstr,
      s(:str, "a\n"),
      s(:str, "b\n")), [], [[:tXSTRING_BEG, ["<<`", #<Parser::Source::Range example.rb 0...10>]],
    [:tSTRING_CONTENT, ["a\n", #<Parser::Source::Range example.rb 11...13>]],
    [:tSTRING_CONTENT, ["b\n", #<Parser::Source::Range example.rb 13...15>]],
    [:tSTRING_END, ["  FOO", #<Parser::Source::Range example.rb 15...23>]], [:tNL, [nil, #<Parser::Source::Range example.rb 10...11>]]]]
    ```
    
    ## `Prism::Translation::Parser` (Actual)
    
    Previously, the tokens returned by the Parser gem were different. The escaped backslash does not match in the `tSTRING_BEG` token and
    value of `tSTRING_END` token.
    
    ```console
    $ bundle exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
    'buf = Parser::Source::Buffer.new("example.rb"); buf.source = File.read("example.rb"); p Prism::Translation::Parser33.new.tokenize(buf)'
    ruby 3.3.0 (2023-12-25 revision https://github.com/ruby/prism/commit/5124f9ac75) [x86_64-darwin22]
    [s(:xstr,
      s(:str, "a\n"),
      s(:str, "b\n")), [], [[:tSTRING_BEG, ["<<\"", #<Parser::Source::Range example.rb 0...10>]],
    [:tSTRING_CONTENT, ["a\n", #<Parser::Source::Range example.rb 11...13>]],
    [:tSTRING_CONTENT, ["b\n", #<Parser::Source::Range example.rb 13...15>]],
    [:tSTRING_END, ["`  FOO`", #<Parser::Source::Range example.rb 15...23>]], [:tNL, [nil, #<Parser::Source::Range example.rb 10...11>]]]]
    ```
    
    After this correction, the AST and tokens returned by the Parser gem are the same:
    
    ```console
    $ bunlde exec ruby -Ilib -rprism -rprism/translation/parser33 -ve \
    'buf = Parser::Source::Buffer.new("example.rb"); buf.source = File.read("example.rb"); p Prism::Translation::Parser33.new.tokenize(buf)'
    ruby 3.3.0 (2023-12-25 revision https://github.com/ruby/prism/commit/5124f9ac75) [x86_64-darwin22]
    [s(:xstr,
      s(:str, "a\n"),
      s(:str, "b\n")), [], [[:tXSTRING_BEG, ["<<`", #<Parser::Source::Range example.rb 0...10>]],
    [:tSTRING_CONTENT, ["a\n", #<Parser::Source::Range example.rb 11...13>]],
    [:tSTRING_CONTENT, ["b\n", #<Parser::Source::Range example.rb 13...15>]],
    [:tSTRING_END, ["  FOO", #<Parser::Source::Range example.rb 15...23>]], [:tNL, [nil, #<Parser::Source::Range example.rb 10...11>]]]]
    ```
    
    https://github.com/ruby/prism/commit/308f8d85a1
Loading